Skip to content(if available)orjump to list(if available)

Ask HN: What's the 2025 stack for a self-hosted photo library with local AI?

Ask HN: What's the 2025 stack for a self-hosted photo library with local AI?

62 comments

·June 30, 2025

First of all, this is purely a personal learning project for me, aiming to combine three of my passions: photography, software engineering, and my family memories. I have a large collection of family photos and want to build an interactive experience to explore them, ala Google or Apple Photo features.

My goal is to create a system with smart search capabilities, and one of the most important requirements is that it must run entirely on my local hardware. Privacy is key, but the main driver is the challenge and joy of building it myself (an obviously learn).

The key features I'm aiming for are:

Automatic identification and tagging of family members (local face recognition).

Generation of descriptive captions for each photo.

Natural language search (e.g., "Show me photos of us at the beach in Luquillo from last summer").

I've already prompted AI tools for a high-level project plan, and they provided a solid blueprint (eg, Ollama with LLaVA, a vector DB like ChromaDB, you know it). Now, I'm highly interested in the real-world human experience. I'm looking for advice, learning stories, and the little details that only come from building something similar.

What tools, models, and best practices would you recommend for a project like this in 2025? Specifically, I'm curious about combining structured metadata (EXIF), face recognition data, and semantic vector search into a single, cohesive application.

Any and all advice would be deeply appreciated. Thanks!

crobibero

I think Immich checks a lot of these

https://immich.app/

sircastor

Immich is what I'm using right now. I'm running it in a Docker container on my Synology. It was very advantageous to spin up another docker container on my laptop to do the face recognition work because the Synology was going to take forever on it.

We no longer are auto uploading to Google or Apple.

So far, I really like it. I haven't quite gone 100%, as we're still uploading with Synology's photo app, but Immich provides a much more refined, featured interface.

old-gregg

May I ask: why not use Synology's own photo stack? The web UI is pretty good, the iPhone app is great, it runs locally without depending on Synology servers, and does have face recognition and all other features.

darknavi

If you want a solid "just upload the photos" experience, PhotoSync on iOS is really great.

I think you can use Immich to just look at a folder and not use the backup from phone bits.

noncoml

> We no longer are auto uploading to Google or Apple.

May I ask why? Just curious as the main reason I use Immich is for the auto upload

Edit: Ugh. Can’t read. I somehow read don’t auto upload to Immich.

adezxc

because you don't want your data being held by Google or Apple?

import

Self hosting and owning your own data

sz4kerto

This. It's a fascinating project, it is hard to believe how can an FLOSS project be so high quality. In my book it's on the level of Postgres (although it's a smaller project, probably).

denysvitali

Their frontend is amazing, their apps are not as performant, and the backend is (IMHO) the worst of them all.

No hate here, I'm really grateful for what they've achieved so far, but I think there's a lot of room for improvement (e.g: proper R/W query split, native S3 integration, faster endpoints, ...). I already mentioned it in their channel (they're a really welcoming community!) and I'm working on an alternative drop-in replacement backend (written in Go) [1] that will hopefully bring all the needed improvements.

TL;DR: It's definitely good, especially for an open-source project, and the team is very dedicated - but it's definitely not Postgres-good

[1]: https://github.com/denysvitali/immich-go-backend

darkwater

Why the focus on S3 for a self-hosted app? Anyway kudos for the effort, I'm not experiencing performance issues in my locally self-hosted Immich installation but more performant software is always welcome.

esseph

Looking at the world around me, so much of it is driven by open source. In fact, I can't name a single piece of electronics around me that isn't using it.

lucideer

Been running immich on my home server for about a year now.

Near zero maintenance stack, incredibly easy to update, the client mobile apps even notify you (unobtrusively) when your server has an update available. The UI is just so polished & features so stable it's hard to believe it's open source.

mossTechnician

This may not interest you, but Ente checks most of these boxes for me. It has face recognition and AI-based object search out of the box, and you can self-host their open-source server without any restrictions. The models they used might be useful for your project.

jamesxv7

Ente is a tremendous proposal. I don't know why I hadn't heard of it before, but I don't think it meets what I'm looking for. But the fact that the software is completely open is impressive.

akho

The Ente self-hosting proposition seems strange. Why would I want to e2e encrypt my photos that I self-host? Sounds like it will only make life more difficult.

mossTechnician

1. "Self-hosted" doesn't always mean "on your own hardware." Some people rent VPSes. This helps keep their data safe.

2. The software is provided without modification; I think it would be stranger to remove the encryption.

idatum

> Some people rent VPSes. This helps keep their data safe.

This is exactly how I self-host Ente and it has been great.

Machine leaning for image detection has worked really well for me, especially facial recognition for family members (easy to find that photo to share).

I have the client on my Android mobile, Fire tablet (via F-Droid), and my Windows laptop.

My initial motivation was to replace "cloud" storage for getting photos copied off the phone as soon as possible.

barbazoo

Their pricing page doesn't say anything as far as I can find but do you still pay pay Ente if you self host the server as well as the photos ("S3-compatible object storage")?

marcusb

> do you still pay pay Ente if you self host the server as well as the photos ("S3-compatible object storage")?

No. (I self-host Ente and use their published ios app.)

null

[deleted]

nico

I don't know about the photo-management aspects. However, I've had very good experiences running gemma3 (4b and 12b) locally via ollama

I've used gemma to process pictures and get descriptions and also to respond questions about the pictures (eg. is there a bicycle in the picture?). Haven't tried it for face recognition, but if you already have identified someone in one photo, it can probably tell you if the person in that photo is also in another photo

Just one caveat, if you are processing thousands of pictures, it will take a while to process them all (depending on your hardware and picture size). You could also try creating a processing pipeline, first extracting faces or bounding boxes of the faces with something like opencv, and then passing those to gemma3

Please post repo link if you ever decide to open source

jamesxv7

Thanks nico for sharing your experience! That's really helpful. The idea of using OpenCV to create a processing pipeline for face detection before passing it to Gemma is brilliant I hadn't thought of that. I'll definitely look into using gemma with ollama.

And for sure, if I get this to a point where it's open-source, I'll post the link here!

iforgotpassword

I currently use photoprism, but it's moving rather slowly. Facial recognition misses a lot of faces, the automatic clustering works fine at first but once you tagged a few thousand faces the implementation grinds to a halt and the background worker runs for hours pegging single cpu core.

The dev is really reluctant of accepting external contributions, which has driven away a lot of curious folks willing to contribute.

Immich seems to be the other extreme. Moving really fast with a lot of contributors, but stuff occasionally breaks, the setup is fiddly, but the Ai features are 100x more powerful. I just don't like the ui as much as photoprism. I with there was some kind of blend of the two, on a middle ground of their dev philosophies.

darkwater

While Immich development release versions every 2-3 weeks on average, and a breaking one every 4-6 months, they are approaching the stable release, so the pace should also down a bit. The setup to be honest is pretty standard IMO.

coffeecoders

I have been building something like this but for personal use.

As of now, I use SentenceTransformer model to chunk files, blip for captioning (“Family vacation in Banff, February 2025”)) and mtcnn with InsightFace for face detection. My index stores captions, face embeddings, and EXIF metadata (date, GPS) for queries like “show photos of us in Banff last winter.” I’m working on integrating ChromaDB for faster searches.

Eventually, I aim to store indexes as:

{

  "filename": "/Vacation/Banff/Wife.jpg",

  "chunk_id": 0,

  "text": "Family at Banff, February 2025",

  "caption_embedding": [0.1, 0.2, ...],

  "face_embeddings": [{"name": "NT", "embedding": [0.3, 0.4, ...]}, ...],

  "exif": {
     
     "DateTimeOriginal": "2025:02:15",

     "GPSCoordinates": "18.387, -65.992"

    }
}

I also built an UI (like Spotlight Search) to search through these indexes.

Code (in progress): https://github.com/neberej/smart-search

wooben

I've been running Nextcloud in Docker with the Recognize and Memories apps for about a year and half now. It's in an off-lease refurbished Dell Precision tower from 2018.

I'm using docker compose to include some supporting containers like go-vod (for hardware transcoding), another nextcloud instance to handle push notifications to the clients, and redis (for caching). I can share some more details, foibles and pitfalls if you'd like.

I initiated a rescan last week, which stacks background jobs in a queue that gets called by cron 2 or 3 times a day. Recognize has been cranking through 10k-20k photos per day, with good results.

I've installed a desktop client on my dad's laptop so he can dump all of the family hard drives we've accumulated over the years. The client does a good job of clearing up disk space after uploading, which is a huge advantage in my setup. My dad has used the OneDrive client before, so he was able to pick up this process very quickly.

Nextcloud also has a decent mobile client that can auto-upload photos and videos, which I recently used to help my mother-in-law upload media from her 7-year-old iPhone.

jan_tse

I run a pretty similar configuration on a pi 4 mounted to an external hard drive which I offload to other hard drives from time to time. The mobile app auto sync specific folders when my phone is connected at the home network. It's not flying performance wise but I mainly need a backup solution.

Gonna check the apps that you mentioned. Feel free to share more details of your set up. Why are you running 2 instances? Edit: I see, probably for the memories app.

nicoburns

It's not self-hosted, but https://ente.io/ is an independent commercial solution with E2E encrypted cloud storage and local AI (EDIT: apparently you can also self-host)

gavin_gee

i swear the single best feature for me would be:

take my photo catalog stored in google photos, apple pictures, Onedrive, Amazon photos. collate into a single store, dedupe. Then build a proper timeline and geo/map view for all the photos.

darknavi

Take a look at something like rclone and it immediately becomes clear that the photo app vendors you listed have no interest in allowing their users to easily access their data programmatically from their services in any meaningful way.

Example: https://rclone.org/googlephotos/#limitations

Glaring example:

> The current google API does not allow photos to be downloaded at original resolution. This is very important if you are, for example, relying on "Google Photos" as a backup of your photos. You will not be able to use rclone to redownload original images. You could use 'google takeout' to recover the original photos as a last resort

swyx

(and semantically index/search, face recognition... what else does AI get us these days?)

rusk

iPhoto used to do this. The Mac photos app that has replaced it since is nowhere near as good.

In fact I would go so far as to say my personal photo management never really recovered from the transition.

joesweetsox

Are any of these systems doing true image based entity resolution? It seems like its only pair-wise similarity checking. If you are trying to index say 20 years of family photos how do they do linking kindergardeners to thier adult images?

chrisgd

This is my dream. I started building something that would upload all my photos from my phone to my desktop, back them up somewhere and then present them 6 at a time on a local website solely so you could look at them again and decide if you wanted to keep them. Heart any you wanted to keep, favorite some, and delete the rest then show me 6 more.

The addition of an AI tool is a great idea.

weinzierl

In addition to all of that I want an AI solution that pre-selects good images for me, so I do not have to go through all of them manually. Similar to Apple Memories or Featured Photos. Is there anything self-hosted like that?

slackpad

Haven’t tried it yet (I’d love to find something like this too) but I saw a conference talk on https://docs.voxel51.com/ that looked pretty interesting. It is kind of a data frame for images with a GUI for exploring them. They make it pretty easy to rip various models over your images to add tags, and to evaluate the results.