Face Recognition¶

Last Updated: April 21, 2026

To recognise faces, PhotoPrism uses a multi-stage AI pipeline that detects faces, generates embeddings, and clusters similar faces so they can be easily organised by person.

The canonical engineering reference for this pipeline is the package README at internal/ai/face/README.md. This page summarises the developer-facing behaviour; consult the README for the latest thresholds, benchmarks, and test recipes.

How It Works¶

The face recognition pipeline runs in three stages:

Detection — the ONNX SCRFD detector locates faces in the 720 px thumbnail of each photo (see Thumbnails for how thumbnails are generated with libvips).
Embedding — TensorFlow generates a 512-dimensional vector for each face based on FaceNet.
Clustering — similar embeddings are grouped with the DBSCAN algorithm so clusters can be assigned to people.

Detection Engine¶

As of the April 2026 release, PhotoPrism ships a single face detector — the legacy Pigo cascade classifier has been removed from the code base.

ONNX SCRFD 0.5g is an ONNX Runtime-backed CNN that delivers higher recall on occluded, off-axis, or poorly-lit faces than the previous Pigo detector. Implementation details:

Consumes 720 px thumbnails with a 640 px model input.
Scheduled on the meta/vision workers, in lock-step with the FaceNet embedding step.
Defaults to half the available CPUs (minimum 1 thread).
The prebuilt runtime targets glibc ≥ 2.27 on amd64 / arm64.
When FACE_ENGINE=auto the detector is used whenever the bundled SCRFD model is present; otherwise detection is disabled rather than falling back to a legacy engine.

For backwards compatibility, legacy FACE_ENGINE=pigo values are silently mapped to ONNX in internal/ai/face/engine.go:ParseEngine so older configuration files keep working after upgrade. New configurations should use auto, onnx, or none.

Configuration¶

We recommend that only advanced users and developers change these parameters. All face-related environment variables and CLI flags are listed in Config Options › Face Recognition; this page only highlights the knobs most relevant to detector behaviour.

Detection Settings¶

Environment Variable	CLI Flag	Default	Description
PHOTOPRISM_FACE_ENGINE	--face-engine	auto	Detection engine (`auto`, `onnx`, `none`). Legacy `pigo` is accepted and aliased to `onnx`.
PHOTOPRISM_FACE_ENGINE_THREADS	--face-engine-threads	runtime.NumCPU()/2 (≥1)	Number of ONNX inference threads.
PHOTOPRISM_FACE_SIZE	--face-size	50	Minimum face size in `PIXELS` (20-10000).
PHOTOPRISM_FACE_SCORE	--face-score	9.0	Base quality threshold before scale-dependent offsets are added.
PHOTOPRISM_FACE_OVERLAP	--face-overlap	42	Maximum allowed IoU when deduplicating markers (preserved from legacy behaviour).

Run scheduling is configured through the face model entry in vision.yml; FaceEngineRunType() simply forwards to vision.Config.RunType(ModelTypeFace). There is no separate FACE_ENGINE_RUN flag. See the package README for the full run-mode semantics.

Clustering Settings¶

It is strongly recommended that you run photoprism faces reset in a terminal to remove existing clusters and markers after changing any of the clustering parameters, otherwise inconsistencies may cause unexpected behaviour or errors.

Environment Variable	CLI Flag	Default	Description
PHOTOPRISM_FACE_CLUSTER_SIZE	--face-cluster-size	80	Minimum size of automatically clustered faces in `PIXELS` (20-10000).
PHOTOPRISM_FACE_CLUSTER_SCORE	--face-cluster-score	15	Minimum `QUALITY` score of automatically clustered faces (1-100).
PHOTOPRISM_FACE_CLUSTER_CORE	--face-cluster-core	4	`NUMBER` of faces forming a cluster core (1-100).
PHOTOPRISM_FACE_CLUSTER_DIST	--face-cluster-dist	0.64	Similarity `DISTANCE` of faces forming a cluster core (0.1-1.5).
PHOTOPRISM_FACE_MATCH_DIST	--face-match-dist	0.46	Similarity `OFFSET` for matching faces with existing clusters (0.1-1.5).

Additional merge-tuning knobs (PHOTOPRISM_FACE_MERGE_MAX_RETRY, PHOTOPRISM_FACE_CLUSTER_RADIUS, PHOTOPRISM_FACE_COLLISION_DIST, PHOTOPRISM_FACE_EPSILON_DIST, PHOTOPRISM_FACE_SKIP_CHILDREN, PHOTOPRISM_FACE_ALLOW_BACKGROUND) are documented in Config Options › Face Recognition and in the package README.

Tuning Tips¶

A reasonable range for the similarity distance between face embeddings is between 0.60 and 0.70; a higher value is more aggressive and leads to larger clusters with more false positives.
To cluster a smaller number of faces, reduce the core to 3 or 2 similar faces.
Use FACE_ENGINE=auto to let PhotoPrism decide whether detection is possible with the bundled model.

Face Embeddings¶

After detection, PhotoPrism uses TensorFlow to run FaceNet and generate a 512-dimensional embedding for each face. The embeddings are used to:

Match faces across different photos.
Cluster similar faces using the DBSCAN algorithm.
Assign faces to people after manual confirmation.

Normalisation¶

All embeddings are L2-normalised to unit length (‖x‖₂ = 1) at:

Creation time, after TensorFlow inference (NewEmbedding).
Midpoint calculation when merging clusters (EmbeddingsMidpoint).
Deserialisation when loading from persisted JSON (UnmarshalEmbedding / UnmarshalEmbeddings).
photoprism faces audit --fix, which re-normalises historical embeddings and re-links markers.

With unit vectors Euclidean distance is a rank-equivalent substitute for cosine similarity, so all thresholds on this page are expressed in the Euclidean domain.

Tensor Memory¶

FaceNet embeddings are generated through TensorFlow bindings that allocate tensors in C memory; those allocations are only released by Go GC finalisers. To keep memory bounded during extended indexing runs, PhotoPrism periodically forces garbage collection and returns freed C buffers to the OS. Tune with PHOTOPRISM_TF_GC_EVERY (default 200; 0 disables).

Commands¶

Learn more about CLI commands ›

Performance Notes¶

The October 2025 optimisations (still current) improved the hot paths on the embedding side:

Benchmark	Before	After
`Embedding.Dist`	~242 ns/op	~155 ns/op
`EmbeddingsMidpoint`	~194 µs/op, 528 KB/op	~99 µs/op, 4 KB/op
Face matching (1024 faces)	1024 distance checks	~16 candidates evaluated (≈0.55 ms/op)
Cluster materialisation	~29.8 µs/op, 384 allocs	~14.8 µs/op, 64 allocs

Re-run BenchmarkEmbeddingDist and BenchmarkEmbeddingsMidpoint after any detector or embedding adjustment to catch regressions early.