How this site is built

Two weeks ago I relaunched finescenery.com on a new home. Same photography, new foundation.

This is the follow-up I promised: how the site is actually built, end to end, and why a pipeline made more sense than a CMS.

Chapter 01

The mess that started it

After 20 years of shooting, the archive looks like every long-running personal archive: 117,166 files spread across half a dozen drives.

Some are RAW originals from a D700 in 2008. Some are NEFs and ARWs from this year. Some are Lightroom exports that got dragged off to a hard drive five laptops ago. The catalog has rough star ratings (1 to 5) and ~26,500 keyword assignments.

The previous portfolio site had hand-picked images and hand-typed captions. Which means it never grew past a few hundred pieces, and updating it was a chore.

I wanted a site that:

pulls directly from the Lightroom catalog, so curating in Lightroom curates the site
generates titles, alt-text, and descriptions automatically, with quality good enough to ship
groups photos by where and what (country, region, place, scene, mood) without me hand-tagging each one
stays fast: static pages, no CMS, no database calls at request time

That’s what PhotoMind became. It’s a Python project that owns one SQLite database (~268 MB) and writes content into the Astro repo for finescenery.com.

Chapter 02

How it works

Three layers, roughly.

Indexing

Scans every photo on disk. Extracts EXIF via ExifTool. Parses Lightroom XMP sidecars. Computes CLIP embeddings via Hugging Face Transformers on PyTorch. Links each file 1:1 to the Lightroom catalog.

The output is the SQLite "ground truth" of what exists, where it lives, and how Lightroom rates it.

61,277 CLIP vectors · 117,166 files indexed

Understanding

Runs vision LLMs over the Lightroom-rendered JPEGs to generate descriptions, alt-text, scene tags, mood, composition, and genre. Three providers: Qwen-VL at two sizes, plus Claude. Stored per-piece-per-provider so I can compare and pick winners.

~8,400 vision passes across 3 providers

Site export

Selects the portfolio (currently every Lightroom-rendered JPEG with rating 2+). Deduplicates burst frames. Maps every piece to a canonical URL. Runs an agentic title-generation pass for variety. Encodes responsive AVIF + WebP at five sizes via Pillow and pillow-avif-plugin. Writes Astro markdown.

Astro builds it. Cloudflare Pages serves it globally.

2,001 static pages

Chapter 03

The interesting parts

A few of the steps are unusual enough to be worth calling out.

Lightroom is the source of truth

Every “winner” image on the site is a JPEG that I’ve personally exported from Lightroom at rating 1+, sitting in portfolio-1-star/. PhotoMind links each of those back to its source RAW via basename and EXIF matching.

5,789 winners · 100% mapped

Curating in Lightroom is curating the site. There is no double-bookkeeping between catalog and site, and no admin panel to maintain.

Captions are written by an AI bucket-batcher

Letting a vision LLM generate captions one by one produces formulaic slop. Every other title becomes “Golden Light Over the Ocean” or some near-variant.

Instead, photos are grouped into buckets by (narrative, place, subregion, region, country, year-month) of up to ten. Claude Sonnet sees the whole bucket at once with a prompt that bans formula leading words (“Golden”, “Sunlit”, “Amber”, “Storm”, and friends) and demands variety.

Output goes into a site_metadata table keyed by (piece_id, prompt_version), which makes re-runs idempotent. Changing the prompt creates fresh rows instead of clobbering old ones.

The descriptions come in two tiers in the same pass:

a short one (12 to 22 words) for <meta description> and RSS
a longer one (25 to 42 words) for the body of the piece page

Saves a second prompt and keeps the two consistent.

URLs are generated from a hand-curated gazetteer

Lightroom keywords are messy: “Lazovsky”, “Lazovsky Reserve”, “lazovsky reserve”, “Lazo”.

A hand-written gazetteer resolves them all to canonical (kind, slug) tags at every hierarchy tier (country, region, subregion, place). A tokenizer rolls them up into the breadcrumb. The geographic tree builds itself from that.

Tail countries with under 10 pieces collapse into /world/. Sub-galleries with under five pieces collapse into the parent.

Themes are a parallel axis

Three taxonomies sit alongside the geographic tree: feature, composition, mood. Hand-curated lists (30 features, 13 compositions, 19 moods) drawn from raw vision-tag frequencies across the corpus.

Same tokenizer pattern: feed it the raw vision output, get back the canonical site tags. That’s how /mood/golden-hour/ and /feature/storm-clouds/ exist without me ever sitting down to tag photos by hand.

Search is CLIP, client-side

Every piece has a 512-dim CLIP embedding. When you type “misty forest” or “sea stacks at dawn” into the search box, the query is encoded in the browser via a small precomputed text-encoder and ranked by cosine similarity over those vectors.

There’s no server doing the matching, just static vector files and a few hundred lines of browser JavaScript. The same CLIP model that powered the original gallery-tag propagation now powers search at zero runtime cost.

Try it

Four ways the pipeline shows up

"misty forest"

CLIP-powered semantic search across the whole archive.

Browse by mood

Golden hour

One of 19 moods auto-derived from vision tags.

Browse by trip

Trips

Narrative shoots auto-grouped from Lightroom keywords.

Explore

Scenes & seasons

Cross-cuts of the corpus by what's in the frame.

Chapter 04

What I learned

Four things I didn’t expect.

AI captioning works, but bucket-batching is the difference. Per-piece prompts produce templated copy. Per-bucket prompts with explicit variety constraints produce something I’d ship. The cost is the same.

Local vision LLMs already understand your photos. You just need to ask nicely. Qwen-VL running on my own GPU described 5,500+ images before I called a single hosted API. A clear prompt and a downscaled image are most of the work. Hosted models still win on style and variety; the local ones do the heavy lifting for free.

Lightroom-as-source-of-truth is liberating. Once it’s set up, I just rate pictures in Lightroom and the site catches up on the next export run. Zero copy-paste between catalog and site.

Static is faster than everything else. The whole site is plain HTML, CSS, and AVIF. Cloudflare serves it for free. There’s no admin panel because there’s nothing to admin.

The pipeline is a personal tool: one photographer, one archive, one site. But the patterns generalise. If you’re sitting on years of RAW files and want to put them on the web without losing your weekends, tell me. I’m always interested in how other people solve this problem.

Credits