Plan: Sort Antique Rifle Photos and Extract Serial Numbers

A script that takes a folder of ~100–1,000 rifle photos shot in an assembly-line session and produces a CSV/JSON manifest mapping each image to a rifle group and an extracted serial number.

The shoot has four signals that simplify the problem:

Each rifle’s photos are contiguous in capture order.
Each group typically starts with a “hero” (full-rifle) shot.
There is a ~10–20 second EXIF timestamp gap between groups, vs. ~2–5 seconds within a group.
Background is consistent across rifles (removes “scene change” as a signal but simplifies LLM reads).

Because of (3), the bulk of the work is solved by a timestamp-gap split, not by visual similarity / embeddings. A vision LLM is used only for serial number OCR and for verifying ambiguous boundaries.

Pipeline

Phase 1 — Ingest and order

Walk the input folder; collect all image files (jpg/jpeg/heic/png/tiff).
Read EXIF DateTimeOriginal (fall back to DateTime, then file mtime) for each image.
Sort images by capture timestamp ascending. Record the sorted order as the canonical sequence.

Phase 2 — Group by timestamp gap

Compute the delta in seconds between each consecutive pair of images.
Split into groups wherever the delta exceeds a threshold (default 8s; make it a CLI flag).
Assign each group a stable ID (rifle_001, rifle_002, …).
Emit a preliminary manifest (rifle_id → list of image paths, with per-image timestamps and intra-group deltas) for inspection before any API spend.

Phase 3 — Hero-shot verification (optional but recommended)

For each group, ask a vision LLM whether the first image is a hero shot (whole rifle visible, side-on, fills most of frame).
If the first image is not a hero shot, flag the group as needs_review in the manifest. Common causes:
- User shot a close-up of the next rifle before its hero shot.
- Gap threshold split a single rifle in two (no hero at the “new” group’s start).
Do not auto-merge or auto-split at this stage — just flag. Manual review is faster than chasing edge cases programmatically.

Phase 4 — Serial number extraction

For each group, send all images in a single vision LLM call. Prompt asks for structured JSON:
- serial_numbers: list of all stamped/engraved numbers visible across images (receiver, barrel, stock cartouche — antiques often have several).
- maker / model / caliber / markings: best-effort identification.
- notes: free-text observations.
- confidence: low / medium / high per serial number.
Use the API’s JSON-mode / response schema feature so output is parseable.
Cache responses keyed by group content hash so re-runs are free.

Phase 5 — Output

Write manifest.csv and manifest.json with columns: rifle_id, image_count, image_files, first_timestamp, last_timestamp, serial_numbers, maker, model, caliber, markings, notes, confidence, needs_review, review_reason.
Write a simple contact_sheet.html showing each group as a row of thumbnails with the extracted data — for fast human review and correction.

Tooling decisions

Language: Python 3.11+. Mature EXIF / image libs and easy LLM SDKs.
EXIF: exifread or Pillow + pillow-heif for HEIC support.
Vision LLM: Gemini 2.5 Pro or GPT-5 — both handle many images per call and engraved-metal OCR well. Pick whichever you already have an API key for. Estimated total cost at ~50 rifles: under $5.
No embeddings / no CLIP / no Tesseract — the timestamp signal makes them unnecessary at this volume. Revisit only if the gap heuristic fails.

CLI shape

rifles-sort \
  --input ./photos \
  --output ./out \
  --gap-seconds 8 \
  --provider gemini \
  --model gemini-2.5-pro \
  --skip-llm        # phase 1+2 only, for cheap dry-run
  --verify-heroes   # phase 3
  --extract-serials # phase 4

Default behavior: run all phases.

Files to create

rifles/__init__.py
rifles/cli.py — argparse entry point.
rifles/ingest.py — file walk + EXIF read + sort.
rifles/group.py — gap-based grouping.
rifles/llm.py — provider-agnostic vision call (Gemini + OpenAI adapters).
rifles/verify.py — hero-shot check.
rifles/extract.py — serial extraction prompt + JSON parse.
rifles/output.py — CSV/JSON/HTML writers.
rifles/cache.py — disk cache keyed by content hash.
pyproject.toml — deps: pillow, pillow-heif, exifread, google-generativeai or openai, jinja2 (contact sheet), rich (logs).

Verification

Dry run on a small sample (10–20 images, 2–3 known rifles):
- Confirm timestamp grouping matches reality.
- Confirm at least one serial number is read correctly per group.
Full run: spot-check 10% of groups in the contact sheet.
Edge cases to deliberately test:
- Group whose first image is a close-up (should flag needs_review).
- A rifle with multiple visible serials.
- A worn/illegible serial (should produce low confidence, not a guess).

Decisions and scope

In scope: grouping, serial extraction, manifest output, contact sheet.
Out of scope (for v1):
- Visual-similarity-based grouping (embeddings/CLIP).
- Auto-merging / auto-splitting flagged groups.
- Web UI for review (HTML contact sheet only).
- Rifle identification beyond what the vision model returns from markings.
Assumptions:
- EXIF timestamps are present and accurate.
- User has an API key for at least one major vision LLM provider.
- All images are from a single shoot session (no need to handle multi-session merging).

Further considerations

Gap threshold tuning: 8s is a starting guess based on the 10–20s observed lag. The dry-run phase output should make it easy to see actual intra- vs. inter-group deltas and adjust. Option: auto-detect by finding the bimodal split in observed deltas.
HEIC handling: iPhone shots are often HEIC. pillow-heif handles this, but some LLM APIs prefer JPEG — may need a transcode step before upload.
Cost guardrail: add a --max-images flag so an accidental point at a huge folder doesn’t burn API credits.

JoeCode

Plan: Sort Antique Rifle Photos and Extract Serial Numbers

Plan: Sort Antique Rifle Photos and Extract Serial Numbers

Pipeline

Phase 1 — Ingest and order

Phase 2 — Group by timestamp gap

Phase 3 — Hero-shot verification (optional but recommended)

Phase 4 — Serial number extraction

Phase 5 — Output

Tooling decisions

CLI shape

Files to create

Verification

Decisions and scope

Further considerations