# SEARCH COORDINATION NOTES

- `SearchIndex` always targets `<collection>/.dorgy/chroma` and keeps a peer manifest at `<collection>/.dorgy/search.json`; never write to global `~/.dorgy` locations so collections stay portable.
- Use `SearchEntry.from_record` (or equivalent helpers) to ensure `FileRecord.document_id`, tags, categories, needs-review flags, and timestamps are normalized consistently before talking to Chromadb. Text payloads must be sanitized via `normalize_search_text`.
- `descriptor_document_text` prioritizes descriptor previews and falls back to vision captions/labels; reuse it so org/watch/mv flows index consistent text regardless of asset type.
- `SearchIndex` acquires a threading lock around Chromadb operations; reuse the instance for org/watch/mv/search flows instead of instantiating multiple clients per command.
- Lifecycle helpers (`ensure_index`, `update_entries`, `delete_entries`, `drop_index`) centralize store initialization, manifest writing, and state metadata updates. Call them instead of invoking `SearchIndex` internals directly so timestamps/flags remain accurate.
- Inject custom Chromadb clients via the constructor’s `client_factory` in tests to avoid touching real persistence; unit tests must cover manifest updates for both upsert and delete paths.
- Configuration defaults live under `config.search.*` (default limit, auto-enable toggles, optional embedding function). Keep CLI flags and docs aligned with these settings and deprecate `cli.search_default_limit` by reading it only when the new block is unset.
- When adding lifecycle commands (`--with-search`, `--init-store`, `--drop-store`, etc.), update SPEC/README/ARCH plus this file so automation consumers know how to initialize or audit the index.
- `dorgy search` requires an active Chromadb index; the CLI errors when collections have not been initialised via `dorgy search --init-store` or `dorgy org` (which now auto-enables search). It issues substring lookups through `SearchIndex.contains`, semantic lookups via `SearchIndex.query` (embedding-backed `collection.query`), rebuilds stores via `--init-store`, force-refreshes indexes via `--reindex`, and tears them down with `--drop-store`. JSON/table outputs depend on the stored `document_id`/snippet payloads and similarity scores (including distances and spaces), so keep those fields backward compatible when evolving the schema.
