2026-05-22 22:32:18 +02:00
43 changed files with 1162 additions and 1743 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -36,7 +36,7 @@ jobs:
            uv.lock

      - name: Install dependencies
-        run: uv sync --all-extras --all-groups
+        run: just install-dev

      - name: Run formatting, lint, and type checks
        run: just check
@ -73,7 +73,7 @@ jobs:
            uv.lock

      - name: Install dependencies
-        run: uv sync --all-extras --all-groups
+        run: just install-dev

      - name: Run test suite
        run: just test
--- a/.github/workflows/docs-pages.yml
+++ b/.github/workflows/docs-pages.yml
@ -42,7 +42,7 @@ jobs:
            uv.lock

      - name: Install dependencies
-        run: uv sync --all-extras --all-groups
+        run: just install-dev

      - name: Build docs
        run: just check-docs
--- a/README.md
+++ b/README.md
@ -6,19 +6,14 @@ Code for detecting and classifying bat echolocation calls in high-frequency
 audio recordings.

 > [!WARNING]
-> `batdetect2` 2.0.0b1 is out.
-> This is a beta release and we are gathering user feedback.
-> If you run into issues or have feedback on the new workflows, please use the
-> GitHub issues page to let us know.
->
+> `batdetect2` 2.0.1 is out.
 > There are many changes and new recommended workflows.
 > We have left the previous `batdetect2.api` module intact, but if you run
 > into issues or want to upgrade, see the
 > [migration guide](docs/source/legacy/migration-guide.md) in the docs site.
 >
 > This update also ships with a refreshed default model.
-> It was trained in the same way and on the same data as before, but you should
-> still expect small output differences in some cases.
+> It was trained in the same way and on the same data as before, but you should still expect small output differences in some cases.

 ## What is BatDetect2

@ -36,10 +31,6 @@ You can use the tool from the command line (terminal) or from Python as needed.

 We have [extensive documentation](docs/source/index.md) on how to use
 `batdetect2`.
-
-The docs site is still being built and will be live soon.
-If you want a quick peek for now, see the `docs/` folder in this repository.
-
 See our [getting started](docs/source/getting_started.md) guide and then jump
 into any of our tutorials:

@ -144,7 +135,7 @@ which you can find
 ```
@article{batdetect2_2022,
    title     = {Towards a General Approach for Bat Echolocation Detection and Classification},
-    author    = {Mac Aodha, Oisin and  Mart\'{i}nez Balvanera, Santiago and  Damstra, Elise and  Cooke, Martyn and  Eichinski, Philip and  Browning, Ella and  Barataud, Michel and  Boughey, Katherine and  Coles, Roger and  Giacomini, Giada and MacSwiney G., M. Cristina and  K. Obrist, Martin and Parsons, Stuart and  Sattler, Thomas and  Jones, Kate E.},
+    author    = {Mac Aodha, Oisin and  Mart\'{i}nez Balvanera, Santiago and  Damstra, Elise and  Cooke, Martyn and  Eichinski, Philip and  Browning, Ella and  Barataudm, Michel and  Boughey, Katherine and  Coles, Roger and  Giacomini, Giada and MacSwiney G., M. Cristina and  K. Obrist, Martin and Parsons, Stuart and  Sattler, Thomas and  Jones, Kate E.},
    journal   = {bioRxiv},
    year      = {2022}
 }
--- a/docs/plan.md
+++ b/docs/plan.md
@ -0,0 +1,441 @@
+# Documentation Plan
+
+## Goal
+
+Build documentation around the main user stories:
+
+1. Run inference with the CLI on one folder of audio.
+2. Use the Python API for inference with fine-grained control over outputs,
+   including per-file workflows, class scores, features, and batch processing.
+3. Train or fine-tune a custom model.
+4. Evaluate a model and understand what the metrics mean.
+5. Understand the concepts needed to use BatDetect2 correctly.
+
+The docs should provide:
+
+- a simple happy path in tutorials,
+- richer task-oriented guidance in how-to guides,
+- complete lookup material in reference,
+- deep conceptual coverage in understanding.
+
+Note: the current docs tree uses `explanation/`. For Diataxis consistency, this
+plan uses `understanding/` as the target name for that conceptual section.
+
+## Current State Review
+
+### Looks reasonably complete
+
+- `docs/source/index.md`: good top-level orientation and navigation.
+- `docs/source/getting_started.md`: solid install and entry-point guidance.
+- `docs/source/explanation/*.md`: the conceptual pages are currently the
+  strongest part of the docs, especially pipeline overview, thresholds,
+  preprocessing consistency, and targets.
+- `docs/source/how_to/configure-*.md` and related target/data pages: practical
+  support docs for preprocessing, targets, ROI mapping, and dataset formats are
+  in decent shape.
+- `docs/source/reference/cli/*.rst`: CLI reference wiring exists and should
+  render useful option-level documentation from the Click commands.
+
+### Partially complete
+
+- `docs/source/how_to/run-batch-predictions.md`: useful, but thin.
+- `docs/source/how_to/tune-detection-threshold.md`: useful, but too brief for
+  a key workflow.
+- `docs/source/reference/preprocessing-config.md`
+- `docs/source/reference/postprocess-config.md`
+- `docs/source/reference/targets-config-workflow.md`
+
+These are good summaries, but they do not yet feel like complete references for
+all the customization surfaces available in the code.
+
+### Clearly incomplete or scaffolded
+
+- `docs/source/tutorials/run-inference-on-folder.md`
+- `docs/source/tutorials/integrate-with-a-python-pipeline.md`
+- `docs/source/tutorials/train-a-custom-model.md`
+- `docs/source/tutorials/evaluate-on-a-test-set.md`
+
+All four main tutorials are still starter scaffolds. This is the biggest gap in
+the current user story.
+
+### Major mismatch to resolve
+
+- `README.md` still tells an older story built around `batdetect2 detect` and
+  `batdetect2.api`.
+- The docs site tells the newer story built around `batdetect2 predict` and
+  `batdetect2.api_v2`.
+
+This creates avoidable confusion for users and should be treated as a priority
+documentation alignment issue.
+
+### Legacy documentation is not yet placed clearly
+
+The repo still contains meaningful legacy documentation material, but it is not
+yet presented as a clearly marked legacy path inside the docs.
+
+Users need two things:
+
+- a clear message that these docs exist for the previous BatDetect2 workflow,
+- a clear recommendation that new users should prefer the newer CLI/API
+  workflows and migrate where possible.
+
+## Legacy Documentation Plan
+
+### Goals
+
+1. Preserve access to the old workflow documentation.
+2. Prevent new users from accidentally following legacy guidance.
+3. Give current users a clear migration path from legacy to current workflows.
+
+### Proposed location
+
+Add a dedicated legacy area inside the docs, for example:
+
+- `docs/source/legacy/index.md`
+- `docs/source/legacy/cli-detect.md`
+- `docs/source/legacy/python-api.md`
+- `docs/source/legacy/feature-extraction.md`
+- `docs/source/legacy/migration-guide.md`
+
+This keeps the material available without mixing it into the main happy-path
+docs.
+
+### User-facing messaging
+
+Add clear notices in all relevant navigation entry points.
+
+Suggested message pattern:
+
+"If you want to use the previous version of BatDetect2, see the legacy
+documentation. For new workflows, we recommend using the current `predict`
+CLI and `BatDetect2API` interfaces."
+
+Places that should link to the legacy docs:
+
+- `docs/source/index.md`
+- `docs/source/getting_started.md`
+- `README.md`
+- tutorial landing pages where users may be coming from older workflows
+- any page that mentions the old `detect` command or old Python API
+
+### Migration guide plan
+
+Add a dedicated migration guide that explains:
+
+1. who should migrate now and who may need to stay on the legacy workflow,
+2. the mapping from old CLI commands to new CLI commands,
+3. the mapping from old Python API calls to new `api_v2` / `BatDetect2API`
+   patterns,
+4. what changed in outputs, terminology, and configuration,
+5. how legacy feature extraction concepts map to the new API surfaces,
+6. what behavior differences users should validate before switching,
+7. a short migration checklist.
+
+High-priority migration mappings to document:
+
+- `batdetect2 detect` -> `batdetect2 predict directory`
+- old `batdetect2.api` file processing -> `BatDetect2API.from_checkpoint(... )`
+  plus `process_file`, `process_files`, `process_audio`, or
+  `process_spectrogram`
+- legacy `cnn_feats`, `spec_features`, and `spec_slices` -> current output and
+  feature access patterns, with explicit notes where there is no direct
+  one-to-one replacement
+
+### Legacy content handling plan
+
+For each legacy page or legacy concept:
+
+1. Decide whether it should be preserved as-is, rewritten as a legacy page, or
+   replaced by the migration guide.
+2. Add a prominent warning banner saying it describes the previous workflow.
+3. Link forward to the current equivalent page when one exists.
+
+### Definition of done for legacy handling
+
+Legacy documentation work is done when:
+
+1. a reader can clearly distinguish legacy from current docs,
+2. old users can still find the previous workflow documentation,
+3. new users are consistently directed to the new docs,
+4. there is a practical migration guide covering the main CLI and Python API
+   transitions.
+
+## Main Gaps By User Story
+
+### 1. CLI inference
+
+Current coverage exists, but the happy path is not truly documented yet.
+
+Missing:
+
+- a full worked tutorial from input audio to saved outputs,
+- clear guidance on what outputs are written and how to inspect them,
+- stronger documentation for `predict dataset`,
+- a clearer story for default model vs custom checkpoint,
+- practical guidance for selecting output formats and thresholds.
+
+### 2. Python API inference
+
+This is currently the weakest major story.
+
+The code exposes much more than the docs explain, including:
+
+- `BatDetect2API.from_checkpoint` and `from_config`,
+- `process_file`, `process_files`, `process_directory`, `process_clips`,
+- `process_audio`, `process_spectrogram`,
+- `get_top_class_name`, `get_class_scores`, `get_detection_features`,
+- `save_predictions` and `load_predictions`.
+
+Missing docs:
+
+- an API-first tutorial with a simple path,
+- a how-to for file-by-file inspection and custom post-processing,
+- a how-to for batch API inference,
+- a reference page for `BatDetect2API`,
+- an explanation of what the feature vectors are and how users should think
+  about them.
+
+Important terminology note:
+
+- the old API/docs talk about `cnn_feats`, `spec_features`, and `spec_slices`,
+- the new API exposes per-detection `features`,
+- users interested in embeddings / downstream exploration will need a clear,
+  explicit doc that connects these ideas.
+
+### 3. Batch inference
+
+Batch prediction exists in both CLI and API workflows, but the docs do not yet
+explain the design space well.
+
+Missing:
+
+- when to use `directory` vs `file_list` vs `dataset`,
+- how clipping works during inference,
+- what `InferenceConfig` controls,
+- how batch size, workers, and output format choices affect runs,
+- how to organize large runs reproducibly.
+
+### 4. Training a custom model
+
+Supporting pages exist, but the end-to-end story is not yet there.
+
+Missing:
+
+- one complete tutorial from dataset config to checkpoints and sanity check,
+- a "minimum viable training setup" page,
+- clearer explanation of how model, targets, audio, training, inference,
+  outputs, and logging configs fit together,
+- a fine-tuning story versus training from scratch.
+
+### 5. Evaluation
+
+Evaluation is significantly under-documented relative to the code.
+
+Missing:
+
+- what evaluation tasks exist,
+- what metrics and plots are produced,
+- how predictions are matched to annotations,
+- how to interpret failures and trade-offs,
+- how to configure evaluation for different research questions.
+
+### 6. Understanding / concepts
+
+This is the best-developed section today, but it still needs expansion.
+
+Concepts that should be covered more fully:
+
+- what the model predicts,
+- what the raw and formatted outputs represent,
+- how to interpret detection scores and class scores,
+- what targets are and how they shape training and decoding,
+- how preprocessing choices affect model behavior,
+- what the extracted features represent and when they are useful,
+- what evaluation metrics actually measure,
+- why local validation is required before ecological inference.
+
+## Proposed Documentation Architecture
+
+## Target Table of Contents
+
+### Home
+
+- Home
+- Getting started
+- FAQ
+- Legacy docs
+
+### Tutorials
+
+These should be the default path for most users.
+
+- Tutorial: Run inference on a folder of audio
+- Tutorial: Explore predictions in Python for one file
+- Tutorial: Train a custom model
+- Tutorial: Evaluate a trained model
+
+### How-to Guides
+
+These cover practical tasks once the user is past the happy path.
+
+- How to choose an inference input mode
+- How to run batch predictions from a directory
+- How to run batch predictions from a file list
+- How to run predictions from a dataset config
+- How to tune detection thresholds
+- How to inspect class scores in Python
+- How to inspect detection features in Python
+- How to save predictions in different output formats
+- How to configure inference clipping
+- How to configure audio preprocessing
+- How to configure spectrogram preprocessing
+- How to configure target definitions
+- How to define target classes
+- How to configure ROI mapping
+- How to configure an AOEF dataset
+- How to import legacy BatDetect2 annotations
+- How to fine-tune from a checkpoint
+- How to choose and configure evaluation tasks
+- How to interpret evaluation outputs
+
+### Reference
+
+This should be the complete lookup layer.
+
+- CLI reference
+- CLI reference: base command and global options
+- CLI reference: predict
+- CLI reference: data
+- CLI reference: train
+- CLI reference: evaluate
+- CLI reference: legacy detect
+- API reference: `BatDetect2API`
+- Config reference: top-level app config
+- Config reference: inference config
+- Config reference: evaluation config
+- Config reference: outputs config
+- Config reference: output formats
+- Config reference: output transforms
+- Config reference: preprocessing config
+- Config reference: postprocess config
+- Config reference: targets config workflow
+- Reference: data sources
+- Reference: targets module
+
+### Understanding
+
+This is the conceptual layer and should carry the deeper Diataxis
+"understanding" material.
+
+- What BatDetect2 predicts
+- How the pipeline fits together
+- How to interpret detection scores and class scores
+- How to interpret formatted outputs
+- What extracted features / embeddings are and are not
+- Postprocessing and thresholds
+- Preprocessing consistency and domain shift
+- Target encoding and decoding
+- Evaluation concepts and matching behavior
+- Model output, validation, and ecological interpretation
+
+### Legacy
+
+This is a clearly signposted area for the previous workflow only.
+
+- Legacy overview
+- Legacy CLI workflow with `batdetect2 detect`
+- Legacy Python API with `batdetect2.api`
+- Legacy feature extraction outputs
+- Migration guide: legacy to current workflows
+
+### Tutorials
+
+Keep tutorials opinionated and minimal. Each one should show the default happy
+path with the fewest possible choices.
+
+Planned tutorial set:
+
+1. Run inference on a folder of audio.
+2. Explore predictions in Python for one file.
+3. Train a custom model.
+4. Evaluate a trained model.
+
+### How-to Guides
+
+Use how-to guides for branching tasks and customization.
+
+Planned additions or expansions:
+
+- Choose an inference input mode: directory, file list, or dataset.
+- Run large batch inference reproducibly.
+- Save predictions in different output formats.
+- Inspect class scores and features in Python.
+- Explore detection features / embeddings downstream.
+- Tune clipping and inference settings.
+- Fine-tune from a checkpoint.
+- Choose and configure evaluation tasks.
+- Interpret evaluation artifacts.
+
+### Reference
+
+Reference should become the complete map of all configurable surfaces.
+
+High-priority additions:
+
+- `BatDetect2API` reference.
+- `InferenceConfig` reference.
+- `EvaluationConfig` reference.
+- `OutputsConfig` and output format reference.
+- Output transform reference.
+- clearer config composition reference for the full app config.
+
+### Understanding
+
+This is where the deeper conceptual material should live.
+
+High-priority pages:
+
+1. What BatDetect2 predicts.
+2. How to interpret outputs, scores, and uncertainty.
+3. What extracted features / embeddings are and are not.
+4. Targets, labels, and decoded outputs.
+5. Preprocessing consistency and domain shift.
+6. Postprocessing, thresholds, and output density.
+7. How evaluation works and what the metrics mean.
+8. Why local validation is required before ecological interpretation.
+
+## Priority Order
+
+### Phase 1: Fix the primary user journey
+
+1. Expand the four scaffold tutorials into real end-to-end guides.
+2. Add a proper Python/API inference story.
+3. Document outputs and how to inspect them.
+4. Align `README.md` with the newer CLI/API documentation story.
+5. Create the legacy docs section and add clear signposting to it.
+
+### Phase 2: Cover the customization surface
+
+1. Add how-to guides for batch inference, output formats, and API inspection.
+2. Add reference pages for inference, outputs, evaluation, and API surfaces.
+3. Add fine-tuning and advanced training guidance.
+4. Write the migration guide from legacy to current workflows.
+
+### Phase 3: Deepen understanding
+
+1. Expand the conceptual section into a true understanding section.
+2. Add pages for output interpretation, features/embeddings, and evaluation
+   concepts.
+3. Reader-test the docs against realistic user questions.
+
+## Immediate Next Steps
+
+1. Decide whether to rename `explanation/` to `understanding/` or keep the
+   current directory name and just treat it as the Diataxis understanding
+   section.
+2. Draft the target table of contents for Tutorials, How-to, Reference, and
+   Understanding.
+3. Draft the legacy docs section and migration-guide table of contents.
+4. Rewrite the four scaffold tutorials first.
+5. Add the missing API, outputs, evaluation, and migration documentation
+   immediately after.
--- a/docs/source/documentation_plan.md
+++ b/docs/source/documentation_plan.md
@ -0,0 +1,139 @@
+---
+orphan: true
+---
+
+# Documentation Architecture and Migration Plan (Phase 0)
+
+This page defines the Phase 0 documentation architecture and inventory for
+reorganizing `batdetect2` documentation using the Diataxis framework.
+
+## Scope and goals
+
+Phase 0 focuses on architecture and prioritization only. It does not attempt
+to write all new docs yet.
+
+Primary goals:
+
+1. Define a target docs architecture by Diataxis type.
+2. Map current pages to target documentation types.
+3. Identify what to keep, split, rewrite, or deprecate.
+4. Set priorities for implementation phases.
+
+## Audiences
+
+Two primary audiences are in scope.
+
+1. Ecologists who prefer minimal coding, focused on practical workflows:
+   run inference, inspect outputs, and possibly train with custom data.
+2. Ecologists or bioacousticians who are Python-savvy and want to customize
+   workflows, training, and analysis.
+
+## Target information architecture
+
+The target architecture uses four top-level documentation sections.
+
+1. Tutorials
+   - Learning-oriented, single-path, reproducible walkthroughs.
+2. How-to guides
+   - Task-oriented procedures for common real goals.
+3. Reference
+   - Factual descriptions of CLI, configs, APIs, and formats.
+4. Explanation
+   - Conceptual material that explains why design and workflow decisions
+     matter.
+
+Cross-cutting navigation conventions:
+
+- Every page starts with audience, prerequisites, and outcome.
+- Every page serves one Diataxis type only.
+- Beginner-first path is prioritized, with clear links to advanced pages.
+
+## Phase 0 inventory: current docs mapped to Diataxis
+
+Legend:
+
+- Keep: useful as-is with minor edits.
+- Split: contains mixed documentation types and should be separated.
+- Rewrite: major changes needed to fit target audience/type.
+- Move: content is valid but belongs under another section.
+
+| Current page | Current role | Target type | Audience | Action | Priority |
+| --- | --- | --- | --- | --- | --- |
+| `README.md` | Mixed quickstart + CLI + API + warning | Tutorial + How-to + Explanation (split) | 1 + 2 | Split | P0 |
+| `docs/source/index.md` | Sparse landing page | Navigation hub | 1 + 2 | Rewrite | P0 |
+| `docs/source/architecture.md` | Internal architecture deep dive | Explanation + developer reference | 2 | Move/trim | P2 |
+| `docs/source/postprocessing.md` | Concept + config + internals + usage | Explanation + How-to + Reference (split) | 1 + 2 | Split | P1 |
+| `docs/source/preprocessing/index.md` | Conceptual overview with some procedural flow | Explanation | 2 (and 1 optional) | Keep/trim | P2 |
+| `docs/source/preprocessing/audio.md` | Detailed configuration and behavior | Reference + How-to fragments | 2 | Split | P2 |
+| `docs/source/preprocessing/spectrogram.md` | Detailed configuration and behavior | Reference + How-to fragments | 2 | Split | P2 |
+| `docs/source/preprocessing/usage.md` | Usage patterns + concept | How-to + Explanation (split) | 2 | Split | P1 |
+| `docs/source/data/index.md` | Data-loading section index | Reference index | 2 | Keep/update | P2 |
+| `docs/source/data/aoef.md` | Config and examples | How-to + Reference (split) | 2 | Split | P1 |
+| `docs/source/data/legacy.md` | Legacy formats and config | How-to + Reference (split) | 2 | Split | P2 |
+| `docs/source/targets/index.md` | Long conceptual + process overview | Explanation + How-to (split) | 2 | Split | P2 |
+| `docs/source/targets/tags_and_terms.md` | Definitions + guidance | Explanation + Reference | 2 | Split | P2 |
+| `docs/source/targets/filtering.md` | Procedure + config | How-to + Reference | 2 | Split | P2 |
+| `docs/source/targets/transform.md` | Procedure + config | How-to + Reference | 2 | Split | P2 |
+| `docs/source/targets/classes.md` | Procedure + config | How-to + Reference | 2 | Split | P2 |
+| `docs/source/targets/rois.md` | Concept + mapping details | Explanation + Reference | 2 | Split | P2 |
+| `docs/source/targets/use.md` | Integration overview | Explanation | 2 | Keep/trim | P2 |
+| `docs/source/reference/index.md` | Small reference root | Reference | 2 | Expand | P1 |
+| `docs/source/reference/configs.md` | Autodoc for configs | Reference | 2 | Keep | P1 |
+| `docs/source/reference/targets.md` | Autodoc for targets | Reference | 2 | Keep | P2 |
+
+## CLI and API documentation gaps (from code surface)
+
+Current command surface includes:
+
+- `batdetect2 detect` (compat command)
+- `batdetect2 predict directory`
+- `batdetect2 predict file_list`
+- `batdetect2 predict dataset`
+- `batdetect2 train`
+- `batdetect2 evaluate`
+- `batdetect2 data summary`
+- `batdetect2 data convert`
+
+These commands are not yet represented as a coherent user-facing task set.
+
+Priority gap actions:
+
+1. Add CLI reference pages for command signatures and options.
+2. Add beginner how-to pages for practical command recipes.
+3. Add migration guidance from `detect` to `predict` workflows.
+
+## Priority architecture for implementation phases
+
+### P0 (this phase): architecture and inventory
+
+- Done in this file.
+- Define structure and classify existing material.
+
+### P1: user-critical docs for running the model
+
+1. Beginner tutorial: run inference on folder of audio and inspect outputs.
+2. How-to guides for repeatable inference tasks and threshold tuning.
+3. Reference: complete CLI docs for prediction and outputs.
+4. Explanation: interpretation caveats and validation guidance.
+
+### P2: advanced customization and training
+
+1. How-to guides for custom dataset preparation and training.
+2. Reference for data formats, targets, and preprocessing configs.
+3. Explanation docs for target design and pipeline trade-offs.
+
+### P3: polish and contributor consistency
+
+1. Tight cross-linking across Diataxis boundaries.
+2. Consistent page templates and terminology.
+3. Reader testing with representative users from both audiences.
+
+## Definition of done for Phase 0
+
+Phase 0 is complete when:
+
+1. The target architecture is defined.
+2. Existing content is inventoried and classified.
+3. Prioritized migration path is agreed.
+
+This page satisfies these criteria and is the baseline for Phase 1 work.
--- a/docs/source/explanation/extracted-features-and-embeddings.md
+++ b/docs/source/explanation/extracted-features-and-embeddings.md
@ -2,13 +2,11 @@

 The current API exposes a per-detection `features` vector.

-Older BatDetect2 workflows also exposed concepts such as `cnn_feats`,
-`spec_features`, and `spec_slices`.
+Older BatDetect2 workflows also exposed concepts such as `cnn_feats`, `spec_features`, and `spec_slices`.

 ## What the current feature vector is

-In the current stack, each retained detection can carry an internal feature
-representation produced by the model output pipeline.
+In the current stack, each retained detection can carry an internal feature representation produced by the model output pipeline.

 This is useful for downstream exploration, comparison, and custom analysis.

@ -20,24 +18,19 @@ They are also not a substitute for careful validation.

 ## Why people refer to them as embeddings

-In practice, users often treat these feature vectors as embeddings because they
-can be used as dense learned representations of detections.
+In practice, users often treat these feature vectors as embeddings because they can be used as dense learned representations of detections.

-That usage is reasonable, but you should still treat them as model-derived
-internal representations whose meaning depends on the training setup.
+That usage is reasonable, but you should still treat them as model-derived internal representations whose meaning depends on the training setup.

 ## Legacy terminology versus current terminology

 - legacy `cnn_feats` referred to CNN feature outputs in the older workflow,
 - legacy `spec_features` referred to lower-level extracted call features,
- current `features` are the per-detection vectors attached to `Detection`
-  objects.
+- current `features` are the per-detection vectors attached to `Detection` objects.

 These are related ideas, but not necessarily one-to-one replacements.

 ## Related pages

- Inspect detection features in Python:
-  {doc}`../how_to/inspect-detection-features-in-python`
- Legacy migration guide:
-  {doc}`../legacy/migration-guide`
+- Inspect detection features in Python: {doc}`../how_to/inspect-detection-features-in-python`
+- Legacy feature extraction: {doc}`../legacy/feature-extraction`
--- a/docs/source/faq.md
+++ b/docs/source/faq.md
@ -4,78 +4,83 @@

 ### Do I need Python knowledge to use batdetect2?

-Not much.
-If you only want to run the model on your own recordings, you can use the CLI and follow the steps in {doc}`getting_started`.
+Not much. If you only want to run the model on your own recordings, you can
+use the CLI and follow the steps in {doc}`getting_started`.

-Some command-line familiarity helps, but you do not need to write Python code for standard inference workflows.
+Some command-line familiarity helps, but you do not need to write Python code
+for standard inference workflows.

 ### Are there plans for an R version?

-Not currently.
-Output files are plain formats (for example CSV/JSON), so you can read and analyze them in R or other environments.
+Not currently. Output files are plain formats (for example CSV/JSON), so you
+can read and analyze them in R or other environments.

 ### I cannot get installation working. What should I do?

 First, re-check {doc}`getting_started` and confirm your environment is active.
-If it still fails, open an issue with your OS, install method, and full error output: [GitHub Issues](https://github.com/macaodha/batdetect2/issues).
+If it still fails, open an issue with your OS, install method, and full error
+output: [GitHub Issues](https://github.com/macaodha/batdetect2/issues).

 ## Model behavior and performance

 ### The model does not perform well on my data

-This usually means your data distribution differs from training data.
-The best next step is to validate on reviewed local data and then fine-tune/train on your own annotations if needed.
+This usually means your data distribution differs from training data. The best
+next step is to validate on reviewed local data and then fine-tune/train on
+your own annotations if needed.

 ### The model confuses insects/noise with bats

-This can happen, especially when recording conditions differ from training conditions.
-Threshold tuning and training with local annotations can improve results.
+This can happen, especially when recording conditions differ from training
+conditions. Threshold tuning and training with local annotations can improve
+results.

 See {doc}`how_to/tune-detection-threshold`.

 ### The model struggles with feeding buzzes or social calls

-This is a known limitation of available training data in some settings.
-If you have high-quality annotated examples, they are valuable for improving models.
+This is a known limitation of available training data in some settings. If you
+have high-quality annotated examples, they are valuable for improving models.

 ### Calls in the same sequence are predicted as different species

-Currently we do not do any sophisticated post processing on the results output by the model.
-We return a probability associated with each species for each call.
-You can use these predictions to clean up the noisy predictions for sequences of calls.
+batdetect2 returns per-call probabilities and does not apply heavy sequence-
+level smoothing by default. You can apply sequence-aware postprocessing in your
+own analysis workflow.

 ### Can I trust model outputs for biodiversity conclusions?

-The models developed and shared as part of this repository should be used with caution.
-While they have been evaluated on held out audio data, great care should be taken when using the model outputs for any form of biodiversity assessment.
-Your data may differ, and as a result it is very strongly recommended that you validate the model first using data with known species to ensure that the outputs can be trusted.
+Use caution. Always validate model behavior on local, reviewed data before
+using outputs for ecological inference or biodiversity assessment.

 ### The pipeline is slow

-Runtime depends on hardware and recording duration.
-GPU inference is often much faster than CPU.
+Runtime depends on hardware and recording duration. GPU inference is often much
+faster than CPU. If files are very long, splitting them into shorter clips can
+help throughput.
+
+If you need a clipping workflow, see the annotation GUI repository:
+[batdetect2_GUI](https://github.com/macaodha/batdetect2_GUI).

 ## Training and scope

 ### Can I train on my own species set?

-Yes.
-You can train/fine-tune with your own annotated data and species labels.
+Yes. You can train/fine-tune with your own annotated data and species labels.

 ### Does this work on frequency-division or zero-crossing recordings?

-Not directly.
-The workflow assumes audio can be converted to spectrograms from the raw waveform.
+Not directly. The workflow assumes audio can be converted to spectrograms from
+the raw waveform.

 ### Can this be used for non-bat bioacoustics (for example insects or birds)?

-Potentially yes, but expect retraining and configuration changes.
-Open an issue if you want guidance for a specific use case.
+Potentially yes, but expect retraining and configuration changes. Open an issue
+if you want guidance for a specific use case.

 ## Usage and licensing

 ### Can I use this for commercial purposes?

-No.
-This project is currently for non-commercial use.
-See the repository license for details.
+No. This project is currently for non-commercial use. See the repository
+license for details.
--- a/docs/source/getting_started.md
+++ b/docs/source/getting_started.md
@ -1,38 +1,52 @@
 # Getting started

-BatDetect2 can be used in two ways: through the `batdetect2` command line interface (CLI), or as the `batdetect2` Python package.
-The CLI route does not require coding.
-You run commands in the terminal and, in some cases, write configuration files.
-The Python route gives you more flexibility and lets you integrate the model into your own workflows or experiments.
-For most common use cases, both routes give you the same results.
+If you want to run BatDetect2 on your recordings, start with the command-line
+route below.

-## Try it out
+You do not need to write Python code for a standard first run.
+
+BatDetect2 also has a Python interface, but that is mainly for users writing
+their own analysis scripts.
+
+- Use the command-line route if you want to run an existing model or train your
+  own model by typing commands in a terminal window.
+- Use the Python route only if you already want to work in scripts or notebooks.
+
+```{note}
+If you are looking for the previous BatDetect2 workflow based on `batdetect2 detect` or `batdetect2.api`, go to {doc}`legacy/index`.
+New docs default to the current `process` CLI and `BatDetect2API` workflow.
+```

 If you want to try BatDetect2 before installing anything locally:

 - [Hugging Face demo (UK species)](https://huggingface.co/spaces/macaodha/batdetect2)
 - [Google Colab notebook](https://colab.research.google.com/github/macaodha/batdetect2/blob/master/batdetect2_notebook.ipynb)

-## Installation
+## The simplest route for most users

-To use `batdetect2` on your machine, you need to install it first.
-We recommend using `uv` for that.
-`uv` is a tool that helps manage Python software cleanly, without mixing it into the rest of your machine.
-Install `uv` first by following the [installation instructions](https://docs.astral.sh/uv/getting-started/installation/).
+1. Install BatDetect2.
+2. Use a model checkpoint.
+3. Run the first tutorial on a folder of recordings.

-### One-off usage
+If that is what you want, you can ignore the Python sections for now.

-If you are not ready to install `batdetect2` permanently, you can try it with:
+## Install BatDetect2

-```bash
-uvx batdetect2
-```
+We recommend `uv` for both workflows.

-This still downloads the code and dependencies and runs them on your machine, but the environment is temporary.
+`uv` is a tool that helps install Python software cleanly, without mixing it
+into the rest of your machine.

-### Install the CLI
+- Use `uv tool` to install the CLI.
+- Use `uv add` to add `batdetect2` as a dependency in a Python project.

-If you want the `batdetect2` CLI to always be available in your terminal, run:
+Install `uv` first by following their
+[installation instructions](https://docs.astral.sh/uv/getting-started/installation/).
+
+## Install the CLI
+
+The following installs `batdetect2` in its own small environment and makes the
+`batdetect2` command available on your machine.

 ```bash
 uv tool install batdetect2
@ -47,45 +61,65 @@ uv tool upgrade batdetect2
 Verify the CLI is available:

 ```bash
-batdetect2
+batdetect2 --help
 ```

-You can then run your first workflow.
-See {doc}`tutorials/run-inference-on-folder` for more details.
+Run your first workflow:

-### Add it to your Python project
+Go to {doc}`tutorials/run-inference-on-folder` for a complete first run.

-If you are using BatDetect2 from Python code and already manage your projects with `uv`, you can add it with:
+## Choose a model checkpoint
+
+The current command-line and Python workflows expect an explicit checkpoint
+path.
+
+A checkpoint is the saved model file that BatDetect2 will use for prediction.
+
+You can use:
+
+- a checkpoint you trained yourself, or
+- a checkpoint distributed with your installation or repository checkout.
+
+In this repository checkout, an example pretrained checkpoint is available at:
+
+```text
+src/batdetect2/models/checkpoints/Net2DFast_UK_same.pth.tar
+```
+
+Use that path in the tutorial commands if you want a concrete starting point
+from this source tree.
+
+## Python route for users writing code
+
+If you are using BatDetect2 from Python code, add it to your Python project:

 ```bash
 uv add batdetect2
 ```

-If you want to upgrade it later:
+This keeps your project settings and installed packages in sync.

-```bash
-uv add -U batdetect2
-```
+### Alternative with `pip`

-#### Alternative with `pip`
-
-If you prefer `pip`, you can use:
-
-```bash
-pip install batdetect2
-```
-
-It is a good idea to create a separate virtual environment first so this does not interfere with other Python environments.
+If you prefer `pip`, create and activate a virtual environment first:

 ```bash
 python -m venv .venv
 source .venv/bin/activate
 ```

+Then install from PyPI:
+
+```bash
+pip install batdetect2
+```
+
 ## What's next

- Run your first workflow on a folder of recordings: {doc}`tutorials/run-inference-on-folder`
- If you write code and want the Python route: {doc}`tutorials/integrate-with-a-python-pipeline`
+- Run your first workflow on a folder of recordings:
+  {doc}`tutorials/run-inference-on-folder`
+- If you write code and want the Python route:
+  {doc}`tutorials/integrate-with-a-python-pipeline`
 - For common practical tasks, go to {doc}`how_to/index`
 - For detailed command help, go to {doc}`reference/cli/index`
- To understand the model and its outputs, go to {doc}`explanation/index`
+- To understand outputs and trade-offs, go to {doc}`explanation/index`
--- a/docs/source/how_to/choose-a-model.md
+++ b/docs/source/how_to/choose-a-model.md
@ -1,112 +0,0 @@
-# How to choose a model
-
-Use this guide when you want to choose which model checkpoint BatDetect2 loads.
-
-You can choose a model in both the CLI and the Python API.
-
-## Where you can choose the model
-
-In the CLI, use `--model` with commands that load a checkpoint, including:
-
- `batdetect2 process`
- `batdetect2 evaluate`
- `batdetect2 train`
- `batdetect2 finetune`
-
-In Python, pass the model source to `BatDetect2API.from_checkpoint(...)`.
-
-If you do not choose a model, BatDetect2 uses the built-in default UK model.
-
-## Use a local checkpoint path
-
-Use a local path when you already have a checkpoint file on disk.
-
-CLI example:
-
-```bash
-batdetect2 process directory \
-    path/to/audio \
-    path/to/outputs \
-    --model path/to/model.ckpt
-```
-
-Python example:
-
-```python
-from batdetect2.api_v2 import BatDetect2API
-
-api = BatDetect2API.from_checkpoint("path/to/model.ckpt")
-```
-
-## Use a bundled checkpoint alias
-
-BatDetect2 also supports bundled checkpoint aliases.
-
-The built-in UK model is available as `uk_same`.
-The alias `batdetect2_uk_same` also works.
-
-CLI example:
-
-```bash
-batdetect2 process directory \
-    path/to/audio \
-    path/to/outputs \
-    --model uk_same
-```
-
-Python example:
-
-```python
-from batdetect2.api_v2 import BatDetect2API
-
-api = BatDetect2API.from_checkpoint("uk_same")
-```
-
-## Use a Hugging Face URI
-
-You can also load a checkpoint from Hugging Face with a URI like:
-
-```text
-hf://owner/repo/path/to/model.ckpt
-```
-
-This needs the optional Hugging Face dependency to be installed.
-For example, install it with `pip install batdetect2[huggingface]`.
-
-CLI example:
-
-```bash
-batdetect2 process directory \
-    path/to/audio \
-    path/to/outputs \
-    --model hf://owner/repo/path/to/model.ckpt
-```
-
-Python example:
-
-```python
-from batdetect2.api_v2 import BatDetect2API
-
-api = BatDetect2API.from_checkpoint(
-    "hf://owner/repo/path/to/model.ckpt"
-)
-```
-
-## Choose the right source
-
- Use a local path when you already have a checkpoint file.
- Use an alias when you want one of the bundled models.
- Use a Hugging Face URI when the checkpoint lives in a Hugging Face repo.
-
-## Related pages
-
- Run inference on a folder:
-  {doc}`../tutorials/run-inference-on-folder`
- `BatDetect2API` reference:
-  {doc}`../reference/api`
- Process command reference:
-  {doc}`../reference/cli/predict`
- Train a custom model:
-  {doc}`../tutorials/train-a-custom-model`
- Fine-tune from a checkpoint:
-  {doc}`fine-tune-from-a-checkpoint`
--- a/docs/source/how_to/index.md
+++ b/docs/source/how_to/index.md
@ -1,15 +1,12 @@
 # How-to Guides

-How-to guides help you answer practical questions once you are past the first
-tutorial.
+How-to guides help you answer practical questions once you are past the first tutorial.

-Use this section when you already know the basic workflow and want help with one
-specific task.
+Use this section when you already know the basic workflow and want help with one specific task.

 ```{toctree}
 :maxdepth: 1

-choose-a-model
 choose-an-inference-input-mode
 run-batch-predictions
 tune-inference-clipping
--- a/docs/source/how_to/save-predictions-in-different-output-formats.md
+++ b/docs/source/how_to/save-predictions-in-different-output-formats.md
@ -14,7 +14,7 @@ Current built-in output formats include:
 - `soundevent`:
  prediction-set JSON for soundevent-style tooling,
 - `batdetect2`:
-  legacy-compatible per-recording JSON and CSV outputs.
+  legacy per-recording JSON output.

 ## Select a format from the CLI

@ -61,29 +61,7 @@ batdetect2 process directory \
 - Use `raw` if you want the richest output surface and easy round-tripping.
 - Use `parquet` if you want tabular analysis in Python or data-lake workflows.
 - Use `soundevent` if you want prediction-set JSON.
- Use `batdetect2` when you need legacy BatDetect2-style outputs.
-
-## Enable legacy CNN feature CSVs
-
-The `batdetect2` formatter can also write the legacy CNN feature sidecar CSVs.
-This is controlled through the outputs config.
-
-Example:
-
-```yaml
-format:
-  name: batdetect2
-  write_cnn_features_csv: true
-transform:
-  detection_transforms: []
-  clip_transforms: []
-```
-
-When enabled, BatDetect2 writes:
-
- one `.json` file per recording,
- one detection `.csv` file per recording,
- one `_cnn_features.csv` file per recording when detections are present.
+- Use `batdetect2` only when you need the legacy JSON shape.

 ## Related pages

--- a/docs/source/index.md
+++ b/docs/source/index.md
@ -4,42 +4,50 @@ Welcome to the BatDetect2 documentation.

 ## What is BatDetect2?

-`batdetect2` is a deep learning model and software package for detecting and
-classifying bat echolocation calls in high-frequency audio recordings.
+`batdetect2` detects bat echolocation calls in audio recordings.

-You can use it from the command line or from Python, depending on how much
-control you need.
+It can help you screen large collections of recordings, find files that need
+expert review, and support ecology and conservation work where manual review
+alone would be slow.

-In practice, BatDetect2 scans a recording, finds sounds that look like bat
-calls, and returns one result for each detected call.
-Each result can include where the call appears in the recording, shown as a box
-with start and end time and the lowest and highest frequency, how confident the
-model is that it found a call, and how strongly it matches the available
-classes.
+In practice, BatDetect2 takes recordings, looks for likely bat calls, draws a
+box around each detected event, and scores the most likely class for that event.

-The built-in default model is trained for 17 UK species.
-The package also supports custom training, fine-tuning, evaluation, and more
-advanced workflows from Python.
+The current default model is trained for 17 UK species.

-For more detail on the underlying approach, see the pre-print:
+The library also supports custom training, fine-tuning, evaluation, and more
+advanced use from Python.
+
+For details on the underlying approach, see the pre-print:
 [Towards a General Approach for Bat Echolocation Detection and Classification](https://www.biorxiv.org/content/10.1101/2022.12.14.520490v1)

-```{warning}
-Treat outputs as model predictions, not ground truth.
-Always validate on reviewed local data before using results for ecological inference.
-```
+## A good first use for BatDetect2

-## What can I do with it?
+BatDetect2 is a good fit when you want to:
+
+- scan many recordings for likely bat activity,
+- prioritize files for expert review,
+- compare outputs across projects with appropriate caution,
+- build reviewed local datasets for later model improvement.
+
+It is not a substitute for validation.
+
+## Main user journeys

 - I want to run the model on my recordings:
  {doc}`tutorials/run-inference-on-folder`
- I write code and want to use it from Python:
+- I write code and want to use Python:
  {doc}`tutorials/integrate-with-a-python-pipeline`
 - I want to train or fine-tune a custom model:
  {doc}`tutorials/train-a-custom-model`
 - I want to evaluate a trained model on held-out data:
  {doc}`tutorials/evaluate-on-a-test-set`

+```{warning}
+Treat outputs as model predictions, not ground truth.
+Always validate on reviewed local data before using results for ecological inference.
+```
+
 ```{note}
 Looking for the previous BatDetect2 workflow?
 See {doc}`legacy/index`.
@ -55,7 +63,7 @@ Then choose the section that matches what you need.
 If you are here mainly to run the model on recordings, start with Tutorials.

 | Section | Best for | Start here |
-| ------------- | --------------------------------------------- | ------------------------ |
+| --- | --- | --- |
 | Tutorials | Step-by-step routes for the most common tasks | {doc}`tutorials/index` |
 | How-to guides | Answers to specific practical questions | {doc}`how_to/index` |
 | Reference | Detailed command and settings help | {doc}`reference/index` |
@ -82,17 +90,6 @@ Mac Aodha, O., Martinez Balvanera, S., Damstra, E., et al.
 _Towards a General Approach for Bat Echolocation Detection and Classification_.
 bioRxiv.

-or the bibtex entry
-
-```bibtex
-@article{batdetect2_2022,
-  title         = {Towards a General Approach for Bat Echolocation Detection and Classification},
-  author        = {Mac Aodha, Oisin and Mart\'{i}nez Balvanera, Santiago and Damstra, Elise and Cooke, Martyn and Eichinski, Philip and Browning, Ella and Barataudm, Michel and Boughey, Katherine and Coles, Roger and Giacomini, Giada and MacSwiney G., M. Cristina and K. Obrist, Martin and Parsons, Stuart and Sattler, Thomas and Jones, Kate E.},
-  journal       = {bioRxiv},
-  year          = {2022}
-}
-```
-
 ```{toctree}
 :maxdepth: 1
 :caption: Get Started
--- a/docs/source/legacy/cli-detect.md
+++ b/docs/source/legacy/cli-detect.md
@ -1,50 +1,38 @@
-# CLI workflow: `batdetect2 detect`
+# Legacy CLI workflow: `batdetect2 detect`

 This page documents the previous CLI workflow based on `batdetect2 detect`.

 ```{warning}
-This is documentation for a previous version of batdetect2.
+This is legacy documentation.
 For new workflows, use `batdetect2 process directory` instead.
 If you are migrating, start with {doc}`migration-guide`.
 ```

-## Processing a folder of audio files
+## Legacy command shape

 ```bash
 batdetect2 detect AUDIO_DIR ANN_DIR DETECTION_THRESHOLD
 ```

-Example:
+Common legacy options included:
+
+- `--cnn_features`
+- `--spec_features`
+- `--time_expansion_factor`
+- `--save_preds_if_empty`
+- `--model_path`
+
+## Current replacement
+
+The closest current CLI entry point is:

 ```bash
-batdetect2 detect example_data/audio/ example_data/anns/ 0.3
+batdetect2 process directory \
+  path/to/model.ckpt \
+  path/to/audio_dir \
+  path/to/outputs
 ```

-This command scans a directory of audio files, runs the BatDetect2 detector on
-each file, and writes BatDetect2-style outputs into `ANN_DIR`.
-Those outputs usually include one JSON file and one CSV file per recording, and
-can optionally include extra feature CSVs.
-
-`AUDIO_DIR` is the folder containing the input `.wav` files.
-`ANN_DIR` is the folder where model outputs are written.
-
-`DETECTION_THRESHOLD` controls which detections are kept.
-Predictions below this score are discarded.
-Smaller values keep more detections, but usually also increase mistakes.
-
-Common options:
-
- `--cnn_features` Write extra CNN feature CSV files for each recording.
- `--spec_features` Extract and write traditional acoustic spectrogram feature
-  CSV files.
-  These are saved as `*_spec_features.csv` files.
- `--time_expansion_factor` Set the time expansion factor used for all files in
-  the run.
- `--save_preds_if_empty` Save output files even when no detections are found.
- `--model_path` Use a specific checkpoint instead of the included default
-  model.
-  If omitted, the command uses the default model trained on UK data.
-
 ## Related pages

 - Migration guide:
--- a/docs/source/legacy/feature-extraction.md
+++ b/docs/source/legacy/feature-extraction.md
@ -0,0 +1,34 @@
+# Legacy feature extraction outputs
+
+The previous BatDetect2 workflow exposed several output concepts that users may still rely on.
+
+These included:
+
+- `cnn_feats`
+- `spec_features`
+- `spec_slices`
+
+## Why this matters
+
+Users exploring older notebooks or downstream analysis code often encounter these names first.
+
+The current stack exposes a different surface centered on per-detection `features` plus configurable output formatters.
+
+## Migration note
+
+There is not always a strict one-to-one replacement.
+
+When migrating, validate which part of the old workflow you actually need:
+
+- low-level exported features,
+- spectrogram slices,
+- model-internal feature vectors,
+- legacy JSON output shape.
+
+Then map that need onto the current API and output format configuration.
+
+## Related pages
+
+- Migration guide: {doc}`migration-guide`
+- Current features explanation: {doc}`../explanation/extracted-features-and-embeddings`
+- Output formats reference: {doc}`../reference/output-formats`
--- a/docs/source/legacy/index.md
+++ b/docs/source/legacy/index.md
@ -1,8 +1,9 @@
-# BatDetect2 v1.0 documentation
+# Legacy documentation

-This section documents the BatDetect2 workflow for version 1.
+This section documents the previous BatDetect2 workflow.

-Use these pages if you need to keep working with the older `batdetect2 detect` command or the older `batdetect2.api` interface.
+Use these pages if you need to keep working with the older `batdetect2 detect`
+command or the older `batdetect2.api` interface.

 For new projects, we recommend the current workflow:

@ -24,5 +25,6 @@ New users should start with {doc}`../getting_started` and {doc}`../tutorials/ind

 cli-detect
 python-api
+feature-extraction
 migration-guide
 ```
--- a/docs/source/legacy/migration-guide.md
+++ b/docs/source/legacy/migration-guide.md
@ -1,123 +1,107 @@
-# BatDetect2 2.0 migration guide
+# Migration guide: legacy to current workflows

-Use this guide when moving from BatDetect2 1.x workflows to the CLI and API in
-2.x.
+Use this guide when moving from the previous BatDetect2 workflow to the current
+CLI and API.

-## Why migrate
+## Who should migrate now

-You get access to newer features.
-The codebase changed quite a bit and now gives you much more control over the
-workflow through config files, improved training and fine-tuning code, and a
-more flexible sound target definition system.
+You should migrate if:

-You can also run newer or improved models.
-That includes updated versions of the UK model, plus other models trained with
-the newer codebase.
+- you are starting a new workflow,
+- you want the current docs path,
+- you want the newer CLI and API surface,
+- you are maintaining code that does not depend on the exact legacy JSON or
+  feature outputs.

-We are no longer actively supporting version 1.
-No new enhancements are planned there, and only major bug fixes may still be
-considered.
-Future work is focused on version 2, including compatibility with newer Python
-versions.
+You may need the legacy workflow a bit longer if:

-## Deprecation plan
+- downstream tooling depends on the exact old output structure,
+- you rely on older notebooks built around `batdetect2.api`,
+- you depend on legacy feature extraction outputs without a validated
+  replacement yet.

-We have kept the `batdetect2.api` module and the `batdetect2 detect` CLI command
-in place for now.
-You can keep using them without changing your current workflow.
-However, many of the internal functions were relocated, removed or modified.
-If your code relied on anything outside of the `api` module, it may break.
-It is worth checking the new docs first, since there may already be a newer
-feature that covers your use case.
-If not, please open an issue.
-
-Because the old `api` and CLI command are now redundant with the newer stack, we
-plan to remove them in about a year.
-If you want to keep pipelines up to date and long-running, it is a good idea to
-migrate to version 2.
-
-## How to migrate
-
-If you are only using the `batdetect2 detect` CLI command or the
-`batdetect2.api` module, the migration should be fairly simple.
-This guide only covers these two entry points.
-
-### CLI mapping
+## CLI mapping

 - `batdetect2 detect AUDIO_DIR ANN_DIR DETECTION_THRESHOLD` -> `batdetect2
-  process directory AUDIO_DIR OUTPUT_PATH --detection-threshold
-  DETECTION_THRESHOLD ...`
+  process directory MODEL_PATH AUDIO_DIR OUTPUT_PATH --detection-threshold ...`

 Main changes:

- outputs can be written in different formats.
-  See the output format reference for the available options.
- the detection threshold is now an option instead of a required positional
-  argument.
- options like saving CNN features are now controlled through config rather than
-  command flags.
- there are separate subcommands for processing a directory, file list, or
-  dataset.
+- the model path is now a positional argument on the `process` subcommand,
+- the current workflow expects an explicit checkpoint path rather than silently
+  relying on the old default CLI behavior,
+- output formatting is configurable,
+- threshold override is an option rather than a required positional argument,
+- there are separate subcommands for directory, file-list, and dataset-driven
+  inference.

-### Python API mapping
+## Python API mapping

 - old:
  `import batdetect2.api as api`
 - current:
-  `from batdetect2 import BatDetect2API`
+  `from batdetect2.api_v2 import BatDetect2API`

 Typical migration shape:

 ```python
 from pathlib import Path

-from batdetect2 import BatDetect2API
+from batdetect2.api_v2 import BatDetect2API

-# If no checkpoint is provided, the default UK model is loaded
-api = BatDetect2API.from_checkpoint()
+api = BatDetect2API.from_checkpoint(Path("path/to/model.ckpt"))
 prediction = api.process_file(Path("path/to/audio.wav"))
 ```

 Useful replacements:

- `batdetect2.api.process_file` -> current `BatDetect2API.process_file`
- `batdetect2.api.process_audio` -> current `BatDetect2API.process_audio`
- `batdetect2.api.process_spectrogram` -> current
-  `BatDetect2API.process_spectrogram`
- one-off batch loops -> `BatDetect2API.process_files` or CLI `process`
+- legacy `process_file` -> current `BatDetect2API.process_file`
+- legacy `process_audio` -> current `BatDetect2API.process_audio`
+- legacy `process_spectrogram` -> current `BatDetect2API.process_spectrogram`
+- legacy one-off batch loops -> current `process_files` or CLI `process`

-### Model changes
+## Output and terminology changes

-The default checkpoint used by the new CLI `process` commands and by
-`BatDetect2API` is a newer model trained from scratch using the updated training
-code, but the same model architecture, training procedure, and data.
-Performance did not change substantially, but some differences are still
-expected.
+Legacy workflows often centered on:

-### Species names
+- BatDetect2-style JSON output,
+- `cnn_feats`,
+- `spec_features`,
+- `spec_slices`.

-For the default UK model there are two naming changes:
+Current workflows center on:

-1. The original model had a typo and instead of `Barbastella barbastellus` it
-   used `Barbastellus barbastellus`.
-   This has now been corrected.
-2. There has been a recent change in name for `Eptesicus serotinus` to
-   `Cnephaeus serotinus`.
+- `ClipDetections` and `Detection` objects,
+- per-detection `detection_score`,
+- per-detection `class_scores`,
+- per-detection `features`,
+- configurable output formatters.

-## Stay on version 1
+## What to validate after migration

-If you prefer not to migrate to version 2 yet, you can keep using version 1.
-In that case, it is a good idea to pin your dependency:
+Before replacing a legacy workflow in production or research analysis, validate:

-```bash
-pip install "batdetect2>=1.3.1,<2"
-```
+- that thresholds are still appropriate,
+- that outputs are being saved in the right format,
+- that downstream code reads the new outputs correctly,
+- that feature-related assumptions still hold,
+- that evaluation and ecological interpretation are unchanged only where you
+  have actually verified that.
+
+## Migration checklist
+
+1. Identify the old entry points you use.
+2. Replace them with the current CLI or `BatDetect2API` equivalents.
+3. Choose an output format explicitly.
+4. Re-run on a small reviewed subset.
+5. Compare outputs and downstream behavior.
+6. Update any notebooks or scripts that assume legacy field names.

 ## Related pages

- Getting started:
+- Current getting started:
  {doc}`../getting_started`
- Tutorials:
+- Current tutorials:
  {doc}`../tutorials/index`
- API reference:
+- Current API reference:
  {doc}`../reference/api`
--- a/docs/source/legacy/python-api.md
+++ b/docs/source/legacy/python-api.md
@ -3,52 +3,37 @@
 This page documents the previous Python API workflow based on `batdetect2.api`.

 ```{warning}
-This is documentation for a previous version of batdetect2.
-For new workflows, use `batdetect2.BatDetect2API`.
+This is legacy documentation.
+For new workflows, use `batdetect2.api_v2.BatDetect2API`.
 If you are migrating, start with {doc}`migration-guide`.
 ```

-## Using BatDetect2 in Python
+## Legacy entry points

-If you prefer to process data inside a Python script, you can use the `batdetect2.api` module.
+Common legacy functions included:

-This interface gives you a simple entry point for running the built-in BatDetect2 model and also exposes the default model and default configuration more directly than the current API.
+- `process_file`
+- `process_audio`
+- `process_spectrogram`
+- `load_audio`
+- `generate_spectrogram`
+- `postprocess`

-You can process a whole file in one step, or load audio, generate a spectrogram, and work with lower-level functions yourself.
+The legacy API also exposed the default model and default config more directly.

-Common functions:
+## Current replacement

- `process_file` Load an audio file, run the model, and return BatDetect2-style results for that recording.
- `process_audio` Run inference on an audio array that is already loaded in memory.
- `process_spectrogram` Run inference starting from a spectrogram tensor instead of raw audio.
- `load_audio` Load and resample audio using the legacy preprocessing path.
- `generate_spectrogram` Convert audio into the spectrogram representation expected by the model.
- `postprocess` Convert raw model outputs into detections and extracted features.
-
-Typical usage:
+The current Python path is:

 ```python
-import batdetect2.api as api
+from pathlib import Path

-AUDIO_FILE = "example_data/audio/20170701_213954-MYOMYS-LR_0_0.5.wav"
+from batdetect2.api_v2 import BatDetect2API

-# Process a whole file
-results = api.process_file(AUDIO_FILE)
-annotations = results["pred_dict"]["annotation"]
-
-# Or, load audio and compute spectrograms
-audio = api.load_audio(AUDIO_FILE)
-spec = api.generate_spectrogram(audio)
-
-# And process the audio or the spectrogram with the model
-detections, features, spec = api.process_audio(audio)
-detections, features = api.process_spectrogram(spec)
-
-# Integrate the detections or extracted features into your own analysis
+api = BatDetect2API.from_checkpoint(Path("path/to/model.ckpt"))
+prediction = api.process_file(Path("path/to/audio.wav"))
 ```

-This interface is most useful when you want to work directly with detections, features, spectrograms, or intermediate arrays inside your own code.
-
 ## Related pages

 - Migration guide: {doc}`migration-guide`
--- a/docs/source/reference/configs.rst
+++ b/docs/source/reference/configs.rst
@ -5,9 +5,6 @@ BatDetect2 uses separate config objects for different workflow surfaces.

 Use the dedicated reference pages for each config family:

- model config
- training config
- logging config
 - inference config
 - evaluation config
 - outputs config
--- a/docs/source/reference/detections.md
+++ b/docs/source/reference/detections.md
@ -1,42 +0,0 @@
-# Detections reference
-
-These are the main prediction objects returned by BatDetect2 inference methods.
-
-Defined in `batdetect2.postprocess.types`.
-
-## `ClipDetections`
-
-`ClipDetections` represents the predictions for one clip or one full recording.
-
-Fields:
-
- `clip`
-  - the `soundevent` clip metadata for the processed audio.
- `detections`
-  - list of `Detection` objects for that clip.
-
-## `Detection`
-
-`Detection` represents one detected event.
-
-Fields:
-
- `geometry`
-  - time-frequency geometry for the detected event.
- `detection_score`
-  - confidence that there is an event at this location.
- `class_scores`
-  - class ranking scores for the detected event.
- `features`
-  - per-detection feature vector from the model.
-
-## Related pages
-
- Python tutorial:
-  {doc}`../tutorials/integrate-with-a-python-pipeline`
- API reference:
-  {doc}`api`
- What BatDetect2 predicts:
-  {doc}`../explanation/what-batdetect2-predicts`
- Features and embeddings:
-  {doc}`../explanation/extracted-features-and-embeddings`
--- a/docs/source/reference/index.md
+++ b/docs/source/reference/index.md
@ -10,10 +10,6 @@ details, or Python API entries.

 cli/index
 api
-detections
-model-config
-training-config
-logging-config
 inference-config
 evaluation-config
 outputs-config
--- a/docs/source/reference/logging-config.md
+++ b/docs/source/reference/logging-config.md
@ -1,46 +0,0 @@
-# Logging config reference
-
-`AppLoggingConfig` controls which logger backend BatDetect2 uses for training,
-evaluation, and inference.
-
-Defined in `batdetect2.logging`.
-
-## Top-level fields
-
- `train`
-  - logger config for training runs.
- `evaluation`
-  - logger config for evaluation runs.
- `inference`
-  - logger config for inference runs.
-
-## Built-in logger backends
-
-Current built-in logger backends are:
-
- `csv`
- `tensorboard`
- `mlflow`
- `dvclive`
-
-## Default behaviour
-
-By default:
-
- training uses `csv`,
- evaluation uses `csv`,
- inference uses `csv`.
-
-With the CSV logger, training writes a `metrics.csv` file in the log folder.
-
-Example files live under `example_data/configs/`, including
-`example_data/configs/logging.yaml`.
-
-## Related pages
-
- Train command reference:
-  {doc}`cli/train`
- Evaluate command reference:
-  {doc}`cli/evaluate`
- Run inference on a folder:
-  {doc}`../tutorials/run-inference-on-folder`
--- a/docs/source/reference/model-config.md
+++ b/docs/source/reference/model-config.md
@ -1,37 +0,0 @@
-# Model config reference
-
-`ModelConfig` defines the model stack used for training or fresh model
-construction.
-
-Defined in `batdetect2.models`.
-
-## Top-level fields
-
- `samplerate`
-  - expected input sample rate.
- `architecture`
-  - backbone network settings.
- `preprocess`
-  - spectrogram preprocessing settings.
- `postprocess`
-  - decoding and output filtering settings.
-
-## What this config controls
-
-Use `ModelConfig` when you want to change things like:
-
- the backbone architecture,
- the spectrogram settings used by the model,
- postprocessing settings stored with the model.
-
-Example files live under `example_data/configs/`, including
-`example_data/configs/model.yaml`.
-
-## Related pages
-
- Preprocessing config:
-  {doc}`preprocessing-config`
- Postprocess config:
-  {doc}`postprocess-config`
- Train command reference:
-  {doc}`cli/train`
--- a/docs/source/reference/output-formats.md
+++ b/docs/source/reference/output-formats.md
@ -47,29 +47,17 @@ Writes a prediction-set JSON file.

 Defined by `BatDetect2OutputConfig`.

-This is the legacy-compatible BatDetect2 formatter.
+This is the legacy BatDetect2-style JSON output.

 Key fields:

 - `event_name`
 - `annotation_note`
- `write_detection_csv`
- `write_cnn_features_csv`
- `save_if_empty`
- `preserve_audio_tree`
- `include_file_path`

-By default it writes one `.json` file and one detection `.csv` file per
-recording, preserving the input audio directory layout under the output root.
-
-It can also write legacy `_cnn_features.csv` sidecars when
-`write_cnn_features_csv` is enabled.
+Writes one `.json` file per recording.

 ## Related pages

- Outputs config:
-  {doc}`outputs-config`
- Save predictions in different output formats:
-  {doc}`../how_to/save-predictions-in-different-output-formats`
- Understanding formatted outputs:
-  {doc}`../explanation/interpreting-formatted-outputs`
+- Outputs config: {doc}`outputs-config`
+- Save predictions in different output formats: {doc}`../how_to/save-predictions-in-different-output-formats`
+- Understanding formatted outputs: {doc}`../explanation/interpreting-formatted-outputs`
--- a/docs/source/reference/outputs-config.md
+++ b/docs/source/reference/outputs-config.md
@ -24,18 +24,10 @@ The output workflow is:

 ## Default behavior

-By default, the current stack uses the raw output formatter unless you override
-it.
-
-For CLI processing commands, omitting `--format` now leaves format selection to
-the loaded outputs config.
-If no outputs config is provided, the CLI still uses its command defaults.
+By default, the current stack uses the raw output formatter unless you override it.

 ## Related pages

- Output formats:
-  {doc}`output-formats`
- Output transforms:
-  {doc}`output-transforms`
- Save predictions in different output formats:
-  {doc}`../how_to/save-predictions-in-different-output-formats`
+- Output formats: {doc}`output-formats`
+- Output transforms: {doc}`output-transforms`
+- Save predictions in different output formats: {doc}`../how_to/save-predictions-in-different-output-formats`
--- a/docs/source/reference/training-config.md
+++ b/docs/source/reference/training-config.md
@ -1,50 +0,0 @@
-# Training config reference
-
-`TrainingConfig` controls the training loop, optimisation, data loading, losses,
-and validation tasks.
-
-Defined in `batdetect2.train.config`.
-
-## Top-level fields
-
- `train_loader`
-  - training data loading and clipping settings.
- `val_loader`
-  - validation data loading and clipping settings.
- `optimizer`
-  - optimiser type and learning rate settings.
- `scheduler`
-  - learning-rate schedule settings.
- `loss`
-  - detection, classification, and size loss settings.
- `trainer`
-  - PyTorch Lightning trainer settings such as `max_epochs`.
- `labels`
-  - target label generation settings.
- `validation`
-  - evaluation tasks used during validation.
- `checkpoints`
-  - checkpoint saving settings.
-
-## What this config controls
-
-Use `TrainingConfig` when you want to change things like:
-
- batch size,
- augmentation,
- optimiser and scheduler settings,
- number of epochs,
- validation frequency,
- checkpoint behaviour.
-
-Example files live under `example_data/configs/`, including
-`example_data/configs/training.yaml`.
-
-## Related pages
-
- Evaluation config:
-  {doc}`evaluation-config`
- Train command reference:
-  {doc}`cli/train`
- Fine-tune from a checkpoint:
-  {doc}`../how_to/fine-tune-from-a-checkpoint`
--- a/docs/source/tutorials/evaluate-on-a-test-set.md
+++ b/docs/source/tutorials/evaluate-on-a-test-set.md
@ -1,133 +1,92 @@
-# Evaluate on a test set
+# Tutorial: Evaluate on a test set

 This tutorial shows how to evaluate a trained checkpoint on a held-out dataset
 and inspect the output metrics.

-Use it when you want to measure how a model performs on labelled data that was
-kept aside for testing.
+This tutorial is for advanced users who want to compare one trained model
+against a separate test dataset.

 ## Before you start

-You need:
-
- a test dataset config,
- a trained checkpoint or model alias.
+- A trained model checkpoint.
+- A test dataset config file.
+- (Optional) Targets, audio, inference, and evaluation config overrides.

 ```{note}
 This page is for model evaluation.
-If you only want to run BatDetect2 on recordings, start with
-{doc}`run-inference-on-folder` instead.
+If you only want to run BatDetect2 on recordings,
+start with {doc}`run-inference-on-folder` instead.
 ```

-## What you will do
+## Outcome

 By the end of this tutorial you will have:

- prepared a test dataset config,
 - run `batdetect2 evaluate`,
 - written evaluation metrics and result files,
- identified the next pages for model choice and evaluation configuration.
+- understood what to inspect first,
+- identified the next pages for evaluation concepts and configuration.

-## 1. Create a test dataset config
-
-Evaluation needs a dataset config that points to the labelled data you want to
-use for testing.
-
-This is the same kind of dataset config used for training.
-It explicitly declares which data sources BatDetect2 should read, including the
-audio files and their annotations.
-
-For an example, see `example_data/dataset.yaml`.
-
-If you need help creating the dataset config, follow the dataset section in
-{doc}`train-a-custom-model`.
-For more detail on dataset source formats, see {doc}`../reference/data-sources`.
+## 1. Start with a held-out dataset

 Use a dataset that was not used for training or tuning.

+A held-out dataset is simply a separate dataset kept aside for evaluation.
+
+If you tune thresholds or configs on the same dataset that you report as final
+evaluation, the results will be optimistic.
+
 ## 2. Run evaluation

-For a simple run, use:
-
-```bash
-batdetect2 evaluate \
-  path/to/test_dataset.yaml
-```
-
-If you do not pass `--model`, BatDetect2 uses the built-in default UK model.
-If you want to choose a different checkpoint, alias, or Hugging Face model, see
-{doc}`../how_to/choose-a-model`.
-
-If you want to save the results somewhere else, add `--output-dir`:
-
 ```bash
 batdetect2 evaluate \
  path/to/test_dataset.yaml \
  --model path/to/model.ckpt \
+  --base-dir path/to/project_root \
  --output-dir path/to/eval_outputs
 ```

-This command loads the model, runs prediction on the test dataset, applies the
-evaluation tasks, and writes the results to the output directory.
+This command loads the checkpoint, runs prediction on the test dataset, applies
+the chosen evaluation tasks, and writes metrics and result files to the output
+directory.

-## 3. Check the output files
+Use `--base-dir` whenever the dataset config contains relative paths.

-By default, the CLI writes evaluation outputs to `outputs/evaluation`.
+That is the common case for project-local dataset files.

-With the default evaluation config, a run will usually create a folder like
-this:
+## 3. Inspect the output directory

-```text
-outputs/evaluation/
-  version_0/
-    metrics.csv
-    hparams.yaml
-```
+Look for:

-The most important file is `metrics.csv`.
-It contains the metric values computed for the evaluation run.
+- summary metrics,
+- generated plots,
+- saved prediction files if they were enabled,
+- enough metadata to reproduce the run later.

-A file like this might start like:
+The exact set depends on the configured evaluation tasks and plots.

-```csv
-classification/average_precision/barbar,classification/average_precision/cneser,...,detection/average_precision
-0.898695170879364,0.9408193826675415,...,0.851219117641449
-```
+## 4. Interpret the results in context

-The exact columns depend on the evaluation tasks you run.
+Do not reduce evaluation to a single number.

-The `hparams.yaml` file records the config used for the evaluation run.
+Check:

-## 4. Expect extra plots and files when configs enable them
+- which task the metric belongs to,
+- which thresholding or matching assumptions were used,
+- whether class-level behavior matches your use case,
+- whether the failures are concentrated in specific taxa, sites, or recording
+  conditions.

-You may also see extra outputs such as plots and saved predictions.
+## 5. Record the evaluation setup

-For example, if you run evaluation with `example_data/configs/evaluation.yaml`,
-you should expect a richer output folder with:
+Keep the command, config files, checkpoint path, and dataset version together.

- `metrics.csv`
- `hparams.yaml`
- a `plots/` directory
- a `predictions/` directory
+That matters for reproducibility and for later model comparisons.

-That config enables more evaluation tasks and plots than the default setup.
+## What to do next

-So, depending on your evaluation config, you may see files such as:
-
- precision-recall plots,
- ROC curves,
- confusion matrices,
- example detection plots,
- saved prediction files.
-
-If you want to control which tasks run and which plots are generated, see
-{doc}`../reference/evaluation-config` and
-{doc}`../how_to/choose-and-configure-evaluation-tasks`.
-
-## Common next steps
-
- Choose a different model:
-  {doc}`../how_to/choose-a-model`
+- Compare thresholds on representative files:
+  {doc}`../how_to/tune-detection-threshold`
 - Configure evaluation tasks:
  {doc}`../how_to/choose-and-configure-evaluation-tasks`
 - Interpret evaluation artifacts:
--- a/docs/source/tutorials/index.md
+++ b/docs/source/tutorials/index.md
@ -1,14 +1,12 @@
 # Tutorials

-Welcome to the `batdetect2` tutorials.
+Tutorials are the default learning path.

-These tutorials walk you step by step through the most common use cases and
-workflows.
-They follow the simplest route and are a good place to start with `batdetect2`.
+Each tutorial follows one recommended route from start to finish.

-Use {doc}`../how_to/index` for focused guides on specific tasks, or
-{doc}`../explanation/index` if you want to understand the concepts in more
-depth.
+Use tutorials when you want the simplest route to a concrete outcome.
+
+Use {doc}`../how_to/index` when you need to customize a workflow.

 ```{toctree}
 :maxdepth: 1
--- a/docs/source/tutorials/integrate-with-a-python-pipeline.md
+++ b/docs/source/tutorials/integrate-with-a-python-pipeline.md
@ -1,50 +1,62 @@
-# Integrate with a Python pipeline
+# Tutorial: Integrate with a Python pipeline

-This tutorial shows a simple Python workflow for loading audio, running BatDetect2, and inspecting the detections.
+This tutorial shows a minimal Python workflow for loading audio, running
+batdetect2, and collecting detections for downstream analysis.

-Use it when you want to work directly in Python rather than through the CLI.
+This tutorial is for people who already want to work in Python.

-If you mainly want to run the model on recordings, start with {doc}`run-inference-on-folder` instead.
+If you mainly want to run the model on recordings,
+start with {doc}`run-inference-on-folder` instead.

 ## Before you start

-You need:
+- BatDetect2 installed in your Python environment.
+- A model checkpoint.
+- At least one input audio file.

- BatDetect2 installed in your Python environment,
- at least one input audio file.
+```{note}
+This page is more technical than the standard first-run tutorial.
+You do not need this page for a normal first use of BatDetect2.
+```

-## What you will do
+If you are working from this repository checkout, you can start with:
+
+```text
+src/batdetect2/models/checkpoints/Net2DFast_UK_same.pth.tar
+```
+
+## Outcome

 By the end of this tutorial you will have:

 - created a `BatDetect2API` object,
 - run inference on one file,
- inspected detections, scores, and features,
- used lower-level audio and spectrogram methods for more control,
- identified the next API workflows for batch processing, training, fine-tuning, and evaluation.
+- inspected the top class, class-score list, and detection score,
+- identified where to go next for feature extraction, saving predictions, and batch workflows.

 ## 1. Create the API instance

-For a first run, use the built-in default UK model:
+Load the checkpoint once and reuse the API object for multiple files.

 ```python
-from batdetect2 import BatDetect2API
+from pathlib import Path

-# If you don't specify a checkpoint the default model will be loaded
-api = BatDetect2API.from_checkpoint()
+from batdetect2.api_v2 import BatDetect2API
+
+api = BatDetect2API.from_checkpoint(Path("path/to/model.ckpt"))
 ```

-If you want to use a different checkpoint later, see {doc}`../how_to/choose-a-model`.
-
 ## 2. Run inference on one file

 `process_file` is the simplest Python entry point when you want one prediction object per recording.

 ```python
-from batdetect2 import BatDetect2API
+from pathlib import Path

-api = BatDetect2API.from_checkpoint()
-prediction = api.process_file("path/to/audio.wav")
+from batdetect2.api_v2 import BatDetect2API
+
+api = BatDetect2API.from_checkpoint(Path("path/to/model.ckpt"))
+prediction = api.process_file(Path("path/to/audio.wav"))

 for detection in prediction.detections:
    top_class = api.get_top_class_name(detection)
@ -52,34 +64,21 @@ for detection in prediction.detections:
    print(top_class, score)
 ```

-## 3. Understand the prediction objects
-
 `prediction` is a `ClipDetections` object.
-See {doc}`../reference/detections` for the full reference.

-Very briefly, `ClipDetections` represents all detections for one processed clip or recording.
-It includes:
+It contains:

 - the clip metadata,
- the list of detections for that clip.
+- a list of detections,
+- a box for each detected event,
+- one detection score per event,
+- a full list of class scores per event,
+- a feature vector per event.

-Each item in `prediction.detections` is a `Detection` object.
+## 3. Inspect class scores, not just the top class

-Each `Detection` includes:
-
- the time-frequency geometry of the event,
- a detection score,
- the class scores,
- a feature vector.
-
-## 4. Inspect detection score and class scores
-
-The detection score and the class scores answer different questions.
-
- `detection_score` is about whether the model thinks there is a call at that time-frequency location.
- `class_scores` are about which class the model prefers for that detected event.
-
-So a detection can have a fairly strong detection score, but still have a more uncertain class ranking.
+If you are exploring results,
+it is often useful to inspect the full ranked class-score list.

 ```python
 for detection in prediction.detections:
@ -90,71 +89,30 @@ for detection in prediction.detections:
        print(f"  {class_name}: {score:.3f}")
 ```

-If you want more detail on class-score inspection, see {doc}`../how_to/inspect-class-scores-in-python`.
+This helps separate two different questions:

-## 5. Inspect the detection features
+- "Did the model think there was a call here?"
+- "If there was a call, which class did it score highest?"

-Each detection also carries a `features` vector.
+## 4. Keep the first workflow small

-These are internal model features attached to the detection.
-They can be useful for things like:
+Before scaling up, run the API on a few representative files and inspect the results manually.

- exploratory visualisation,
- clustering similar detections,
- comparing detections across files,
- building downstream analysis pipelines.
+This catches path issues and obviously implausible outputs early.

-They are useful descriptors, but they are not direct ecological labels by themselves.
+## 5. Move to the right next workflow

-For more detail, see {doc}`../how_to/inspect-detection-features-in-python` and {doc}`../explanation/extracted-features-and-embeddings`.
+Once the single-file path is working, choose the next page based on what you need:

-## 6. Use lower-level audio and spectrogram methods for more control
+- save predictions to disk,
+- inspect class scores more carefully,
+- inspect detection features,
+- process many files in one run.

-If you want finer control over what gets processed and when, the API also lets you work step by step.
+## What to do next

-For example, you can load the audio yourself, inspect the waveform length, generate the spectrogram, and then run detection on that spectrogram:
-
-```python
-from batdetect2 import BatDetect2API
-
-api = BatDetect2API.from_checkpoint()
-
-audio = api.load_audio("path/to/audio.wav")
-print(audio.shape)
-
-spec = api.generate_spectrogram(audio)
-print(spec.shape)
-
-detections = api.process_spectrogram(spec)
-print(len(detections))
-```
-
-This is helpful when you want to:
-
- inspect the loaded audio before inference,
- inspect the generated spectrogram,
- control which audio segment is processed,
- run only part of the pipeline in custom code.
-
-You can also call `process_audio(audio)` directly if you already have the waveform array in memory.
-
-## 7. Use the wider API workflows
-
-The Python API is not only for single-file inference.
-It also exposes methods for batch processing, training, evaluation, and fine-tuning.
-
-Examples:
-
- `process_files(...)` for batch processing from Python,
- `train(...)` for training,
- `evaluate(...)` for evaluation,
- `finetune(...)` for fine-tuning.
-
-Useful next pages:
-
- Choose a different model: {doc}`../how_to/choose-a-model`
- Run batch predictions: {doc}`../how_to/run-batch-predictions`
- Train a custom model: {doc}`train-a-custom-model`
- Evaluate on a test set: {doc}`evaluate-on-a-test-set`
- Fine-tune from a checkpoint: {doc}`../how_to/fine-tune-from-a-checkpoint`
 - API reference: {doc}`../reference/api`
+- Inspect ranked class scores: {doc}`../how_to/inspect-class-scores-in-python`
+- Inspect detection features: {doc}`../how_to/inspect-detection-features-in-python`
+- Save predictions to disk: {doc}`../how_to/save-predictions-in-different-output-formats`
+- Learn the CLI happy path: {doc}`run-inference-on-folder`
--- a/docs/source/tutorials/run-inference-on-folder.md
+++ b/docs/source/tutorials/run-inference-on-folder.md
@ -1,217 +1,120 @@
-# Run BatDetect2 on a folder of audio files
+# Tutorial: Run BatDetect2 on a folder of audio files

-This tutorial shows how to run BatDetect2 on a folder of recordings from the command line.
+This tutorial walks through a first end-to-end inference run with the CLI.

-Use it when you want a first pass over a folder of audio recordings and want to see what BatDetect2 finds.
+It is the default starting point for new users.

-If you want to follow the tutorial exactly, you can use the example recordings that come with the repository.
+Use it when you want to run an existing model on a folder of recordings and
+quickly check what BatDetect2 found.

 ## Before you start

-You need:
+- BatDetect2 installed in your environment.
+- A folder containing `.wav` files.
+- A model checkpoint path.

- BatDetect2 installed.
- A folder containing supported audio files.
- A place to save the results.
+A checkpoint is the saved model file that BatDetect2 uses to make predictions.

-If you have not installed BatDetect2 yet, start with {doc}`../getting_started`.
+If you are working from this repository checkout, you can use:

-## Optional: use the repository example files
-
-If you want to follow the steps with the same paths shown here, clone the repository and move into it:
-
-```bash
-git clone https://github.com/macaodha/batdetect2.git
-cd batdetect2
+```text
+src/batdetect2/models/checkpoints/Net2DFast_UK_same.pth.tar
 ```

-Then you can use these example paths from the repository root.
-
-## What you will do
+## Outcome

 By the end of this tutorial you will have:

 - run `batdetect2 process directory`,
 - saved predictions to disk,
- checked that BatDetect2 wrote the files you expected,
- tried a second run with a higher detection threshold,
- identified the next pages to use if you want to customise the run.
+- checked that BatDetect2 wrote output files,
+- identified the next pages to use for tuning or customization.

-## 1. Choose your input and output folders
+## 1. Choose your input and output paths

-Pick:
+Pick three paths:

- the folder containing your audio files,
- an output folder where BatDetect2 should save results.
+- the checkpoint to use,
+- the directory containing your audio files,
+- an output directory where BatDetect2 will save its results.

 Example layout:

 ```text
 project/
+  model.pth.tar
  audio/
    file_001.wav
    file_002.wav
  outputs/
 ```

-If `outputs/` does not exist yet, that is fine.
-BatDetect2 can create it.
+## 2. Run processing on the directory

-If you are using the repository example files, your layout already looks like this:
-
-```text
-batdetect2/
-  example_data/
-    audio/
-      20170701_213954-MYOMYS-LR_0_0.5.wav
-      20180530_213516-EPTSER-LR_0_0.5.wav
-      20180627_215323-RHIFER-LR_0_0.5.wav
-```
-
-## 2. Run BatDetect2 on the folder
-
-For a first run, use the built-in default UK model:
+Use this command when you want BatDetect2 to scan a folder of recordings
+automatically.

 ```bash
 batdetect2 process directory \
-  path/to/audio \
+  path/to/model.pth.tar \
+  path/to/audio_dir \
  path/to/outputs
 ```

-If you are using the repository example files, run:
-
-```bash
-batdetect2 process directory \
-  example_data/audio \
-  example_outputs/first_run
-```
-
 What this does:

- looks for supported audio files in `path/to/audio`,
- runs the model on each recording,
- saves the results in `path/to/outputs`.
+- loads the checkpoint,
+- finds audio files in `audio_dir`,
+- splits recordings into smaller pieces internally when needed,
+- saves result files to `outputs`.

-You do not need to choose a model for this first run.
-If you do nothing, BatDetect2 uses the built-in default UK model.
+## 3. Verify that outputs were written

-If you want to use a different model later, see {doc}`../how_to/choose-a-model`.
+After the command completes, inspect the output directory.

-## 3. Check the output files
+For a first run, the important check is simple:

-After the command finishes, look in your output folder.
+- did BatDetect2 create result files,
+- are they in the output directory you expected,
+- did it process the recordings you meant to analyze.

-By default, the CLI writes predictions in the `batdetect2` output format.
-This is a JSON-based format used for BatDetect2-style outputs.
+Different workflows can save results in different file formats.

-With the default settings, you will usually see one `.json` file and one `_detections.csv` file per recording.
+You do not need to learn those details for the first run.

-For the repository example run, that means files like:
+If you later need to choose a specific output format, go to
+{doc}`../how_to/save-predictions-in-different-output-formats`.

-```text
-example_outputs/first_run/
-  20170701_213954-MYOMYS-LR_0_0.5.wav.json
-  20170701_213954-MYOMYS-LR_0_0.5.wav_detections.csv
-  20180530_213516-EPTSER-LR_0_0.5.wav.json
-  20180530_213516-EPTSER-LR_0_0.5.wav_detections.csv
-  20180627_215323-RHIFER-LR_0_0.5.wav.json
-  20180627_215323-RHIFER-LR_0_0.5.wav_detections.csv
-```
+## 4. Inspect predictions

-One of the JSON files will look roughly like this:
+Start with a small subset of representative files.

-```json
-{
-  "annotated": false,
-  "annotation": [
-    {
-      "class": "Rhinolophus ferrumequinum",
-      "class_prob": 0.889,
-      "det_prob": 0.889,
-      "end_time": 0.0668,
-      "event": "Echolocation",
-      "high_freq": 84857,
-      "individual": "-1",
-      "low_freq": 67578,
-      "start_time": 0.0
-    }
-  ]
-}
-```
+Check:

-Very briefly:
+- whether detections were written for the expected recordings,
+- whether output counts are plausible,
+- whether the model is obviously too sensitive or too conservative,
+- whether the predicted classes look broadly reasonable for your data.

- `annotated: false` means this is a prediction file, not a reviewed annotation file.
- `annotation` holds the list of detections.
- Each detection includes a predicted class, detection score, class score, time bounds, and frequency bounds.
+Do not treat the first run as validated ecological output.

-For more detail, see {doc}`../explanation/interpreting-formatted-outputs`.
-If you want to save results in another format, see {doc}`../how_to/save-predictions-in-different-output-formats`.
+The first run is a workflow check.

-## 4. Run the same folder with a higher threshold
+Validation comes next.

-If you want, you can also run the same folder again with a higher detection threshold and save that run in a separate output folder.
+## 5. Tune only after you have a baseline

-```bash
-batdetect2 process directory \
-    path/to/audio \
-    path/to/outputs_threshold_05 \
-    --detection-threshold 0.5
-```
+If the first run is too noisy or misses obvious calls, tune thresholds on a
+reviewed subset rather than changing settings blindly across the full dataset.

-Concrete example:
+Use {doc}`../how_to/tune-detection-threshold` for that process.

-```bash
-batdetect2 process directory \
-    example_data/audio \
-    example_outputs/threshold_05 \
-    --detection-threshold 0.5
-```
+## What to do next

-Keeping this in a separate folder makes it easy to compare runs later.
-
-## 5. Run the model on a list of recordings
-
-If you only want to process selected recordings, use `file_list`.
-The list file should contain one recording path per line.
-
-Example `audio_files.txt`:
-
-```text
-path/to/audio/file_001.wav
-path/to/audio/file_002.wav
-path/to/audio/file_010.wav
-```
-
-Repository example:
-
-```text
-example_data/audio/20170701_213954-MYOMYS-LR_0_0.5.wav
-example_data/audio/20180530_213516-EPTSER-LR_0_0.5.wav
-```
-
-Then run:
-
-```bash
-batdetect2 process file_list \
-    path/to/audio_files.txt \
-    path/to/selected_outputs
-```
-
-Concrete example:
-
-```bash
-batdetect2 process file_list \
-    example_data/audio_files.txt \
-    example_outputs/selected_outputs
-```
-
-This is useful when your recordings are spread across folders, or when you only want to run a chosen subset.
-
-## Common next steps
-
- If your recordings are not all in one folder, or you want to compare input modes, see {doc}`../how_to/choose-an-inference-input-mode`.
- If you want to save results in another format, see {doc}`../how_to/save-predictions-in-different-output-formats`.
- If you want to choose a different model, see {doc}`../how_to/choose-a-model`.
- If you already write code and want more control from Python, see {doc}`integrate-with-a-python-pipeline`.
- If you want the full command reference, including `--model`, see {doc}`../reference/cli/predict`.
+- If you need a different input mode, use
+  {doc}`../how_to/choose-an-inference-input-mode`.
+- If you want to tune sensitivity, use
+  {doc}`../how_to/tune-detection-threshold`.
+- If you already write code and want more control from Python, use
+  {doc}`integrate-with-a-python-pipeline`.
+- If you need full command details, use {doc}`../reference/cli/predict`.
--- a/docs/source/tutorials/train-a-custom-model.md
+++ b/docs/source/tutorials/train-a-custom-model.md
@ -1,208 +1,85 @@
-# Train a custom model
+# Tutorial: Train a custom model

-This tutorial walks through a first custom training run using your own annotations.
+This tutorial walks through a first custom training run using your own
+annotations.

-Use it when you already have labelled recordings and want to train a model for your own data.
+This tutorial is for advanced users who already have dataset files and want to train a model on their own annotated data.

 ## Before you start

-You need:
-
 - BatDetect2 installed.
- labelled recordings and annotations.
+- A training dataset config file.
+- (Optional) A validation dataset config file.
+- A targets config file if you are not using the default target setup.
+- A model config file if you are not training from the built-in defaults.

 ```{note}
-This is not the first page to start with if you only want to run the existing
-model on recordings.
+This is not the first page to start with if you only want to run the existing model on recordings.
 Use {doc}`run-inference-on-folder` for that.
 ```

-## Optional: use the repository example files
-
-If you want to follow the steps with the same files shown here, clone the repository and move into it:
-
-```bash
-git clone https://github.com/macaodha/batdetect2.git
-cd batdetect2
-```
-
-## What you will do
+## Outcome

 By the end of this tutorial you will have:

- created a dataset config,
- defined a targets config,
 - started a training run,
- checked the checkpoint and log outputs,
- identified the next pages for evaluation and customisation.
+- written checkpoints and logs,
+- understood the minimum settings involved,
+- identified the next pages for fine-tuning and evaluation.

-## 1. Create a dataset config
+## 1. Gather the minimum required inputs

-The dataset config explicitly declares what data you want to use for training.
-It is a YAML file.
-If YAML is new to you, see [Learn YAML in Y Minutes](https://learnxinyminutes.com/yaml/).
+At minimum, a custom training run needs:

-In the dataset config, you list one or more data sources.
-Each source tells `batdetect2` where the audio recordings live and where the matching annotations are stored.
+- a training dataset config,
+- optional validation dataset config,
+- either a model config for a fresh run or a checkpoint for continued training,
+- optional settings files for targets, audio, training, evaluation, inference, outputs, and logging.

-BatDetect2 can read annotations from different source formats.
-In this example, we use the example data in the `batdetect2` format.
+The most important point is that the dataset file, target definitions, and preprocessing choices need to agree with each other.

-Use `example_data/dataset.yaml` as a reference:
+## 2. Run a first training command

-```yaml
-name: example dataset
-description: Only for demonstration purposes
-sources:
-  - format: batdetect2
-    name: Example Data
-    description: Examples included for testing batdetect2
-    annotations_dir: example_data/anns
-    audio_dir: example_data/audio
-```
-
-For your own project, the main thing to change is the file paths.
-If you have several collections of recordings, you can add more than one source to the same dataset config.
-That lets you describe the full training data you want to use in one place.
-
-If you need more detail on dataset source formats, see {doc}`../reference/data-sources`.
-
-## 2. Define a targets config
-
-The targets config tells BatDetect2 how to turn your annotations into training targets.
-
-It defines two main things:
-
- what should count as a detection,
- which classes the model should learn to predict.
-
-In practice, this means the targets config maps the labels in your annotations to the detection and classification outputs used during training.
-
-Use `example_data/targets.yaml` as a reference:
-
-```yaml
-detection_target:
-  name: bat
-  match_if:
-    name: all_of
-    conditions:
-      - name: has_tag
-        tag: { key: event, value: Echolocation }
-      - name: not
-        condition:
-          name: has_tag
-          tag: { key: class, value: Unknown }
-  assign_tags:
-    - key: class
-      value: Bat
-
-classification_targets:
-  - name: myomys
-    tags:
-      - key: class
-        value: Myotis mystacinus
-  - name: pippip
-    tags:
-      - key: class
-        value: Pipistrellus pipistrellus
-```
-
-For your own project, update the matching rules and class definitions so they fit your labels.
-
-In this example:
-
- `detection_target` says that echolocation calls should be treated as detections,
- `classification_targets` define the classes the model should predict,
-
-It is worth taking a bit of time over this file, because your targets config decides what the model is actually being asked to learn.
-
-If you need help with that, see {doc}`../how_to/configure-target-definitions` and {doc}`../reference/targets-config-workflow`.
-
-## 3. Run a first training command
-
-For a first run, keep the command simple:
+Use a command like this for a fresh run:

 ```bash
 batdetect2 train \
  path/to/train_dataset.yaml \
  --val-dataset path/to/val_dataset.yaml \
-  --targets path/to/targets.yaml
+  --targets path/to/targets.yaml \
+  --model-config path/to/model.yaml \
+  --training-config path/to/training.yaml
 ```

-If you are using the repository example files, run:
+Use `--model` instead of `--model-config` when you want to continue from an existing checkpoint.

-```bash
-batdetect2 train \
-  example_data/dataset.yaml \
-  --val-dataset example_data/dataset.yaml \
-  --targets example_data/targets.yaml
-```
+## 3. Check that outputs are being written

-This uses the same dataset for training and validation only to keep the example simple.
-For real training runs, you usually want separate training and validation datasets.
+After the command starts, verify that:

-This uses the built-in default model and training settings.
-If you want to change the model architecture later, see {doc}`../reference/model-config`.
-If you want to change optimiser settings, batch size, epochs, or checkpoint behaviour, see {doc}`../reference/training-config`.
+- the run initializes without configuration errors,
+- checkpoints are written to the checkpoint directory,
+- logs are written to the log directory or configured logger backend,
+- the training and validation datasets load as expected.

-## 4. Check the training outputs
+## 4. Run a sanity inference pass after training

-After the run starts, `batdetect2` should write checkpoints and logs.
+Do not wait until full evaluation to confirm that the trained checkpoint behaves sensibly.

-By default, training logs are written with the CSV logger.
-That means you should see a log folder with a `metrics.csv` file.
+Take a small reviewed subset of recordings and run a quick prediction pass with the new checkpoint.

-A typical layout looks like this:
+That catches setup mismatches early, especially around targets and preprocessing.

-```text
-outputs/
-  checkpoints/
-    epoch=19-step=20.ckpt
-  logs/
-    version_0/
-      metrics.csv
-      hparams.yaml
-    training_artifacts/
-      train_dataset.yaml
-      val_dataset.yaml
-      targets.yaml
-      train_class_summary.csv
-      val_class_summary.csv
-```
+## 5. Evaluate on held-out data

-The checkpoint is the trained model you can use later for inference, evaluation, or sharing with someone else.
+Once the checkpoint looks sensible on a small sanity subset, run the formal evaluation workflow on a held-out test set.

-The files in `training_artifacts/` record which datasets and targets were used for the run.
-The `hparams.yaml` file records the full training setup, including the configs used for the model, training, and other parts of the run.
+That is where you should compare models, thresholds, and task-level performance metrics.

-The `metrics.csv` file stores one row per validation epoch.
-It includes training losses as well as validation losses and metrics such as:
-
-```csv
-classification/mean_average_precision,detection/average_precision,epoch,total_loss/val
-0.10041624307632446,0.3697187900543213,0,4070.3515625
-0.11328697204589844,0.346899151802063,1,3941.6455078125
-0.1388484090566635,0.36171725392341614,2,3776.323974609375
-```
-
-You may also see class-specific metrics in extra columns.
-
-The more detailed metrics are computed from the validation set.
-If you do not provide `--val-dataset`, those validation metrics will not appear.
-
-Other logger backends are also supported, including TensorBoard, MLflow, and DVCLive.
-See {doc}`../reference/logging-config` if you want to change that.
-
-## Use the trained model
-
-You can now use the trained checkpoint in BatDetect2, or share it with someone else to use in their own runs.
-If you want to load it for inference or evaluation, see {doc}`../how_to/choose-a-model`.
-
-## Common next steps
+## What to do next

 - Evaluate the trained checkpoint: {doc}`evaluate-on-a-test-set`
 - Fine-tune from a checkpoint: {doc}`../how_to/fine-tune-from-a-checkpoint`
- Configure targets in more detail: {doc}`../how_to/configure-target-definitions`
- Configure audio preprocessing: {doc}`../how_to/configure-audio-preprocessing`
- Configure spectrogram preprocessing: {doc}`../how_to/configure-spectrogram-preprocessing`
+- Configure targets: {doc}`../how_to/configure-target-definitions`
+- Configure preprocessing: {doc}`../how_to/configure-audio-preprocessing`
 - Check full train options: {doc}`../reference/cli/train`
--- a/example_data/audio_files.txt
+++ b/example_data/audio_files.txt
@ -1,2 +0,0 @@
-example_data/audio/20170701_213954-MYOMYS-LR_0_0.5.wav
-example_data/audio/20180530_213516-EPTSER-LR_0_0.5.wav
--- a/pyproject.toml
+++ b/pyproject.toml
@ -37,10 +37,10 @@ classifiers = [
  "Intended Audience :: Science/Research",
  "Natural Language :: English",
  "Operating System :: OS Independent",
+  "Programming Language :: Python :: 3.9",
  "Programming Language :: Python :: 3.10",
  "Programming Language :: Python :: 3.11",
  "Programming Language :: Python :: 3.12",
-  "Programming Language :: Python :: 3.13",
  "Topic :: Scientific/Engineering :: Artificial Intelligence",
  "Topic :: Software Development :: Libraries :: Python Modules",
  "Topic :: Multimedia :: Sound/Audio :: Analysis",
--- a/run_batdetect.py
+++ b/run_batdetect.py
@ -0,0 +1,5 @@
+"""Run batdetect2.command.main() from the command line."""
+from batdetect2.cli import detect
+
+if __name__ == "__main__":
+    detect()
--- a/src/batdetect2/init.py
+++ b/src/batdetect2/init.py
@ -1,5 +1,4 @@
 import logging
-import warnings
 from typing import TYPE_CHECKING

 from loguru import logger
@ -7,18 +6,15 @@ from loguru import logger
 if TYPE_CHECKING:
    from batdetect2.api_v2 import BatDetect2API

-__all__ = ["BatDetect2API", "__version__"]
-__version__ = "1.1.1"
-
 logger.disable("batdetect2")

-# Silences the irrelevant warning
-warnings.filterwarnings("ignore", message="The pynvml package is deprecated")
-warnings.filterwarnings("ignore", message=".*isinstance(treespec, LeafSpec).*")

 numba_logger = logging.getLogger("numba")
 numba_logger.setLevel(logging.WARNING)

+__all__ = ["BatDetect2API", "__version__"]
+__version__ = "1.1.1"
+

 def __getattr__(name: str):
    if name == "BatDetect2API":
--- a/src/batdetect2/cli/inference.py
+++ b/src/batdetect2/cli/inference.py
@ -27,15 +27,6 @@ def process() -> None:
 def common_predict_options(func):
    """Attach options shared by all ``process`` subcommands."""

-    @click.option(
-        "--model",
-        "model_path",
-        type=str,
-        help=(
-            "Path to a checkpoint, checkpoint alias, or a Hugging Face "
-            "URI to fine-tune from. Defaults to uk_same"
-        ),
-    )
    @click.option(
        "--audio-config",
        type=click.Path(exists=True),
@ -86,8 +77,7 @@ def common_predict_options(func):
        type=str,
        help=(
            "Output format name used by the prediction writer. If omitted, "
-            "the loaded outputs config is used, or batdetect2 when no "
-            "outputs config is provided."
+            "the config default is used."
        ),
    )
    @click.option(
@ -107,7 +97,7 @@ def common_predict_options(func):


 def _build_api(
-    model_path: str | None,
+    model_path: str,
    audio_config: Path | None,
    inference_config: Path | None,
    outputs_config: Path | None,
@ -139,7 +129,7 @@ def _build_api(
    )

    api = BatDetect2API.from_checkpoint(
-        path=model_path,
+        model_path,
        audio_config=audio_conf,
        inference_config=inference_conf,
        outputs_config=outputs_conf,
@ -149,7 +139,7 @@ def _build_api(


 def _run_prediction(
-    model_path: str | None,
+    model_path: str,
    audio_files: list[Path],
    output_path: Path,
    audio_config: Path | None,
@ -160,7 +150,6 @@ def _run_prediction(
    num_workers: int,
    format_name: str | None,
    detection_threshold: float | None,
-    audio_dir: Path | None = None,
 ) -> None:
    logger.info("Initiating prediction process...")

@ -184,16 +173,11 @@ def _run_prediction(
        detection_threshold=detection_threshold,
    )

-    if audio_dir is None:
-        audio_dir = audio_files[0].parent if audio_files else None
-
-    if format_name is None and outputs_conf is None:
-        format_name = "batdetect2"
-
+    common_path = audio_files[0].parent if audio_files else None
    api.save_predictions(
        predictions,
        path=output_path,
-        audio_dir=audio_dir,
+        audio_dir=common_path,
        format=format_name,
    )

@ -206,11 +190,12 @@ def _run_prediction(
    name="directory",
    short_help="Process audio files in a directory.",
 )
+@click.argument("model_path", type=str)
@click.argument("audio_dir", type=click.Path(exists=True))
@click.argument("output_path", type=click.Path())
@common_predict_options
 def predict_directory_command(
-    model_path: str | None,
+    model_path: str,
    audio_dir: Path,
    output_path: Path,
    audio_config: Path | None,
@ -242,7 +227,6 @@ def predict_directory_command(
        num_workers=num_workers,
        format_name=format_name,
        detection_threshold=detection_threshold,
-        audio_dir=audio_dir,
    )


@ -250,13 +234,14 @@ def predict_directory_command(
    name="file_list",
    short_help="Process paths listed in a text file.",
 )
+@click.argument("model_path", type=str)
@click.argument("file_list", type=click.Path(exists=True))
@click.argument("output_path", type=click.Path())
@common_predict_options
 def predict_file_list_command(
+    model_path: str,
    file_list: Path,
    output_path: Path,
-    model_path: str | None,
    audio_config: Path | None,
    inference_config: Path | None,
    outputs_config: Path | None,
@ -297,13 +282,14 @@ def predict_file_list_command(
    name="dataset",
    short_help="Process recordings from a dataset config.",
 )
+@click.argument("model_path", type=str)
@click.argument("dataset_path", type=click.Path(exists=True))
@click.argument("output_path", type=click.Path())
@common_predict_options
 def predict_dataset_command(
+    model_path: str,
    dataset_path: Path,
    output_path: Path,
-    model_path: str | None,
    audio_config: Path | None,
    inference_config: Path | None,
    outputs_config: Path | None,
--- a/src/batdetect2/logging.py
+++ b/src/batdetect2/logging.py
@ -104,7 +104,7 @@ LoggerConfig = Annotated[


 class AppLoggingConfig(BaseConfig):
-    train: LoggerConfig = Field(default_factory=CSVLoggerConfig)
+    train: LoggerConfig = Field(default_factory=TensorBoardLoggerConfig)
    evaluation: LoggerConfig = Field(default_factory=CSVLoggerConfig)
    inference: LoggerConfig = Field(default_factory=CSVLoggerConfig)

--- a/src/batdetect2/models/types.py
+++ b/src/batdetect2/models/types.py
@ -1,8 +1,7 @@
-from typing import TYPE_CHECKING, Any, NamedTuple, Protocol
+from typing import Any, NamedTuple, Protocol

 import torch

-if TYPE_CHECKING:
 from batdetect2.postprocess.types import PostprocessorProtocol
 from batdetect2.preprocess.types import PreprocessorProtocol

@ -117,8 +116,8 @@ class DetectorProtocol(ModuleProtocol, Protocol):

 class ModelProtocol(ModuleProtocol, Protocol):
    detector: DetectorProtocol
-    preprocessor: "PreprocessorProtocol"
-    postprocessor: "PostprocessorProtocol"
+    preprocessor: PreprocessorProtocol
+    postprocessor: PostprocessorProtocol
    class_names: list[str]
    dimension_names: list[str]

--- a/src/batdetect2/outputs/formats/base.py
+++ b/src/batdetect2/outputs/formats/base.py
@ -27,10 +27,6 @@ def make_path_relative(path: PathLike, audio_dir: PathLike) -> Path:

        return path.relative_to(audio_dir)

-    audio_parts = audio_dir.parts
-    if audio_parts and path.parts[: len(audio_parts)] == audio_parts:
-        return Path(*path.parts[len(audio_parts) :])
-
    return path


--- a/src/batdetect2/outputs/formats/batdetect2.py
+++ b/src/batdetect2/outputs/formats/batdetect2.py
@ -1,9 +1,8 @@
 import json
 from pathlib import Path
-from typing import List, Literal, Sequence, TypedDict, cast
+from typing import List, Literal, Sequence, TypedDict

 import numpy as np
-import pandas as pd
 from soundevent import data
 from soundevent.geometry import compute_bounds

@ -14,6 +13,7 @@ from batdetect2.outputs.formats.base import (
 )
 from batdetect2.outputs.types import OutputFormatterProtocol
 from batdetect2.postprocess.types import ClipDetections, Detection
+from batdetect2.targets import terms
 from batdetect2.targets.types import TargetProtocol

 try:
@ -24,7 +24,7 @@ except ImportError:
 DictWithClass = TypedDict("DictWithClass", {"class": str})


-class Annotation(DictWithClass, total=False):
+class Annotation(DictWithClass):
    start_time: float
    end_time: float
    low_freq: float
@ -33,7 +33,6 @@ class Annotation(DictWithClass, total=False):
    det_prob: float
    individual: str
    event: str
-    cnn_features: NotRequired[list[float]]  # ty: ignore[invalid-type-form]


 class FileAnnotation(TypedDict):
@ -53,14 +52,6 @@ class BatDetect2OutputConfig(BaseConfig):

    event_name: str = "Echolocation"
    annotation_note: str = "Automatically generated."
-    class_label_mode: Literal["class_name", "decoded_tag"] = "decoded_tag"
-    decoded_label_key: str = "dwc:scientificName"
-    fallback_to_class_name: bool = True
-    write_detection_csv: bool = True
-    write_cnn_features_csv: bool = False
-    save_if_empty: bool = False
-    preserve_audio_tree: bool = True
-    include_file_path: bool = False


 class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
@ -69,26 +60,10 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
        targets: TargetProtocol,
        event_name: str,
        annotation_note: str,
-        class_label_mode: Literal["class_name", "decoded_tag"] = "decoded_tag",
-        decoded_label_key: str = "dwc:scientificName",
-        fallback_to_class_name: bool = True,
-        write_detection_csv: bool = True,
-        write_cnn_features_csv: bool = False,
-        save_if_empty: bool = False,
-        preserve_audio_tree: bool = True,
-        include_file_path: bool = False,
    ):
        self.targets = targets
        self.event_name = event_name
        self.annotation_note = annotation_note
-        self.class_label_mode = class_label_mode
-        self.decoded_label_key = decoded_label_key
-        self.fallback_to_class_name = fallback_to_class_name
-        self.write_detection_csv = write_detection_csv
-        self.write_cnn_features_csv = write_cnn_features_csv
-        self.save_if_empty = save_if_empty
-        self.preserve_audio_tree = preserve_audio_tree
-        self.include_file_path = include_file_path

    def format(
        self, predictions: Sequence[ClipDetections]
@ -109,57 +84,22 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
            path.mkdir(parents=True)

        for prediction in predictions:
-            annotations = prediction["annotation"]
+            pred_path = path / (prediction["id"] + ".json")

-            if not annotations and not self.save_if_empty:
-                continue
-
-            pred_path = self.get_output_path(prediction, path, audio_dir)
-            pred_path.parent.mkdir(parents=True, exist_ok=True)
-
-            # make a copy of the prediction
-            data = dict(prediction)
-
-            raw_file_path = data.get("file_path")
-            if audio_dir is not None and isinstance(raw_file_path, str):
-                data["file_path"] = str(
-                    make_path_relative(raw_file_path, audio_dir)
+            if audio_dir is not None and "file_path" in prediction:
+                prediction["file_path"] = str(
+                    make_path_relative(
+                        prediction["file_path"],
+                        audio_dir,
+                    )
                )

-            if not self.include_file_path:
-                data.pop("file_path", None)
-
-            annotations = cast(list[Annotation], data["annotation"])
-            data["annotation"] = [
-                {
-                    key: value
-                    for key, value in annotation.items()
-                    if key != "cnn_features"
-                }
-                for annotation in annotations
-            ]
-
-            pred_path.write_text(json.dumps(data, indent=2, sort_keys=True))
-
-            if self.write_detection_csv:
-                self.save_detection_csv(
-                    prediction,
-                    pred_path.with_suffix(".csv"),
-                )
-
-            if self.write_cnn_features_csv:
-                self.save_cnn_features_csv(
-                    prediction,
-                    pred_path.with_name(pred_path.stem + "_cnn_features.csv"),
-                )
+            pred_path.write_text(json.dumps(prediction))

    def load(self, path: data.PathLike) -> List[FileAnnotation]:
        path = Path(path)

-        if path.is_file():
-            files = [path] if path.suffix == ".json" else []
-        else:
-            files = sorted(path.rglob("*.json"))
+        files = list(path.glob("*.json"))

        if not files:
            return []
@ -168,121 +108,12 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
            json.loads(file.read_text()) for file in files if file.is_file()
        ]

-    def get_output_path(
-        self,
-        prediction: FileAnnotation,
-        output_dir: Path,
-        audio_dir: data.PathLike | None,
-    ) -> Path:
-        if (
-            self.preserve_audio_tree
-            and audio_dir is not None
-            and "file_path" in prediction
-        ):
-            relative_path = make_path_relative(
-                prediction["file_path"],
-                audio_dir,
-            )
-            return (
-                output_dir / relative_path.parent / f"{prediction['id']}.json"
-            )
-
-        return output_dir / f"{prediction['id']}.json"
-
-    def save_detection_csv(
-        self,
-        prediction: FileAnnotation,
-        path: Path,
-    ) -> None:
-        annotations = prediction["annotation"]
+    def get_recording_class(self, annotations: List[Annotation]) -> str:
        if not annotations:
-            return
+            return ""

-        preds_df = pd.DataFrame(annotations)[
-            [
-                "det_prob",
-                "start_time",
-                "end_time",
-                "high_freq",
-                "low_freq",
-                "class",
-                "class_prob",
-            ]
-        ]
-        preds_df.to_csv(path, sep=",")
-
-    def save_cnn_features_csv(
-        self, prediction: FileAnnotation, path: Path
-    ) -> None:
-        annotations = prediction["annotation"]
-
-        if not annotations:
-            return
-
-        cnn_features = [
-            annotation["cnn_features"]
-            for annotation in annotations
-            if "cnn_features" in annotation
-        ]
-
-        if not cnn_features:
-            return
-
-        cnn_feats_df = pd.DataFrame(
-            cnn_features,
-            columns=[str(ii) for ii in range(len(cnn_features[0]))],
-        )
-
-        cnn_feats_df.to_csv(
-            path,
-            sep=",",
-            index=False,
-            float_format="%.5f",
-        )
-
-    def get_class_name(self, class_index: int) -> str:
-        class_name = self.targets.class_names[class_index]
-
-        if self.class_label_mode == "class_name":
-            return class_name
-
-        tags = self.targets.decode_class(class_name)
-        default = class_name if self.fallback_to_class_name else None
-        decoded = data.find_tag_value(
-            tags,
-            key=self.decoded_label_key,
-            default=default,
-        )
-
-        if decoded is None:
-            raise ValueError(
-                "Could not decode class label using key "
-                f"{self.decoded_label_key!r} for class {class_name!r}."
-            )
-
-        return decoded
-
-    def get_recording_class(self, detections: Sequence[Detection]) -> str:
-        if not detections:
-            return "None"
-
-        class_scores = np.stack(
-            [detection.class_scores for detection in detections],
-            axis=1,
-        )
-        detection_scores = np.array(
-            [detection.detection_score for detection in detections],
-            dtype=np.float32,
-        )
-        weighted_scores = (class_scores * detection_scores).sum(axis=1)
-
-        total = weighted_scores.sum()
-
-        if total <= 0:
-            return "None"
-
-        top_class_index = int(np.argmax(weighted_scores / total))
-        return self.get_class_name(top_class_index)
+        highest_scoring = max(annotations, key=lambda x: x["class_prob"])
+        return highest_scoring["class"]

    def format_prediction(self, prediction: ClipDetections) -> FileAnnotation:
        recording = prediction.clip.recording
@ -292,19 +123,26 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
            for pred in prediction.detections
        ]

-        file_annotation = FileAnnotation(
+        return FileAnnotation(
            id=recording.path.name,
+            file_path=str(recording.path),
            annotated=False,
-            duration=round(float(recording.duration), 4),
+            duration=recording.duration,
            issues=False,
            time_exp=recording.time_expansion,
-            class_name=self.get_recording_class(prediction.detections),
+            class_name=self.get_recording_class(annotations),
            notes=self.annotation_note,
            annotation=annotations,
-            file_path=str(recording.path),
        )

-        return file_annotation
+    def get_class_name(self, class_index: int) -> str:
+        class_name = self.targets.class_names[class_index]
+        tags = self.targets.decode_class(class_name)
+        return data.find_tag_value(
+            tags,
+            term=terms.generic_class,
+            default=class_name,
+        )  # type: ignore

    def format_sound_event_prediction(
        self, prediction: Detection
@ -317,20 +155,16 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
        top_class_score = float(prediction.class_scores[top_class_index])
        top_class = self.get_class_name(top_class_index)
        annotation: Annotation = {
-            "start_time": round(float(start_time), 4),
-            "end_time": round(float(end_time), 4),
-            "low_freq": int(low_freq),
-            "high_freq": int(high_freq),
-            "class_prob": round(top_class_score, 3),
-            "det_prob": round(float(prediction.detection_score), 3),
-            "individual": "-1",
+            "start_time": start_time,
+            "end_time": end_time,
+            "low_freq": low_freq,
+            "high_freq": high_freq,
+            "class_prob": top_class_score,
+            "det_prob": float(prediction.detection_score),
+            "individual": "",
            "event": self.event_name,
            "class": top_class,
        }
-
-        if self.write_cnn_features_csv:
-            annotation["cnn_features"] = prediction.features.tolist()  # type: ignore[index]
-
        return annotation

    @output_formatters.register(BatDetect2OutputConfig)
@ -340,12 +174,4 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
            targets,
            event_name=config.event_name,
            annotation_note=config.annotation_note,
-            class_label_mode=config.class_label_mode,
-            decoded_label_key=config.decoded_label_key,
-            fallback_to_class_name=config.fallback_to_class_name,
-            write_detection_csv=config.write_detection_csv,
-            write_cnn_features_csv=config.write_cnn_features_csv,
-            save_if_empty=config.save_if_empty,
-            preserve_audio_tree=config.preserve_audio_tree,
-            include_file_path=config.include_file_path,
        )
--- a/src/batdetect2/train/train.py
+++ b/src/batdetect2/train/train.py
@ -10,9 +10,9 @@ from soundevent import data
 from batdetect2.audio import AudioConfig, AudioLoader, build_audio_loader
 from batdetect2.evaluate import EvaluatorProtocol, build_evaluator
 from batdetect2.logging import (
-    CSVLoggerConfig,
    LoggerConfig,
    LoggingCallback,
+    TensorBoardLoggerConfig,
    build_logger,
 )
 from batdetect2.models import ModelConfig, build_model
@ -165,7 +165,7 @@ def run_train(
    )

    train_logger = build_logger(
-        logger_config or CSVLoggerConfig(),
+        logger_config or TensorBoardLoggerConfig(),
        log_dir=log_dir,
        experiment_name=experiment_name,
        run_name=run_name,
--- a/tests/test_api_v2/test_outputs_io.py
+++ b/tests/test_api_v2/test_outputs_io.py
@ -1,11 +1,8 @@
 from pathlib import Path
 from typing import cast
-from unittest.mock import Mock

 import numpy as np
-import pandas as pd
 import pytest
-from soundevent import data as soundevent_data

 from batdetect2.api_v2 import BatDetect2API
 from batdetect2.outputs import build_output_formatter
@ -13,7 +10,6 @@ from batdetect2.outputs.formats import (
    BatDetect2OutputConfig,
    SoundEventOutputConfig,
 )
-from batdetect2.outputs.formats.batdetect2 import BatDetect2Formatter
 from batdetect2.postprocess.types import ClipDetections


@ -82,82 +78,6 @@ def test_save_predictions_with_batdetect2_override(
    assert len(loaded[0]["annotation"]) == len(file_prediction.detections)


-def test_batdetect2_formatter_can_use_raw_class_names(
-    api_v2: BatDetect2API,
-    file_prediction,
-    tmp_path: Path,
-) -> None:
-    output_dir = tmp_path / "batdetect2_raw_class_names"
-    api_v2.save_predictions(
-        [file_prediction],
-        path=output_dir,
-        config=BatDetect2OutputConfig(class_label_mode="class_name"),
-    )
-
-    loaded = cast(
-        list[dict], api_v2.load_predictions(output_dir, format="batdetect2")
-    )
-    first_annotation = loaded[0]["annotation"][0]
-
-    assert first_annotation["class"] in api_v2.targets.class_names
-
-
-def test_batdetect2_formatter_can_use_decoded_species_tag() -> None:
-    targets = Mock()
-    targets.class_names = ["myodau"]
-    targets.decode_class.return_value = [
-        soundevent_data.Tag(
-            key="dwc:scientificName",
-            value="Myotis daubentonii",
-        )
-    ]
-
-    formatter = BatDetect2Formatter(
-        targets=targets,
-        event_name="Echolocation",
-        annotation_note="Automatically generated.",
-    )
-
-    assert formatter.get_class_name(0) == "Myotis daubentonii"
-
-
-def test_batdetect2_formatter_can_fallback_to_class_name_when_key_missing() -> (
-    None
-):
-    targets = Mock()
-    targets.class_names = ["myodau"]
-    targets.decode_class.return_value = []
-
-    formatter = BatDetect2Formatter(
-        targets=targets,
-        event_name="Echolocation",
-        annotation_note="Automatically generated.",
-        decoded_label_key="dwc:scientificName",
-        fallback_to_class_name=True,
-    )
-
-    assert formatter.get_class_name(0) == "myodau"
-
-
-def test_batdetect2_formatter_rejects_missing_decoded_key_without_fallback() -> (
-    None
-):
-    targets = Mock()
-    targets.class_names = ["myodau"]
-    targets.decode_class.return_value = []
-
-    formatter = BatDetect2Formatter(
-        targets=targets,
-        event_name="Echolocation",
-        annotation_note="Automatically generated.",
-        decoded_label_key="dwc:scientificName",
-        fallback_to_class_name=False,
-    )
-
-    with pytest.raises(ValueError, match="Could not decode class label"):
-        formatter.get_class_name(0)
-
-
 def test_load_predictions_with_format_override(
    api_v2: BatDetect2API,
    file_prediction,
@ -178,47 +98,6 @@ def test_load_predictions_with_format_override(
    assert "annotation" in loaded_item


-def test_load_predictions_with_batdetect2_nested_layout(
-    api_v2: BatDetect2API,
-    example_audio_files: list[Path],
-    tmp_path: Path,
-) -> None:
-    output_dir = tmp_path / "batdetect2_nested"
-    predictions = [
-        api_v2.process_file(audio_file) for audio_file in example_audio_files
-    ]
-
-    api_v2.save_predictions(
-        predictions,
-        path=output_dir,
-        format="batdetect2",
-        audio_dir=example_audio_files[0].parent,
-    )
-
-    loaded = api_v2.load_predictions(output_dir, format="batdetect2")
-
-    assert len(loaded) == len(example_audio_files)
-
-
-def test_save_predictions_with_batdetect2_writes_cnn_feature_csv(
-    api_v2: BatDetect2API,
-    file_prediction,
-    tmp_path: Path,
-) -> None:
-    output_dir = tmp_path / "batdetect2_cnn"
-    api_v2.save_predictions(
-        [file_prediction],
-        path=output_dir,
-        config=BatDetect2OutputConfig(write_cnn_features_csv=True),
-    )
-
-    cnn_csvs = list(output_dir.rglob("*_cnn_features.csv"))
-    assert len(cnn_csvs) == 1
-
-    loaded_df = pd.read_csv(cnn_csvs[0])
-    assert not loaded_df.empty
-
-
 def test_save_predictions_with_soundevent_override(
    api_v2: BatDetect2API,
    file_prediction,
--- a/tests/test_cli/test_predict.py
+++ b/tests/test_cli/test_predict.py
@ -1,16 +1,12 @@
 """Behavior tests for process CLI workflows."""

-import json
 from pathlib import Path

-import pandas as pd
 import pytest
 from click.testing import CliRunner
 from soundevent import data, io

 from batdetect2.cli import cli
-from batdetect2.outputs import OutputsConfig
-from batdetect2.outputs.formats import BatDetect2OutputConfig


 def test_cli_process_help() -> None:
@ -39,7 +35,6 @@ def test_cli_process_directory_runs_on_real_audio(
        [
            "process",
            "directory",
-            "--model",
            str(tiny_checkpoint_path),
            str(single_audio_dir),
            str(output_path),
@ -57,190 +52,6 @@ def test_cli_process_directory_runs_on_real_audio(
    assert len(list(output_path.glob("*.json"))) == 1


-@pytest.mark.slow
-def test_cli_process_directory_runs_on_example_audio_data(
-    tmp_path: Path,
-    tiny_checkpoint_path: Path,
-    example_audio_dir: Path,
-    example_audio_files: list[Path],
-) -> None:
-    """User story: process the bundled example audio directory."""
-
-    output_path = tmp_path / "predictions"
-
-    result = CliRunner().invoke(
-        cli,
-        [
-            "process",
-            "directory",
-            "--model",
-            str(tiny_checkpoint_path),
-            str(example_audio_dir),
-            str(output_path),
-            "--batch-size",
-            "1",
-            "--workers",
-            "0",
-            "--format",
-            "batdetect2",
-        ],
-    )
-
-    assert result.exit_code == 0
-    assert output_path.exists()
-    assert len(list(output_path.glob("*.json"))) == len(example_audio_files)
-
-
-@pytest.mark.slow
-def test_cli_process_directory_batdetect2_matches_legacy_artifacts(
-    tmp_path: Path,
-    tiny_checkpoint_path: Path,
-    example_audio_dir: Path,
-    example_audio_files: list[Path],
-    example_anns_dir: Path,
-) -> None:
-    """User story: process batdetect2 output matches legacy-style files."""
-
-    output_path = tmp_path / "predictions"
-
-    result = CliRunner().invoke(
-        cli,
-        [
-            "process",
-            "directory",
-            "--model",
-            str(tiny_checkpoint_path),
-            str(example_audio_dir),
-            str(output_path),
-            "--batch-size",
-            "1",
-            "--workers",
-            "0",
-            "--format",
-            "batdetect2",
-        ],
-    )
-
-    assert result.exit_code == 0
-
-    json_files = sorted(output_path.rglob("*.json"))
-    csv_files = sorted(output_path.rglob("*.csv"))
-
-    assert len(json_files) == len(example_audio_files)
-    assert len(csv_files) == len(example_audio_files)
-
-    expected_names = sorted(
-        audio_file.name for audio_file in example_audio_files
-    )
-    assert sorted(path.stem for path in json_files) == expected_names
-    assert sorted(path.stem for path in csv_files) == expected_names
-
-    first_output = json.loads(json_files[0].read_text())
-    assert "file_path" not in first_output
-    assert isinstance(first_output["class_name"], str)
-    assert first_output["class_name"]
-
-    first_annotation = first_output["annotation"][0]
-    assert first_annotation["individual"] == "-1"
-    assert isinstance(first_annotation["high_freq"], int)
-    assert isinstance(first_annotation["low_freq"], int)
-
-    expected_json = json.loads(
-        (example_anns_dir / json_files[0].name).read_text()
-    )
-    assert first_output["id"] == expected_json["id"]
-    assert first_output["time_exp"] == expected_json["time_exp"]
-
-    first_csv = pd.read_csv(csv_files[0], index_col=0)
-    assert list(first_csv.columns) == [
-        "det_prob",
-        "start_time",
-        "end_time",
-        "high_freq",
-        "low_freq",
-        "class",
-        "class_prob",
-    ]
-
-
-@pytest.mark.slow
-def test_cli_process_directory_batdetect2_writes_cnn_features_csv_when_enabled(
-    tmp_path: Path,
-    tiny_checkpoint_path: Path,
-    example_audio_dir: Path,
-) -> None:
-    """User story: request legacy CNN feature CSV sidecars via config."""
-
-    output_path = tmp_path / "predictions"
-    outputs_config_path = tmp_path / "outputs.yaml"
-    outputs_config_path.write_text(
-        OutputsConfig(
-            format=BatDetect2OutputConfig(write_cnn_features_csv=True)
-        ).to_yaml_string()
-    )
-
-    result = CliRunner().invoke(
-        cli,
-        [
-            "process",
-            "directory",
-            "--model",
-            str(tiny_checkpoint_path),
-            str(example_audio_dir),
-            str(output_path),
-            "--batch-size",
-            "1",
-            "--workers",
-            "0",
-            "--outputs-config",
-            str(outputs_config_path),
-        ],
-    )
-
-    assert result.exit_code == 0
-
-    cnn_csvs = sorted(output_path.rglob("*_cnn_features.csv"))
-    assert len(cnn_csvs) == 3
-
-    first_df = pd.read_csv(cnn_csvs[0])
-    assert not first_df.empty
-    assert list(first_df.columns) == [
-        str(ii) for ii in range(len(first_df.columns))
-    ]
-
-
-def test_cli_process_directory_defaults_to_batdetect2_without_output_options(
-    tmp_path: Path,
-    tiny_checkpoint_path: Path,
-    single_audio_dir: Path,
-) -> None:
-    """User story: default process output stays batdetect2 for CLI users."""
-
-    output_path = tmp_path / "predictions"
-
-    result = CliRunner().invoke(
-        cli,
-        [
-            "process",
-            "directory",
-            "--model",
-            str(tiny_checkpoint_path),
-            str(single_audio_dir),
-            str(output_path),
-            "--batch-size",
-            "1",
-            "--workers",
-            "0",
-        ],
-    )
-
-    assert result.exit_code == 0
-    assert output_path.exists()
-    assert len(list(output_path.glob("*.json"))) == 1
-    assert len(list(output_path.glob("*.csv"))) == 1
-    assert len(list(output_path.glob("*.nc"))) == 0
-
-
 def test_cli_process_file_list_runs_on_real_audio(
    tmp_path: Path,
    tiny_checkpoint_path: Path,
@ -259,7 +70,6 @@ def test_cli_process_file_list_runs_on_real_audio(
        [
            "process",
            "file_list",
-            "--model",
            str(tiny_checkpoint_path),
            str(file_list),
            str(output_path),
@ -307,7 +117,6 @@ def test_cli_process_dataset_runs_on_aoef_metadata(
        [
            "process",
            "dataset",
-            "--model",
            str(tiny_checkpoint_path),
            str(dataset_path),
            str(output_path),
@ -350,7 +159,6 @@ def test_cli_process_directory_supports_output_format_override(
        [
            "process",
            "directory",
-            "--model",
            str(tiny_checkpoint_path),
            str(single_audio_dir),
            str(output_path),
@ -409,7 +217,6 @@ def test_cli_process_dataset_deduplicates_recordings(
        [
            "process",
            "dataset",
-            "--model",
            str(tiny_checkpoint_path),
            str(dataset_path),
            str(output_path),
@ -440,7 +247,6 @@ def test_cli_process_rejects_unknown_output_format(
        [
            "process",
            "directory",
-            "--model",
            str(tiny_checkpoint_path),
            str(single_audio_dir),
            str(output_path),
--- a/tests/test_outputs/test_base.py
+++ b/tests/test_outputs/test_base.py
@ -1,21 +0,0 @@
-from pathlib import Path
-
-from batdetect2.outputs.formats.base import make_path_relative
-
-
-def test_make_path_relative_strips_shared_relative_prefix() -> None:
-    audio_dir = Path("example_data/audio")
-    path = Path("example_data/audio/subdir/clip.wav")
-
-    relative = make_path_relative(path, audio_dir)
-
-    assert relative == Path("subdir/clip.wav")
-
-
-def test_make_path_relative_returns_dot_for_matching_relative_dir() -> None:
-    audio_dir = Path("example_data/audio")
-    path = Path("example_data/audio")
-
-    relative = make_path_relative(path, audio_dir)
-
-    assert relative == Path(".")