Compare commits

..

No commits in common. "7cdb6221dc37e709ce50c0d4ee18561e9481c315" and "b0f85b96e3f62d5b4d520b4ccee18de04ca10f93" have entirely different histories.

43 changed files with 1162 additions and 1743 deletions

View File

@ -36,7 +36,7 @@ jobs:
uv.lock
- name: Install dependencies
run: uv sync --all-extras --all-groups
run: just install-dev
- name: Run formatting, lint, and type checks
run: just check
@ -73,7 +73,7 @@ jobs:
uv.lock
- name: Install dependencies
run: uv sync --all-extras --all-groups
run: just install-dev
- name: Run test suite
run: just test

View File

@ -42,7 +42,7 @@ jobs:
uv.lock
- name: Install dependencies
run: uv sync --all-extras --all-groups
run: just install-dev
- name: Build docs
run: just check-docs

View File

@ -6,19 +6,14 @@ Code for detecting and classifying bat echolocation calls in high-frequency
audio recordings.
> [!WARNING]
> `batdetect2` 2.0.0b1 is out.
> This is a beta release and we are gathering user feedback.
> If you run into issues or have feedback on the new workflows, please use the
> GitHub issues page to let us know.
>
> `batdetect2` 2.0.1 is out.
> There are many changes and new recommended workflows.
> We have left the previous `batdetect2.api` module intact, but if you run
> into issues or want to upgrade, see the
> [migration guide](docs/source/legacy/migration-guide.md) in the docs site.
>
> This update also ships with a refreshed default model.
> It was trained in the same way and on the same data as before, but you should
> still expect small output differences in some cases.
> It was trained in the same way and on the same data as before, but you should still expect small output differences in some cases.
## What is BatDetect2
@ -36,10 +31,6 @@ You can use the tool from the command line (terminal) or from Python as needed.
We have [extensive documentation](docs/source/index.md) on how to use
`batdetect2`.
The docs site is still being built and will be live soon.
If you want a quick peek for now, see the `docs/` folder in this repository.
See our [getting started](docs/source/getting_started.md) guide and then jump
into any of our tutorials:
@ -144,7 +135,7 @@ which you can find
```
@article{batdetect2_2022,
title = {Towards a General Approach for Bat Echolocation Detection and Classification},
author = {Mac Aodha, Oisin and Mart\'{i}nez Balvanera, Santiago and Damstra, Elise and Cooke, Martyn and Eichinski, Philip and Browning, Ella and Barataud, Michel and Boughey, Katherine and Coles, Roger and Giacomini, Giada and MacSwiney G., M. Cristina and K. Obrist, Martin and Parsons, Stuart and Sattler, Thomas and Jones, Kate E.},
author = {Mac Aodha, Oisin and Mart\'{i}nez Balvanera, Santiago and Damstra, Elise and Cooke, Martyn and Eichinski, Philip and Browning, Ella and Barataudm, Michel and Boughey, Katherine and Coles, Roger and Giacomini, Giada and MacSwiney G., M. Cristina and K. Obrist, Martin and Parsons, Stuart and Sattler, Thomas and Jones, Kate E.},
journal = {bioRxiv},
year = {2022}
}

441
docs/plan.md Normal file
View File

@ -0,0 +1,441 @@
# Documentation Plan
## Goal
Build documentation around the main user stories:
1. Run inference with the CLI on one folder of audio.
2. Use the Python API for inference with fine-grained control over outputs,
including per-file workflows, class scores, features, and batch processing.
3. Train or fine-tune a custom model.
4. Evaluate a model and understand what the metrics mean.
5. Understand the concepts needed to use BatDetect2 correctly.
The docs should provide:
- a simple happy path in tutorials,
- richer task-oriented guidance in how-to guides,
- complete lookup material in reference,
- deep conceptual coverage in understanding.
Note: the current docs tree uses `explanation/`. For Diataxis consistency, this
plan uses `understanding/` as the target name for that conceptual section.
## Current State Review
### Looks reasonably complete
- `docs/source/index.md`: good top-level orientation and navigation.
- `docs/source/getting_started.md`: solid install and entry-point guidance.
- `docs/source/explanation/*.md`: the conceptual pages are currently the
strongest part of the docs, especially pipeline overview, thresholds,
preprocessing consistency, and targets.
- `docs/source/how_to/configure-*.md` and related target/data pages: practical
support docs for preprocessing, targets, ROI mapping, and dataset formats are
in decent shape.
- `docs/source/reference/cli/*.rst`: CLI reference wiring exists and should
render useful option-level documentation from the Click commands.
### Partially complete
- `docs/source/how_to/run-batch-predictions.md`: useful, but thin.
- `docs/source/how_to/tune-detection-threshold.md`: useful, but too brief for
a key workflow.
- `docs/source/reference/preprocessing-config.md`
- `docs/source/reference/postprocess-config.md`
- `docs/source/reference/targets-config-workflow.md`
These are good summaries, but they do not yet feel like complete references for
all the customization surfaces available in the code.
### Clearly incomplete or scaffolded
- `docs/source/tutorials/run-inference-on-folder.md`
- `docs/source/tutorials/integrate-with-a-python-pipeline.md`
- `docs/source/tutorials/train-a-custom-model.md`
- `docs/source/tutorials/evaluate-on-a-test-set.md`
All four main tutorials are still starter scaffolds. This is the biggest gap in
the current user story.
### Major mismatch to resolve
- `README.md` still tells an older story built around `batdetect2 detect` and
`batdetect2.api`.
- The docs site tells the newer story built around `batdetect2 predict` and
`batdetect2.api_v2`.
This creates avoidable confusion for users and should be treated as a priority
documentation alignment issue.
### Legacy documentation is not yet placed clearly
The repo still contains meaningful legacy documentation material, but it is not
yet presented as a clearly marked legacy path inside the docs.
Users need two things:
- a clear message that these docs exist for the previous BatDetect2 workflow,
- a clear recommendation that new users should prefer the newer CLI/API
workflows and migrate where possible.
## Legacy Documentation Plan
### Goals
1. Preserve access to the old workflow documentation.
2. Prevent new users from accidentally following legacy guidance.
3. Give current users a clear migration path from legacy to current workflows.
### Proposed location
Add a dedicated legacy area inside the docs, for example:
- `docs/source/legacy/index.md`
- `docs/source/legacy/cli-detect.md`
- `docs/source/legacy/python-api.md`
- `docs/source/legacy/feature-extraction.md`
- `docs/source/legacy/migration-guide.md`
This keeps the material available without mixing it into the main happy-path
docs.
### User-facing messaging
Add clear notices in all relevant navigation entry points.
Suggested message pattern:
"If you want to use the previous version of BatDetect2, see the legacy
documentation. For new workflows, we recommend using the current `predict`
CLI and `BatDetect2API` interfaces."
Places that should link to the legacy docs:
- `docs/source/index.md`
- `docs/source/getting_started.md`
- `README.md`
- tutorial landing pages where users may be coming from older workflows
- any page that mentions the old `detect` command or old Python API
### Migration guide plan
Add a dedicated migration guide that explains:
1. who should migrate now and who may need to stay on the legacy workflow,
2. the mapping from old CLI commands to new CLI commands,
3. the mapping from old Python API calls to new `api_v2` / `BatDetect2API`
patterns,
4. what changed in outputs, terminology, and configuration,
5. how legacy feature extraction concepts map to the new API surfaces,
6. what behavior differences users should validate before switching,
7. a short migration checklist.
High-priority migration mappings to document:
- `batdetect2 detect` -> `batdetect2 predict directory`
- old `batdetect2.api` file processing -> `BatDetect2API.from_checkpoint(... )`
plus `process_file`, `process_files`, `process_audio`, or
`process_spectrogram`
- legacy `cnn_feats`, `spec_features`, and `spec_slices` -> current output and
feature access patterns, with explicit notes where there is no direct
one-to-one replacement
### Legacy content handling plan
For each legacy page or legacy concept:
1. Decide whether it should be preserved as-is, rewritten as a legacy page, or
replaced by the migration guide.
2. Add a prominent warning banner saying it describes the previous workflow.
3. Link forward to the current equivalent page when one exists.
### Definition of done for legacy handling
Legacy documentation work is done when:
1. a reader can clearly distinguish legacy from current docs,
2. old users can still find the previous workflow documentation,
3. new users are consistently directed to the new docs,
4. there is a practical migration guide covering the main CLI and Python API
transitions.
## Main Gaps By User Story
### 1. CLI inference
Current coverage exists, but the happy path is not truly documented yet.
Missing:
- a full worked tutorial from input audio to saved outputs,
- clear guidance on what outputs are written and how to inspect them,
- stronger documentation for `predict dataset`,
- a clearer story for default model vs custom checkpoint,
- practical guidance for selecting output formats and thresholds.
### 2. Python API inference
This is currently the weakest major story.
The code exposes much more than the docs explain, including:
- `BatDetect2API.from_checkpoint` and `from_config`,
- `process_file`, `process_files`, `process_directory`, `process_clips`,
- `process_audio`, `process_spectrogram`,
- `get_top_class_name`, `get_class_scores`, `get_detection_features`,
- `save_predictions` and `load_predictions`.
Missing docs:
- an API-first tutorial with a simple path,
- a how-to for file-by-file inspection and custom post-processing,
- a how-to for batch API inference,
- a reference page for `BatDetect2API`,
- an explanation of what the feature vectors are and how users should think
about them.
Important terminology note:
- the old API/docs talk about `cnn_feats`, `spec_features`, and `spec_slices`,
- the new API exposes per-detection `features`,
- users interested in embeddings / downstream exploration will need a clear,
explicit doc that connects these ideas.
### 3. Batch inference
Batch prediction exists in both CLI and API workflows, but the docs do not yet
explain the design space well.
Missing:
- when to use `directory` vs `file_list` vs `dataset`,
- how clipping works during inference,
- what `InferenceConfig` controls,
- how batch size, workers, and output format choices affect runs,
- how to organize large runs reproducibly.
### 4. Training a custom model
Supporting pages exist, but the end-to-end story is not yet there.
Missing:
- one complete tutorial from dataset config to checkpoints and sanity check,
- a "minimum viable training setup" page,
- clearer explanation of how model, targets, audio, training, inference,
outputs, and logging configs fit together,
- a fine-tuning story versus training from scratch.
### 5. Evaluation
Evaluation is significantly under-documented relative to the code.
Missing:
- what evaluation tasks exist,
- what metrics and plots are produced,
- how predictions are matched to annotations,
- how to interpret failures and trade-offs,
- how to configure evaluation for different research questions.
### 6. Understanding / concepts
This is the best-developed section today, but it still needs expansion.
Concepts that should be covered more fully:
- what the model predicts,
- what the raw and formatted outputs represent,
- how to interpret detection scores and class scores,
- what targets are and how they shape training and decoding,
- how preprocessing choices affect model behavior,
- what the extracted features represent and when they are useful,
- what evaluation metrics actually measure,
- why local validation is required before ecological inference.
## Proposed Documentation Architecture
## Target Table of Contents
### Home
- Home
- Getting started
- FAQ
- Legacy docs
### Tutorials
These should be the default path for most users.
- Tutorial: Run inference on a folder of audio
- Tutorial: Explore predictions in Python for one file
- Tutorial: Train a custom model
- Tutorial: Evaluate a trained model
### How-to Guides
These cover practical tasks once the user is past the happy path.
- How to choose an inference input mode
- How to run batch predictions from a directory
- How to run batch predictions from a file list
- How to run predictions from a dataset config
- How to tune detection thresholds
- How to inspect class scores in Python
- How to inspect detection features in Python
- How to save predictions in different output formats
- How to configure inference clipping
- How to configure audio preprocessing
- How to configure spectrogram preprocessing
- How to configure target definitions
- How to define target classes
- How to configure ROI mapping
- How to configure an AOEF dataset
- How to import legacy BatDetect2 annotations
- How to fine-tune from a checkpoint
- How to choose and configure evaluation tasks
- How to interpret evaluation outputs
### Reference
This should be the complete lookup layer.
- CLI reference
- CLI reference: base command and global options
- CLI reference: predict
- CLI reference: data
- CLI reference: train
- CLI reference: evaluate
- CLI reference: legacy detect
- API reference: `BatDetect2API`
- Config reference: top-level app config
- Config reference: inference config
- Config reference: evaluation config
- Config reference: outputs config
- Config reference: output formats
- Config reference: output transforms
- Config reference: preprocessing config
- Config reference: postprocess config
- Config reference: targets config workflow
- Reference: data sources
- Reference: targets module
### Understanding
This is the conceptual layer and should carry the deeper Diataxis
"understanding" material.
- What BatDetect2 predicts
- How the pipeline fits together
- How to interpret detection scores and class scores
- How to interpret formatted outputs
- What extracted features / embeddings are and are not
- Postprocessing and thresholds
- Preprocessing consistency and domain shift
- Target encoding and decoding
- Evaluation concepts and matching behavior
- Model output, validation, and ecological interpretation
### Legacy
This is a clearly signposted area for the previous workflow only.
- Legacy overview
- Legacy CLI workflow with `batdetect2 detect`
- Legacy Python API with `batdetect2.api`
- Legacy feature extraction outputs
- Migration guide: legacy to current workflows
### Tutorials
Keep tutorials opinionated and minimal. Each one should show the default happy
path with the fewest possible choices.
Planned tutorial set:
1. Run inference on a folder of audio.
2. Explore predictions in Python for one file.
3. Train a custom model.
4. Evaluate a trained model.
### How-to Guides
Use how-to guides for branching tasks and customization.
Planned additions or expansions:
- Choose an inference input mode: directory, file list, or dataset.
- Run large batch inference reproducibly.
- Save predictions in different output formats.
- Inspect class scores and features in Python.
- Explore detection features / embeddings downstream.
- Tune clipping and inference settings.
- Fine-tune from a checkpoint.
- Choose and configure evaluation tasks.
- Interpret evaluation artifacts.
### Reference
Reference should become the complete map of all configurable surfaces.
High-priority additions:
- `BatDetect2API` reference.
- `InferenceConfig` reference.
- `EvaluationConfig` reference.
- `OutputsConfig` and output format reference.
- Output transform reference.
- clearer config composition reference for the full app config.
### Understanding
This is where the deeper conceptual material should live.
High-priority pages:
1. What BatDetect2 predicts.
2. How to interpret outputs, scores, and uncertainty.
3. What extracted features / embeddings are and are not.
4. Targets, labels, and decoded outputs.
5. Preprocessing consistency and domain shift.
6. Postprocessing, thresholds, and output density.
7. How evaluation works and what the metrics mean.
8. Why local validation is required before ecological interpretation.
## Priority Order
### Phase 1: Fix the primary user journey
1. Expand the four scaffold tutorials into real end-to-end guides.
2. Add a proper Python/API inference story.
3. Document outputs and how to inspect them.
4. Align `README.md` with the newer CLI/API documentation story.
5. Create the legacy docs section and add clear signposting to it.
### Phase 2: Cover the customization surface
1. Add how-to guides for batch inference, output formats, and API inspection.
2. Add reference pages for inference, outputs, evaluation, and API surfaces.
3. Add fine-tuning and advanced training guidance.
4. Write the migration guide from legacy to current workflows.
### Phase 3: Deepen understanding
1. Expand the conceptual section into a true understanding section.
2. Add pages for output interpretation, features/embeddings, and evaluation
concepts.
3. Reader-test the docs against realistic user questions.
## Immediate Next Steps
1. Decide whether to rename `explanation/` to `understanding/` or keep the
current directory name and just treat it as the Diataxis understanding
section.
2. Draft the target table of contents for Tutorials, How-to, Reference, and
Understanding.
3. Draft the legacy docs section and migration-guide table of contents.
4. Rewrite the four scaffold tutorials first.
5. Add the missing API, outputs, evaluation, and migration documentation
immediately after.

View File

@ -0,0 +1,139 @@
---
orphan: true
---
# Documentation Architecture and Migration Plan (Phase 0)
This page defines the Phase 0 documentation architecture and inventory for
reorganizing `batdetect2` documentation using the Diataxis framework.
## Scope and goals
Phase 0 focuses on architecture and prioritization only. It does not attempt
to write all new docs yet.
Primary goals:
1. Define a target docs architecture by Diataxis type.
2. Map current pages to target documentation types.
3. Identify what to keep, split, rewrite, or deprecate.
4. Set priorities for implementation phases.
## Audiences
Two primary audiences are in scope.
1. Ecologists who prefer minimal coding, focused on practical workflows:
run inference, inspect outputs, and possibly train with custom data.
2. Ecologists or bioacousticians who are Python-savvy and want to customize
workflows, training, and analysis.
## Target information architecture
The target architecture uses four top-level documentation sections.
1. Tutorials
- Learning-oriented, single-path, reproducible walkthroughs.
2. How-to guides
- Task-oriented procedures for common real goals.
3. Reference
- Factual descriptions of CLI, configs, APIs, and formats.
4. Explanation
- Conceptual material that explains why design and workflow decisions
matter.
Cross-cutting navigation conventions:
- Every page starts with audience, prerequisites, and outcome.
- Every page serves one Diataxis type only.
- Beginner-first path is prioritized, with clear links to advanced pages.
## Phase 0 inventory: current docs mapped to Diataxis
Legend:
- Keep: useful as-is with minor edits.
- Split: contains mixed documentation types and should be separated.
- Rewrite: major changes needed to fit target audience/type.
- Move: content is valid but belongs under another section.
| Current page | Current role | Target type | Audience | Action | Priority |
| --- | --- | --- | --- | --- | --- |
| `README.md` | Mixed quickstart + CLI + API + warning | Tutorial + How-to + Explanation (split) | 1 + 2 | Split | P0 |
| `docs/source/index.md` | Sparse landing page | Navigation hub | 1 + 2 | Rewrite | P0 |
| `docs/source/architecture.md` | Internal architecture deep dive | Explanation + developer reference | 2 | Move/trim | P2 |
| `docs/source/postprocessing.md` | Concept + config + internals + usage | Explanation + How-to + Reference (split) | 1 + 2 | Split | P1 |
| `docs/source/preprocessing/index.md` | Conceptual overview with some procedural flow | Explanation | 2 (and 1 optional) | Keep/trim | P2 |
| `docs/source/preprocessing/audio.md` | Detailed configuration and behavior | Reference + How-to fragments | 2 | Split | P2 |
| `docs/source/preprocessing/spectrogram.md` | Detailed configuration and behavior | Reference + How-to fragments | 2 | Split | P2 |
| `docs/source/preprocessing/usage.md` | Usage patterns + concept | How-to + Explanation (split) | 2 | Split | P1 |
| `docs/source/data/index.md` | Data-loading section index | Reference index | 2 | Keep/update | P2 |
| `docs/source/data/aoef.md` | Config and examples | How-to + Reference (split) | 2 | Split | P1 |
| `docs/source/data/legacy.md` | Legacy formats and config | How-to + Reference (split) | 2 | Split | P2 |
| `docs/source/targets/index.md` | Long conceptual + process overview | Explanation + How-to (split) | 2 | Split | P2 |
| `docs/source/targets/tags_and_terms.md` | Definitions + guidance | Explanation + Reference | 2 | Split | P2 |
| `docs/source/targets/filtering.md` | Procedure + config | How-to + Reference | 2 | Split | P2 |
| `docs/source/targets/transform.md` | Procedure + config | How-to + Reference | 2 | Split | P2 |
| `docs/source/targets/classes.md` | Procedure + config | How-to + Reference | 2 | Split | P2 |
| `docs/source/targets/rois.md` | Concept + mapping details | Explanation + Reference | 2 | Split | P2 |
| `docs/source/targets/use.md` | Integration overview | Explanation | 2 | Keep/trim | P2 |
| `docs/source/reference/index.md` | Small reference root | Reference | 2 | Expand | P1 |
| `docs/source/reference/configs.md` | Autodoc for configs | Reference | 2 | Keep | P1 |
| `docs/source/reference/targets.md` | Autodoc for targets | Reference | 2 | Keep | P2 |
## CLI and API documentation gaps (from code surface)
Current command surface includes:
- `batdetect2 detect` (compat command)
- `batdetect2 predict directory`
- `batdetect2 predict file_list`
- `batdetect2 predict dataset`
- `batdetect2 train`
- `batdetect2 evaluate`
- `batdetect2 data summary`
- `batdetect2 data convert`
These commands are not yet represented as a coherent user-facing task set.
Priority gap actions:
1. Add CLI reference pages for command signatures and options.
2. Add beginner how-to pages for practical command recipes.
3. Add migration guidance from `detect` to `predict` workflows.
## Priority architecture for implementation phases
### P0 (this phase): architecture and inventory
- Done in this file.
- Define structure and classify existing material.
### P1: user-critical docs for running the model
1. Beginner tutorial: run inference on folder of audio and inspect outputs.
2. How-to guides for repeatable inference tasks and threshold tuning.
3. Reference: complete CLI docs for prediction and outputs.
4. Explanation: interpretation caveats and validation guidance.
### P2: advanced customization and training
1. How-to guides for custom dataset preparation and training.
2. Reference for data formats, targets, and preprocessing configs.
3. Explanation docs for target design and pipeline trade-offs.
### P3: polish and contributor consistency
1. Tight cross-linking across Diataxis boundaries.
2. Consistent page templates and terminology.
3. Reader testing with representative users from both audiences.
## Definition of done for Phase 0
Phase 0 is complete when:
1. The target architecture is defined.
2. Existing content is inventoried and classified.
3. Prioritized migration path is agreed.
This page satisfies these criteria and is the baseline for Phase 1 work.

View File

@ -2,13 +2,11 @@
The current API exposes a per-detection `features` vector.
Older BatDetect2 workflows also exposed concepts such as `cnn_feats`,
`spec_features`, and `spec_slices`.
Older BatDetect2 workflows also exposed concepts such as `cnn_feats`, `spec_features`, and `spec_slices`.
## What the current feature vector is
In the current stack, each retained detection can carry an internal feature
representation produced by the model output pipeline.
In the current stack, each retained detection can carry an internal feature representation produced by the model output pipeline.
This is useful for downstream exploration, comparison, and custom analysis.
@ -20,24 +18,19 @@ They are also not a substitute for careful validation.
## Why people refer to them as embeddings
In practice, users often treat these feature vectors as embeddings because they
can be used as dense learned representations of detections.
In practice, users often treat these feature vectors as embeddings because they can be used as dense learned representations of detections.
That usage is reasonable, but you should still treat them as model-derived
internal representations whose meaning depends on the training setup.
That usage is reasonable, but you should still treat them as model-derived internal representations whose meaning depends on the training setup.
## Legacy terminology versus current terminology
- legacy `cnn_feats` referred to CNN feature outputs in the older workflow,
- legacy `spec_features` referred to lower-level extracted call features,
- current `features` are the per-detection vectors attached to `Detection`
objects.
- current `features` are the per-detection vectors attached to `Detection` objects.
These are related ideas, but not necessarily one-to-one replacements.
## Related pages
- Inspect detection features in Python:
{doc}`../how_to/inspect-detection-features-in-python`
- Legacy migration guide:
{doc}`../legacy/migration-guide`
- Inspect detection features in Python: {doc}`../how_to/inspect-detection-features-in-python`
- Legacy feature extraction: {doc}`../legacy/feature-extraction`

View File

@ -4,78 +4,83 @@
### Do I need Python knowledge to use batdetect2?
Not much.
If you only want to run the model on your own recordings, you can use the CLI and follow the steps in {doc}`getting_started`.
Not much. If you only want to run the model on your own recordings, you can
use the CLI and follow the steps in {doc}`getting_started`.
Some command-line familiarity helps, but you do not need to write Python code for standard inference workflows.
Some command-line familiarity helps, but you do not need to write Python code
for standard inference workflows.
### Are there plans for an R version?
Not currently.
Output files are plain formats (for example CSV/JSON), so you can read and analyze them in R or other environments.
Not currently. Output files are plain formats (for example CSV/JSON), so you
can read and analyze them in R or other environments.
### I cannot get installation working. What should I do?
First, re-check {doc}`getting_started` and confirm your environment is active.
If it still fails, open an issue with your OS, install method, and full error output: [GitHub Issues](https://github.com/macaodha/batdetect2/issues).
If it still fails, open an issue with your OS, install method, and full error
output: [GitHub Issues](https://github.com/macaodha/batdetect2/issues).
## Model behavior and performance
### The model does not perform well on my data
This usually means your data distribution differs from training data.
The best next step is to validate on reviewed local data and then fine-tune/train on your own annotations if needed.
This usually means your data distribution differs from training data. The best
next step is to validate on reviewed local data and then fine-tune/train on
your own annotations if needed.
### The model confuses insects/noise with bats
This can happen, especially when recording conditions differ from training conditions.
Threshold tuning and training with local annotations can improve results.
This can happen, especially when recording conditions differ from training
conditions. Threshold tuning and training with local annotations can improve
results.
See {doc}`how_to/tune-detection-threshold`.
### The model struggles with feeding buzzes or social calls
This is a known limitation of available training data in some settings.
If you have high-quality annotated examples, they are valuable for improving models.
This is a known limitation of available training data in some settings. If you
have high-quality annotated examples, they are valuable for improving models.
### Calls in the same sequence are predicted as different species
Currently we do not do any sophisticated post processing on the results output by the model.
We return a probability associated with each species for each call.
You can use these predictions to clean up the noisy predictions for sequences of calls.
batdetect2 returns per-call probabilities and does not apply heavy sequence-
level smoothing by default. You can apply sequence-aware postprocessing in your
own analysis workflow.
### Can I trust model outputs for biodiversity conclusions?
The models developed and shared as part of this repository should be used with caution.
While they have been evaluated on held out audio data, great care should be taken when using the model outputs for any form of biodiversity assessment.
Your data may differ, and as a result it is very strongly recommended that you validate the model first using data with known species to ensure that the outputs can be trusted.
Use caution. Always validate model behavior on local, reviewed data before
using outputs for ecological inference or biodiversity assessment.
### The pipeline is slow
Runtime depends on hardware and recording duration.
GPU inference is often much faster than CPU.
Runtime depends on hardware and recording duration. GPU inference is often much
faster than CPU. If files are very long, splitting them into shorter clips can
help throughput.
If you need a clipping workflow, see the annotation GUI repository:
[batdetect2_GUI](https://github.com/macaodha/batdetect2_GUI).
## Training and scope
### Can I train on my own species set?
Yes.
You can train/fine-tune with your own annotated data and species labels.
Yes. You can train/fine-tune with your own annotated data and species labels.
### Does this work on frequency-division or zero-crossing recordings?
Not directly.
The workflow assumes audio can be converted to spectrograms from the raw waveform.
Not directly. The workflow assumes audio can be converted to spectrograms from
the raw waveform.
### Can this be used for non-bat bioacoustics (for example insects or birds)?
Potentially yes, but expect retraining and configuration changes.
Open an issue if you want guidance for a specific use case.
Potentially yes, but expect retraining and configuration changes. Open an issue
if you want guidance for a specific use case.
## Usage and licensing
### Can I use this for commercial purposes?
No.
This project is currently for non-commercial use.
See the repository license for details.
No. This project is currently for non-commercial use. See the repository
license for details.

View File

@ -1,38 +1,52 @@
# Getting started
BatDetect2 can be used in two ways: through the `batdetect2` command line interface (CLI), or as the `batdetect2` Python package.
The CLI route does not require coding.
You run commands in the terminal and, in some cases, write configuration files.
The Python route gives you more flexibility and lets you integrate the model into your own workflows or experiments.
For most common use cases, both routes give you the same results.
If you want to run BatDetect2 on your recordings, start with the command-line
route below.
## Try it out
You do not need to write Python code for a standard first run.
BatDetect2 also has a Python interface, but that is mainly for users writing
their own analysis scripts.
- Use the command-line route if you want to run an existing model or train your
own model by typing commands in a terminal window.
- Use the Python route only if you already want to work in scripts or notebooks.
```{note}
If you are looking for the previous BatDetect2 workflow based on `batdetect2 detect` or `batdetect2.api`, go to {doc}`legacy/index`.
New docs default to the current `process` CLI and `BatDetect2API` workflow.
```
If you want to try BatDetect2 before installing anything locally:
- [Hugging Face demo (UK species)](https://huggingface.co/spaces/macaodha/batdetect2)
- [Google Colab notebook](https://colab.research.google.com/github/macaodha/batdetect2/blob/master/batdetect2_notebook.ipynb)
## Installation
## The simplest route for most users
To use `batdetect2` on your machine, you need to install it first.
We recommend using `uv` for that.
`uv` is a tool that helps manage Python software cleanly, without mixing it into the rest of your machine.
Install `uv` first by following the [installation instructions](https://docs.astral.sh/uv/getting-started/installation/).
1. Install BatDetect2.
2. Use a model checkpoint.
3. Run the first tutorial on a folder of recordings.
### One-off usage
If that is what you want, you can ignore the Python sections for now.
If you are not ready to install `batdetect2` permanently, you can try it with:
## Install BatDetect2
```bash
uvx batdetect2
```
We recommend `uv` for both workflows.
This still downloads the code and dependencies and runs them on your machine, but the environment is temporary.
`uv` is a tool that helps install Python software cleanly, without mixing it
into the rest of your machine.
### Install the CLI
- Use `uv tool` to install the CLI.
- Use `uv add` to add `batdetect2` as a dependency in a Python project.
If you want the `batdetect2` CLI to always be available in your terminal, run:
Install `uv` first by following their
[installation instructions](https://docs.astral.sh/uv/getting-started/installation/).
## Install the CLI
The following installs `batdetect2` in its own small environment and makes the
`batdetect2` command available on your machine.
```bash
uv tool install batdetect2
@ -47,45 +61,65 @@ uv tool upgrade batdetect2
Verify the CLI is available:
```bash
batdetect2
batdetect2 --help
```
You can then run your first workflow.
See {doc}`tutorials/run-inference-on-folder` for more details.
Run your first workflow:
### Add it to your Python project
Go to {doc}`tutorials/run-inference-on-folder` for a complete first run.
If you are using BatDetect2 from Python code and already manage your projects with `uv`, you can add it with:
## Choose a model checkpoint
The current command-line and Python workflows expect an explicit checkpoint
path.
A checkpoint is the saved model file that BatDetect2 will use for prediction.
You can use:
- a checkpoint you trained yourself, or
- a checkpoint distributed with your installation or repository checkout.
In this repository checkout, an example pretrained checkpoint is available at:
```text
src/batdetect2/models/checkpoints/Net2DFast_UK_same.pth.tar
```
Use that path in the tutorial commands if you want a concrete starting point
from this source tree.
## Python route for users writing code
If you are using BatDetect2 from Python code, add it to your Python project:
```bash
uv add batdetect2
```
If you want to upgrade it later:
This keeps your project settings and installed packages in sync.
```bash
uv add -U batdetect2
```
### Alternative with `pip`
#### Alternative with `pip`
If you prefer `pip`, you can use:
```bash
pip install batdetect2
```
It is a good idea to create a separate virtual environment first so this does not interfere with other Python environments.
If you prefer `pip`, create and activate a virtual environment first:
```bash
python -m venv .venv
source .venv/bin/activate
```
Then install from PyPI:
```bash
pip install batdetect2
```
## What's next
- Run your first workflow on a folder of recordings: {doc}`tutorials/run-inference-on-folder`
- If you write code and want the Python route: {doc}`tutorials/integrate-with-a-python-pipeline`
- Run your first workflow on a folder of recordings:
{doc}`tutorials/run-inference-on-folder`
- If you write code and want the Python route:
{doc}`tutorials/integrate-with-a-python-pipeline`
- For common practical tasks, go to {doc}`how_to/index`
- For detailed command help, go to {doc}`reference/cli/index`
- To understand the model and its outputs, go to {doc}`explanation/index`
- To understand outputs and trade-offs, go to {doc}`explanation/index`

View File

@ -1,112 +0,0 @@
# How to choose a model
Use this guide when you want to choose which model checkpoint BatDetect2 loads.
You can choose a model in both the CLI and the Python API.
## Where you can choose the model
In the CLI, use `--model` with commands that load a checkpoint, including:
- `batdetect2 process`
- `batdetect2 evaluate`
- `batdetect2 train`
- `batdetect2 finetune`
In Python, pass the model source to `BatDetect2API.from_checkpoint(...)`.
If you do not choose a model, BatDetect2 uses the built-in default UK model.
## Use a local checkpoint path
Use a local path when you already have a checkpoint file on disk.
CLI example:
```bash
batdetect2 process directory \
path/to/audio \
path/to/outputs \
--model path/to/model.ckpt
```
Python example:
```python
from batdetect2.api_v2 import BatDetect2API
api = BatDetect2API.from_checkpoint("path/to/model.ckpt")
```
## Use a bundled checkpoint alias
BatDetect2 also supports bundled checkpoint aliases.
The built-in UK model is available as `uk_same`.
The alias `batdetect2_uk_same` also works.
CLI example:
```bash
batdetect2 process directory \
path/to/audio \
path/to/outputs \
--model uk_same
```
Python example:
```python
from batdetect2.api_v2 import BatDetect2API
api = BatDetect2API.from_checkpoint("uk_same")
```
## Use a Hugging Face URI
You can also load a checkpoint from Hugging Face with a URI like:
```text
hf://owner/repo/path/to/model.ckpt
```
This needs the optional Hugging Face dependency to be installed.
For example, install it with `pip install batdetect2[huggingface]`.
CLI example:
```bash
batdetect2 process directory \
path/to/audio \
path/to/outputs \
--model hf://owner/repo/path/to/model.ckpt
```
Python example:
```python
from batdetect2.api_v2 import BatDetect2API
api = BatDetect2API.from_checkpoint(
"hf://owner/repo/path/to/model.ckpt"
)
```
## Choose the right source
- Use a local path when you already have a checkpoint file.
- Use an alias when you want one of the bundled models.
- Use a Hugging Face URI when the checkpoint lives in a Hugging Face repo.
## Related pages
- Run inference on a folder:
{doc}`../tutorials/run-inference-on-folder`
- `BatDetect2API` reference:
{doc}`../reference/api`
- Process command reference:
{doc}`../reference/cli/predict`
- Train a custom model:
{doc}`../tutorials/train-a-custom-model`
- Fine-tune from a checkpoint:
{doc}`fine-tune-from-a-checkpoint`

View File

@ -1,15 +1,12 @@
# How-to Guides
How-to guides help you answer practical questions once you are past the first
tutorial.
How-to guides help you answer practical questions once you are past the first tutorial.
Use this section when you already know the basic workflow and want help with one
specific task.
Use this section when you already know the basic workflow and want help with one specific task.
```{toctree}
:maxdepth: 1
choose-a-model
choose-an-inference-input-mode
run-batch-predictions
tune-inference-clipping

View File

@ -14,7 +14,7 @@ Current built-in output formats include:
- `soundevent`:
prediction-set JSON for soundevent-style tooling,
- `batdetect2`:
legacy-compatible per-recording JSON and CSV outputs.
legacy per-recording JSON output.
## Select a format from the CLI
@ -61,29 +61,7 @@ batdetect2 process directory \
- Use `raw` if you want the richest output surface and easy round-tripping.
- Use `parquet` if you want tabular analysis in Python or data-lake workflows.
- Use `soundevent` if you want prediction-set JSON.
- Use `batdetect2` when you need legacy BatDetect2-style outputs.
## Enable legacy CNN feature CSVs
The `batdetect2` formatter can also write the legacy CNN feature sidecar CSVs.
This is controlled through the outputs config.
Example:
```yaml
format:
name: batdetect2
write_cnn_features_csv: true
transform:
detection_transforms: []
clip_transforms: []
```
When enabled, BatDetect2 writes:
- one `.json` file per recording,
- one detection `.csv` file per recording,
- one `_cnn_features.csv` file per recording when detections are present.
- Use `batdetect2` only when you need the legacy JSON shape.
## Related pages

View File

@ -4,42 +4,50 @@ Welcome to the BatDetect2 documentation.
## What is BatDetect2?
`batdetect2` is a deep learning model and software package for detecting and
classifying bat echolocation calls in high-frequency audio recordings.
`batdetect2` detects bat echolocation calls in audio recordings.
You can use it from the command line or from Python, depending on how much
control you need.
It can help you screen large collections of recordings, find files that need
expert review, and support ecology and conservation work where manual review
alone would be slow.
In practice, BatDetect2 scans a recording, finds sounds that look like bat
calls, and returns one result for each detected call.
Each result can include where the call appears in the recording, shown as a box
with start and end time and the lowest and highest frequency, how confident the
model is that it found a call, and how strongly it matches the available
classes.
In practice, BatDetect2 takes recordings, looks for likely bat calls, draws a
box around each detected event, and scores the most likely class for that event.
The built-in default model is trained for 17 UK species.
The package also supports custom training, fine-tuning, evaluation, and more
advanced workflows from Python.
The current default model is trained for 17 UK species.
For more detail on the underlying approach, see the pre-print:
The library also supports custom training, fine-tuning, evaluation, and more
advanced use from Python.
For details on the underlying approach, see the pre-print:
[Towards a General Approach for Bat Echolocation Detection and Classification](https://www.biorxiv.org/content/10.1101/2022.12.14.520490v1)
```{warning}
Treat outputs as model predictions, not ground truth.
Always validate on reviewed local data before using results for ecological inference.
```
## A good first use for BatDetect2
## What can I do with it?
BatDetect2 is a good fit when you want to:
- scan many recordings for likely bat activity,
- prioritize files for expert review,
- compare outputs across projects with appropriate caution,
- build reviewed local datasets for later model improvement.
It is not a substitute for validation.
## Main user journeys
- I want to run the model on my recordings:
{doc}`tutorials/run-inference-on-folder`
- I write code and want to use it from Python:
- I write code and want to use Python:
{doc}`tutorials/integrate-with-a-python-pipeline`
- I want to train or fine-tune a custom model:
{doc}`tutorials/train-a-custom-model`
- I want to evaluate a trained model on held-out data:
{doc}`tutorials/evaluate-on-a-test-set`
```{warning}
Treat outputs as model predictions, not ground truth.
Always validate on reviewed local data before using results for ecological inference.
```
```{note}
Looking for the previous BatDetect2 workflow?
See {doc}`legacy/index`.
@ -55,7 +63,7 @@ Then choose the section that matches what you need.
If you are here mainly to run the model on recordings, start with Tutorials.
| Section | Best for | Start here |
| ------------- | --------------------------------------------- | ------------------------ |
| --- | --- | --- |
| Tutorials | Step-by-step routes for the most common tasks | {doc}`tutorials/index` |
| How-to guides | Answers to specific practical questions | {doc}`how_to/index` |
| Reference | Detailed command and settings help | {doc}`reference/index` |
@ -82,17 +90,6 @@ Mac Aodha, O., Martinez Balvanera, S., Damstra, E., et al.
_Towards a General Approach for Bat Echolocation Detection and Classification_.
bioRxiv.
or the bibtex entry
```bibtex
@article{batdetect2_2022,
title = {Towards a General Approach for Bat Echolocation Detection and Classification},
author = {Mac Aodha, Oisin and Mart\'{i}nez Balvanera, Santiago and Damstra, Elise and Cooke, Martyn and Eichinski, Philip and Browning, Ella and Barataudm, Michel and Boughey, Katherine and Coles, Roger and Giacomini, Giada and MacSwiney G., M. Cristina and K. Obrist, Martin and Parsons, Stuart and Sattler, Thomas and Jones, Kate E.},
journal = {bioRxiv},
year = {2022}
}
```
```{toctree}
:maxdepth: 1
:caption: Get Started

View File

@ -1,50 +1,38 @@
# CLI workflow: `batdetect2 detect`
# Legacy CLI workflow: `batdetect2 detect`
This page documents the previous CLI workflow based on `batdetect2 detect`.
```{warning}
This is documentation for a previous version of batdetect2.
This is legacy documentation.
For new workflows, use `batdetect2 process directory` instead.
If you are migrating, start with {doc}`migration-guide`.
```
## Processing a folder of audio files
## Legacy command shape
```bash
batdetect2 detect AUDIO_DIR ANN_DIR DETECTION_THRESHOLD
```
Example:
Common legacy options included:
- `--cnn_features`
- `--spec_features`
- `--time_expansion_factor`
- `--save_preds_if_empty`
- `--model_path`
## Current replacement
The closest current CLI entry point is:
```bash
batdetect2 detect example_data/audio/ example_data/anns/ 0.3
batdetect2 process directory \
path/to/model.ckpt \
path/to/audio_dir \
path/to/outputs
```
This command scans a directory of audio files, runs the BatDetect2 detector on
each file, and writes BatDetect2-style outputs into `ANN_DIR`.
Those outputs usually include one JSON file and one CSV file per recording, and
can optionally include extra feature CSVs.
`AUDIO_DIR` is the folder containing the input `.wav` files.
`ANN_DIR` is the folder where model outputs are written.
`DETECTION_THRESHOLD` controls which detections are kept.
Predictions below this score are discarded.
Smaller values keep more detections, but usually also increase mistakes.
Common options:
- `--cnn_features` Write extra CNN feature CSV files for each recording.
- `--spec_features` Extract and write traditional acoustic spectrogram feature
CSV files.
These are saved as `*_spec_features.csv` files.
- `--time_expansion_factor` Set the time expansion factor used for all files in
the run.
- `--save_preds_if_empty` Save output files even when no detections are found.
- `--model_path` Use a specific checkpoint instead of the included default
model.
If omitted, the command uses the default model trained on UK data.
## Related pages
- Migration guide:

View File

@ -0,0 +1,34 @@
# Legacy feature extraction outputs
The previous BatDetect2 workflow exposed several output concepts that users may still rely on.
These included:
- `cnn_feats`
- `spec_features`
- `spec_slices`
## Why this matters
Users exploring older notebooks or downstream analysis code often encounter these names first.
The current stack exposes a different surface centered on per-detection `features` plus configurable output formatters.
## Migration note
There is not always a strict one-to-one replacement.
When migrating, validate which part of the old workflow you actually need:
- low-level exported features,
- spectrogram slices,
- model-internal feature vectors,
- legacy JSON output shape.
Then map that need onto the current API and output format configuration.
## Related pages
- Migration guide: {doc}`migration-guide`
- Current features explanation: {doc}`../explanation/extracted-features-and-embeddings`
- Output formats reference: {doc}`../reference/output-formats`

View File

@ -1,8 +1,9 @@
# BatDetect2 v1.0 documentation
# Legacy documentation
This section documents the BatDetect2 workflow for version 1.
This section documents the previous BatDetect2 workflow.
Use these pages if you need to keep working with the older `batdetect2 detect` command or the older `batdetect2.api` interface.
Use these pages if you need to keep working with the older `batdetect2 detect`
command or the older `batdetect2.api` interface.
For new projects, we recommend the current workflow:
@ -24,5 +25,6 @@ New users should start with {doc}`../getting_started` and {doc}`../tutorials/ind
cli-detect
python-api
feature-extraction
migration-guide
```

View File

@ -1,123 +1,107 @@
# BatDetect2 2.0 migration guide
# Migration guide: legacy to current workflows
Use this guide when moving from BatDetect2 1.x workflows to the CLI and API in
2.x.
Use this guide when moving from the previous BatDetect2 workflow to the current
CLI and API.
## Why migrate
## Who should migrate now
You get access to newer features.
The codebase changed quite a bit and now gives you much more control over the
workflow through config files, improved training and fine-tuning code, and a
more flexible sound target definition system.
You should migrate if:
You can also run newer or improved models.
That includes updated versions of the UK model, plus other models trained with
the newer codebase.
- you are starting a new workflow,
- you want the current docs path,
- you want the newer CLI and API surface,
- you are maintaining code that does not depend on the exact legacy JSON or
feature outputs.
We are no longer actively supporting version 1.
No new enhancements are planned there, and only major bug fixes may still be
considered.
Future work is focused on version 2, including compatibility with newer Python
versions.
You may need the legacy workflow a bit longer if:
## Deprecation plan
- downstream tooling depends on the exact old output structure,
- you rely on older notebooks built around `batdetect2.api`,
- you depend on legacy feature extraction outputs without a validated
replacement yet.
We have kept the `batdetect2.api` module and the `batdetect2 detect` CLI command
in place for now.
You can keep using them without changing your current workflow.
However, many of the internal functions were relocated, removed or modified.
If your code relied on anything outside of the `api` module, it may break.
It is worth checking the new docs first, since there may already be a newer
feature that covers your use case.
If not, please open an issue.
Because the old `api` and CLI command are now redundant with the newer stack, we
plan to remove them in about a year.
If you want to keep pipelines up to date and long-running, it is a good idea to
migrate to version 2.
## How to migrate
If you are only using the `batdetect2 detect` CLI command or the
`batdetect2.api` module, the migration should be fairly simple.
This guide only covers these two entry points.
### CLI mapping
## CLI mapping
- `batdetect2 detect AUDIO_DIR ANN_DIR DETECTION_THRESHOLD` -> `batdetect2
process directory AUDIO_DIR OUTPUT_PATH --detection-threshold
DETECTION_THRESHOLD ...`
process directory MODEL_PATH AUDIO_DIR OUTPUT_PATH --detection-threshold ...`
Main changes:
- outputs can be written in different formats.
See the output format reference for the available options.
- the detection threshold is now an option instead of a required positional
argument.
- options like saving CNN features are now controlled through config rather than
command flags.
- there are separate subcommands for processing a directory, file list, or
dataset.
- the model path is now a positional argument on the `process` subcommand,
- the current workflow expects an explicit checkpoint path rather than silently
relying on the old default CLI behavior,
- output formatting is configurable,
- threshold override is an option rather than a required positional argument,
- there are separate subcommands for directory, file-list, and dataset-driven
inference.
### Python API mapping
## Python API mapping
- old:
`import batdetect2.api as api`
- current:
`from batdetect2 import BatDetect2API`
`from batdetect2.api_v2 import BatDetect2API`
Typical migration shape:
```python
from pathlib import Path
from batdetect2 import BatDetect2API
from batdetect2.api_v2 import BatDetect2API
# If no checkpoint is provided, the default UK model is loaded
api = BatDetect2API.from_checkpoint()
api = BatDetect2API.from_checkpoint(Path("path/to/model.ckpt"))
prediction = api.process_file(Path("path/to/audio.wav"))
```
Useful replacements:
- `batdetect2.api.process_file` -> current `BatDetect2API.process_file`
- `batdetect2.api.process_audio` -> current `BatDetect2API.process_audio`
- `batdetect2.api.process_spectrogram` -> current
`BatDetect2API.process_spectrogram`
- one-off batch loops -> `BatDetect2API.process_files` or CLI `process`
- legacy `process_file` -> current `BatDetect2API.process_file`
- legacy `process_audio` -> current `BatDetect2API.process_audio`
- legacy `process_spectrogram` -> current `BatDetect2API.process_spectrogram`
- legacy one-off batch loops -> current `process_files` or CLI `process`
### Model changes
## Output and terminology changes
The default checkpoint used by the new CLI `process` commands and by
`BatDetect2API` is a newer model trained from scratch using the updated training
code, but the same model architecture, training procedure, and data.
Performance did not change substantially, but some differences are still
expected.
Legacy workflows often centered on:
### Species names
- BatDetect2-style JSON output,
- `cnn_feats`,
- `spec_features`,
- `spec_slices`.
For the default UK model there are two naming changes:
Current workflows center on:
1. The original model had a typo and instead of `Barbastella barbastellus` it
used `Barbastellus barbastellus`.
This has now been corrected.
2. There has been a recent change in name for `Eptesicus serotinus` to
`Cnephaeus serotinus`.
- `ClipDetections` and `Detection` objects,
- per-detection `detection_score`,
- per-detection `class_scores`,
- per-detection `features`,
- configurable output formatters.
## Stay on version 1
## What to validate after migration
If you prefer not to migrate to version 2 yet, you can keep using version 1.
In that case, it is a good idea to pin your dependency:
Before replacing a legacy workflow in production or research analysis, validate:
```bash
pip install "batdetect2>=1.3.1,<2"
```
- that thresholds are still appropriate,
- that outputs are being saved in the right format,
- that downstream code reads the new outputs correctly,
- that feature-related assumptions still hold,
- that evaluation and ecological interpretation are unchanged only where you
have actually verified that.
## Migration checklist
1. Identify the old entry points you use.
2. Replace them with the current CLI or `BatDetect2API` equivalents.
3. Choose an output format explicitly.
4. Re-run on a small reviewed subset.
5. Compare outputs and downstream behavior.
6. Update any notebooks or scripts that assume legacy field names.
## Related pages
- Getting started:
- Current getting started:
{doc}`../getting_started`
- Tutorials:
- Current tutorials:
{doc}`../tutorials/index`
- API reference:
- Current API reference:
{doc}`../reference/api`

View File

@ -3,52 +3,37 @@
This page documents the previous Python API workflow based on `batdetect2.api`.
```{warning}
This is documentation for a previous version of batdetect2.
For new workflows, use `batdetect2.BatDetect2API`.
This is legacy documentation.
For new workflows, use `batdetect2.api_v2.BatDetect2API`.
If you are migrating, start with {doc}`migration-guide`.
```
## Using BatDetect2 in Python
## Legacy entry points
If you prefer to process data inside a Python script, you can use the `batdetect2.api` module.
Common legacy functions included:
This interface gives you a simple entry point for running the built-in BatDetect2 model and also exposes the default model and default configuration more directly than the current API.
- `process_file`
- `process_audio`
- `process_spectrogram`
- `load_audio`
- `generate_spectrogram`
- `postprocess`
You can process a whole file in one step, or load audio, generate a spectrogram, and work with lower-level functions yourself.
The legacy API also exposed the default model and default config more directly.
Common functions:
## Current replacement
- `process_file` Load an audio file, run the model, and return BatDetect2-style results for that recording.
- `process_audio` Run inference on an audio array that is already loaded in memory.
- `process_spectrogram` Run inference starting from a spectrogram tensor instead of raw audio.
- `load_audio` Load and resample audio using the legacy preprocessing path.
- `generate_spectrogram` Convert audio into the spectrogram representation expected by the model.
- `postprocess` Convert raw model outputs into detections and extracted features.
Typical usage:
The current Python path is:
```python
import batdetect2.api as api
from pathlib import Path
AUDIO_FILE = "example_data/audio/20170701_213954-MYOMYS-LR_0_0.5.wav"
from batdetect2.api_v2 import BatDetect2API
# Process a whole file
results = api.process_file(AUDIO_FILE)
annotations = results["pred_dict"]["annotation"]
# Or, load audio and compute spectrograms
audio = api.load_audio(AUDIO_FILE)
spec = api.generate_spectrogram(audio)
# And process the audio or the spectrogram with the model
detections, features, spec = api.process_audio(audio)
detections, features = api.process_spectrogram(spec)
# Integrate the detections or extracted features into your own analysis
api = BatDetect2API.from_checkpoint(Path("path/to/model.ckpt"))
prediction = api.process_file(Path("path/to/audio.wav"))
```
This interface is most useful when you want to work directly with detections, features, spectrograms, or intermediate arrays inside your own code.
## Related pages
- Migration guide: {doc}`migration-guide`

View File

@ -5,9 +5,6 @@ BatDetect2 uses separate config objects for different workflow surfaces.
Use the dedicated reference pages for each config family:
- model config
- training config
- logging config
- inference config
- evaluation config
- outputs config

View File

@ -1,42 +0,0 @@
# Detections reference
These are the main prediction objects returned by BatDetect2 inference methods.
Defined in `batdetect2.postprocess.types`.
## `ClipDetections`
`ClipDetections` represents the predictions for one clip or one full recording.
Fields:
- `clip`
- the `soundevent` clip metadata for the processed audio.
- `detections`
- list of `Detection` objects for that clip.
## `Detection`
`Detection` represents one detected event.
Fields:
- `geometry`
- time-frequency geometry for the detected event.
- `detection_score`
- confidence that there is an event at this location.
- `class_scores`
- class ranking scores for the detected event.
- `features`
- per-detection feature vector from the model.
## Related pages
- Python tutorial:
{doc}`../tutorials/integrate-with-a-python-pipeline`
- API reference:
{doc}`api`
- What BatDetect2 predicts:
{doc}`../explanation/what-batdetect2-predicts`
- Features and embeddings:
{doc}`../explanation/extracted-features-and-embeddings`

View File

@ -10,10 +10,6 @@ details, or Python API entries.
cli/index
api
detections
model-config
training-config
logging-config
inference-config
evaluation-config
outputs-config

View File

@ -1,46 +0,0 @@
# Logging config reference
`AppLoggingConfig` controls which logger backend BatDetect2 uses for training,
evaluation, and inference.
Defined in `batdetect2.logging`.
## Top-level fields
- `train`
- logger config for training runs.
- `evaluation`
- logger config for evaluation runs.
- `inference`
- logger config for inference runs.
## Built-in logger backends
Current built-in logger backends are:
- `csv`
- `tensorboard`
- `mlflow`
- `dvclive`
## Default behaviour
By default:
- training uses `csv`,
- evaluation uses `csv`,
- inference uses `csv`.
With the CSV logger, training writes a `metrics.csv` file in the log folder.
Example files live under `example_data/configs/`, including
`example_data/configs/logging.yaml`.
## Related pages
- Train command reference:
{doc}`cli/train`
- Evaluate command reference:
{doc}`cli/evaluate`
- Run inference on a folder:
{doc}`../tutorials/run-inference-on-folder`

View File

@ -1,37 +0,0 @@
# Model config reference
`ModelConfig` defines the model stack used for training or fresh model
construction.
Defined in `batdetect2.models`.
## Top-level fields
- `samplerate`
- expected input sample rate.
- `architecture`
- backbone network settings.
- `preprocess`
- spectrogram preprocessing settings.
- `postprocess`
- decoding and output filtering settings.
## What this config controls
Use `ModelConfig` when you want to change things like:
- the backbone architecture,
- the spectrogram settings used by the model,
- postprocessing settings stored with the model.
Example files live under `example_data/configs/`, including
`example_data/configs/model.yaml`.
## Related pages
- Preprocessing config:
{doc}`preprocessing-config`
- Postprocess config:
{doc}`postprocess-config`
- Train command reference:
{doc}`cli/train`

View File

@ -47,29 +47,17 @@ Writes a prediction-set JSON file.
Defined by `BatDetect2OutputConfig`.
This is the legacy-compatible BatDetect2 formatter.
This is the legacy BatDetect2-style JSON output.
Key fields:
- `event_name`
- `annotation_note`
- `write_detection_csv`
- `write_cnn_features_csv`
- `save_if_empty`
- `preserve_audio_tree`
- `include_file_path`
By default it writes one `.json` file and one detection `.csv` file per
recording, preserving the input audio directory layout under the output root.
It can also write legacy `_cnn_features.csv` sidecars when
`write_cnn_features_csv` is enabled.
Writes one `.json` file per recording.
## Related pages
- Outputs config:
{doc}`outputs-config`
- Save predictions in different output formats:
{doc}`../how_to/save-predictions-in-different-output-formats`
- Understanding formatted outputs:
{doc}`../explanation/interpreting-formatted-outputs`
- Outputs config: {doc}`outputs-config`
- Save predictions in different output formats: {doc}`../how_to/save-predictions-in-different-output-formats`
- Understanding formatted outputs: {doc}`../explanation/interpreting-formatted-outputs`

View File

@ -24,18 +24,10 @@ The output workflow is:
## Default behavior
By default, the current stack uses the raw output formatter unless you override
it.
For CLI processing commands, omitting `--format` now leaves format selection to
the loaded outputs config.
If no outputs config is provided, the CLI still uses its command defaults.
By default, the current stack uses the raw output formatter unless you override it.
## Related pages
- Output formats:
{doc}`output-formats`
- Output transforms:
{doc}`output-transforms`
- Save predictions in different output formats:
{doc}`../how_to/save-predictions-in-different-output-formats`
- Output formats: {doc}`output-formats`
- Output transforms: {doc}`output-transforms`
- Save predictions in different output formats: {doc}`../how_to/save-predictions-in-different-output-formats`

View File

@ -1,50 +0,0 @@
# Training config reference
`TrainingConfig` controls the training loop, optimisation, data loading, losses,
and validation tasks.
Defined in `batdetect2.train.config`.
## Top-level fields
- `train_loader`
- training data loading and clipping settings.
- `val_loader`
- validation data loading and clipping settings.
- `optimizer`
- optimiser type and learning rate settings.
- `scheduler`
- learning-rate schedule settings.
- `loss`
- detection, classification, and size loss settings.
- `trainer`
- PyTorch Lightning trainer settings such as `max_epochs`.
- `labels`
- target label generation settings.
- `validation`
- evaluation tasks used during validation.
- `checkpoints`
- checkpoint saving settings.
## What this config controls
Use `TrainingConfig` when you want to change things like:
- batch size,
- augmentation,
- optimiser and scheduler settings,
- number of epochs,
- validation frequency,
- checkpoint behaviour.
Example files live under `example_data/configs/`, including
`example_data/configs/training.yaml`.
## Related pages
- Evaluation config:
{doc}`evaluation-config`
- Train command reference:
{doc}`cli/train`
- Fine-tune from a checkpoint:
{doc}`../how_to/fine-tune-from-a-checkpoint`

View File

@ -1,133 +1,92 @@
# Evaluate on a test set
# Tutorial: Evaluate on a test set
This tutorial shows how to evaluate a trained checkpoint on a held-out dataset
and inspect the output metrics.
Use it when you want to measure how a model performs on labelled data that was
kept aside for testing.
This tutorial is for advanced users who want to compare one trained model
against a separate test dataset.
## Before you start
You need:
- a test dataset config,
- a trained checkpoint or model alias.
- A trained model checkpoint.
- A test dataset config file.
- (Optional) Targets, audio, inference, and evaluation config overrides.
```{note}
This page is for model evaluation.
If you only want to run BatDetect2 on recordings, start with
{doc}`run-inference-on-folder` instead.
If you only want to run BatDetect2 on recordings,
start with {doc}`run-inference-on-folder` instead.
```
## What you will do
## Outcome
By the end of this tutorial you will have:
- prepared a test dataset config,
- run `batdetect2 evaluate`,
- written evaluation metrics and result files,
- identified the next pages for model choice and evaluation configuration.
- understood what to inspect first,
- identified the next pages for evaluation concepts and configuration.
## 1. Create a test dataset config
Evaluation needs a dataset config that points to the labelled data you want to
use for testing.
This is the same kind of dataset config used for training.
It explicitly declares which data sources BatDetect2 should read, including the
audio files and their annotations.
For an example, see `example_data/dataset.yaml`.
If you need help creating the dataset config, follow the dataset section in
{doc}`train-a-custom-model`.
For more detail on dataset source formats, see {doc}`../reference/data-sources`.
## 1. Start with a held-out dataset
Use a dataset that was not used for training or tuning.
A held-out dataset is simply a separate dataset kept aside for evaluation.
If you tune thresholds or configs on the same dataset that you report as final
evaluation, the results will be optimistic.
## 2. Run evaluation
For a simple run, use:
```bash
batdetect2 evaluate \
path/to/test_dataset.yaml
```
If you do not pass `--model`, BatDetect2 uses the built-in default UK model.
If you want to choose a different checkpoint, alias, or Hugging Face model, see
{doc}`../how_to/choose-a-model`.
If you want to save the results somewhere else, add `--output-dir`:
```bash
batdetect2 evaluate \
path/to/test_dataset.yaml \
--model path/to/model.ckpt \
--base-dir path/to/project_root \
--output-dir path/to/eval_outputs
```
This command loads the model, runs prediction on the test dataset, applies the
evaluation tasks, and writes the results to the output directory.
This command loads the checkpoint, runs prediction on the test dataset, applies
the chosen evaluation tasks, and writes metrics and result files to the output
directory.
## 3. Check the output files
Use `--base-dir` whenever the dataset config contains relative paths.
By default, the CLI writes evaluation outputs to `outputs/evaluation`.
That is the common case for project-local dataset files.
With the default evaluation config, a run will usually create a folder like
this:
## 3. Inspect the output directory
```text
outputs/evaluation/
version_0/
metrics.csv
hparams.yaml
```
Look for:
The most important file is `metrics.csv`.
It contains the metric values computed for the evaluation run.
- summary metrics,
- generated plots,
- saved prediction files if they were enabled,
- enough metadata to reproduce the run later.
A file like this might start like:
The exact set depends on the configured evaluation tasks and plots.
```csv
classification/average_precision/barbar,classification/average_precision/cneser,...,detection/average_precision
0.898695170879364,0.9408193826675415,...,0.851219117641449
```
## 4. Interpret the results in context
The exact columns depend on the evaluation tasks you run.
Do not reduce evaluation to a single number.
The `hparams.yaml` file records the config used for the evaluation run.
Check:
## 4. Expect extra plots and files when configs enable them
- which task the metric belongs to,
- which thresholding or matching assumptions were used,
- whether class-level behavior matches your use case,
- whether the failures are concentrated in specific taxa, sites, or recording
conditions.
You may also see extra outputs such as plots and saved predictions.
## 5. Record the evaluation setup
For example, if you run evaluation with `example_data/configs/evaluation.yaml`,
you should expect a richer output folder with:
Keep the command, config files, checkpoint path, and dataset version together.
- `metrics.csv`
- `hparams.yaml`
- a `plots/` directory
- a `predictions/` directory
That matters for reproducibility and for later model comparisons.
That config enables more evaluation tasks and plots than the default setup.
## What to do next
So, depending on your evaluation config, you may see files such as:
- precision-recall plots,
- ROC curves,
- confusion matrices,
- example detection plots,
- saved prediction files.
If you want to control which tasks run and which plots are generated, see
{doc}`../reference/evaluation-config` and
{doc}`../how_to/choose-and-configure-evaluation-tasks`.
## Common next steps
- Choose a different model:
{doc}`../how_to/choose-a-model`
- Compare thresholds on representative files:
{doc}`../how_to/tune-detection-threshold`
- Configure evaluation tasks:
{doc}`../how_to/choose-and-configure-evaluation-tasks`
- Interpret evaluation artifacts:

View File

@ -1,14 +1,12 @@
# Tutorials
Welcome to the `batdetect2` tutorials.
Tutorials are the default learning path.
These tutorials walk you step by step through the most common use cases and
workflows.
They follow the simplest route and are a good place to start with `batdetect2`.
Each tutorial follows one recommended route from start to finish.
Use {doc}`../how_to/index` for focused guides on specific tasks, or
{doc}`../explanation/index` if you want to understand the concepts in more
depth.
Use tutorials when you want the simplest route to a concrete outcome.
Use {doc}`../how_to/index` when you need to customize a workflow.
```{toctree}
:maxdepth: 1

View File

@ -1,50 +1,62 @@
# Integrate with a Python pipeline
# Tutorial: Integrate with a Python pipeline
This tutorial shows a simple Python workflow for loading audio, running BatDetect2, and inspecting the detections.
This tutorial shows a minimal Python workflow for loading audio, running
batdetect2, and collecting detections for downstream analysis.
Use it when you want to work directly in Python rather than through the CLI.
This tutorial is for people who already want to work in Python.
If you mainly want to run the model on recordings, start with {doc}`run-inference-on-folder` instead.
If you mainly want to run the model on recordings,
start with {doc}`run-inference-on-folder` instead.
## Before you start
You need:
- BatDetect2 installed in your Python environment.
- A model checkpoint.
- At least one input audio file.
- BatDetect2 installed in your Python environment,
- at least one input audio file.
```{note}
This page is more technical than the standard first-run tutorial.
You do not need this page for a normal first use of BatDetect2.
```
## What you will do
If you are working from this repository checkout, you can start with:
```text
src/batdetect2/models/checkpoints/Net2DFast_UK_same.pth.tar
```
## Outcome
By the end of this tutorial you will have:
- created a `BatDetect2API` object,
- run inference on one file,
- inspected detections, scores, and features,
- used lower-level audio and spectrogram methods for more control,
- identified the next API workflows for batch processing, training, fine-tuning, and evaluation.
- inspected the top class, class-score list, and detection score,
- identified where to go next for feature extraction, saving predictions, and batch workflows.
## 1. Create the API instance
For a first run, use the built-in default UK model:
Load the checkpoint once and reuse the API object for multiple files.
```python
from batdetect2 import BatDetect2API
from pathlib import Path
# If you don't specify a checkpoint the default model will be loaded
api = BatDetect2API.from_checkpoint()
from batdetect2.api_v2 import BatDetect2API
api = BatDetect2API.from_checkpoint(Path("path/to/model.ckpt"))
```
If you want to use a different checkpoint later, see {doc}`../how_to/choose-a-model`.
## 2. Run inference on one file
`process_file` is the simplest Python entry point when you want one prediction object per recording.
```python
from batdetect2 import BatDetect2API
from pathlib import Path
api = BatDetect2API.from_checkpoint()
prediction = api.process_file("path/to/audio.wav")
from batdetect2.api_v2 import BatDetect2API
api = BatDetect2API.from_checkpoint(Path("path/to/model.ckpt"))
prediction = api.process_file(Path("path/to/audio.wav"))
for detection in prediction.detections:
top_class = api.get_top_class_name(detection)
@ -52,34 +64,21 @@ for detection in prediction.detections:
print(top_class, score)
```
## 3. Understand the prediction objects
`prediction` is a `ClipDetections` object.
See {doc}`../reference/detections` for the full reference.
Very briefly, `ClipDetections` represents all detections for one processed clip or recording.
It includes:
It contains:
- the clip metadata,
- the list of detections for that clip.
- a list of detections,
- a box for each detected event,
- one detection score per event,
- a full list of class scores per event,
- a feature vector per event.
Each item in `prediction.detections` is a `Detection` object.
## 3. Inspect class scores, not just the top class
Each `Detection` includes:
- the time-frequency geometry of the event,
- a detection score,
- the class scores,
- a feature vector.
## 4. Inspect detection score and class scores
The detection score and the class scores answer different questions.
- `detection_score` is about whether the model thinks there is a call at that time-frequency location.
- `class_scores` are about which class the model prefers for that detected event.
So a detection can have a fairly strong detection score, but still have a more uncertain class ranking.
If you are exploring results,
it is often useful to inspect the full ranked class-score list.
```python
for detection in prediction.detections:
@ -90,71 +89,30 @@ for detection in prediction.detections:
print(f" {class_name}: {score:.3f}")
```
If you want more detail on class-score inspection, see {doc}`../how_to/inspect-class-scores-in-python`.
This helps separate two different questions:
## 5. Inspect the detection features
- "Did the model think there was a call here?"
- "If there was a call, which class did it score highest?"
Each detection also carries a `features` vector.
## 4. Keep the first workflow small
These are internal model features attached to the detection.
They can be useful for things like:
Before scaling up, run the API on a few representative files and inspect the results manually.
- exploratory visualisation,
- clustering similar detections,
- comparing detections across files,
- building downstream analysis pipelines.
This catches path issues and obviously implausible outputs early.
They are useful descriptors, but they are not direct ecological labels by themselves.
## 5. Move to the right next workflow
For more detail, see {doc}`../how_to/inspect-detection-features-in-python` and {doc}`../explanation/extracted-features-and-embeddings`.
Once the single-file path is working, choose the next page based on what you need:
## 6. Use lower-level audio and spectrogram methods for more control
- save predictions to disk,
- inspect class scores more carefully,
- inspect detection features,
- process many files in one run.
If you want finer control over what gets processed and when, the API also lets you work step by step.
## What to do next
For example, you can load the audio yourself, inspect the waveform length, generate the spectrogram, and then run detection on that spectrogram:
```python
from batdetect2 import BatDetect2API
api = BatDetect2API.from_checkpoint()
audio = api.load_audio("path/to/audio.wav")
print(audio.shape)
spec = api.generate_spectrogram(audio)
print(spec.shape)
detections = api.process_spectrogram(spec)
print(len(detections))
```
This is helpful when you want to:
- inspect the loaded audio before inference,
- inspect the generated spectrogram,
- control which audio segment is processed,
- run only part of the pipeline in custom code.
You can also call `process_audio(audio)` directly if you already have the waveform array in memory.
## 7. Use the wider API workflows
The Python API is not only for single-file inference.
It also exposes methods for batch processing, training, evaluation, and fine-tuning.
Examples:
- `process_files(...)` for batch processing from Python,
- `train(...)` for training,
- `evaluate(...)` for evaluation,
- `finetune(...)` for fine-tuning.
Useful next pages:
- Choose a different model: {doc}`../how_to/choose-a-model`
- Run batch predictions: {doc}`../how_to/run-batch-predictions`
- Train a custom model: {doc}`train-a-custom-model`
- Evaluate on a test set: {doc}`evaluate-on-a-test-set`
- Fine-tune from a checkpoint: {doc}`../how_to/fine-tune-from-a-checkpoint`
- API reference: {doc}`../reference/api`
- Inspect ranked class scores: {doc}`../how_to/inspect-class-scores-in-python`
- Inspect detection features: {doc}`../how_to/inspect-detection-features-in-python`
- Save predictions to disk: {doc}`../how_to/save-predictions-in-different-output-formats`
- Learn the CLI happy path: {doc}`run-inference-on-folder`

View File

@ -1,217 +1,120 @@
# Run BatDetect2 on a folder of audio files
# Tutorial: Run BatDetect2 on a folder of audio files
This tutorial shows how to run BatDetect2 on a folder of recordings from the command line.
This tutorial walks through a first end-to-end inference run with the CLI.
Use it when you want a first pass over a folder of audio recordings and want to see what BatDetect2 finds.
It is the default starting point for new users.
If you want to follow the tutorial exactly, you can use the example recordings that come with the repository.
Use it when you want to run an existing model on a folder of recordings and
quickly check what BatDetect2 found.
## Before you start
You need:
- BatDetect2 installed in your environment.
- A folder containing `.wav` files.
- A model checkpoint path.
- BatDetect2 installed.
- A folder containing supported audio files.
- A place to save the results.
A checkpoint is the saved model file that BatDetect2 uses to make predictions.
If you have not installed BatDetect2 yet, start with {doc}`../getting_started`.
If you are working from this repository checkout, you can use:
## Optional: use the repository example files
If you want to follow the steps with the same paths shown here, clone the repository and move into it:
```bash
git clone https://github.com/macaodha/batdetect2.git
cd batdetect2
```text
src/batdetect2/models/checkpoints/Net2DFast_UK_same.pth.tar
```
Then you can use these example paths from the repository root.
## What you will do
## Outcome
By the end of this tutorial you will have:
- run `batdetect2 process directory`,
- saved predictions to disk,
- checked that BatDetect2 wrote the files you expected,
- tried a second run with a higher detection threshold,
- identified the next pages to use if you want to customise the run.
- checked that BatDetect2 wrote output files,
- identified the next pages to use for tuning or customization.
## 1. Choose your input and output folders
## 1. Choose your input and output paths
Pick:
Pick three paths:
- the folder containing your audio files,
- an output folder where BatDetect2 should save results.
- the checkpoint to use,
- the directory containing your audio files,
- an output directory where BatDetect2 will save its results.
Example layout:
```text
project/
model.pth.tar
audio/
file_001.wav
file_002.wav
outputs/
```
If `outputs/` does not exist yet, that is fine.
BatDetect2 can create it.
## 2. Run processing on the directory
If you are using the repository example files, your layout already looks like this:
```text
batdetect2/
example_data/
audio/
20170701_213954-MYOMYS-LR_0_0.5.wav
20180530_213516-EPTSER-LR_0_0.5.wav
20180627_215323-RHIFER-LR_0_0.5.wav
```
## 2. Run BatDetect2 on the folder
For a first run, use the built-in default UK model:
Use this command when you want BatDetect2 to scan a folder of recordings
automatically.
```bash
batdetect2 process directory \
path/to/audio \
path/to/model.pth.tar \
path/to/audio_dir \
path/to/outputs
```
If you are using the repository example files, run:
```bash
batdetect2 process directory \
example_data/audio \
example_outputs/first_run
```
What this does:
- looks for supported audio files in `path/to/audio`,
- runs the model on each recording,
- saves the results in `path/to/outputs`.
- loads the checkpoint,
- finds audio files in `audio_dir`,
- splits recordings into smaller pieces internally when needed,
- saves result files to `outputs`.
You do not need to choose a model for this first run.
If you do nothing, BatDetect2 uses the built-in default UK model.
## 3. Verify that outputs were written
If you want to use a different model later, see {doc}`../how_to/choose-a-model`.
After the command completes, inspect the output directory.
## 3. Check the output files
For a first run, the important check is simple:
After the command finishes, look in your output folder.
- did BatDetect2 create result files,
- are they in the output directory you expected,
- did it process the recordings you meant to analyze.
By default, the CLI writes predictions in the `batdetect2` output format.
This is a JSON-based format used for BatDetect2-style outputs.
Different workflows can save results in different file formats.
With the default settings, you will usually see one `.json` file and one `_detections.csv` file per recording.
You do not need to learn those details for the first run.
For the repository example run, that means files like:
If you later need to choose a specific output format, go to
{doc}`../how_to/save-predictions-in-different-output-formats`.
```text
example_outputs/first_run/
20170701_213954-MYOMYS-LR_0_0.5.wav.json
20170701_213954-MYOMYS-LR_0_0.5.wav_detections.csv
20180530_213516-EPTSER-LR_0_0.5.wav.json
20180530_213516-EPTSER-LR_0_0.5.wav_detections.csv
20180627_215323-RHIFER-LR_0_0.5.wav.json
20180627_215323-RHIFER-LR_0_0.5.wav_detections.csv
```
## 4. Inspect predictions
One of the JSON files will look roughly like this:
Start with a small subset of representative files.
```json
{
"annotated": false,
"annotation": [
{
"class": "Rhinolophus ferrumequinum",
"class_prob": 0.889,
"det_prob": 0.889,
"end_time": 0.0668,
"event": "Echolocation",
"high_freq": 84857,
"individual": "-1",
"low_freq": 67578,
"start_time": 0.0
}
]
}
```
Check:
Very briefly:
- whether detections were written for the expected recordings,
- whether output counts are plausible,
- whether the model is obviously too sensitive or too conservative,
- whether the predicted classes look broadly reasonable for your data.
- `annotated: false` means this is a prediction file, not a reviewed annotation file.
- `annotation` holds the list of detections.
- Each detection includes a predicted class, detection score, class score, time bounds, and frequency bounds.
Do not treat the first run as validated ecological output.
For more detail, see {doc}`../explanation/interpreting-formatted-outputs`.
If you want to save results in another format, see {doc}`../how_to/save-predictions-in-different-output-formats`.
The first run is a workflow check.
## 4. Run the same folder with a higher threshold
Validation comes next.
If you want, you can also run the same folder again with a higher detection threshold and save that run in a separate output folder.
## 5. Tune only after you have a baseline
```bash
batdetect2 process directory \
path/to/audio \
path/to/outputs_threshold_05 \
--detection-threshold 0.5
```
If the first run is too noisy or misses obvious calls, tune thresholds on a
reviewed subset rather than changing settings blindly across the full dataset.
Concrete example:
Use {doc}`../how_to/tune-detection-threshold` for that process.
```bash
batdetect2 process directory \
example_data/audio \
example_outputs/threshold_05 \
--detection-threshold 0.5
```
## What to do next
Keeping this in a separate folder makes it easy to compare runs later.
## 5. Run the model on a list of recordings
If you only want to process selected recordings, use `file_list`.
The list file should contain one recording path per line.
Example `audio_files.txt`:
```text
path/to/audio/file_001.wav
path/to/audio/file_002.wav
path/to/audio/file_010.wav
```
Repository example:
```text
example_data/audio/20170701_213954-MYOMYS-LR_0_0.5.wav
example_data/audio/20180530_213516-EPTSER-LR_0_0.5.wav
```
Then run:
```bash
batdetect2 process file_list \
path/to/audio_files.txt \
path/to/selected_outputs
```
Concrete example:
```bash
batdetect2 process file_list \
example_data/audio_files.txt \
example_outputs/selected_outputs
```
This is useful when your recordings are spread across folders, or when you only want to run a chosen subset.
## Common next steps
- If your recordings are not all in one folder, or you want to compare input modes, see {doc}`../how_to/choose-an-inference-input-mode`.
- If you want to save results in another format, see {doc}`../how_to/save-predictions-in-different-output-formats`.
- If you want to choose a different model, see {doc}`../how_to/choose-a-model`.
- If you already write code and want more control from Python, see {doc}`integrate-with-a-python-pipeline`.
- If you want the full command reference, including `--model`, see {doc}`../reference/cli/predict`.
- If you need a different input mode, use
{doc}`../how_to/choose-an-inference-input-mode`.
- If you want to tune sensitivity, use
{doc}`../how_to/tune-detection-threshold`.
- If you already write code and want more control from Python, use
{doc}`integrate-with-a-python-pipeline`.
- If you need full command details, use {doc}`../reference/cli/predict`.

View File

@ -1,208 +1,85 @@
# Train a custom model
# Tutorial: Train a custom model
This tutorial walks through a first custom training run using your own annotations.
This tutorial walks through a first custom training run using your own
annotations.
Use it when you already have labelled recordings and want to train a model for your own data.
This tutorial is for advanced users who already have dataset files and want to train a model on their own annotated data.
## Before you start
You need:
- BatDetect2 installed.
- labelled recordings and annotations.
- A training dataset config file.
- (Optional) A validation dataset config file.
- A targets config file if you are not using the default target setup.
- A model config file if you are not training from the built-in defaults.
```{note}
This is not the first page to start with if you only want to run the existing
model on recordings.
This is not the first page to start with if you only want to run the existing model on recordings.
Use {doc}`run-inference-on-folder` for that.
```
## Optional: use the repository example files
If you want to follow the steps with the same files shown here, clone the repository and move into it:
```bash
git clone https://github.com/macaodha/batdetect2.git
cd batdetect2
```
## What you will do
## Outcome
By the end of this tutorial you will have:
- created a dataset config,
- defined a targets config,
- started a training run,
- checked the checkpoint and log outputs,
- identified the next pages for evaluation and customisation.
- written checkpoints and logs,
- understood the minimum settings involved,
- identified the next pages for fine-tuning and evaluation.
## 1. Create a dataset config
## 1. Gather the minimum required inputs
The dataset config explicitly declares what data you want to use for training.
It is a YAML file.
If YAML is new to you, see [Learn YAML in Y Minutes](https://learnxinyminutes.com/yaml/).
At minimum, a custom training run needs:
In the dataset config, you list one or more data sources.
Each source tells `batdetect2` where the audio recordings live and where the matching annotations are stored.
- a training dataset config,
- optional validation dataset config,
- either a model config for a fresh run or a checkpoint for continued training,
- optional settings files for targets, audio, training, evaluation, inference, outputs, and logging.
BatDetect2 can read annotations from different source formats.
In this example, we use the example data in the `batdetect2` format.
The most important point is that the dataset file, target definitions, and preprocessing choices need to agree with each other.
Use `example_data/dataset.yaml` as a reference:
## 2. Run a first training command
```yaml
name: example dataset
description: Only for demonstration purposes
sources:
- format: batdetect2
name: Example Data
description: Examples included for testing batdetect2
annotations_dir: example_data/anns
audio_dir: example_data/audio
```
For your own project, the main thing to change is the file paths.
If you have several collections of recordings, you can add more than one source to the same dataset config.
That lets you describe the full training data you want to use in one place.
If you need more detail on dataset source formats, see {doc}`../reference/data-sources`.
## 2. Define a targets config
The targets config tells BatDetect2 how to turn your annotations into training targets.
It defines two main things:
- what should count as a detection,
- which classes the model should learn to predict.
In practice, this means the targets config maps the labels in your annotations to the detection and classification outputs used during training.
Use `example_data/targets.yaml` as a reference:
```yaml
detection_target:
name: bat
match_if:
name: all_of
conditions:
- name: has_tag
tag: { key: event, value: Echolocation }
- name: not
condition:
name: has_tag
tag: { key: class, value: Unknown }
assign_tags:
- key: class
value: Bat
classification_targets:
- name: myomys
tags:
- key: class
value: Myotis mystacinus
- name: pippip
tags:
- key: class
value: Pipistrellus pipistrellus
```
For your own project, update the matching rules and class definitions so they fit your labels.
In this example:
- `detection_target` says that echolocation calls should be treated as detections,
- `classification_targets` define the classes the model should predict,
It is worth taking a bit of time over this file, because your targets config decides what the model is actually being asked to learn.
If you need help with that, see {doc}`../how_to/configure-target-definitions` and {doc}`../reference/targets-config-workflow`.
## 3. Run a first training command
For a first run, keep the command simple:
Use a command like this for a fresh run:
```bash
batdetect2 train \
path/to/train_dataset.yaml \
--val-dataset path/to/val_dataset.yaml \
--targets path/to/targets.yaml
--targets path/to/targets.yaml \
--model-config path/to/model.yaml \
--training-config path/to/training.yaml
```
If you are using the repository example files, run:
Use `--model` instead of `--model-config` when you want to continue from an existing checkpoint.
```bash
batdetect2 train \
example_data/dataset.yaml \
--val-dataset example_data/dataset.yaml \
--targets example_data/targets.yaml
```
## 3. Check that outputs are being written
This uses the same dataset for training and validation only to keep the example simple.
For real training runs, you usually want separate training and validation datasets.
After the command starts, verify that:
This uses the built-in default model and training settings.
If you want to change the model architecture later, see {doc}`../reference/model-config`.
If you want to change optimiser settings, batch size, epochs, or checkpoint behaviour, see {doc}`../reference/training-config`.
- the run initializes without configuration errors,
- checkpoints are written to the checkpoint directory,
- logs are written to the log directory or configured logger backend,
- the training and validation datasets load as expected.
## 4. Check the training outputs
## 4. Run a sanity inference pass after training
After the run starts, `batdetect2` should write checkpoints and logs.
Do not wait until full evaluation to confirm that the trained checkpoint behaves sensibly.
By default, training logs are written with the CSV logger.
That means you should see a log folder with a `metrics.csv` file.
Take a small reviewed subset of recordings and run a quick prediction pass with the new checkpoint.
A typical layout looks like this:
That catches setup mismatches early, especially around targets and preprocessing.
```text
outputs/
checkpoints/
epoch=19-step=20.ckpt
logs/
version_0/
metrics.csv
hparams.yaml
training_artifacts/
train_dataset.yaml
val_dataset.yaml
targets.yaml
train_class_summary.csv
val_class_summary.csv
```
## 5. Evaluate on held-out data
The checkpoint is the trained model you can use later for inference, evaluation, or sharing with someone else.
Once the checkpoint looks sensible on a small sanity subset, run the formal evaluation workflow on a held-out test set.
The files in `training_artifacts/` record which datasets and targets were used for the run.
The `hparams.yaml` file records the full training setup, including the configs used for the model, training, and other parts of the run.
That is where you should compare models, thresholds, and task-level performance metrics.
The `metrics.csv` file stores one row per validation epoch.
It includes training losses as well as validation losses and metrics such as:
```csv
classification/mean_average_precision,detection/average_precision,epoch,total_loss/val
0.10041624307632446,0.3697187900543213,0,4070.3515625
0.11328697204589844,0.346899151802063,1,3941.6455078125
0.1388484090566635,0.36171725392341614,2,3776.323974609375
```
You may also see class-specific metrics in extra columns.
The more detailed metrics are computed from the validation set.
If you do not provide `--val-dataset`, those validation metrics will not appear.
Other logger backends are also supported, including TensorBoard, MLflow, and DVCLive.
See {doc}`../reference/logging-config` if you want to change that.
## Use the trained model
You can now use the trained checkpoint in BatDetect2, or share it with someone else to use in their own runs.
If you want to load it for inference or evaluation, see {doc}`../how_to/choose-a-model`.
## Common next steps
## What to do next
- Evaluate the trained checkpoint: {doc}`evaluate-on-a-test-set`
- Fine-tune from a checkpoint: {doc}`../how_to/fine-tune-from-a-checkpoint`
- Configure targets in more detail: {doc}`../how_to/configure-target-definitions`
- Configure audio preprocessing: {doc}`../how_to/configure-audio-preprocessing`
- Configure spectrogram preprocessing: {doc}`../how_to/configure-spectrogram-preprocessing`
- Configure targets: {doc}`../how_to/configure-target-definitions`
- Configure preprocessing: {doc}`../how_to/configure-audio-preprocessing`
- Check full train options: {doc}`../reference/cli/train`

View File

@ -1,2 +0,0 @@
example_data/audio/20170701_213954-MYOMYS-LR_0_0.5.wav
example_data/audio/20180530_213516-EPTSER-LR_0_0.5.wav

View File

@ -37,10 +37,10 @@ classifiers = [
"Intended Audience :: Science/Research",
"Natural Language :: English",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
"Topic :: Software Development :: Libraries :: Python Modules",
"Topic :: Multimedia :: Sound/Audio :: Analysis",

5
run_batdetect.py Normal file
View File

@ -0,0 +1,5 @@
"""Run batdetect2.command.main() from the command line."""
from batdetect2.cli import detect
if __name__ == "__main__":
detect()

View File

@ -1,5 +1,4 @@
import logging
import warnings
from typing import TYPE_CHECKING
from loguru import logger
@ -7,18 +6,15 @@ from loguru import logger
if TYPE_CHECKING:
from batdetect2.api_v2 import BatDetect2API
__all__ = ["BatDetect2API", "__version__"]
__version__ = "1.1.1"
logger.disable("batdetect2")
# Silences the irrelevant warning
warnings.filterwarnings("ignore", message="The pynvml package is deprecated")
warnings.filterwarnings("ignore", message=".*isinstance(treespec, LeafSpec).*")
numba_logger = logging.getLogger("numba")
numba_logger.setLevel(logging.WARNING)
__all__ = ["BatDetect2API", "__version__"]
__version__ = "1.1.1"
def __getattr__(name: str):
if name == "BatDetect2API":

View File

@ -27,15 +27,6 @@ def process() -> None:
def common_predict_options(func):
"""Attach options shared by all ``process`` subcommands."""
@click.option(
"--model",
"model_path",
type=str,
help=(
"Path to a checkpoint, checkpoint alias, or a Hugging Face "
"URI to fine-tune from. Defaults to uk_same"
),
)
@click.option(
"--audio-config",
type=click.Path(exists=True),
@ -86,8 +77,7 @@ def common_predict_options(func):
type=str,
help=(
"Output format name used by the prediction writer. If omitted, "
"the loaded outputs config is used, or batdetect2 when no "
"outputs config is provided."
"the config default is used."
),
)
@click.option(
@ -107,7 +97,7 @@ def common_predict_options(func):
def _build_api(
model_path: str | None,
model_path: str,
audio_config: Path | None,
inference_config: Path | None,
outputs_config: Path | None,
@ -139,7 +129,7 @@ def _build_api(
)
api = BatDetect2API.from_checkpoint(
path=model_path,
model_path,
audio_config=audio_conf,
inference_config=inference_conf,
outputs_config=outputs_conf,
@ -149,7 +139,7 @@ def _build_api(
def _run_prediction(
model_path: str | None,
model_path: str,
audio_files: list[Path],
output_path: Path,
audio_config: Path | None,
@ -160,7 +150,6 @@ def _run_prediction(
num_workers: int,
format_name: str | None,
detection_threshold: float | None,
audio_dir: Path | None = None,
) -> None:
logger.info("Initiating prediction process...")
@ -184,16 +173,11 @@ def _run_prediction(
detection_threshold=detection_threshold,
)
if audio_dir is None:
audio_dir = audio_files[0].parent if audio_files else None
if format_name is None and outputs_conf is None:
format_name = "batdetect2"
common_path = audio_files[0].parent if audio_files else None
api.save_predictions(
predictions,
path=output_path,
audio_dir=audio_dir,
audio_dir=common_path,
format=format_name,
)
@ -206,11 +190,12 @@ def _run_prediction(
name="directory",
short_help="Process audio files in a directory.",
)
@click.argument("model_path", type=str)
@click.argument("audio_dir", type=click.Path(exists=True))
@click.argument("output_path", type=click.Path())
@common_predict_options
def predict_directory_command(
model_path: str | None,
model_path: str,
audio_dir: Path,
output_path: Path,
audio_config: Path | None,
@ -242,7 +227,6 @@ def predict_directory_command(
num_workers=num_workers,
format_name=format_name,
detection_threshold=detection_threshold,
audio_dir=audio_dir,
)
@ -250,13 +234,14 @@ def predict_directory_command(
name="file_list",
short_help="Process paths listed in a text file.",
)
@click.argument("model_path", type=str)
@click.argument("file_list", type=click.Path(exists=True))
@click.argument("output_path", type=click.Path())
@common_predict_options
def predict_file_list_command(
model_path: str,
file_list: Path,
output_path: Path,
model_path: str | None,
audio_config: Path | None,
inference_config: Path | None,
outputs_config: Path | None,
@ -297,13 +282,14 @@ def predict_file_list_command(
name="dataset",
short_help="Process recordings from a dataset config.",
)
@click.argument("model_path", type=str)
@click.argument("dataset_path", type=click.Path(exists=True))
@click.argument("output_path", type=click.Path())
@common_predict_options
def predict_dataset_command(
model_path: str,
dataset_path: Path,
output_path: Path,
model_path: str | None,
audio_config: Path | None,
inference_config: Path | None,
outputs_config: Path | None,

View File

@ -104,7 +104,7 @@ LoggerConfig = Annotated[
class AppLoggingConfig(BaseConfig):
train: LoggerConfig = Field(default_factory=CSVLoggerConfig)
train: LoggerConfig = Field(default_factory=TensorBoardLoggerConfig)
evaluation: LoggerConfig = Field(default_factory=CSVLoggerConfig)
inference: LoggerConfig = Field(default_factory=CSVLoggerConfig)

View File

@ -1,8 +1,7 @@
from typing import TYPE_CHECKING, Any, NamedTuple, Protocol
from typing import Any, NamedTuple, Protocol
import torch
if TYPE_CHECKING:
from batdetect2.postprocess.types import PostprocessorProtocol
from batdetect2.preprocess.types import PreprocessorProtocol
@ -117,8 +116,8 @@ class DetectorProtocol(ModuleProtocol, Protocol):
class ModelProtocol(ModuleProtocol, Protocol):
detector: DetectorProtocol
preprocessor: "PreprocessorProtocol"
postprocessor: "PostprocessorProtocol"
preprocessor: PreprocessorProtocol
postprocessor: PostprocessorProtocol
class_names: list[str]
dimension_names: list[str]

View File

@ -27,10 +27,6 @@ def make_path_relative(path: PathLike, audio_dir: PathLike) -> Path:
return path.relative_to(audio_dir)
audio_parts = audio_dir.parts
if audio_parts and path.parts[: len(audio_parts)] == audio_parts:
return Path(*path.parts[len(audio_parts) :])
return path

View File

@ -1,9 +1,8 @@
import json
from pathlib import Path
from typing import List, Literal, Sequence, TypedDict, cast
from typing import List, Literal, Sequence, TypedDict
import numpy as np
import pandas as pd
from soundevent import data
from soundevent.geometry import compute_bounds
@ -14,6 +13,7 @@ from batdetect2.outputs.formats.base import (
)
from batdetect2.outputs.types import OutputFormatterProtocol
from batdetect2.postprocess.types import ClipDetections, Detection
from batdetect2.targets import terms
from batdetect2.targets.types import TargetProtocol
try:
@ -24,7 +24,7 @@ except ImportError:
DictWithClass = TypedDict("DictWithClass", {"class": str})
class Annotation(DictWithClass, total=False):
class Annotation(DictWithClass):
start_time: float
end_time: float
low_freq: float
@ -33,7 +33,6 @@ class Annotation(DictWithClass, total=False):
det_prob: float
individual: str
event: str
cnn_features: NotRequired[list[float]] # ty: ignore[invalid-type-form]
class FileAnnotation(TypedDict):
@ -53,14 +52,6 @@ class BatDetect2OutputConfig(BaseConfig):
event_name: str = "Echolocation"
annotation_note: str = "Automatically generated."
class_label_mode: Literal["class_name", "decoded_tag"] = "decoded_tag"
decoded_label_key: str = "dwc:scientificName"
fallback_to_class_name: bool = True
write_detection_csv: bool = True
write_cnn_features_csv: bool = False
save_if_empty: bool = False
preserve_audio_tree: bool = True
include_file_path: bool = False
class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
@ -69,26 +60,10 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
targets: TargetProtocol,
event_name: str,
annotation_note: str,
class_label_mode: Literal["class_name", "decoded_tag"] = "decoded_tag",
decoded_label_key: str = "dwc:scientificName",
fallback_to_class_name: bool = True,
write_detection_csv: bool = True,
write_cnn_features_csv: bool = False,
save_if_empty: bool = False,
preserve_audio_tree: bool = True,
include_file_path: bool = False,
):
self.targets = targets
self.event_name = event_name
self.annotation_note = annotation_note
self.class_label_mode = class_label_mode
self.decoded_label_key = decoded_label_key
self.fallback_to_class_name = fallback_to_class_name
self.write_detection_csv = write_detection_csv
self.write_cnn_features_csv = write_cnn_features_csv
self.save_if_empty = save_if_empty
self.preserve_audio_tree = preserve_audio_tree
self.include_file_path = include_file_path
def format(
self, predictions: Sequence[ClipDetections]
@ -109,57 +84,22 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
path.mkdir(parents=True)
for prediction in predictions:
annotations = prediction["annotation"]
pred_path = path / (prediction["id"] + ".json")
if not annotations and not self.save_if_empty:
continue
pred_path = self.get_output_path(prediction, path, audio_dir)
pred_path.parent.mkdir(parents=True, exist_ok=True)
# make a copy of the prediction
data = dict(prediction)
raw_file_path = data.get("file_path")
if audio_dir is not None and isinstance(raw_file_path, str):
data["file_path"] = str(
make_path_relative(raw_file_path, audio_dir)
if audio_dir is not None and "file_path" in prediction:
prediction["file_path"] = str(
make_path_relative(
prediction["file_path"],
audio_dir,
)
)
if not self.include_file_path:
data.pop("file_path", None)
annotations = cast(list[Annotation], data["annotation"])
data["annotation"] = [
{
key: value
for key, value in annotation.items()
if key != "cnn_features"
}
for annotation in annotations
]
pred_path.write_text(json.dumps(data, indent=2, sort_keys=True))
if self.write_detection_csv:
self.save_detection_csv(
prediction,
pred_path.with_suffix(".csv"),
)
if self.write_cnn_features_csv:
self.save_cnn_features_csv(
prediction,
pred_path.with_name(pred_path.stem + "_cnn_features.csv"),
)
pred_path.write_text(json.dumps(prediction))
def load(self, path: data.PathLike) -> List[FileAnnotation]:
path = Path(path)
if path.is_file():
files = [path] if path.suffix == ".json" else []
else:
files = sorted(path.rglob("*.json"))
files = list(path.glob("*.json"))
if not files:
return []
@ -168,121 +108,12 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
json.loads(file.read_text()) for file in files if file.is_file()
]
def get_output_path(
self,
prediction: FileAnnotation,
output_dir: Path,
audio_dir: data.PathLike | None,
) -> Path:
if (
self.preserve_audio_tree
and audio_dir is not None
and "file_path" in prediction
):
relative_path = make_path_relative(
prediction["file_path"],
audio_dir,
)
return (
output_dir / relative_path.parent / f"{prediction['id']}.json"
)
return output_dir / f"{prediction['id']}.json"
def save_detection_csv(
self,
prediction: FileAnnotation,
path: Path,
) -> None:
annotations = prediction["annotation"]
def get_recording_class(self, annotations: List[Annotation]) -> str:
if not annotations:
return
return ""
preds_df = pd.DataFrame(annotations)[
[
"det_prob",
"start_time",
"end_time",
"high_freq",
"low_freq",
"class",
"class_prob",
]
]
preds_df.to_csv(path, sep=",")
def save_cnn_features_csv(
self, prediction: FileAnnotation, path: Path
) -> None:
annotations = prediction["annotation"]
if not annotations:
return
cnn_features = [
annotation["cnn_features"]
for annotation in annotations
if "cnn_features" in annotation
]
if not cnn_features:
return
cnn_feats_df = pd.DataFrame(
cnn_features,
columns=[str(ii) for ii in range(len(cnn_features[0]))],
)
cnn_feats_df.to_csv(
path,
sep=",",
index=False,
float_format="%.5f",
)
def get_class_name(self, class_index: int) -> str:
class_name = self.targets.class_names[class_index]
if self.class_label_mode == "class_name":
return class_name
tags = self.targets.decode_class(class_name)
default = class_name if self.fallback_to_class_name else None
decoded = data.find_tag_value(
tags,
key=self.decoded_label_key,
default=default,
)
if decoded is None:
raise ValueError(
"Could not decode class label using key "
f"{self.decoded_label_key!r} for class {class_name!r}."
)
return decoded
def get_recording_class(self, detections: Sequence[Detection]) -> str:
if not detections:
return "None"
class_scores = np.stack(
[detection.class_scores for detection in detections],
axis=1,
)
detection_scores = np.array(
[detection.detection_score for detection in detections],
dtype=np.float32,
)
weighted_scores = (class_scores * detection_scores).sum(axis=1)
total = weighted_scores.sum()
if total <= 0:
return "None"
top_class_index = int(np.argmax(weighted_scores / total))
return self.get_class_name(top_class_index)
highest_scoring = max(annotations, key=lambda x: x["class_prob"])
return highest_scoring["class"]
def format_prediction(self, prediction: ClipDetections) -> FileAnnotation:
recording = prediction.clip.recording
@ -292,19 +123,26 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
for pred in prediction.detections
]
file_annotation = FileAnnotation(
return FileAnnotation(
id=recording.path.name,
file_path=str(recording.path),
annotated=False,
duration=round(float(recording.duration), 4),
duration=recording.duration,
issues=False,
time_exp=recording.time_expansion,
class_name=self.get_recording_class(prediction.detections),
class_name=self.get_recording_class(annotations),
notes=self.annotation_note,
annotation=annotations,
file_path=str(recording.path),
)
return file_annotation
def get_class_name(self, class_index: int) -> str:
class_name = self.targets.class_names[class_index]
tags = self.targets.decode_class(class_name)
return data.find_tag_value(
tags,
term=terms.generic_class,
default=class_name,
) # type: ignore
def format_sound_event_prediction(
self, prediction: Detection
@ -317,20 +155,16 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
top_class_score = float(prediction.class_scores[top_class_index])
top_class = self.get_class_name(top_class_index)
annotation: Annotation = {
"start_time": round(float(start_time), 4),
"end_time": round(float(end_time), 4),
"low_freq": int(low_freq),
"high_freq": int(high_freq),
"class_prob": round(top_class_score, 3),
"det_prob": round(float(prediction.detection_score), 3),
"individual": "-1",
"start_time": start_time,
"end_time": end_time,
"low_freq": low_freq,
"high_freq": high_freq,
"class_prob": top_class_score,
"det_prob": float(prediction.detection_score),
"individual": "",
"event": self.event_name,
"class": top_class,
}
if self.write_cnn_features_csv:
annotation["cnn_features"] = prediction.features.tolist() # type: ignore[index]
return annotation
@output_formatters.register(BatDetect2OutputConfig)
@ -340,12 +174,4 @@ class BatDetect2Formatter(OutputFormatterProtocol[FileAnnotation]):
targets,
event_name=config.event_name,
annotation_note=config.annotation_note,
class_label_mode=config.class_label_mode,
decoded_label_key=config.decoded_label_key,
fallback_to_class_name=config.fallback_to_class_name,
write_detection_csv=config.write_detection_csv,
write_cnn_features_csv=config.write_cnn_features_csv,
save_if_empty=config.save_if_empty,
preserve_audio_tree=config.preserve_audio_tree,
include_file_path=config.include_file_path,
)

View File

@ -10,9 +10,9 @@ from soundevent import data
from batdetect2.audio import AudioConfig, AudioLoader, build_audio_loader
from batdetect2.evaluate import EvaluatorProtocol, build_evaluator
from batdetect2.logging import (
CSVLoggerConfig,
LoggerConfig,
LoggingCallback,
TensorBoardLoggerConfig,
build_logger,
)
from batdetect2.models import ModelConfig, build_model
@ -165,7 +165,7 @@ def run_train(
)
train_logger = build_logger(
logger_config or CSVLoggerConfig(),
logger_config or TensorBoardLoggerConfig(),
log_dir=log_dir,
experiment_name=experiment_name,
run_name=run_name,

View File

@ -1,11 +1,8 @@
from pathlib import Path
from typing import cast
from unittest.mock import Mock
import numpy as np
import pandas as pd
import pytest
from soundevent import data as soundevent_data
from batdetect2.api_v2 import BatDetect2API
from batdetect2.outputs import build_output_formatter
@ -13,7 +10,6 @@ from batdetect2.outputs.formats import (
BatDetect2OutputConfig,
SoundEventOutputConfig,
)
from batdetect2.outputs.formats.batdetect2 import BatDetect2Formatter
from batdetect2.postprocess.types import ClipDetections
@ -82,82 +78,6 @@ def test_save_predictions_with_batdetect2_override(
assert len(loaded[0]["annotation"]) == len(file_prediction.detections)
def test_batdetect2_formatter_can_use_raw_class_names(
api_v2: BatDetect2API,
file_prediction,
tmp_path: Path,
) -> None:
output_dir = tmp_path / "batdetect2_raw_class_names"
api_v2.save_predictions(
[file_prediction],
path=output_dir,
config=BatDetect2OutputConfig(class_label_mode="class_name"),
)
loaded = cast(
list[dict], api_v2.load_predictions(output_dir, format="batdetect2")
)
first_annotation = loaded[0]["annotation"][0]
assert first_annotation["class"] in api_v2.targets.class_names
def test_batdetect2_formatter_can_use_decoded_species_tag() -> None:
targets = Mock()
targets.class_names = ["myodau"]
targets.decode_class.return_value = [
soundevent_data.Tag(
key="dwc:scientificName",
value="Myotis daubentonii",
)
]
formatter = BatDetect2Formatter(
targets=targets,
event_name="Echolocation",
annotation_note="Automatically generated.",
)
assert formatter.get_class_name(0) == "Myotis daubentonii"
def test_batdetect2_formatter_can_fallback_to_class_name_when_key_missing() -> (
None
):
targets = Mock()
targets.class_names = ["myodau"]
targets.decode_class.return_value = []
formatter = BatDetect2Formatter(
targets=targets,
event_name="Echolocation",
annotation_note="Automatically generated.",
decoded_label_key="dwc:scientificName",
fallback_to_class_name=True,
)
assert formatter.get_class_name(0) == "myodau"
def test_batdetect2_formatter_rejects_missing_decoded_key_without_fallback() -> (
None
):
targets = Mock()
targets.class_names = ["myodau"]
targets.decode_class.return_value = []
formatter = BatDetect2Formatter(
targets=targets,
event_name="Echolocation",
annotation_note="Automatically generated.",
decoded_label_key="dwc:scientificName",
fallback_to_class_name=False,
)
with pytest.raises(ValueError, match="Could not decode class label"):
formatter.get_class_name(0)
def test_load_predictions_with_format_override(
api_v2: BatDetect2API,
file_prediction,
@ -178,47 +98,6 @@ def test_load_predictions_with_format_override(
assert "annotation" in loaded_item
def test_load_predictions_with_batdetect2_nested_layout(
api_v2: BatDetect2API,
example_audio_files: list[Path],
tmp_path: Path,
) -> None:
output_dir = tmp_path / "batdetect2_nested"
predictions = [
api_v2.process_file(audio_file) for audio_file in example_audio_files
]
api_v2.save_predictions(
predictions,
path=output_dir,
format="batdetect2",
audio_dir=example_audio_files[0].parent,
)
loaded = api_v2.load_predictions(output_dir, format="batdetect2")
assert len(loaded) == len(example_audio_files)
def test_save_predictions_with_batdetect2_writes_cnn_feature_csv(
api_v2: BatDetect2API,
file_prediction,
tmp_path: Path,
) -> None:
output_dir = tmp_path / "batdetect2_cnn"
api_v2.save_predictions(
[file_prediction],
path=output_dir,
config=BatDetect2OutputConfig(write_cnn_features_csv=True),
)
cnn_csvs = list(output_dir.rglob("*_cnn_features.csv"))
assert len(cnn_csvs) == 1
loaded_df = pd.read_csv(cnn_csvs[0])
assert not loaded_df.empty
def test_save_predictions_with_soundevent_override(
api_v2: BatDetect2API,
file_prediction,

View File

@ -1,16 +1,12 @@
"""Behavior tests for process CLI workflows."""
import json
from pathlib import Path
import pandas as pd
import pytest
from click.testing import CliRunner
from soundevent import data, io
from batdetect2.cli import cli
from batdetect2.outputs import OutputsConfig
from batdetect2.outputs.formats import BatDetect2OutputConfig
def test_cli_process_help() -> None:
@ -39,7 +35,6 @@ def test_cli_process_directory_runs_on_real_audio(
[
"process",
"directory",
"--model",
str(tiny_checkpoint_path),
str(single_audio_dir),
str(output_path),
@ -57,190 +52,6 @@ def test_cli_process_directory_runs_on_real_audio(
assert len(list(output_path.glob("*.json"))) == 1
@pytest.mark.slow
def test_cli_process_directory_runs_on_example_audio_data(
tmp_path: Path,
tiny_checkpoint_path: Path,
example_audio_dir: Path,
example_audio_files: list[Path],
) -> None:
"""User story: process the bundled example audio directory."""
output_path = tmp_path / "predictions"
result = CliRunner().invoke(
cli,
[
"process",
"directory",
"--model",
str(tiny_checkpoint_path),
str(example_audio_dir),
str(output_path),
"--batch-size",
"1",
"--workers",
"0",
"--format",
"batdetect2",
],
)
assert result.exit_code == 0
assert output_path.exists()
assert len(list(output_path.glob("*.json"))) == len(example_audio_files)
@pytest.mark.slow
def test_cli_process_directory_batdetect2_matches_legacy_artifacts(
tmp_path: Path,
tiny_checkpoint_path: Path,
example_audio_dir: Path,
example_audio_files: list[Path],
example_anns_dir: Path,
) -> None:
"""User story: process batdetect2 output matches legacy-style files."""
output_path = tmp_path / "predictions"
result = CliRunner().invoke(
cli,
[
"process",
"directory",
"--model",
str(tiny_checkpoint_path),
str(example_audio_dir),
str(output_path),
"--batch-size",
"1",
"--workers",
"0",
"--format",
"batdetect2",
],
)
assert result.exit_code == 0
json_files = sorted(output_path.rglob("*.json"))
csv_files = sorted(output_path.rglob("*.csv"))
assert len(json_files) == len(example_audio_files)
assert len(csv_files) == len(example_audio_files)
expected_names = sorted(
audio_file.name for audio_file in example_audio_files
)
assert sorted(path.stem for path in json_files) == expected_names
assert sorted(path.stem for path in csv_files) == expected_names
first_output = json.loads(json_files[0].read_text())
assert "file_path" not in first_output
assert isinstance(first_output["class_name"], str)
assert first_output["class_name"]
first_annotation = first_output["annotation"][0]
assert first_annotation["individual"] == "-1"
assert isinstance(first_annotation["high_freq"], int)
assert isinstance(first_annotation["low_freq"], int)
expected_json = json.loads(
(example_anns_dir / json_files[0].name).read_text()
)
assert first_output["id"] == expected_json["id"]
assert first_output["time_exp"] == expected_json["time_exp"]
first_csv = pd.read_csv(csv_files[0], index_col=0)
assert list(first_csv.columns) == [
"det_prob",
"start_time",
"end_time",
"high_freq",
"low_freq",
"class",
"class_prob",
]
@pytest.mark.slow
def test_cli_process_directory_batdetect2_writes_cnn_features_csv_when_enabled(
tmp_path: Path,
tiny_checkpoint_path: Path,
example_audio_dir: Path,
) -> None:
"""User story: request legacy CNN feature CSV sidecars via config."""
output_path = tmp_path / "predictions"
outputs_config_path = tmp_path / "outputs.yaml"
outputs_config_path.write_text(
OutputsConfig(
format=BatDetect2OutputConfig(write_cnn_features_csv=True)
).to_yaml_string()
)
result = CliRunner().invoke(
cli,
[
"process",
"directory",
"--model",
str(tiny_checkpoint_path),
str(example_audio_dir),
str(output_path),
"--batch-size",
"1",
"--workers",
"0",
"--outputs-config",
str(outputs_config_path),
],
)
assert result.exit_code == 0
cnn_csvs = sorted(output_path.rglob("*_cnn_features.csv"))
assert len(cnn_csvs) == 3
first_df = pd.read_csv(cnn_csvs[0])
assert not first_df.empty
assert list(first_df.columns) == [
str(ii) for ii in range(len(first_df.columns))
]
def test_cli_process_directory_defaults_to_batdetect2_without_output_options(
tmp_path: Path,
tiny_checkpoint_path: Path,
single_audio_dir: Path,
) -> None:
"""User story: default process output stays batdetect2 for CLI users."""
output_path = tmp_path / "predictions"
result = CliRunner().invoke(
cli,
[
"process",
"directory",
"--model",
str(tiny_checkpoint_path),
str(single_audio_dir),
str(output_path),
"--batch-size",
"1",
"--workers",
"0",
],
)
assert result.exit_code == 0
assert output_path.exists()
assert len(list(output_path.glob("*.json"))) == 1
assert len(list(output_path.glob("*.csv"))) == 1
assert len(list(output_path.glob("*.nc"))) == 0
def test_cli_process_file_list_runs_on_real_audio(
tmp_path: Path,
tiny_checkpoint_path: Path,
@ -259,7 +70,6 @@ def test_cli_process_file_list_runs_on_real_audio(
[
"process",
"file_list",
"--model",
str(tiny_checkpoint_path),
str(file_list),
str(output_path),
@ -307,7 +117,6 @@ def test_cli_process_dataset_runs_on_aoef_metadata(
[
"process",
"dataset",
"--model",
str(tiny_checkpoint_path),
str(dataset_path),
str(output_path),
@ -350,7 +159,6 @@ def test_cli_process_directory_supports_output_format_override(
[
"process",
"directory",
"--model",
str(tiny_checkpoint_path),
str(single_audio_dir),
str(output_path),
@ -409,7 +217,6 @@ def test_cli_process_dataset_deduplicates_recordings(
[
"process",
"dataset",
"--model",
str(tiny_checkpoint_path),
str(dataset_path),
str(output_path),
@ -440,7 +247,6 @@ def test_cli_process_rejects_unknown_output_format(
[
"process",
"directory",
"--model",
str(tiny_checkpoint_path),
str(single_audio_dir),
str(output_path),

View File

@ -1,21 +0,0 @@
from pathlib import Path
from batdetect2.outputs.formats.base import make_path_relative
def test_make_path_relative_strips_shared_relative_prefix() -> None:
audio_dir = Path("example_data/audio")
path = Path("example_data/audio/subdir/clip.wav")
relative = make_path_relative(path, audio_dir)
assert relative == Path("subdir/clip.wav")
def test_make_path_relative_returns_dot_for_matching_relative_dir() -> None:
audio_dir = Path("example_data/audio")
path = Path("example_data/audio")
relative = make_path_relative(path, audio_dir)
assert relative == Path(".")