mirror of
https://github.com/macaodha/batdetect2.git
synced 2026-05-22 22:32:18 +02:00
docs: expand core user workflow tutorials
This commit is contained in:
parent
9635a858bd
commit
9dec35b1ce
@ -3,33 +3,89 @@
|
||||
This tutorial shows how to evaluate a trained checkpoint on a held-out dataset
|
||||
and inspect the output metrics.
|
||||
|
||||
This tutorial is for advanced users who want to compare one trained model against a separate test dataset.
|
||||
|
||||
## Before you start
|
||||
|
||||
- A trained model checkpoint.
|
||||
- A test dataset config file.
|
||||
- (Optional) Targets, audio, inference, and evaluation config overrides.
|
||||
|
||||
## Tutorial steps
|
||||
```{note}
|
||||
This page is for model evaluation.
|
||||
If you only want to run BatDetect2 on recordings,
|
||||
start with {doc}`run-inference-on-folder` instead.
|
||||
```
|
||||
|
||||
1. Select a checkpoint and a test dataset.
|
||||
2. Run `batdetect2 evaluate`.
|
||||
3. Inspect output metrics and prediction artifacts.
|
||||
4. Record evaluation settings for reproducibility.
|
||||
## Outcome
|
||||
|
||||
## Example command
|
||||
By the end of this tutorial you will have:
|
||||
|
||||
- run `batdetect2 evaluate`,
|
||||
- written evaluation metrics and result files,
|
||||
- understood what to inspect first,
|
||||
- identified the next pages for evaluation concepts and configuration.
|
||||
|
||||
## 1. Start with a held-out dataset
|
||||
|
||||
Use a dataset that was not used for training or tuning.
|
||||
|
||||
A held-out dataset is simply a separate dataset kept aside for evaluation.
|
||||
|
||||
If you tune thresholds or configs on the same dataset that you report as final evaluation, the results will be optimistic.
|
||||
|
||||
## 2. Run evaluation
|
||||
|
||||
```bash
|
||||
batdetect2 evaluate \
|
||||
path/to/model.ckpt \
|
||||
path/to/test_dataset.yaml \
|
||||
--base-dir path/to/project_root \
|
||||
--output-dir path/to/eval_outputs
|
||||
```
|
||||
|
||||
This command loads the checkpoint,
|
||||
runs prediction on the test dataset,
|
||||
applies the chosen evaluation tasks,
|
||||
and writes metrics and result files to the output directory.
|
||||
|
||||
Use `--base-dir` whenever the dataset config contains relative paths.
|
||||
|
||||
That is the common case for project-local dataset files.
|
||||
|
||||
## 3. Inspect the output directory
|
||||
|
||||
Look for:
|
||||
|
||||
- summary metrics,
|
||||
- generated plots,
|
||||
- saved prediction files if they were enabled,
|
||||
- enough metadata to reproduce the run later.
|
||||
|
||||
The exact set depends on the configured evaluation tasks and plots.
|
||||
|
||||
## 4. Interpret the results in context
|
||||
|
||||
Do not reduce evaluation to a single number.
|
||||
|
||||
Check:
|
||||
|
||||
- which task the metric belongs to,
|
||||
- which thresholding or matching assumptions were used,
|
||||
- whether class-level behavior matches your use case,
|
||||
- whether the failures are concentrated in specific taxa, sites, or recording conditions.
|
||||
|
||||
## 5. Record the evaluation setup
|
||||
|
||||
Keep the command, config files, checkpoint path, and dataset version together.
|
||||
|
||||
That matters for reproducibility and for later model comparisons.
|
||||
|
||||
## What to do next
|
||||
|
||||
- Compare thresholds on representative files:
|
||||
{doc}`../how_to/tune-detection-threshold`
|
||||
- Configure evaluation tasks: {doc}`../how_to/choose-and-configure-evaluation-tasks`
|
||||
- Interpret evaluation artifacts: {doc}`../how_to/interpret-evaluation-outputs`
|
||||
- Learn the evaluation concepts: {doc}`../explanation/evaluation-concepts-and-matching`
|
||||
- Check full evaluate options: {doc}`../reference/cli/evaluate`
|
||||
|
||||
This page is a starter scaffold and will be expanded with a full worked
|
||||
example.
|
||||
|
||||
@ -3,21 +3,52 @@
|
||||
This tutorial shows a minimal Python workflow for loading audio, running
|
||||
batdetect2, and collecting detections for downstream analysis.
|
||||
|
||||
This tutorial is for people who already want to work in Python.
|
||||
|
||||
If you mainly want to run the model on recordings,
|
||||
start with {doc}`run-inference-on-folder` instead.
|
||||
|
||||
## Before you start
|
||||
|
||||
- BatDetect2 installed in your Python environment.
|
||||
- A model checkpoint.
|
||||
- At least one input audio file.
|
||||
|
||||
## Tutorial steps
|
||||
```{note}
|
||||
This page is more technical than the standard first-run tutorial.
|
||||
You do not need this page for a normal first use of BatDetect2.
|
||||
```
|
||||
|
||||
1. Load BatDetect2 in Python.
|
||||
2. Create an API instance from a checkpoint.
|
||||
3. Run `process_file` on one audio file.
|
||||
4. Read detection fields and class scores.
|
||||
5. Save or pass detections to your downstream pipeline.
|
||||
If you are working from this repository checkout, you can start with:
|
||||
|
||||
## Example code
|
||||
```text
|
||||
src/batdetect2/models/checkpoints/Net2DFast_UK_same.pth.tar
|
||||
```
|
||||
|
||||
## Outcome
|
||||
|
||||
By the end of this tutorial you will have:
|
||||
|
||||
- created a `BatDetect2API` object,
|
||||
- run inference on one file,
|
||||
- inspected the top class, class-score list, and detection score,
|
||||
- identified where to go next for feature extraction, saving predictions, and batch workflows.
|
||||
|
||||
## 1. Create the API instance
|
||||
|
||||
Load the checkpoint once and reuse the API object for multiple files.
|
||||
|
||||
```python
|
||||
from pathlib import Path
|
||||
|
||||
from batdetect2.api_v2 import BatDetect2API
|
||||
|
||||
api = BatDetect2API.from_checkpoint(Path("path/to/model.ckpt"))
|
||||
```
|
||||
|
||||
## 2. Run inference on one file
|
||||
|
||||
`process_file` is the simplest Python entry point when you want one prediction object per recording.
|
||||
|
||||
```python
|
||||
from pathlib import Path
|
||||
@ -33,10 +64,55 @@ for detection in prediction.detections:
|
||||
print(top_class, score)
|
||||
```
|
||||
|
||||
`prediction` is a `ClipDetections` object.
|
||||
|
||||
It contains:
|
||||
|
||||
- the clip metadata,
|
||||
- a list of detections,
|
||||
- a box for each detected event,
|
||||
- one detection score per event,
|
||||
- a full list of class scores per event,
|
||||
- a feature vector per event.
|
||||
|
||||
## 3. Inspect class scores, not just the top class
|
||||
|
||||
If you are exploring results,
|
||||
it is often useful to inspect the full ranked class-score list.
|
||||
|
||||
```python
|
||||
for detection in prediction.detections:
|
||||
print("top class:", api.get_top_class_name(detection))
|
||||
print("detection score:", detection.detection_score)
|
||||
print("class scores:")
|
||||
for class_name, score in api.get_class_scores(detection):
|
||||
print(f" {class_name}: {score:.3f}")
|
||||
```
|
||||
|
||||
This helps separate two different questions:
|
||||
|
||||
- "Did the model think there was a call here?"
|
||||
- "If there was a call, which class did it score highest?"
|
||||
|
||||
## 4. Keep the first workflow small
|
||||
|
||||
Before scaling up, run the API on a few representative files and inspect the results manually.
|
||||
|
||||
This catches path issues and obviously implausible outputs early.
|
||||
|
||||
## 5. Move to the right next workflow
|
||||
|
||||
Once the single-file path is working, choose the next page based on what you need:
|
||||
|
||||
- save predictions to disk,
|
||||
- inspect class scores more carefully,
|
||||
- inspect detection features,
|
||||
- process many files in one run.
|
||||
|
||||
## What to do next
|
||||
|
||||
- See API/config references: {doc}`../reference/index`
|
||||
- Learn practical CLI alternatives: {doc}`run-inference-on-folder`
|
||||
|
||||
This page is a starter scaffold and will be expanded with a full worked
|
||||
example.
|
||||
- API reference: {doc}`../reference/api`
|
||||
- Inspect ranked class scores: {doc}`../how_to/inspect-class-scores-in-python`
|
||||
- Inspect detection features: {doc}`../how_to/inspect-detection-features-in-python`
|
||||
- Save predictions to disk: {doc}`../how_to/save-predictions-in-different-output-formats`
|
||||
- Learn the CLI happy path: {doc}`run-inference-on-folder`
|
||||
|
||||
@ -1,33 +1,115 @@
|
||||
# Tutorial: Run inference on a folder of audio files
|
||||
# Tutorial: Run BatDetect2 on a folder of audio files
|
||||
|
||||
This tutorial walks through a first end-to-end inference run with the CLI.
|
||||
|
||||
It is the default starting point for new users.
|
||||
|
||||
Use it when you want to run an existing model on a folder of recordings and quickly check what BatDetect2 found.
|
||||
|
||||
## Before you start
|
||||
|
||||
- BatDetect2 installed in your environment.
|
||||
- A folder containing `.wav` files.
|
||||
- A model checkpoint path.
|
||||
|
||||
## Tutorial steps
|
||||
A checkpoint is the saved model file that BatDetect2 uses to make predictions.
|
||||
|
||||
1. Choose your input and output directories.
|
||||
2. Run prediction with the CLI.
|
||||
3. Verify output files were written.
|
||||
4. Inspect predictions and confidence scores.
|
||||
If you are working from this repository checkout, you can use:
|
||||
|
||||
## Example command
|
||||
```text
|
||||
src/batdetect2/models/checkpoints/Net2DFast_UK_same.pth.tar
|
||||
```
|
||||
|
||||
## Outcome
|
||||
|
||||
By the end of this tutorial you will have:
|
||||
|
||||
- run `batdetect2 predict directory`,
|
||||
- saved predictions to disk,
|
||||
- checked that BatDetect2 wrote output files,
|
||||
- identified the next pages to use for tuning or customization.
|
||||
|
||||
## 1. Choose your input and output paths
|
||||
|
||||
Pick three paths:
|
||||
|
||||
- the checkpoint to use,
|
||||
- the directory containing your audio files,
|
||||
- an output directory where BatDetect2 will save its results.
|
||||
|
||||
Example layout:
|
||||
|
||||
```text
|
||||
project/
|
||||
model.pth.tar
|
||||
audio/
|
||||
file_001.wav
|
||||
file_002.wav
|
||||
outputs/
|
||||
```
|
||||
|
||||
## 2. Run prediction on the directory
|
||||
|
||||
Use this command when you want BatDetect2 to scan a folder of recordings automatically.
|
||||
|
||||
```bash
|
||||
batdetect2 predict directory \
|
||||
path/to/model.ckpt \
|
||||
path/to/model.pth.tar \
|
||||
path/to/audio_dir \
|
||||
path/to/outputs
|
||||
```
|
||||
|
||||
What this does:
|
||||
|
||||
- loads the checkpoint,
|
||||
- finds audio files in `audio_dir`,
|
||||
- splits recordings into smaller pieces internally when needed,
|
||||
- saves result files to `outputs`.
|
||||
|
||||
## 3. Verify that outputs were written
|
||||
|
||||
After the command completes, inspect the output directory.
|
||||
|
||||
For a first run,
|
||||
the important check is simple:
|
||||
|
||||
- did BatDetect2 create result files,
|
||||
- are they in the output directory you expected,
|
||||
- did it process the recordings you meant to analyze.
|
||||
|
||||
Different workflows can save results in different file formats.
|
||||
|
||||
You do not need to learn those details for the first run.
|
||||
|
||||
If you later need to choose a specific output format,
|
||||
go to {doc}`../how_to/save-predictions-in-different-output-formats`.
|
||||
|
||||
## 4. Inspect predictions
|
||||
|
||||
Start with a small subset of representative files.
|
||||
|
||||
Check:
|
||||
|
||||
- whether detections were written for the expected recordings,
|
||||
- whether output counts are plausible,
|
||||
- whether the model is obviously too sensitive or too conservative,
|
||||
- whether the predicted classes look broadly reasonable for your data.
|
||||
|
||||
Do not treat the first run as validated ecological output.
|
||||
|
||||
The first run is a workflow check.
|
||||
|
||||
Validation comes next.
|
||||
|
||||
## 5. Tune only after you have a baseline
|
||||
|
||||
If the first run is too noisy or misses obvious calls, tune thresholds on a reviewed subset rather than changing settings blindly across the full dataset.
|
||||
|
||||
Use {doc}`../how_to/tune-detection-threshold` for that process.
|
||||
|
||||
## What to do next
|
||||
|
||||
- Use {doc}`../how_to/tune-detection-threshold` to tune sensitivity.
|
||||
- Use {doc}`../reference/cli/index` for full command options.
|
||||
|
||||
This page is a starter scaffold and will be expanded with a full worked
|
||||
example.
|
||||
- If you need a different input mode, use {doc}`../how_to/choose-an-inference-input-mode`.
|
||||
- If you want to tune sensitivity, use {doc}`../how_to/tune-detection-threshold`.
|
||||
- If you already write code and want more control from Python, use {doc}`integrate-with-a-python-pipeline`.
|
||||
- If you need full command details, use {doc}`../reference/cli/predict`.
|
||||
|
||||
@ -3,21 +3,44 @@
|
||||
This tutorial walks through a first custom training run using your own
|
||||
annotations.
|
||||
|
||||
This tutorial is for advanced users who already have dataset files and want to train a model on their own annotated data.
|
||||
|
||||
## Before you start
|
||||
|
||||
- BatDetect2 installed.
|
||||
- A training dataset config file.
|
||||
- (Optional) A validation dataset config file.
|
||||
- A targets config file if you are not using the default target setup.
|
||||
- A model config file if you are not training from the built-in defaults.
|
||||
|
||||
## Tutorial steps
|
||||
```{note}
|
||||
This is not the first page to start with if you only want to run the existing model on recordings.
|
||||
Use {doc}`run-inference-on-folder` for that.
|
||||
```
|
||||
|
||||
1. Prepare training and validation dataset config files.
|
||||
2. Choose target definitions and model/training config files.
|
||||
3. Run `batdetect2 train`.
|
||||
4. Check that checkpoints and logs are written.
|
||||
5. Run a quick sanity inference on a small audio subset.
|
||||
## Outcome
|
||||
|
||||
## Example command
|
||||
By the end of this tutorial you will have:
|
||||
|
||||
- started a training run,
|
||||
- written checkpoints and logs,
|
||||
- understood the minimum settings involved,
|
||||
- identified the next pages for fine-tuning and evaluation.
|
||||
|
||||
## 1. Gather the minimum required inputs
|
||||
|
||||
At minimum, a custom training run needs:
|
||||
|
||||
- a training dataset config,
|
||||
- optional validation dataset config,
|
||||
- either a model config for a fresh run or a checkpoint for continued training,
|
||||
- optional settings files for targets, audio, training, evaluation, inference, outputs, and logging.
|
||||
|
||||
The most important point is that the dataset file, target definitions, and preprocessing choices need to agree with each other.
|
||||
|
||||
## 2. Run a first training command
|
||||
|
||||
Use a command like this for a fresh run:
|
||||
|
||||
```bash
|
||||
batdetect2 train \
|
||||
@ -28,10 +51,35 @@ batdetect2 train \
|
||||
--training-config path/to/training.yaml
|
||||
```
|
||||
|
||||
Use `--model` instead of `--model-config` when you want to continue from an existing checkpoint.
|
||||
|
||||
## 3. Check that outputs are being written
|
||||
|
||||
After the command starts, verify that:
|
||||
|
||||
- the run initializes without configuration errors,
|
||||
- checkpoints are written to the checkpoint directory,
|
||||
- logs are written to the log directory or configured logger backend,
|
||||
- the training and validation datasets load as expected.
|
||||
|
||||
## 4. Run a sanity inference pass after training
|
||||
|
||||
Do not wait until full evaluation to confirm that the trained checkpoint behaves sensibly.
|
||||
|
||||
Take a small reviewed subset of recordings and run a quick prediction pass with the new checkpoint.
|
||||
|
||||
That catches setup mismatches early, especially around targets and preprocessing.
|
||||
|
||||
## 5. Evaluate on held-out data
|
||||
|
||||
Once the checkpoint looks sensible on a small sanity subset, run the formal evaluation workflow on a held-out test set.
|
||||
|
||||
That is where you should compare models, thresholds, and task-level performance metrics.
|
||||
|
||||
## What to do next
|
||||
|
||||
- Evaluate the trained checkpoint: {doc}`evaluate-on-a-test-set`
|
||||
- Fine-tune from a checkpoint: {doc}`../how_to/fine-tune-from-a-checkpoint`
|
||||
- Configure targets: {doc}`../how_to/configure-target-definitions`
|
||||
- Configure preprocessing: {doc}`../how_to/configure-audio-preprocessing`
|
||||
- Check full train options: {doc}`../reference/cli/train`
|
||||
|
||||
This page is a starter scaffold and will be expanded with a full worked
|
||||
example.
|
||||
|
||||
Loading…
Reference in New Issue
Block a user