diff --git a/docs/source/tutorials/evaluate-on-a-test-set.md b/docs/source/tutorials/evaluate-on-a-test-set.md index 2e94c2e..a4bce6a 100644 --- a/docs/source/tutorials/evaluate-on-a-test-set.md +++ b/docs/source/tutorials/evaluate-on-a-test-set.md @@ -3,33 +3,89 @@ This tutorial shows how to evaluate a trained checkpoint on a held-out dataset and inspect the output metrics. +This tutorial is for advanced users who want to compare one trained model against a separate test dataset. + ## Before you start - A trained model checkpoint. - A test dataset config file. - (Optional) Targets, audio, inference, and evaluation config overrides. -## Tutorial steps +```{note} +This page is for model evaluation. +If you only want to run BatDetect2 on recordings, +start with {doc}`run-inference-on-folder` instead. +``` -1. Select a checkpoint and a test dataset. -2. Run `batdetect2 evaluate`. -3. Inspect output metrics and prediction artifacts. -4. Record evaluation settings for reproducibility. +## Outcome -## Example command +By the end of this tutorial you will have: + +- run `batdetect2 evaluate`, +- written evaluation metrics and result files, +- understood what to inspect first, +- identified the next pages for evaluation concepts and configuration. + +## 1. Start with a held-out dataset + +Use a dataset that was not used for training or tuning. + +A held-out dataset is simply a separate dataset kept aside for evaluation. + +If you tune thresholds or configs on the same dataset that you report as final evaluation, the results will be optimistic. + +## 2. Run evaluation ```bash batdetect2 evaluate \ path/to/model.ckpt \ path/to/test_dataset.yaml \ + --base-dir path/to/project_root \ --output-dir path/to/eval_outputs ``` +This command loads the checkpoint, +runs prediction on the test dataset, +applies the chosen evaluation tasks, +and writes metrics and result files to the output directory. + +Use `--base-dir` whenever the dataset config contains relative paths. + +That is the common case for project-local dataset files. + +## 3. Inspect the output directory + +Look for: + +- summary metrics, +- generated plots, +- saved prediction files if they were enabled, +- enough metadata to reproduce the run later. + +The exact set depends on the configured evaluation tasks and plots. + +## 4. Interpret the results in context + +Do not reduce evaluation to a single number. + +Check: + +- which task the metric belongs to, +- which thresholding or matching assumptions were used, +- whether class-level behavior matches your use case, +- whether the failures are concentrated in specific taxa, sites, or recording conditions. + +## 5. Record the evaluation setup + +Keep the command, config files, checkpoint path, and dataset version together. + +That matters for reproducibility and for later model comparisons. + ## What to do next - Compare thresholds on representative files: {doc}`../how_to/tune-detection-threshold` +- Configure evaluation tasks: {doc}`../how_to/choose-and-configure-evaluation-tasks` +- Interpret evaluation artifacts: {doc}`../how_to/interpret-evaluation-outputs` +- Learn the evaluation concepts: {doc}`../explanation/evaluation-concepts-and-matching` - Check full evaluate options: {doc}`../reference/cli/evaluate` - -This page is a starter scaffold and will be expanded with a full worked -example. diff --git a/docs/source/tutorials/integrate-with-a-python-pipeline.md b/docs/source/tutorials/integrate-with-a-python-pipeline.md index 0c4fffd..1c62390 100644 --- a/docs/source/tutorials/integrate-with-a-python-pipeline.md +++ b/docs/source/tutorials/integrate-with-a-python-pipeline.md @@ -3,21 +3,52 @@ This tutorial shows a minimal Python workflow for loading audio, running batdetect2, and collecting detections for downstream analysis. +This tutorial is for people who already want to work in Python. + +If you mainly want to run the model on recordings, +start with {doc}`run-inference-on-folder` instead. + ## Before you start - BatDetect2 installed in your Python environment. - A model checkpoint. - At least one input audio file. -## Tutorial steps +```{note} +This page is more technical than the standard first-run tutorial. +You do not need this page for a normal first use of BatDetect2. +``` -1. Load BatDetect2 in Python. -2. Create an API instance from a checkpoint. -3. Run `process_file` on one audio file. -4. Read detection fields and class scores. -5. Save or pass detections to your downstream pipeline. +If you are working from this repository checkout, you can start with: -## Example code +```text +src/batdetect2/models/checkpoints/Net2DFast_UK_same.pth.tar +``` + +## Outcome + +By the end of this tutorial you will have: + +- created a `BatDetect2API` object, +- run inference on one file, +- inspected the top class, class-score list, and detection score, +- identified where to go next for feature extraction, saving predictions, and batch workflows. + +## 1. Create the API instance + +Load the checkpoint once and reuse the API object for multiple files. + +```python +from pathlib import Path + +from batdetect2.api_v2 import BatDetect2API + +api = BatDetect2API.from_checkpoint(Path("path/to/model.ckpt")) +``` + +## 2. Run inference on one file + +`process_file` is the simplest Python entry point when you want one prediction object per recording. ```python from pathlib import Path @@ -33,10 +64,55 @@ for detection in prediction.detections: print(top_class, score) ``` +`prediction` is a `ClipDetections` object. + +It contains: + +- the clip metadata, +- a list of detections, +- a box for each detected event, +- one detection score per event, +- a full list of class scores per event, +- a feature vector per event. + +## 3. Inspect class scores, not just the top class + +If you are exploring results, +it is often useful to inspect the full ranked class-score list. + +```python +for detection in prediction.detections: + print("top class:", api.get_top_class_name(detection)) + print("detection score:", detection.detection_score) + print("class scores:") + for class_name, score in api.get_class_scores(detection): + print(f" {class_name}: {score:.3f}") +``` + +This helps separate two different questions: + +- "Did the model think there was a call here?" +- "If there was a call, which class did it score highest?" + +## 4. Keep the first workflow small + +Before scaling up, run the API on a few representative files and inspect the results manually. + +This catches path issues and obviously implausible outputs early. + +## 5. Move to the right next workflow + +Once the single-file path is working, choose the next page based on what you need: + +- save predictions to disk, +- inspect class scores more carefully, +- inspect detection features, +- process many files in one run. + ## What to do next -- See API/config references: {doc}`../reference/index` -- Learn practical CLI alternatives: {doc}`run-inference-on-folder` - -This page is a starter scaffold and will be expanded with a full worked -example. +- API reference: {doc}`../reference/api` +- Inspect ranked class scores: {doc}`../how_to/inspect-class-scores-in-python` +- Inspect detection features: {doc}`../how_to/inspect-detection-features-in-python` +- Save predictions to disk: {doc}`../how_to/save-predictions-in-different-output-formats` +- Learn the CLI happy path: {doc}`run-inference-on-folder` diff --git a/docs/source/tutorials/run-inference-on-folder.md b/docs/source/tutorials/run-inference-on-folder.md index ca7f5bd..cff78b7 100644 --- a/docs/source/tutorials/run-inference-on-folder.md +++ b/docs/source/tutorials/run-inference-on-folder.md @@ -1,33 +1,115 @@ -# Tutorial: Run inference on a folder of audio files +# Tutorial: Run BatDetect2 on a folder of audio files This tutorial walks through a first end-to-end inference run with the CLI. +It is the default starting point for new users. + +Use it when you want to run an existing model on a folder of recordings and quickly check what BatDetect2 found. + ## Before you start - BatDetect2 installed in your environment. - A folder containing `.wav` files. - A model checkpoint path. -## Tutorial steps +A checkpoint is the saved model file that BatDetect2 uses to make predictions. -1. Choose your input and output directories. -2. Run prediction with the CLI. -3. Verify output files were written. -4. Inspect predictions and confidence scores. +If you are working from this repository checkout, you can use: -## Example command +```text +src/batdetect2/models/checkpoints/Net2DFast_UK_same.pth.tar +``` + +## Outcome + +By the end of this tutorial you will have: + +- run `batdetect2 predict directory`, +- saved predictions to disk, +- checked that BatDetect2 wrote output files, +- identified the next pages to use for tuning or customization. + +## 1. Choose your input and output paths + +Pick three paths: + +- the checkpoint to use, +- the directory containing your audio files, +- an output directory where BatDetect2 will save its results. + +Example layout: + +```text +project/ + model.pth.tar + audio/ + file_001.wav + file_002.wav + outputs/ +``` + +## 2. Run prediction on the directory + +Use this command when you want BatDetect2 to scan a folder of recordings automatically. ```bash batdetect2 predict directory \ - path/to/model.ckpt \ + path/to/model.pth.tar \ path/to/audio_dir \ path/to/outputs ``` +What this does: + +- loads the checkpoint, +- finds audio files in `audio_dir`, +- splits recordings into smaller pieces internally when needed, +- saves result files to `outputs`. + +## 3. Verify that outputs were written + +After the command completes, inspect the output directory. + +For a first run, +the important check is simple: + +- did BatDetect2 create result files, +- are they in the output directory you expected, +- did it process the recordings you meant to analyze. + +Different workflows can save results in different file formats. + +You do not need to learn those details for the first run. + +If you later need to choose a specific output format, +go to {doc}`../how_to/save-predictions-in-different-output-formats`. + +## 4. Inspect predictions + +Start with a small subset of representative files. + +Check: + +- whether detections were written for the expected recordings, +- whether output counts are plausible, +- whether the model is obviously too sensitive or too conservative, +- whether the predicted classes look broadly reasonable for your data. + +Do not treat the first run as validated ecological output. + +The first run is a workflow check. + +Validation comes next. + +## 5. Tune only after you have a baseline + +If the first run is too noisy or misses obvious calls, tune thresholds on a reviewed subset rather than changing settings blindly across the full dataset. + +Use {doc}`../how_to/tune-detection-threshold` for that process. + ## What to do next -- Use {doc}`../how_to/tune-detection-threshold` to tune sensitivity. -- Use {doc}`../reference/cli/index` for full command options. - -This page is a starter scaffold and will be expanded with a full worked -example. +- If you need a different input mode, use {doc}`../how_to/choose-an-inference-input-mode`. +- If you want to tune sensitivity, use {doc}`../how_to/tune-detection-threshold`. +- If you already write code and want more control from Python, use {doc}`integrate-with-a-python-pipeline`. +- If you need full command details, use {doc}`../reference/cli/predict`. diff --git a/docs/source/tutorials/train-a-custom-model.md b/docs/source/tutorials/train-a-custom-model.md index 269e6a3..3a1ff82 100644 --- a/docs/source/tutorials/train-a-custom-model.md +++ b/docs/source/tutorials/train-a-custom-model.md @@ -3,21 +3,44 @@ This tutorial walks through a first custom training run using your own annotations. +This tutorial is for advanced users who already have dataset files and want to train a model on their own annotated data. + ## Before you start - BatDetect2 installed. - A training dataset config file. - (Optional) A validation dataset config file. +- A targets config file if you are not using the default target setup. +- A model config file if you are not training from the built-in defaults. -## Tutorial steps +```{note} +This is not the first page to start with if you only want to run the existing model on recordings. +Use {doc}`run-inference-on-folder` for that. +``` -1. Prepare training and validation dataset config files. -2. Choose target definitions and model/training config files. -3. Run `batdetect2 train`. -4. Check that checkpoints and logs are written. -5. Run a quick sanity inference on a small audio subset. +## Outcome -## Example command +By the end of this tutorial you will have: + +- started a training run, +- written checkpoints and logs, +- understood the minimum settings involved, +- identified the next pages for fine-tuning and evaluation. + +## 1. Gather the minimum required inputs + +At minimum, a custom training run needs: + +- a training dataset config, +- optional validation dataset config, +- either a model config for a fresh run or a checkpoint for continued training, +- optional settings files for targets, audio, training, evaluation, inference, outputs, and logging. + +The most important point is that the dataset file, target definitions, and preprocessing choices need to agree with each other. + +## 2. Run a first training command + +Use a command like this for a fresh run: ```bash batdetect2 train \ @@ -28,10 +51,35 @@ batdetect2 train \ --training-config path/to/training.yaml ``` +Use `--model` instead of `--model-config` when you want to continue from an existing checkpoint. + +## 3. Check that outputs are being written + +After the command starts, verify that: + +- the run initializes without configuration errors, +- checkpoints are written to the checkpoint directory, +- logs are written to the log directory or configured logger backend, +- the training and validation datasets load as expected. + +## 4. Run a sanity inference pass after training + +Do not wait until full evaluation to confirm that the trained checkpoint behaves sensibly. + +Take a small reviewed subset of recordings and run a quick prediction pass with the new checkpoint. + +That catches setup mismatches early, especially around targets and preprocessing. + +## 5. Evaluate on held-out data + +Once the checkpoint looks sensible on a small sanity subset, run the formal evaluation workflow on a held-out test set. + +That is where you should compare models, thresholds, and task-level performance metrics. + ## What to do next - Evaluate the trained checkpoint: {doc}`evaluate-on-a-test-set` +- Fine-tune from a checkpoint: {doc}`../how_to/fine-tune-from-a-checkpoint` +- Configure targets: {doc}`../how_to/configure-target-definitions` +- Configure preprocessing: {doc}`../how_to/configure-audio-preprocessing` - Check full train options: {doc}`../reference/cli/train` - -This page is a starter scaffold and will be expanded with a full worked -example.