lionel/batdetect2

mirror of https://github.com/macaodha/batdetect2.git synced 2026-05-22 22:32:18 +02:00

mbsantiago ce6975770e ci: add GitHub workflows and release helpers

2026-05-06 17:22:18 +01:00

1.8 KiB

Raw Blame History

How to choose and configure evaluation tasks

Use this guide when the default evaluation tasks do not match the question you want to answer.

Know the default first

By default, BatDetect2 evaluation starts with:

sound event detection,
sound event classification.

Those are good defaults for many projects, but not for all of them.

Choose the task that matches the question

Common built-in task families include:

sound_event_detection
sound_event_classification
top_class_detection
clip_detection
clip_classification

Choose based on the question you care about.

Use sound-event tasks when you care about individual call events.
Use clip tasks when you care about clip-level presence or clip-level class evidence.
Use top-class detection when you want matching based on the highest-scoring class per detection.

Configure tasks in `EvaluationConfig`

Example:

tasks:
  - name: sound_event_detection
    prefix: detection
    affinity_threshold: 0.0
    strict_match: true
  - name: clip_classification
    prefix: clip_classification

Pass the config with:

batdetect2 evaluate \
  path/to/test_dataset.yaml \
  --model path/to/model.ckpt \
  --base-dir path/to/project_root \
  --evaluation-config path/to/evaluation.yaml

Include --base-dir when the dataset config resolves recordings through relative paths.

Change one thing at a time

When comparing models or settings, avoid changing task definitions, thresholds, matching behavior, and datasets all at once.

Otherwise it becomes hard to explain why the metric changed.

Evaluation tutorial: {doc}../tutorials/evaluate-on-a-test-set
Evaluation config reference: {doc}../reference/evaluation-config
Evaluation concepts: {doc}../explanation/evaluation-concepts-and-matching