batdetect2/docs/source/how_to/choose-and-configure-evaluation-tasks.md
2026-05-06 17:22:18 +01:00

1.8 KiB

How to choose and configure evaluation tasks

Use this guide when the default evaluation tasks do not match the question you want to answer.

Know the default first

By default, BatDetect2 evaluation starts with:

  • sound event detection,
  • sound event classification.

Those are good defaults for many projects, but not for all of them.

Choose the task that matches the question

Common built-in task families include:

  • sound_event_detection
  • sound_event_classification
  • top_class_detection
  • clip_detection
  • clip_classification

Choose based on the question you care about.

  • Use sound-event tasks when you care about individual call events.
  • Use clip tasks when you care about clip-level presence or clip-level class evidence.
  • Use top-class detection when you want matching based on the highest-scoring class per detection.

Configure tasks in EvaluationConfig

Example:

tasks:
  - name: sound_event_detection
    prefix: detection
    affinity_threshold: 0.0
    strict_match: true
  - name: clip_classification
    prefix: clip_classification

Pass the config with:

batdetect2 evaluate \
  path/to/test_dataset.yaml \
  --model path/to/model.ckpt \
  --base-dir path/to/project_root \
  --evaluation-config path/to/evaluation.yaml

Include --base-dir when the dataset config resolves recordings through relative paths.

Change one thing at a time

When comparing models or settings, avoid changing task definitions, thresholds, matching behavior, and datasets all at once.

Otherwise it becomes hard to explain why the metric changed.

  • Evaluation tutorial: {doc}../tutorials/evaluate-on-a-test-set
  • Evaluation config reference: {doc}../reference/evaluation-config
  • Evaluation concepts: {doc}../explanation/evaluation-concepts-and-matching