5.4 KiB
Filtering Sound Events for Training
Purpose
When preparing your annotated audio data for training a batdetect2
model, you often want to select only specific sound events.
For example, you might want to:
- Focus only on echolocation calls and ignore social calls or noise.
- Exclude annotations that were marked as low quality.
- Train only on specific species or groups of species.
This filtering module allows you to define rules based on the tags associated with each sound event annotation. Only the events that pass all your defined rules will be kept for further processing and training.
How it Works: Rules
Filtering is controlled by a list of rules. Each rule defines a condition based on the tags attached to a sound event. An event must satisfy all the rules you define in your configuration to be included. If an event fails even one rule, it is discarded.
Defining Rules in Configuration
You define these rules within your main configuration file (usually a .yaml
file) under a specific section (the exact name might depend on the main training config, but let's assume it's called filtering
).
The configuration consists of a list named rules
.
Each item in this list is a single filter rule.
Each rule has two parts:
match_type
: Specifies the kind of check to perform.tags
: A list of specific tags (each with akey
andvalue
) that the rule applies to.
# Example structure in your configuration file
filtering:
rules:
- match_type: <TYPE_OF_CHECK_1>
tags:
- key: <tag_key_1a>
value: <tag_value_1a>
- key: <tag_key_1b>
value: <tag_value_1b>
- match_type: <TYPE_OF_CHECK_2>
tags:
- key: <tag_key_2a>
value: <tag_value_2a>
# ... add more rules as needed
Understanding match_type
This determines how the list of tags
in the rule is used to check a sound event.
There are four types:
-
any
: (Keep if at least one tag matches)- The sound event passes this rule if it has at least one of the tags listed in the
tags
section of the rule. - Think of it as an OR condition.
- Example Use Case: Keep events if they are tagged as
Species: Pip Pip
ORSpecies: Pip Pyg
.
- The sound event passes this rule if it has at least one of the tags listed in the
-
all
: (Keep only if all tags match)- The sound event passes this rule only if it has all of the tags listed in the
tags
section. The event can have other tags as well, but it must contain all the ones specified here. - Think of it as an AND condition.
- Example Use Case: Keep events only if they are tagged with
Sound Type: Echolocation
ANDQuality: Good
.
- The sound event passes this rule only if it has all of the tags listed in the
-
exclude
: (Discard if any tag matches)- The sound event passes this rule only if it does not have any of the tags listed in the
tags
section. If it matches even one tag in the list, the event is discarded. - Example Use Case: Discard events if they are tagged
Quality: Poor
ORNoise Source: Insect
.
- The sound event passes this rule only if it does not have any of the tags listed in the
-
equal
: (Keep only if tags match exactly)- The sound event passes this rule only if its set of tags is exactly identical to the list of
tags
provided in the rule (no more, no less). - Note: This is very strict and usually less useful than
all
orany
.
- The sound event passes this rule only if its set of tags is exactly identical to the list of
Combining Rules
Remember: A sound event must pass every single rule defined in the rules
list to be kept.
The rules are checked one by one, and if an event fails any rule, it's immediately excluded from further consideration.
Examples
Example 1: Keep good quality echolocation calls
filtering:
rules:
# Rule 1: Must have the 'Echolocation' tag
- match_type: any # Could also use 'all' if 'Sound Type' is the only tag expected
tags:
- key: Sound Type
value: Echolocation
# Rule 2: Must NOT have the 'Poor' quality tag
- match_type: exclude
tags:
- key: Quality
value: Poor
Explanation: An event is kept only if it passes BOTH rules.
It must have the Sound Type: Echolocation
tag AND it must NOT have the Quality: Poor
tag.
Example 2: Keep calls from Pipistrellus species recorded in a specific project, excluding uncertain IDs
filtering:
rules:
# Rule 1: Must be either Pip pip or Pip pyg
- match_type: any
tags:
- key: Species
value: Pipistrellus pipistrellus
- key: Species
value: Pipistrellus pygmaeus
# Rule 2: Must belong to 'Project Alpha'
- match_type: any # Using 'any' as it likely only has one project tag
tags:
- key: Project ID
value: Project Alpha
# Rule 3: Exclude if ID Certainty is 'Low' or 'Maybe'
- match_type: exclude
tags:
- key: ID Certainty
value: Low
- key: ID Certainty
value: Maybe
Explanation: An event is kept only if it passes ALL three rules:
- It has a
Species
tag that is eitherPipistrellus pipistrellus
ORPipistrellus pygmaeus
. - It has the
Project ID: Project Alpha
tag. - It does not have an
ID Certainty: Low
tag AND it does not have anID Certainty: Maybe
tag.
Usage
You will typically specify the path to the configuration file containing these filtering
rules when you set up your data processing or training pipeline in batdetect2
.
The tool will then automatically load these rules and apply them to your annotated sound events.