From 2da6a9504c0f9efbe58049741239871fb429b58b Mon Sep 17 00:00:00 2001 From: mbsantiago Date: Wed, 6 May 2026 21:18:25 +0100 Subject: [PATCH] docs: remove obsolete legacy planning pages --- docs/source/documentation_plan.md | 139 ------------------ .../extracted-features-and-embeddings.md | 21 ++- docs/source/legacy/feature-extraction.md | 34 ----- docs/source/legacy/index.md | 8 +- 4 files changed, 17 insertions(+), 185 deletions(-) delete mode 100644 docs/source/documentation_plan.md delete mode 100644 docs/source/legacy/feature-extraction.md diff --git a/docs/source/documentation_plan.md b/docs/source/documentation_plan.md deleted file mode 100644 index 50a4310..0000000 --- a/docs/source/documentation_plan.md +++ /dev/null @@ -1,139 +0,0 @@ ---- -orphan: true ---- - -# Documentation Architecture and Migration Plan (Phase 0) - -This page defines the Phase 0 documentation architecture and inventory for -reorganizing `batdetect2` documentation using the Diataxis framework. - -## Scope and goals - -Phase 0 focuses on architecture and prioritization only. It does not attempt -to write all new docs yet. - -Primary goals: - -1. Define a target docs architecture by Diataxis type. -2. Map current pages to target documentation types. -3. Identify what to keep, split, rewrite, or deprecate. -4. Set priorities for implementation phases. - -## Audiences - -Two primary audiences are in scope. - -1. Ecologists who prefer minimal coding, focused on practical workflows: - run inference, inspect outputs, and possibly train with custom data. -2. Ecologists or bioacousticians who are Python-savvy and want to customize - workflows, training, and analysis. - -## Target information architecture - -The target architecture uses four top-level documentation sections. - -1. Tutorials - - Learning-oriented, single-path, reproducible walkthroughs. -2. How-to guides - - Task-oriented procedures for common real goals. -3. Reference - - Factual descriptions of CLI, configs, APIs, and formats. -4. Explanation - - Conceptual material that explains why design and workflow decisions - matter. - -Cross-cutting navigation conventions: - -- Every page starts with audience, prerequisites, and outcome. -- Every page serves one Diataxis type only. -- Beginner-first path is prioritized, with clear links to advanced pages. - -## Phase 0 inventory: current docs mapped to Diataxis - -Legend: - -- Keep: useful as-is with minor edits. -- Split: contains mixed documentation types and should be separated. -- Rewrite: major changes needed to fit target audience/type. -- Move: content is valid but belongs under another section. - -| Current page | Current role | Target type | Audience | Action | Priority | -| --- | --- | --- | --- | --- | --- | -| `README.md` | Mixed quickstart + CLI + API + warning | Tutorial + How-to + Explanation (split) | 1 + 2 | Split | P0 | -| `docs/source/index.md` | Sparse landing page | Navigation hub | 1 + 2 | Rewrite | P0 | -| `docs/source/architecture.md` | Internal architecture deep dive | Explanation + developer reference | 2 | Move/trim | P2 | -| `docs/source/postprocessing.md` | Concept + config + internals + usage | Explanation + How-to + Reference (split) | 1 + 2 | Split | P1 | -| `docs/source/preprocessing/index.md` | Conceptual overview with some procedural flow | Explanation | 2 (and 1 optional) | Keep/trim | P2 | -| `docs/source/preprocessing/audio.md` | Detailed configuration and behavior | Reference + How-to fragments | 2 | Split | P2 | -| `docs/source/preprocessing/spectrogram.md` | Detailed configuration and behavior | Reference + How-to fragments | 2 | Split | P2 | -| `docs/source/preprocessing/usage.md` | Usage patterns + concept | How-to + Explanation (split) | 2 | Split | P1 | -| `docs/source/data/index.md` | Data-loading section index | Reference index | 2 | Keep/update | P2 | -| `docs/source/data/aoef.md` | Config and examples | How-to + Reference (split) | 2 | Split | P1 | -| `docs/source/data/legacy.md` | Legacy formats and config | How-to + Reference (split) | 2 | Split | P2 | -| `docs/source/targets/index.md` | Long conceptual + process overview | Explanation + How-to (split) | 2 | Split | P2 | -| `docs/source/targets/tags_and_terms.md` | Definitions + guidance | Explanation + Reference | 2 | Split | P2 | -| `docs/source/targets/filtering.md` | Procedure + config | How-to + Reference | 2 | Split | P2 | -| `docs/source/targets/transform.md` | Procedure + config | How-to + Reference | 2 | Split | P2 | -| `docs/source/targets/classes.md` | Procedure + config | How-to + Reference | 2 | Split | P2 | -| `docs/source/targets/rois.md` | Concept + mapping details | Explanation + Reference | 2 | Split | P2 | -| `docs/source/targets/use.md` | Integration overview | Explanation | 2 | Keep/trim | P2 | -| `docs/source/reference/index.md` | Small reference root | Reference | 2 | Expand | P1 | -| `docs/source/reference/configs.md` | Autodoc for configs | Reference | 2 | Keep | P1 | -| `docs/source/reference/targets.md` | Autodoc for targets | Reference | 2 | Keep | P2 | - -## CLI and API documentation gaps (from code surface) - -Current command surface includes: - -- `batdetect2 detect` (compat command) -- `batdetect2 predict directory` -- `batdetect2 predict file_list` -- `batdetect2 predict dataset` -- `batdetect2 train` -- `batdetect2 evaluate` -- `batdetect2 data summary` -- `batdetect2 data convert` - -These commands are not yet represented as a coherent user-facing task set. - -Priority gap actions: - -1. Add CLI reference pages for command signatures and options. -2. Add beginner how-to pages for practical command recipes. -3. Add migration guidance from `detect` to `predict` workflows. - -## Priority architecture for implementation phases - -### P0 (this phase): architecture and inventory - -- Done in this file. -- Define structure and classify existing material. - -### P1: user-critical docs for running the model - -1. Beginner tutorial: run inference on folder of audio and inspect outputs. -2. How-to guides for repeatable inference tasks and threshold tuning. -3. Reference: complete CLI docs for prediction and outputs. -4. Explanation: interpretation caveats and validation guidance. - -### P2: advanced customization and training - -1. How-to guides for custom dataset preparation and training. -2. Reference for data formats, targets, and preprocessing configs. -3. Explanation docs for target design and pipeline trade-offs. - -### P3: polish and contributor consistency - -1. Tight cross-linking across Diataxis boundaries. -2. Consistent page templates and terminology. -3. Reader testing with representative users from both audiences. - -## Definition of done for Phase 0 - -Phase 0 is complete when: - -1. The target architecture is defined. -2. Existing content is inventoried and classified. -3. Prioritized migration path is agreed. - -This page satisfies these criteria and is the baseline for Phase 1 work. diff --git a/docs/source/explanation/extracted-features-and-embeddings.md b/docs/source/explanation/extracted-features-and-embeddings.md index d2ea44b..01d2837 100644 --- a/docs/source/explanation/extracted-features-and-embeddings.md +++ b/docs/source/explanation/extracted-features-and-embeddings.md @@ -2,11 +2,13 @@ The current API exposes a per-detection `features` vector. -Older BatDetect2 workflows also exposed concepts such as `cnn_feats`, `spec_features`, and `spec_slices`. +Older BatDetect2 workflows also exposed concepts such as `cnn_feats`, +`spec_features`, and `spec_slices`. ## What the current feature vector is -In the current stack, each retained detection can carry an internal feature representation produced by the model output pipeline. +In the current stack, each retained detection can carry an internal feature +representation produced by the model output pipeline. This is useful for downstream exploration, comparison, and custom analysis. @@ -18,19 +20,24 @@ They are also not a substitute for careful validation. ## Why people refer to them as embeddings -In practice, users often treat these feature vectors as embeddings because they can be used as dense learned representations of detections. +In practice, users often treat these feature vectors as embeddings because they +can be used as dense learned representations of detections. -That usage is reasonable, but you should still treat them as model-derived internal representations whose meaning depends on the training setup. +That usage is reasonable, but you should still treat them as model-derived +internal representations whose meaning depends on the training setup. ## Legacy terminology versus current terminology - legacy `cnn_feats` referred to CNN feature outputs in the older workflow, - legacy `spec_features` referred to lower-level extracted call features, -- current `features` are the per-detection vectors attached to `Detection` objects. +- current `features` are the per-detection vectors attached to `Detection` + objects. These are related ideas, but not necessarily one-to-one replacements. ## Related pages -- Inspect detection features in Python: {doc}`../how_to/inspect-detection-features-in-python` -- Legacy feature extraction: {doc}`../legacy/feature-extraction` +- Inspect detection features in Python: + {doc}`../how_to/inspect-detection-features-in-python` +- Legacy migration guide: + {doc}`../legacy/migration-guide` diff --git a/docs/source/legacy/feature-extraction.md b/docs/source/legacy/feature-extraction.md deleted file mode 100644 index f14be19..0000000 --- a/docs/source/legacy/feature-extraction.md +++ /dev/null @@ -1,34 +0,0 @@ -# Legacy feature extraction outputs - -The previous BatDetect2 workflow exposed several output concepts that users may still rely on. - -These included: - -- `cnn_feats` -- `spec_features` -- `spec_slices` - -## Why this matters - -Users exploring older notebooks or downstream analysis code often encounter these names first. - -The current stack exposes a different surface centered on per-detection `features` plus configurable output formatters. - -## Migration note - -There is not always a strict one-to-one replacement. - -When migrating, validate which part of the old workflow you actually need: - -- low-level exported features, -- spectrogram slices, -- model-internal feature vectors, -- legacy JSON output shape. - -Then map that need onto the current API and output format configuration. - -## Related pages - -- Migration guide: {doc}`migration-guide` -- Current features explanation: {doc}`../explanation/extracted-features-and-embeddings` -- Output formats reference: {doc}`../reference/output-formats` diff --git a/docs/source/legacy/index.md b/docs/source/legacy/index.md index 226adf1..a9b2c56 100644 --- a/docs/source/legacy/index.md +++ b/docs/source/legacy/index.md @@ -1,9 +1,8 @@ -# Legacy documentation +# BatDetect2 v1.0 documentation -This section documents the previous BatDetect2 workflow. +This section documents the BatDetect2 workflow for version 1. -Use these pages if you need to keep working with the older `batdetect2 detect` -command or the older `batdetect2.api` interface. +Use these pages if you need to keep working with the older `batdetect2 detect` command or the older `batdetect2.api` interface. For new projects, we recommend the current workflow: @@ -25,6 +24,5 @@ New users should start with {doc}`../getting_started` and {doc}`../tutorials/ind cli-detect python-api -feature-extraction migration-guide ```