BirdNet Custom Training Pipeline Architecture

Overview

The BirdNet Custom Training pipeline is designed to process large datasets of acoustic data, train a custom BirdNet classifier, and store the resulting model in Azure Blob Storage.

Key Components

  1. Datastore:
    • customaudiodata: Stores both the raw training data and the output custom model.
  2. Pipeline Steps:
    • Training Step:
      • The script train.py located in birdnet_scripts/ is executed.
      • Input acoustic training data is taken from BirdNET_training_datasets/Stage1/Training/.
      • The trained model is saved to BirdNet_custom_models/Stage1/.
  3. Pipeline Details:
    • Pipeline Name: BirdNet-Custom-Training-Pipeline
    • Training Script: train.py
    • Compute Resources: The pipeline runs on a GPU Cluster to efficiently train the BirdNet model.
    • Environment: birdnet-training-env, built from the provided Dockerfile (environments/birdnet-training-env).

Model Training Parameters

  • Epochs: 100
  • Batch Size: 32
  • Learning Rate: 0.01

The pipeline is designed to be scalable and customizable, allowing changes to the model, training data, or compute resources.