Developer Guidelines for BirdNet Custom Training Pipeline
Current Configuration
Training Script: The script used for training is train.py.
Model Output: The trained model is stored in BirdNet_custom_models/Stage1/ in the customaudiodata datastore.
Environment: The training runs in a Docker environment birdnet-training-env, which is built from the Dockerfile at environments/birdnet-training-env.
Modifying the Pipeline
Changing the Training Data
Upload new training data to the customaudiodata datastore.
Update the input path in the pipeline configuration to point to the new dataset.
Changing Model Hyperparameters
To adjust the training process (e.g., changing epochs, batch size, or learning rate), modify the pipeline arguments: json { "--epochs": 100, "--batch_size": 32, "--learning_rate": 0.01 }
Modifying Compute Resources
The pipeline currently uses a GPU cluster for training. If needed, you can change the compute configuration in the pipeline settings in AML Studio.
Testing Changes
After modifying the pipeline, run tests in a development environment to ensure everything works as expected before deploying the updated pipeline.