BirdNet Pipeline Setup Instructions
Prerequisites
- Access to the shared Azure Machine Learning (AML) workspace in the same subscription.
- The following datastores should be configured:
landing_kutuma_hashed
: Contains raw audio recordings.ml_public_models
: Stores the BirdNet model.bronze_audio_data
: For processed and intermediate results.
Setting Up the Pipeline in Azure Machine Learning Studio
1. Log into Azure Machine Learning Studio
Go to Azure Machine Learning Studio and select the workspace in your subscription.
2. Verify Datastores
Ensure that the necessary datastores are set up:
landing_kutuma_hashed
ml_public_models
bronze_audio_data
3. Verify Pipeline Configuration
The current pipeline is named BirdNet-Natural-State-RBP-Pipeline and processes audio recordings in phases:
- Phases 1-3 handle ingestion, analysis, post-processing, and archiving of audio data.
4. Running the Pipeline
- Open the pipeline in AML Studio.
- Verify the input data directory under
landing_kutuma_hashed
. - Set the output locations in
bronze_audio_data
. - Click Submit to start the pipeline.
5. Monitor Execution
Once submitted, monitor the pipeline’s progress under Experiments. Detailed logs will be available for each step of the pipeline.
Pipeline Deployment
The BirdNet pipeline is deployed and managed primarily via a scheduled endpoint. Documentation on how to deploy this endpoint using Terraform is available in the NIP-Lakehouse-Infra repository.
For further details on setting up the infrastructure required for the MegaDetector pipeline, consult the Terraform configuration.