MegaDetector Pipeline Setup Instructions
Prerequisites
- You must have access to the shared Azure Machine Learning (AML) workspace in the same subscription,( currently is ns-ii-tech-stg-ml-workspace).
- The following datastores are and should be configured:
landing_kutuma_hashed
: Contains raw images.ml_public_models
: Stores the MegaDetector model (currently:md_v5b.0.0.pt
).bronze_camera_trap
: For processed results.bronze_megadetector
: For detection results.
Setting Up the Pipeline in Azure Machine Learning Studio
1. Log into Azure Machine Learning Studio
Go to Azure Machine Learning Studio and select the workspace in your subscription.
2. Verify Datastores
Ensure that the necessary datastores are set up:
landing_kutuma_hashed
ml_public_models
bronze_camera_trap
bronze_megadetector
3. Verify Pipeline Configuration
- The current pipeline is named
MegaDetector-NaturalState-RBP-Pipeline
and processes images in phases. - Phases 1-3 handle ingestion, detection, and post-processing of camera trap images.
4. Running the Pipeline
- Open the pipeline in AML Studio.
- Verify the input data directory under
landing_kutuma_hashed
. - Set the output locations in
bronze_megadetector
andbronze_camera_trap
. - Click Submit to start the pipeline.
5. Monitor Execution
Once submitted, monitor the pipeline’s progress under Experiments. Detailed logs will be available for each step of the pipeline.
Pipeline Deployment
The MegaDetector pipeline is deployed and managed primarily via a scheduled endpoint. Documentation on how to deploy this endpoint using Terraform is available in the NIP-Lakehouse-Infra repository.
For further details on setting up the infrastructure required for the MegaDetector pipeline, consult the Terraform configuration.