Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Foundations of Audio Classification
- Types of sound events: environmental, mechanical, and human-generated
- Overview of use cases: surveillance, monitoring, and automation
- Differences between audio classification, detection, and segmentation
Audio Data and Feature Extraction
- Types of audio files and common formats
- Considerations for sampling rate, windowing, and frame size
- Extracting MFCCs, chroma features, and mel-spectrograms
Data Preparation and Annotation
- Utilising datasets such as UrbanSound8K, ESC-50, and custom datasets
- Labeling sound events and defining temporal boundaries
- Strategies for balancing datasets and augmenting audio data
Building Audio Classification Models
- Applying convolutional neural networks (CNNs) to audio data
- Model inputs: raw waveforms versus extracted features
- Selecting loss functions, evaluation metrics, and mitigating overfitting
Event Detection and Temporal Localisation
- Strategies for frame-based and segment-based detection
- Refining detections through thresholds and smoothing techniques
- Visualising predictions on audio timelines
Advanced Topics and Real-Time Processing
- Employing transfer learning for scenarios with limited data
- Deploying models using TensorFlow Lite or ONNX
- Handling streaming audio processing and latency considerations
Project Development and Application Scenarios
- Designing a comprehensive pipeline from data ingestion to classification
- Developing a proof-of-concept for surveillance, quality control, or monitoring
- Implementing logging, alerting, and integration with dashboards or APIs
Summary and Next Steps
Requirements
- A solid understanding of machine learning concepts and model training processes
- Practical experience with Python programming and data pre-processing
- Familiarity with the fundamentals of digital audio
Target Audience
- Data scientists
- Machine learning engineers
- Researchers and developers specialising in audio signal processing
21 Hours