Solutions

We provide machine learning solutions for several multimodal application domains:

ML for Video analysis

We provide solutions that combine machine learning applied on all aspects of video (sound, image and accompanying text), to achieve several video analysis applications:

Video summarization
Video classification and retrieval
Multimodal content-based video / movie recommendation and profiling
Human recognition and human activity recognition

Music information retrieval

We have years' experience in build models for analyzing musical content for various subdomains such as:

Musical genre classification
Recognition of music mood and emotion
Content-based music retrieval and recommendation

Speech analytics

We combine audio and text information from speech signals, to recognize what people say (and how they say it) when interacting between each other for applications such as:

Health and Wellbeing: cognitive decline detection, mental health monitoring, depression estimation
Mood and emotion classification
Speaking style and public speaking quality analysis

Audio (non-speech) Recognition

Mouse vocalizations analytics
Audio event detection
Urban soundscape analysis
Audio context classification

Environment & Bioacoustics

Urban soundscape quality assessment and monitoring
Wildlife and bioacoustic signal analysis
Environmental sound event detection

Defence & Security

Acoustic threat detection using few-shot and self-supervised learning
Adversarial robustness evaluation of audio AI systems
Frugal AI for resource-constrained defence applications

Image retrieval and classification

Semantic image classification and segmentation
Photography aesthetics recognition