Audio Analysis V6 Classifier

The Cyanite API exposes a variety of classifiers for your music.

Mood#

The mood multi-label classifier provides the following labels:

aggressive, calm, chilled, dark, energetic, epic, happy, romantic, sad, scary, sexy, ethereal, uplifting

Each label has a score reaching from 0-1, where 0 (0%) indicates that the track is unlikely to represent a given mood and 1 (100%) indicates a high probability that the track represents a given mood.

Since the mood of a track might not always be properly described by a single tag, the mood classifier is able to predict multiple moods for a given song instead of only one. A track could be classified with dark (Score: 0.9), while also being classified with aggressive (Score: 0.8).

The mood can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution. In addition the score the API also exposes a list which includes the most likely moods, or the term ambiguous in case of none of the audio not reflecting any of our mood tags properly.

Genre#

The genre multi-label classifier provides the following labels:

ambient, blues, classical, country, electronicDance, folk, indieAlternative, jazz, latin, metal, pop, punk, rapHipHop, reggae, rnb, rock, singerSongwriter

Each label has a score reaching from 0-1 where 0 (0%) indicates that the track is unlikely to represent a given genre and 1 (100%) indicates a high probability that track represents a given genre.

Since music could break genre borders the genre classifier can predict multiple genres for a given song instead of only predicting one genre. A track could be classified with rapHipHop (Score: 0.9) but also reggae (Score: 0.8).

The genre can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution. In addition the score the API also exposes a list which includes the most likely genres.

EDM Sub-Genre#

In case a track's genre got classified as electronicDance, the EDM sub-genre classifier is available for going to a deeper analysis layer, applying the following labels for edm sub-genres:

breakbeatDrumAndBass, deepHouse, electro, house, minimal, techHouse, techno, trance

Each label has a score reaching from 0-1 where 0 (0%) indicates that the track is unlikely to represent a given sub-genre and 1 (100%) indicates a high probability that track represents a given sub-genre.

The EDM sub-genre can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution. In addition the score the API also exposes a list which includes the most likely EDM sub-genres.

Voice#

The voice classifier categorizes the audio as female or male singing voice or instrumental (non-vocal).

Each label has a score reaching from 0-1 where 0 (0%) indicates that the track is unlikely to have the given voice elements and 1 (100%) indicates a high probability that track contains the given voice elements.

The voice classifier results can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution.

Voice Presence Profile#

This label describes the amount of singing voice throughout the full duration of the track and may be none, low, medium or high.

Predominant Voice Gender#

This label indicates if the predominant singing voice holds more likely female or male characteristics. It may be none if no singing voice is detected.

Instruments#

The instrument classifier currently only predicts the presence of a percussive instrument, such as drums or drum machines or similar. The result is displayed under the label of percussion.

The label has a score reaching from 0-1 where 0 (0%) indicates that the track is unlikely to contain a given instrument and 1 (100%) indicates a high probability that track contains a given instrument.

The instrument classifier result can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution.

Valence / Arousal#

The valence / arousal regression model predicts the degree of valence or arousal of a track.

Each label has a score reaching from -1 to 1 where -1 indicates the lowest degree (negative valence, negative arousal) and 1 indicated the highest degree (positive valence, positive arousal).

The voice valence / arousal results can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution.

Energy Level#

The Energy Level is a label for the intensity of an analysed track which can be either variable, medium, high, low.

A low Energy Level indicates a calm overall appearance of a track while a high one stands for more strong and powerful characteristics. A track with a variable energy level will hold steady changes in its intensity profile.

Energy Dynamics#

Energy Dynamics describes the progression of the Energy Level throughout the duration of the music piece, where the value low represents a stable trend and high depicts a strong variance between low and high energy levels. The high value indicates variable energy level (see above).

Musical Era#

The musical era classifier describes the era the audio was likely produced in, or which the sound of production suggests.

Keywords (Experimental)#

Experimental taxonomy that can be associated with the audio. The data is experimental and expected to change. The access must be requested from the cyanite sales team.

Example keywords:

uplifting, edm, friendly, motivating, pleasant, happy, energetic, joy, bliss, gladness, auspicious, pleasure, forceful, determined, confident, positive, optimistic, agile, animated, journey, party, driving, kicking, impelling, upbeat,