Skip to main content

Audio Analysis V6 Classifier

The Cyanite API exposes a variety of classifiers for your music.

BPM#

The BPM classifier provides you the BPM of the track.

Key#

The Key classifier provides you with the predicted key.

Mood#

The mood multi-label classifier provides the following labels:

aggressive, calm, chilled, dark, energetic, epic, happy, romantic, sad, scary, sexy, ethereal, uplifting

Each label has a score ranging from 0-1, where 0 (0%) indicates that the track is unlikely to represent a given mood and 1 (100%) indicates a high probability that the track represents a given mood.

Since the mood of a track might not always be properly described by a single tag, the mood classifier is able to predict multiple moods for a given song instead of only one. A track could be classified with dark (Score: 0.9), while also being classified with aggressive (Score: 0.8).

The mood can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution. In addition to the score, the API also exposes a list that includes the most likely moods, or the term ambiguous in case of none of the audio not reflecting any of our mood tags properly.

Genre#

The genre multi-label classifier provides the following labels:

ambient, blues, classical, country, electronicDance, folk, indieAlternative, jazz, latin, metal, pop, punk, rapHipHop, reggae, rnb, rock, singerSongwriter

Each label has a score ranging from 0-1 where 0 (0%) indicates that the track is unlikely to represent a given genre and 1 (100%) indicates a high probability that track represents a given genre.

Since music could break genre borders the genre classifier can predict multiple genres for a given song instead of only predicting one genre. A track could be classified with rapHipHop (Score: 0.9) but also reggae (Score: 0.8).

The genre can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution. In addition to the score, the API also exposes a list that includes the most likely genres.

Sub-genre#

For some tracks an additional sub-genre can be predicted. Possible sub-genres include:

bluesRock, folkRock, hardRock, indieAlternative, psychedelicProgressiveRock, punk, rockAndRoll, popSoftRock, abstractIDMLeftfield, breakbeatDnB, deepHouse, electro, house, minimal, synthPop, techHouse, techno, trance, contemporaryRnB, gangsta, jazzyHipHop, popRap, trap, blackMetal, deathMetal, doomMetal, heavyMetal, metalcore, nuMetal, disco, funk, gospel, neoSoul, soul, bigBandSwing, bebop, contemporaryJazz, easyListening, fusion, latinJazz, smoothJazz, country, folk

Each label has a score ranging from 0-1 where 0 (0%) indicates that the track is unlikely to represent a given sub-genre and 1 (100%) indicates a high probability that track represents a given sub-genre.

The sub-genre can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution. In addition to the score, the API also exposes a list that includes the most likely sub-genres.

caution

Some tracks don't have any sub-genre. In this case the sub-genre tags is an empty array and averaged segments values are unavailable.

Voice#

The voice classifier categorizes the audio as female or male singing voice or instrumental (non-vocal).

Each label has a score ranging from 0-1 where 0 (0%) indicates that the track is unlikely to have the given voice elements and 1 (100%) indicates a high probability that track contains the given voice elements.

The voice classifier results can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution.

Voice Presence Profile#

This label describes the amount of singing voice throughout the full duration of the track and may be none, low, medium or high.

Predominant Voice Gender#

This label indicates if the predominant singing voice holds more likely female or male characteristics. It may be none if no singing voice is detected.

Instruments#

The instrument classifier predicts the presence of the following instruments: percussion, synth, piano, acousticGuitar, electricGuitar, strings, bass, bassGuitar and brassWoodwinds.

It is possible to retrieve the presence of each instrument for each track segment, a list of the dominant instruments and a taxonomy that describes the presence of each instrument over the complete track.

The segment instrument score reaches from 0-1 where 0 (0%) indicates that the segment is unlikely to contain a given instrument and 1 (100%) indicates a high probability that the track segment contains a given instrument.

The taxonomy absent, partially, frequently and throughout describe the presence of each instrument

TaxonomyDescription
absentInstrument has not been detected
throughoutInstrument is detected throughout the full duration of the track
frequentlyInstrument is detected in major parts of the track
partiallyInstrument is detected in minor parts of the track.

Valence / Arousal#

The valence / arousal regression model predicts the degree of valence or arousal of a track.

Each label has a score ranging from -1 to 1 where -1 indicates the lowest degree (negative valence, negative arousal) and 1 indicates the highest degree (positive valence, positive arousal).

The voice valence / arousal results can be retrieved both averaged over the whole track and segment-wise over time with 15s temporal resolution.

Energy Level#

The Energy Level is a label for the intensity of an analysed track which can be either variable, medium, high, low.

A low Energy Level indicates a calm overall appearance of a track while a high one stands for more strong and powerful characteristics. A track with a variable energy level will hold steady changes in its intensity profile.

Energy Dynamics#

Energy Dynamics describes the progression of the Energy Level throughout the duration of the music piece, where the value low represents a stable trend and high depicts a strong variance between low and high energy levels. The high value indicates variable energy level (see above).

Musical Era#

The musical era classifier describes the era the audio was likely produced in, or which the sound of production suggests.

Keywords (Experimental)#

Experimental taxonomy that can be associated with the audio. The data is experimental and expected to change. The access must be requested from the cyanite sales team.

Example keywords:

uplifting, edm, friendly, motivating, pleasant, happy, energetic, joy, bliss, gladness, auspicious, pleasure, forceful, determined, confident, positive, optimistic, agile, animated, journey, party, driving, kicking, impelling, upbeat,