ElevenLabs releases Music v2 with section-by-section composition, sound-effects integration, and up to 50% price cuts

ListenMay 26, 2026published Jun 7, 2026

ElevenLabs on May 26, 2026, released Music v2, the next version of its AI music generation model. The update adds genre-transition support, embedded sound effects, an improved multilingual engine, and a major pricing reduction that cuts Music v1 and Music v2 costs by up to 50 percent for ElevenAPI developers and up to 40 percent for ElevenCreative self-serve users.

What's new

Music v2 introduces capabilities that were not available in the previous version:

Genre flexibility: A single track can move between radically different styles — from opera to heavy metal — while maintaining musical coherence throughout.
Sound-effects integration: Non-musical sound effects can now be embedded directly within a generated track without breaking its structure.
Dense vocal support: The model handles fast rap and dense lyrical delivery more reliably.
Advanced inpainting: Composers can regenerate specific sections — a bridge, a verse, an outro — without altering the rest of the track.
Improved multilingual support: More reliable lyric and vocal generation across languages.
Section-by-section composition: Full songs can be built incrementally (intro, verse, chorus, outro), giving creators finer structural control.

Pricing was cut alongside the model launch. Music v1 and v2 are now up to 50 percent cheaper on ElevenAPI and up to 40 percent cheaper on ElevenCreative for self-serve customers. The model is trained exclusively on licensed data and cleared for commercial use with no sync fees or clearance restrictions. Music v2 is available now on ElevenMusic and ElevenCreative; ElevenAPI access is coming soon.

Context

ElevenLabs built its reputation on voice synthesis and text-to-speech but has expanded steadily into music and dubbing through its ElevenMusic, ElevenCreative, and ElevenProductions platforms. The company crossed $500M in annualized recurring revenue in May 2026 and has drawn investment from BlackRock, NVIDIA, and others.

AI music generation has grown into a competitive space alongside Suno, Udio, and Google's MusicFX. ElevenLabs has differentiated by emphasizing licensed training data and commercial clearance — a meaningful distinction as copyright questions have clouded competitors.

Music v2 follows ElevenLabs' May 28 release of Dubbing v2, which preserves speaker emotion across 90-plus languages, suggesting a broader push to stake out leadership across audio AI product categories.

Why it matters

The 50 percent ElevenAPI price cut lowers the barrier for developers building audio applications on top of ElevenLabs infrastructure. For indie developers and small studios, the cost reduction alone makes production-quality AI music generation materially more accessible.

The technical advances around genre transitions and embedded sound effects move AI music generation closer to genuine compositional control. Rather than generating a single style-consistent audio clip, composers can now direct the arc of a piece — shifting genres mid-track, embedding ambient sound effects, and editing sections non-destructively. That kind of granular control matters for narrative audio experiences, game soundtracks, and branded content workflows where uniformity is a limitation rather than a feature.

The commercial-use clearance removes a persistent obstacle for professional use cases. Brands and studios have been cautious about AI music tools whose training data provenance is unclear. A model trained exclusively on licensed data with no sync fees or clearance restrictions sidesteps the legal uncertainty that has slowed enterprise adoption elsewhere.

Corroborating sources

Elevenlabs
https://elevenlabs.io/blog/introducing-music-v2
“A single song can move from opera to heavy metal and back, sustain fast rap and dense lyrical delivery, and embed non-musical sound effects directly within the track, all without breaking musical coherence.”

What's new

Music v2 introduces capabilities that were not available in the previous version:

Genre flexibility: A single track can move between radically different styles — from opera to heavy metal — while maintaining musical coherence throughout.

Sound-effects integration: Non-musical sound effects can now be embedded directly within a generated track without breaking its structure.

Dense vocal support: The model handles fast rap and dense lyrical delivery more reliably.

Advanced inpainting: Composers can regenerate specific sections — a bridge, a verse, an outro — without altering the rest of the track.

Improved multilingual support: More reliable lyric and vocal generation across languages.

Section-by-section composition: Full songs can be built incrementally (intro, verse, chorus, outro), giving creators finer structural control.

Context

Why it matters