SEED-MUSIC — yaboihanoi ญาบอยฮานอย

AI RESEARCH AI MUSIC NEWSFEED RESUME SHOWS

SEED-MUSIC

Affiliation: TikTok/ByteDance

Role: Senior AI Research Scientist. Core architect of the Audio Tokenizer/Vector Quantization algorithm.

Links:
SEED-MUSIC Homepage
Technical Paper

SEED-MUSIC is a suite of music generation and editing systems designed to produce high-quality music with fine-grained style control. SEED-MUSIC is a unified framework which leverages both auto-regressive language modeling and diffusion approaches to support two key music creation workflows: controlled music generation and post-production editing.

SEED-MUSIC supports a wide range of music generation tasks and readers are highly recommended to listen to demos on the official project page. From a high level, the system can do:

Shortform audio generation: Generate 30s-60s tracks with expressive vocals in multiple languages across a range of styles. Generation follows the provided lyrics exactlty.

Longform audio generation: Generate full length tracks which maintain melodic, rhythmic and genre coherence across 3-5 minutes.

Audio Prompting: Users can provide a reference audio track to control the style of generated audio or directly “continue” the track.

Instrumental Music Generation: Instrumental music generation can be generalized as a sub-task of vocal music generation without vocals and lyrics.

Leadsheet Token Generation: Leadsheet tokens are musically interpretable units which contain information about note pitch, note duration, lyric phonemes and instrumentation. Like MIDI, it can be interepreted and directly edited by a musician and has been optimized for use with LLM and diffusion audio workflows. Instead of directly generating from language descriptions, SEED-MUSIC can generate music defined explicitly by by leadsheet tokens.

Non-destructive Audio Post-Processing and Editing: A full mix of music can be post-processed in two ways - change the lyrics without altering the melody and changing the melody without altering the lyrics. Other elements of the full mix are left unaltered.

Press Coverage

Seed-Music 音乐大模型正式发布！生成编辑两开花，十种创作任务，满足多样化需求

(Translation: Announcing the official launch of Seed-Music, a next-gen music AI model! It combines powerful composition and editing tools across ten creative tasks to meet a diverse range of musical needs.)

字节音乐大模型炸场！Seed-Music发布，支持一键生成高质量歌曲、片段编辑等

(Translation: ByteDance’s blockbuster music model drops with a bang! Seed-Music enables one-click high-quality song generation, clip editing, and more.)

豆包招聘速递｜豆包大模型 Speech 团队热招中

(Translation: Our Large-Model Speech Team at Doubao SEED Is Now Hiring! Check our latest research: SEED-MUSIC)