Mawf Audio

Affiliation: TikTok/ByteDance
Role: C++ Architect Co-Lead + AI Research Scientist


Mawf is a free plugin which puts the power of realtime neural audio synthesis in your DAW. The plugin can transform audio inputs like singing and other everyday sounds into musical instruments. It is the first pro-audio plugin which can synthesize audio from a machine learning system in real-time. Moreover, the output audio is at CD-quality (48 kHz) compared to the 16 kHz found in other audio ML products. 

What sets Mawf apart from a “traditional” synthesizers is the neural network at the heart of the synthesis engine. We used machine learning (ML) to analyze recordings of professionals playing musical instruments (e.g. a flute). The ML model then extracts and learns expressive changes in the performance at different pitches and amplitudes (e.g. brighter sounds at higher volume). It then uses this information to re-render other sounds as though it was being played on that instrument. For example, a vocal performance can be re-rendered into an equivalent performance on the flute. 


Learn under-the-hood technical details at our Audio Deveoper Conference ADC2022 talk “Realtime Interactive Synthesis with ML” (due to COVID, I could not make the in-person presentation). In short, the jump from an offline + server-based + GPU-based to a streaming realtime + on-device + CPU-based required our engineering team to solve many new problems. For example, we developed methods to ensure an ML model trained at one fixed audio buffer size on GPU is compatible with variable audio buffer sizes on CPU. 


I used Mawf as part of my winning entry to the AI Song Contest in 2022. Because Mawf’s underlying ML model analyzes audio signals directly as frequencies (Hz) instead of MIDI notes, I could use a variety of tunings systems with the plugin. It enabled me to craft new sound scapes based on tunings from Thai classical music. 


