Min-Jae Hwang
I am a research scientist at Meta AI. My area of expertise lies in high-quality and expressive speech generation, which has applications in Text-to-Speech (TTS) and Speech-to-Speech Translation (S2ST). In this field, I’ve contributed to various projects like Clova Dubbing and Seamless Communication.
Prior to Meta AI, I was a research scientist at Naver Corporation. I received my Ph.D. degree in department of Electrical and Electronics at Yonsei University. During my Ph.D. course, I was fortunate to have research experiences as an intern at Microsoft Research Asia and Naver Corporation.
I’m open to learn new knowledge and enjoy applying them to solve our society’s real-world problems. If you are interested in me, feel free to contact me.
Download my CV
NEWS!
5/2024 : One paper1 has been accepted to Findings of ACL 2024.
5/2024 : I joined Meta AI, Seattle, USA as Research Scientist!
2/2024 : I gave guest lectures at BishBash 2024 event for the topic of expressive S2ST.
11/2023 : We launched Seamless, a new family of AI translation models that preserve expression and deliver near-real time streaming translations.
11/2023 : SeamlessM4T was recognized by TIME magazine among the best inventions of 2023!
8/2023 : We launched SeamlessM4T, a foundational multilingual and multitask model that seamlessly translates and transcribes across speech and text.
Research Interests
-
Speech-to-speech translation (S2ST)
- Expressive S2ST system
-
Text-to-speech (TTS) synthesis
- High-quality and real-time waveform generation method
- Expressive and emotional TTS system
Recent Publications
Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation
Min-Jae Hwang, Ilia Kulikov, Benjamin Peloquin, Hongyu Gong, Peng-Jen Chen, and Ann Lee
Accepted to Findings of ACL 2024Seamless: Multilingual Expressive and Streaming Speech Translation
Seamless CommunicationSeamlessM4T—Massively Multilingual & Multimodal Machine Translation
Seamless CommunicationHierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis
Sang-Hoon Lee, Seung-Bin Kim, Ji-Hyun Lee, Eunwoo Song, Min-Jae Hwang, Seong-Whan Lee
Published in NeurIPS 2022Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Hyun-Wook Yoon, Ohsung Kwon, Hoyeon Lee, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim, Min-Jae Hwang
Published in Interspeech 2022