Publications
(* denotes equal contribution)
Preprints
Characterizing and Efficiently Accelerating Multimodal Generation Model Inference
Yejin Lee, Anna Sun, Basil Hosmer, Bilge Acun, Can Balioglu, Changhan Wang, Charles David Hernandez, Christian Puhrsch, Daniel Haziza, Driss Guessous, Francisco Massa, Jacob Kahn, Jeffrey Wan, Jeremy Reizenstein, Jiaqi Zhai, Joe Isaacson, Joel Schlosser, Juan Pino, Kaushik Ram Sadagopan, Leonid Shamis, Linjian Ma, Min-Jae Hwang, Mingda Chen, Mostafa Elhoushi, Pedro Rodriguez, Ram Pasunuru, Scott Yih, Sravya Popuri, Xing Liu, Carole-Jean Wu
Submitted to HPCA 2025 Industry TrackSeamless: Multilingual Expressive and Streaming Speech Translation
Seamless Communication TeamSeamlessM4T—Massively Multilingual & Multimodal Machine Translation
Seamless Communication Team
2025
- Joint speech and text machine translation for up to 100 languages
Seamless Communication Team
Published in Nature Magazine
2024
- Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation
Min-Jae Hwang, Ilia Kulikov, Benjamin Peloquin, Hongyu Gong, Peng-Jen Chen, and Ann Lee
Accepted to Findings of ACL 2024
2022
HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis
Sang-Hoon Lee, Seung-Bin Kim, Ji-Hyun Lee, Eunwoo Song, Min-Jae Hwang, Seong-Whan Lee
Published in NeurIPS 2022Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Hyun-Wook Yoon, Ohsung Kwon, Hoyeon Lee, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim, Min-Jae Hwang
Published in Interspeech 2022TTS-by-TTS 2: Data-selective Augmentation for Neural Speech Synthesis using Ranking Support Vector Machine with Variational Autoencoder
Eunwoo Song, Ryuichi Yamamoto, Ohsung Kwon, Chan-Ho Song, Min-Jae Hwang, Suhyeon Oh, Hyun-Wook Yoon, Jin-Seob Kim, Jae-Min Kim
Published in Interspeech 2022Linear Prediction-based Parallel WaveGAN Speech Synthesis
Min-Jae Hwang, Hyun-Wook Yoon, Chan-Ho Song, Jin-Seob Kim, Jae-Min Kim, Eunwoo Song
Published in ICEIC 2022Effective Data Augmentation Methods for Neural Text-to-Speech Systems
Suhyeon Oh, Ohsung Kwon, Min-Jae Hwang, Jae-Min Kim, Eunwoo Song
Published in ICEIC 2022
2021
High-Fidelity Parallel WaveGAN with Multi-Band Harmonic-Plus-Noise Model
Min-Jae Hwang*, Ryuichi Yamamoto*, Eunwoo Song, Jae-Min Kim
Published in Interspeech 2021 [video]LiteTTS: A Lightweight Mel-Spectrogram-Free Text-to-Wave Synthesizer Based on Generative Adversarial Networks
Huu-Kim Nguyen, Kihyuk Jeong, Seyun Um, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang
Published in Interspeech 2021TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-quality Speech Synthesis
Min-Jae Hwang, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim
Published in ICASSP 2021Parallel Waveform Synthesis based on Generative Adversarial Networks with Voicing-aware Conditional Discriminators
Ryuichi Yamamoto, Eunwoo Song, Min-Jae Hwang, Jae-Min Kim
Published in ICASSP 2021Improved Parallel WaveGAN Vocoder with Perceptually Weighted Spectrogram Loss
Eunwoo Song, Ryuichi Yamamoto, Min-Jae Hwang, Jin-Seob Kim, Ohsung Kwon, Jae-Min Kim
Published in IEEE SLT Workshop 2021
2020
ExcitGlow: Improving a WaveGlow-based Neural Vocoder with Linear Prediction Analysis
Suhyeon Oh, Hyungseob Lim, Kyungguen Byun, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang
Published in APSIPA ASC 2020LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis
Min-Jae Hwang, Frank Soong, Eunwoo Song, Xi Wang, Hyeonjoo Kang, Hong-Goo Kang
Published in APSIPA ASC 2020Neural Text-to-Speech with a Modeling-by-Generation Excitation Vocoder
Eunwoo Song, Min-Jae Hwang, Ryuichi Yamamoto, Jin-Seob Kim, Ohsung Kwon, Jae-Min Kim
Published in Interspeech 2020Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network
Min-Jae Hwang, Eunwoo Song, Ryuichi Yamamoto, Frank Soong, Hong-Goo Kang
Published in ICASSP 2020
2019
- Parameter Enhancement for MELP Speech Codec in Noisy Communication Environment
Min-Jae Hwang, Hong-Goo Kang
Published in Interspeech 2019
2018
A Unified Framework for the Generation of Glottal Signals in Deep Learning-based Parametric Speech Synthesis Systems
Min-Jae Hwang, Eunwoo Song, Jin-Seob Kim, Hong-Goo Kang
Published in Interspeech 2018Modeling-by-Generation-structured Noise Compensation Algorithm for Glottal Vocoding Speech Synthesis System
Min-Jae Hwang, Eunwoo Song, Kyungguen Byun, Hong-Goo Kang
Published in ICASSP 2018
2017
- SVD-based Adaptive QIM Watermarking on Stereo Audio Signals
Min-Jae Hwang, JeeSok Lee, MiSuk Lee, Hong-Goo Kang
Published in Proceedings of IEEE Transactions on Multimedia (3.977 IF at 2017)