Publications
(* denotes equal contribution)
Preprints
Seamless: Multilingual Expressive and Streaming Speech Translation
Seamless CommunicationSeamlessM4T—Massively Multilingual & Multimodal Machine Translation
Seamless Communication
2024
- Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation
Min-Jae Hwang, Ilia Kulikov, Benjamin Peloquin, Hongyu Gong, Peng-Jen Chen, and Ann Lee
Accepted to Findings of ACL 2024
2022
HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis
Sang-Hoon Lee, Seung-Bin Kim, Ji-Hyun Lee, Eunwoo Song, Min-Jae Hwang, Seong-Whan Lee
Published in NeurIPS 2022Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Hyun-Wook Yoon, Ohsung Kwon, Hoyeon Lee, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim, Min-Jae Hwang
Published in Interspeech 2022TTS-by-TTS 2: Data-selective Augmentation for Neural Speech Synthesis using Ranking Support Vector Machine with Variational Autoencoder
Eunwoo Song, Ryuichi Yamamoto, Ohsung Kwon, Chan-Ho Song, Min-Jae Hwang, Suhyeon Oh, Hyun-Wook Yoon, Jin-Seob Kim, Jae-Min Kim
Published in Interspeech 2022Linear Prediction-based Parallel WaveGAN Speech Synthesis
Min-Jae Hwang, Hyun-Wook Yoon, Chan-Ho Song, Jin-Seob Kim, Jae-Min Kim, Eunwoo Song
Published in ICEIC 2022Effective Data Augmentation Methods for Neural Text-to-Speech Systems
Suhyeon Oh, Ohsung Kwon, Min-Jae Hwang, Jae-Min Kim, Eunwoo Song
Published in ICEIC 2022
2021
High-Fidelity Parallel WaveGAN with Multi-Band Harmonic-Plus-Noise Model
Min-Jae Hwang*, Ryuichi Yamamoto*, Eunwoo Song, Jae-Min Kim
Published in Interspeech 2021 [video]LiteTTS: A Lightweight Mel-Spectrogram-Free Text-to-Wave Synthesizer Based on Generative Adversarial Networks
Huu-Kim Nguyen, Kihyuk Jeong, Seyun Um, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang
Published in Interspeech 2021TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-quality Speech Synthesis
Min-Jae Hwang, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim
Published in ICASSP 2021Parallel Waveform Synthesis based on Generative Adversarial Networks with Voicing-aware Conditional Discriminators
Ryuichi Yamamoto, Eunwoo Song, Min-Jae Hwang, Jae-Min Kim
Published in ICASSP 2021Improved Parallel WaveGAN Vocoder with Perceptually Weighted Spectrogram Loss
Eunwoo Song, Ryuichi Yamamoto, Min-Jae Hwang, Jin-Seob Kim, Ohsung Kwon, Jae-Min Kim
Published in IEEE SLT Workshop 2021
2020
ExcitGlow: Improving a WaveGlow-based Neural Vocoder with Linear Prediction Analysis
Suhyeon Oh, Hyungseob Lim, Kyungguen Byun, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang
Published in APSIPA ASC 2020LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis
Min-Jae Hwang, Frank Soong, Eunwoo Song, Xi Wang, Hyeonjoo Kang, Hong-Goo Kang
Published in APSIPA ASC 2020Neural Text-to-Speech with a Modeling-by-Generation Excitation Vocoder
Eunwoo Song, Min-Jae Hwang, Ryuichi Yamamoto, Jin-Seob Kim, Ohsung Kwon, Jae-Min Kim
Published in Interspeech 2020Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network
Min-Jae Hwang, Eunwoo Song, Ryuichi Yamamoto, Frank Soong, Hong-Goo Kang
Published in ICASSP 2020
2019
- Parameter Enhancement for MELP Speech Codec in Noisy Communication Environment
Min-Jae Hwang, Hong-Goo Kang
Published in Interspeech 2019
2018
A Unified Framework for the Generation of Glottal Signals in Deep Learning-based Parametric Speech Synthesis Systems
Min-Jae Hwang, Eunwoo Song, Jin-Seob Kim, Hong-Goo Kang
Published in Interspeech 2018Modeling-by-Generation-structured Noise Compensation Algorithm for Glottal Vocoding Speech Synthesis System
Min-Jae Hwang, Eunwoo Song, Kyungguen Byun, Hong-Goo Kang
Published in ICASSP 2018
2017
- SVD-based Adaptive QIM Watermarking on Stereo Audio Signals
Min-Jae Hwang, JeeSok Lee, MiSuk Lee, Hong-Goo Kang
Published in Proceedings of IEEE Transactions on Multimedia (3.977 IF at 2017)