Hieu-Thi Luong

Email: contact (at) hieuthi.com

CV: pdf (Last updated: 2023-02-01)

I received my Ph.D. degree in Multidisciplinary Science in 2020 from SOKENDAI, Japan and currently is a Research Fellow at Nanyang Technological University, Singapore. My works focus on researching and developing novel solutions for Speech and Language Processing Systems including Automatic Speech Recognition, Speech Synthesis, Fake Speech Detection, etc. I'm interested in Speech Processing, Machine Learning and Natural Language Processing in general.

I also do programming and drawing as hobbies. More below.

For inquiries about research, technology, education, or something else, you can contact me via the email listed above.

Blog posts [read more]

Articles written in English

⅓ espresso [read more]

Fleeting notes written in Vietnamese

Selected publications

Controlling Multi-Class Human Vocalization Generation via a Simple Segment-based Labeling Scheme

Hieu-Thi Luong, Junichi Yamagishi

Interspeech 2023

Samples Paper



LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example

Hieu-Thi Luong, Junichi Yamagishi

arXiv manuscript

Samples Code Preprint

Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance

Hieu-Thi Luong, Junichi Yamagishi

Speech Synthesis Workshop 2021 (SSW11)

Samples Slide Preprint


Latent linguistic embedding for cross-lingual text-to-speech and voice conversion

Hieu-Thi Luong, Junichi Yamagishi

VCC2020 Workshop

Samples Preprint

Deep learning based voice cloning framework for a unified system of text-to-speech and voice conversion (Ph.D. thesis)

Hieu-Thi Luong

Ph.D. thesis, 2020

Preprint Thesis

NAUTILUS: a Versatile Voice Cloning System

Hieu-Thi Luong, Junichi Yamagishi

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Samples Paper


Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech

Hieu-Thi Luong, Junichi Yamagishi

ASRU 2019

Samples Poster Preprint

A Unified Speaker Adaptation Method for Speech Synthesis using Transcribed and Untranscribed Speech with Backpropagation

Hieu-Thi Luong, Junichi Yamagishi

arXiv manuscript

Samples Preprint

Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora

Hieu-Thi Luong, Xin Wang, Junichi Yamagishi, Nobuyuki Nishizawa

Interspeech 2019

Samples Preprint


Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems

Hieu-Thi Luong, Junichi Yamagishi

SLT 2018

Samples Poster Preprint

Multimodal Speech Synthesis Architecture for Unsupervised Speaker Adaptation

Hieu-Thi Luong, Junichi Yamagishi

Interspeech 2018

Samples Poster Preprint

Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects

Hieu-Thi Luong, Xin Wang, Junichi Yamagishi, Nobuyuki Nishizawa

Interspeech 2018



Adapting and Controlling DNN-based Speech Synthesis using Input Codes

Hieu-Thi Luong, Shinji Takaki, Gustav Eje Henter, Junichi Yamagishi


Samples Preprint


A non-expert Kaldi recipe for Vietnamese Speech Recognition System

Hieu-Thi Luong, Hai-Quan Vu


Corpus Paper




Learning device, learning method, voice synthesis device, voice synthesis method and program

Inventor: Hieu-Thi Luong, Junichi Yamagishi

P7109071 · Issued Jul 29, 2022



I'm drawing and sketching in free time. Find me on Instagram

Side projects


Pixels Touch

Android drawing application optimized for pixel art and touch screen devices




Utility extracts vocabulary from text for learning English. Present in printer-friendly format


ABC Notation Editor

An online editor for ABC Notation format for edit, play, and print music sheets.