Andrew Gibiansky's Blog

Recent posts

Streaming Audio Synthesis

The naive approach to streaming audio synthesis using deep neural networks is to break up the input into chunks and then run synthesis on each chunk. Unfortunately, this introduces wasted computation and discontinuities. In this blog post, I present a simple and robust alternative.

Full Story

Brainstorming: Neural Transducers for Speech Synthesis

Neural transducers are commonly used for automatic speech recognition (ASR), often achieving state-of-the-art results for quality and inference speech; for instance, they power Google's offline ASR engine. In this post, I'd like to

Full Story

PQMF: Sub-band Coding for Neural Vocoders (Part 2)

This is a continuation of Part 1 of this two-part series. In this post, I'll try to go over the implementation of PQMF filters in sufficient detail such that you'll be able to

Full Story

PQMF: Sub-band Coding for Neural Vocoders (Part 1)

In the past year or so, there's been several papers that investigate using sub-band coding with neural vocoders to model audio and accelerate inference: FFTNet with sub-band codingWaveNet with sub-band codingDurIan TTS System

Full Story

Facebook's Knowledge-Assisted NLP

A deep dive into several Facebook publications about knowledge-augmented language tasks, such as question answering and entity linking.

Full Story

DiffWave and WaveGrad: Theory (Part 2)

In this post, I'll derive the equations for DiffWave and WaveGrad using diffusion probabilistic processes.

Full Story