InfoMasker: Preventing Eavesdropping Using Phoneme-Based Noise

InfoMasker is a highly effective anti-eavesdropping system. It can jam nearby microphones with human-inaudible ultrasound noise. The recorded noise induced by InfoMasker can effectively mislead state-of-the-art (SOTA) auto-speech-recognition (ASR) systems. Besides, the noise is sufficiently robust against various speech enhancement methods, such as noise reduction and speech separation.

Audio Examples

Tips:
★ We recommend not listening to the origin audio samples and not reading the transcript before listening to the noisy audios. The priori knowledge will weaken the jamming effectiveness of these noises for human ear.

★ We recommend using headphones instead of external speakers to listen to these audio examples.

★ The denoised noisy audios are used to demonstrate the robustness of these noises against SOTA speech enhancement methods.

Digital Domain Example 1

Noise Type	Noisy Audio (SNR=0)	Denoised Noisy Audio
Our Noise
White Noise
1 Series Speech Noise
2 Series Speech Noise
3 Series Speech Noise

Original Audio
(2300-131720-0010 in Librispeech test-clean)

The following figure shows the spectrogram of the original audio, the denoised noisy audio jammed by our noise, and the denoised noisy audio jammed by the 3-series speech noise respectively. Compared to the speech noise, our noise retains more noise components after denoising, which reveals the greater robustness of our noise.

Our Noise After Denoising
Origin Audio
3 Series Speech Noise After Denoising

Digital Domain Example 2

Noise Type	Noisy Audio (SNR=0)	Denoised Noisy Audio
Our Noise
White Noise
1 Series Speech Noise
2 Series Speech Noise
3 Series Speech Noise

Original Audio
(4992-41806-0013 in Librispeech test-clean)

Digital Domain Example 3

To better illustrate the effectiveness and robustness of our noise, we compare our noise with speech noise in different SNRs. The following audios are denoised noisy audios jammed by different types of noise. Our noise performs much better than others in lower SNRs.