Just released:
Real-timeSpeech Enhancement on Raw Signals with Deep State-space ModelingYan Ru Pei∗, Ritik Shrivastava†, Sidharth‡ Brainchip Inc. Laguna Hills, USA [email protected]∗, [email protected]†, [email protected]‡ arXiv:2409.03377v1 [cs.SD] 5 Sep 2024
Abstract—We present aTENNuate, a simple deep state-space autoencoder configured for efficient online raw speech enhancement in an end-to-end fashion. The network’s performance is primarily evaluated on raw speech denoising, with additional assessments on tasks such as super-resolution and de-quantization. We benchmark aTENNuate on the VoiceBank + DEMAND and the Microsoft DNS1 synthetic test sets. The network outperforms previous real-time denoising models in terms of PESQ score, parameter count, MACs, and latency. Even as a raw waveform processing model, the model maintains high fidelity to the clean signal with minimal audible artifacts. In addition, the model remains performant even when the noisy input is compressed down to 4000Hz and 4 bits, suggesting general speech enhancement capabilities in low-resource environments.
VI. CONCLUSION We introduced a light weight deep state-space autoencoder, aTENNuate, that can perform raw audio denoising, super resolution, and de-quantization. Compared to previous works, the key features of this network are:
1) consisting of state space layers that can be efficiently trained and configured for inference,
2)allowing for real-time inference with low latency,
3)architecturally simple and light in parameters and MACs,
4) capable of processing raw audio waveforms directly without requiring pre/post-processing, and
5)highly competitive with other speech enhancement solutions.
VII. ACKNOLWEDGEMENT
We thank Temi Mohandespour and Keith Johnson for contributing to the early stages of the project. We also thank M. Anthony Lewis, Douglas McLelland, Kristofor Carlson, and Chris Jones for providing useful feedback on the manuscript.
A significant development with widespread applications across many industries.
My opinion only DYOR
Fact Finder
- Forums
- ASX - By Stock
- BRN
- aTENNuate for raw speech denoising in real time
aTENNuate for raw speech denoising in real time
-
-
- There are more pages in this discussion • 29 more messages in this thread...
You’re viewing a single post only. To view the entire thread just sign in or Join Now (FREE)
Featured News
Add BRN (ASX) to my watchlist
(20min delay)
|
|||||
Last
28.0¢ |
Change
-0.015(5.08%) |
Mkt cap ! $552.2M |
Open | High | Low | Value | Volume |
29.5¢ | 30.0¢ | 27.5¢ | $6.297M | 22.13M |
Buyers (Bids)
No. | Vol. | Price($) |
---|---|---|
44 | 1370975 | 27.5¢ |
Sellers (Offers)
Price($) | Vol. | No. |
---|---|---|
28.5¢ | 1073647 | 14 |
View Market Depth
No. | Vol. | Price($) |
---|---|---|
44 | 1370975 | 0.275 |
48 | 1391142 | 0.270 |
14 | 575137 | 0.265 |
16 | 333306 | 0.260 |
7 | 126883 | 0.255 |
Price($) | Vol. | No. |
---|---|---|
0.285 | 1073647 | 14 |
0.290 | 678311 | 14 |
0.295 | 941871 | 17 |
0.300 | 1384727 | 29 |
0.305 | 934255 | 14 |
Last trade - 16.10pm 12/11/2024 (20 minute delay) ? |
Featured News
BRN (ASX) Chart |
The Watchlist
NUZ
NEURIZON THERAPEUTICS LIMITED
Michael Thurn, CEO & MD
Michael Thurn
CEO & MD
Previous Video
Next Video
SPONSORED BY The Market Online