End-to-end speech enhancement based on ultra-lightweight channel attention [基于超轻量通道注意力的端对端语音增强方法]

Hong Y.; Sun C.<sup>*</sup>; Leng Y.

doi:10.11959/j.issn.2096-6652.202136

摘要

The full convolutional time-domain audio separation network (Conv-TasNet) is a state-of-the-art end-to-end speech separation model which was proposed recently. The Conv-TasNet used dilated convolution to expand the receptive field and fuse more speech features in space, which greatly improved the speech separation performance of the network, but at the same time ignored the importance of information across different convolution channels. An end-to-end speech enhancement method based on ultra-lightweight channel attention was proposed, which effectively combined Conv-TasNet and channel attention. At the same time, a group of filters was added to the Conv-TasNet codec to improve the speech feature extraction ability of the network. This method can make convolutional neural network combine spatial information and channel information more effectively to improve the speech enhancement effect. Experiment shows that the proposed model can effectively improve the performance of speech enhancement when the model capacity is only increased by about 0.02%.

单位
南昌航空大学

全文

访问全文

收藏分享被引(2) 浏览

更新时间：2023-10-27 01:57

End-to-end speech enhancement based on ultra-lightweight channel attention [基于超轻量通道注意力的端对端语音增强方法]

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友