摘要
Identifying cell-phones using recorded speech has become a hot topic in the field of multimedia forensics in recent years.However, most of the existing studies focus on the clean speech or the speech with unnaturally artificial noise.In this paper, the speech with background noise is taken into account and a source cell-phone identification method is presented on the basis of the low-dimensional deep features.First, the logarithmic Mel-filter bank coefficients are extracted as the main acoustic features and input to the temporal convolutional network for training and further extracting the deep features of speech devices.Then, the linear discriminant analysis is used to reduce the size of the high-dimensional deep features and remove the redundancy.Finally, the low-dimensional deep features are used as input to the support vector machine classifier.The experimental results on 47 models of mobile phones and 37, 600 speech samples with background noise show that the proposed method has better recognition performance and better adaptability to different brands, different models of the same brand, different sampling lengths, different sizes of the dataset, and different sampling rates. ? 2021, Chinese Institute of Electronics. All right reserved.
- 单位