Spatial Smoothing Regularization for Bi-direction Long Short-term Memory Model [双向长短时记忆模型训练中的空间平滑正则化方法研究]

Li W.; Ge F.; Zhang P.<sup>*</sup>; Yan Y.

doi:10.11999/JEITdzyxxxb-41-3-544

摘要

Bi-direction Long Short-Term Memory (BLSTM) model is widely used in large scale acoustic modeling recently. It is superior to many other neural networks on performance and stability. The reason may be that the BLSTM model gets complicated structure and computation with cell and gates, taking more context and time dependence into account during training. However, one of the biggest problem of BLSTM is overfitting, there are some common ways to get over it, for example, multitask learning, L2 model regularization. A method of spatial smoothing is proposed on BLSTM model to relieve the overfitting problem. First, the activations on the hidden layer are reorganized to a 2-D grid, then a filter transform is used to induce smoothness over the grid, finally adding the smooth information to the objective function, to train a BLSTM network. Experiment results show that the proposed spatial smoothing way achieves 4% relative reduction on Word Error Ratio (WER), when adding the L2 norm to model, which can lower the relative WER by 8.6% jointly.

单位
中国科学院大学

全文

访问全文

收藏分享被引浏览

更新时间：2023-06-02 09:06

Spatial Smoothing Regularization for Bi-direction Long Short-term Memory Model [双向长短时记忆模型训练中的空间平滑正则化方法研究]

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友