Result Details
Residual Memory Networks: Feed-forward approach to learn long-term temporal dependencies
Karafiát Martin, Ing., Ph.D., DCGM (FIT)
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Veselý Karel, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Grézl František, Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Training deep recurrent neural network (RNN) architectures iscomplicated due to the increased network complexity. This disruptsthe learning of higher order abstracts using deep RNN. Incase of feed-forward networks training deep structures is simpleand faster while learning long-term temporal information isnot possible. In this paper we propose a residual memory neuralnetwork (RMN) architecture to model short-time dependenciesusing deep feed-forward layers having residual and time delayedconnections. The residual connection paves way to constructdeeper networks by enabling unhindered flow of gradientsand the time delay units capture temporal information withshared weights. The number of layers in RMN signifies both thehierarchical processing depth and temporal depth. The computationalcomplexity in training RMN is significantly less whencompared to deep recurrent networks. RMN is further extendedas bi-directional RMN (BRMN) to capture both past and futureinformation. Experimental analysis is done on AMI corpus tosubstantiate the capability of RMN in learning long-term informationand hierarchical information. Recognition performanceof RMN trained with 300 hours of Switchboard corpus is comparedwith various state-of-the-art LVCSR systems. The resultsindicate that RMN and BRMN gains 6 % and 3.8 % relativeimprovement over LSTM and BLSTM networks.
Automatic speech recognition, LSTM, RNN,Residual memory networks.
Training deep recurrent neural network (RNN) architectures is complicated due to the increased network complexity. This disrupts the learning of higher order abstracts using deep RNN. In case of feed-forward networks training deep structures is simple and faster while learning long-term temporal information is not possible. In this paper we propose a residual memory neural network (RMN) architecture to model short-time dependencies using deep feed-forward layers having residual and time delayed connections. The residual connection paves way to construct deeper networks by enabling unhindered flow of gradients and the time delay units capture temporal information with shared weights. The number of layers in RMN signifies both the hierarchical processing depth and temporal depth. The computational complexity in training RMN is significantly less when compared to deep recurrent networks. RMN is further extended as bi-directional RMN (BRMN) to capture both past and future information. Experimental analysis is done on AMI corpus to substantiate the capability of RMN in learning long-term information and hierarchical information. Recognition performance of RMN trained with 300 hours of Switchboard corpus is compared with various state-of-the-art LVCSR systems. The results indicate that RMN and BRMN gains 6 % and 3.8 % relative improvement over LSTM and BLSTM networks.
@inproceedings{BUT144448,
  author="Murali Karthick {Baskar} and Martin {Karafiát} and Lukáš {Burget} and Karel {Veselý} and František {Grézl} and Jan {Černocký}",
  title="Residual Memory Networks: Feed-forward approach to learn long-term temporal dependencies",
  booktitle="Proceedings of ICASSP 2017",
  year="2017",
  pages="4810--4814",
  publisher="IEEE Signal Processing Society",
  address="New Orleans",
  doi="10.1109/ICASSP.2017.7953070",
  isbn="978-1-5090-4117-6",
  url="https://www.fit.vut.cz/research/publication/11467/"
}IT4Innovations excellence in science, MŠMT, Národní program udržitelnosti II, LQ1602, start: 2016-01-01, end: 2020-12-31, completed