[1909.03011] RNN Architecture Learning with Sparse Regularization

Neural models for NLP typically use large numbers of parameters to reach state-of-the-art performance, which can lead to excessive memory usage and increased runtime. We present a structure learning method for learning sparse, parameter-efficient NLP models. Our method applies group lasso to rational RNNs (Peng et al., 2018), a family of models that is closely connected to weighted finite-state automata (WFSAs). We take advantage of rational RNNs' natural grouping of the weights, so the group lasso penalty directly removes WFSA states, substantially reducing the number of parameters in the model. Our experiments on a number of sentiment analysis datasets, using both GloVe and BERT embeddings, show that our approach learns neural structures which have fewer parameters without sacrificing performance relative to parameter-rich baselines. Our method also highlights the interpretable properties of rational RNNs. We show that sparsifying such models makes them easier to visualize, and we pr

2 mentions: @nlpnoah@royschwartz02
Keywords: rnn
Date: 2019/09/10 20:17

Referring Tweets

@nlpnoah the latest on rational recurrences: use group lasso to regularize while learning and get a compact neural model equivalent to a tiny number of WFSAs. t.co/hfyS9sveO9 to appear at EMNLP, work by @JesseDodge, @royschwartz02, Hao Peng, @nlpnoah
@royschwartz02 Rational recurrences not only provide us with better understanding of neural models, but also allows to make use of classic tools like group lasso to learn a smaller, more efficient RNN. #greenai #emnlp2019 t.co/PRNcmXGHt6 @JesseDodge, @royschwartz02, Hao Peng, @nlpnoah t.co/LtPrfxNQfy t.co/WhkdIq92X4

Related Entries

Read more Bidirectional-LSTM based RNNs for text-to-speech synthesis with OpenJTalk (ja) — nnmnkwii 0.0.16 doc...
0 users, 0 mentions 2018/08/23 09:24
Read more Learning Phrase Representations using RNN Encoder-Decoder for Statist…
0 users, 0 mentions 2018/04/22 03:40
Read more piqcy on Twitter: "5/21週のまとめ。RNNが見ている範囲についての論文、Goodfellow氏自らが携わる機械学習モデルの脆弱性診断ベンチマークツールは要チェック。 🔥 Ho...
0 users, 0 mentions 2018/05/25 10:37
Read more DCNet — Denoising (DNA) Sequence With a LSTM-RNN and PyTorch
0 users, 0 mentions 2018/11/01 03:14
Read more RNNで来月の航空会社の乗客数を予測する:TFLearnでLSTMからGRUまで実装しよう - DeepAgex_{1}, x_{2}, x_{3}, ..., x_{n}x_{1}x_{2}y_{1...
0 users, 0 mentions 2018/04/22 03:40