You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by zh...@apache.org on 2017/12/21 01:09:06 UTC

[incubator-mxnet] branch master updated: Add Wikitext for gluon/rnnlm (#9090)

This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
     new 5c3acff  Add Wikitext for gluon/rnnlm (#9090)
5c3acff is described below

commit 5c3acff3b7bdb177a4731094faa724e31387715d
Author: Zihao Zheng <zi...@gmail.com>
AuthorDate: Thu Dec 21 09:09:02 2017 +0800

    Add Wikitext for gluon/rnnlm (#9090)
    
    * Add wikitext-2 data for rnnlm example in gluon.
    
    * Add Wikitext2 for rnnlm.
    
    * Add performance data in WikiText-2.
---
 example/gluon/word_language_model/README.md        | 22 +++++++++++--
 .../word_language_model/get_wikitext2_data.sh      | 36 ++++++++++++++++++++++
 example/gluon/word_language_model/train.py         |  2 +-
 3 files changed, 57 insertions(+), 3 deletions(-)

diff --git a/example/gluon/word_language_model/README.md b/example/gluon/word_language_model/README.md
index f200c16..ff8ea56 100644
--- a/example/gluon/word_language_model/README.md
+++ b/example/gluon/word_language_model/README.md
@@ -3,6 +3,7 @@
 This example trains a multi-layer RNN (Elman, GRU, or LSTM) on Penn Treebank (PTB) language modeling benchmark.
 
 The model obtains the state-of-the-art result on PTB using LSTM, getting a test perplexity of ~72.
+And ~97 ppl in WikiText-2, outperform than basic LSTM(99.3) and reach Variational LSTM(96.3).
 
 The following techniques have been adopted for SOTA results: 
 - [LSTM for LM](https://arxiv.org/pdf/1409.2329.pdf)
@@ -10,20 +11,37 @@ The following techniques have been adopted for SOTA results:
 
 ## Data
 
+### PTB
+
 The PTB data is the processed version from [(Mikolov et al, 2010)](http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf):
 
 ```bash
+bash get_ptb_data.sh
 python data.py
 ```
 
+### Wiki Text
+
+The wikitext-2 data is downloaded from [(The wikitext long term dependency language modeling dataset)](https://www.salesforce.com/products/einstein/ai-research/the-wikitext-dependency-language-modeling-dataset/):
+
+```bash
+bash get_wikitext2_data.sh
+```
+
+
 ## Usage
 
 Example runs and the results:
 
 ```
-python train.py --cuda --tied --nhid 650 --emsize 650 --dropout 0.5        # Test ppl of 75.3
-python train.py --cuda --tied --nhid 1500 --emsize 1500 --dropout 0.65      # Test ppl of 72.0
+python train.py -data ./data/ptb. --cuda --tied --nhid 650 --emsize 650 --dropout 0.5        # Test ppl of 75.3 in ptb
+python train.py -data ./data/ptb. --cuda --tied --nhid 1500 --emsize 1500 --dropout 0.65      # Test ppl of 72.0 in ptb
+```
+
 ```
+python train.py -data ./data/wikitext-2/wiki. --cuda --tied --nhid 256 --emsize 256          # Test ppl of 97.07 in wikitext-2 
+```
+
 
 <br>
 
diff --git a/example/gluon/word_language_model/get_wikitext2_data.sh b/example/gluon/word_language_model/get_wikitext2_data.sh
new file mode 100755
index 0000000..e9b8461
--- /dev/null
+++ b/example/gluon/word_language_model/get_wikitext2_data.sh
@@ -0,0 +1,36 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+RNN_DIR=$(cd `dirname $0`; pwd)
+DATA_DIR="${RNN_DIR}/data/"
+
+if [[ ! -d "${DATA_DIR}" ]]; then
+  echo "${DATA_DIR} doesn't exist, will create one";
+  mkdir -p ${DATA_DIR}
+fi
+
+wget -P ${DATA_DIR} https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip
+cd ${DATA_DIR}
+unzip wikitext-2-v1.zip
+
+# rename
+mv ${DATA_DIR}/wikitext-2/wiki.test.tokens ${DATA_DIR}/wikitext-2/wiki.test.txt
+mv ${DATA_DIR}/wikitext-2/wiki.valid.tokens ${DATA_DIR}/wikitext-2/wiki.valid.txt
+mv ${DATA_DIR}/wikitext-2/wiki.train.tokens ${DATA_DIR}/wikitext-2/wiki.train.txt
diff --git a/example/gluon/word_language_model/train.py b/example/gluon/word_language_model/train.py
index b419277..eb584b8 100644
--- a/example/gluon/word_language_model/train.py
+++ b/example/gluon/word_language_model/train.py
@@ -24,7 +24,7 @@ import model
 import data
 
 parser = argparse.ArgumentParser(description='MXNet Autograd PennTreeBank RNN/LSTM Language Model')
-parser.add_argument('--data', type=str, default='./data/ptb.',
+parser.add_argument('--data', type=str, default='./data/wikitext-2/wiki.',
                     help='location of the data corpus')
 parser.add_argument('--model', type=str, default='lstm',
                     help='type of recurrent net (rnn_tanh, rnn_relu, lstm, gru)')

-- 
To stop receiving notification emails like this one, please contact
['"commits@mxnet.apache.org" <co...@mxnet.apache.org>'].