You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2017/12/21 01:09:06 UTC
[GitHub] szha closed pull request #9090: Add Wikitext for gluon/rnnlm

szha closed pull request #9090: Add Wikitext for gluon/rnnlm
URL: https://github.com/apache/incubator-mxnet/pull/9090
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/example/gluon/word_language_model/README.md b/example/gluon/word_language_model/README.md
index f200c164a7..ff8ea56b20 100644
--- a/example/gluon/word_language_model/README.md
+++ b/example/gluon/word_language_model/README.md
@@ -3,6 +3,7 @@
 This example trains a multi-layer RNN (Elman, GRU, or LSTM) on Penn Treebank (PTB) language modeling benchmark.
 
 The model obtains the state-of-the-art result on PTB using LSTM, getting a test perplexity of ~72.
+And ~97 ppl in WikiText-2, outperform than basic LSTM(99.3) and reach Variational LSTM(96.3).
 
 The following techniques have been adopted for SOTA results: 
 - [LSTM for LM](https://arxiv.org/pdf/1409.2329.pdf)
@@ -10,20 +11,37 @@ The following techniques have been adopted for SOTA results:
 
 ## Data
 
+### PTB
+
 The PTB data is the processed version from [(Mikolov et al, 2010)](http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf):
 
 ```bash
+bash get_ptb_data.sh
 python data.py
 ```
 
+### Wiki Text
+
+The wikitext-2 data is downloaded from [(The wikitext long term dependency language modeling dataset)](https://www.salesforce.com/products/einstein/ai-research/the-wikitext-dependency-language-modeling-dataset/):
+
+```bash
+bash get_wikitext2_data.sh
+```
+
+
 ## Usage
 
 Example runs and the results:
 
 ```
-python train.py --cuda --tied --nhid 650 --emsize 650 --dropout 0.5        # Test ppl of 75.3
-python train.py --cuda --tied --nhid 1500 --emsize 1500 --dropout 0.65      # Test ppl of 72.0
+python train.py -data ./data/ptb. --cuda --tied --nhid 650 --emsize 650 --dropout 0.5        # Test ppl of 75.3 in ptb
+python train.py -data ./data/ptb. --cuda --tied --nhid 1500 --emsize 1500 --dropout 0.65      # Test ppl of 72.0 in ptb
+```
+
 ```
+python train.py -data ./data/wikitext-2/wiki. --cuda --tied --nhid 256 --emsize 256          # Test ppl of 97.07 in wikitext-2 
+```
+
 
 <br>
 
diff --git a/example/gluon/word_language_model/get_wikitext2_data.sh b/example/gluon/word_language_model/get_wikitext2_data.sh
new file mode 100755
index 0000000000..e9b8461c40
--- /dev/null
+++ b/example/gluon/word_language_model/get_wikitext2_data.sh
@@ -0,0 +1,36 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+RNN_DIR=$(cd `dirname $0`; pwd)
+DATA_DIR="${RNN_DIR}/data/"
+
+if [[ ! -d "${DATA_DIR}" ]]; then
+  echo "${DATA_DIR} doesn't exist, will create one";
+  mkdir -p ${DATA_DIR}
+fi
+
+wget -P ${DATA_DIR} https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip
+cd ${DATA_DIR}
+unzip wikitext-2-v1.zip
+
+# rename
+mv ${DATA_DIR}/wikitext-2/wiki.test.tokens ${DATA_DIR}/wikitext-2/wiki.test.txt
+mv ${DATA_DIR}/wikitext-2/wiki.valid.tokens ${DATA_DIR}/wikitext-2/wiki.valid.txt
+mv ${DATA_DIR}/wikitext-2/wiki.train.tokens ${DATA_DIR}/wikitext-2/wiki.train.txt
diff --git a/example/gluon/word_language_model/train.py b/example/gluon/word_language_model/train.py
index b419277dcf..eb584b822a 100644
--- a/example/gluon/word_language_model/train.py
+++ b/example/gluon/word_language_model/train.py
@@ -24,7 +24,7 @@
 import data
 
 parser = argparse.ArgumentParser(description='MXNet Autograd PennTreeBank RNN/LSTM Language Model')
-parser.add_argument('--data', type=str, default='./data/ptb.',
+parser.add_argument('--data', type=str, default='./data/wikitext-2/wiki.',
                     help='location of the data corpus')
 parser.add_argument('--model', type=str, default='lstm',
                     help='type of recurrent net (rnn_tanh, rnn_relu, lstm, gru)')


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services