You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by jx...@apache.org on 2017/12/08 19:25:15 UTC
[incubator-mxnet] branch master updated: Some fixes for
example/reinforcement-learning/parallel_actor_critic (#8991)
This is an automated email from the ASF dual-hosted git repository.
jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git
The following commit(s) were added to refs/heads/master by this push:
new f154101 Some fixes for example/reinforcement-learning/parallel_actor_critic (#8991)
f154101 is described below
commit f154101db0f50a78866bab474fa85f6e86dad763
Author: mbaijal <30...@users.noreply.github.com>
AuthorDate: Fri Dec 8 11:25:11 2017 -0800
Some fixes for example/reinforcement-learning/parallel_actor_critic (#8991)
* Fix some errors
* Update the README
---
example/reinforcement-learning/parallel_actor_critic/README.md | 8 ++++++++
example/reinforcement-learning/parallel_actor_critic/model.py | 2 +-
example/reinforcement-learning/parallel_actor_critic/train.py | 2 +-
3 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/example/reinforcement-learning/parallel_actor_critic/README.md b/example/reinforcement-learning/parallel_actor_critic/README.md
index d734ceb..d328849 100644
--- a/example/reinforcement-learning/parallel_actor_critic/README.md
+++ b/example/reinforcement-learning/parallel_actor_critic/README.md
@@ -10,6 +10,14 @@ Please see the accompanying [tutorial](https://minpy.readthedocs.io/en/latest/tu
Author: Sean Welleck ([@wellecks](https://github.com/wellecks)), Reed Lee ([@loofahcus](https://github.com/loofahcus))
+
+## Prerequisites
+ - Install Scikit-learn: `python -m pip install --user sklearn`
+ - Install SciPy: `python -m pip install --user scipy`
+ - Install the required OpenAI environments. For example, install Atari: `pip install gym[atari]`
+
+For more details refer: https://github.com/openai/gym
+
## Training
#### Atari Pong
diff --git a/example/reinforcement-learning/parallel_actor_critic/model.py b/example/reinforcement-learning/parallel_actor_critic/model.py
index b90af67..384f48c 100644
--- a/example/reinforcement-learning/parallel_actor_critic/model.py
+++ b/example/reinforcement-learning/parallel_actor_critic/model.py
@@ -88,7 +88,7 @@ class Agent(object):
# Compute discounted rewards and advantages.
advs = []
gamma, lambda_ = self.config.gamma, self.config.lambda_
- for i in xrange(len(env_vs)):
+ for i in range(len(env_vs)):
# Compute advantages using Generalized Advantage Estimation;
# see eqn. (16) of [Schulman 2016].
delta_t = (env_rs[i] + gamma*np.array(env_vs[i][1:]) -
diff --git a/example/reinforcement-learning/parallel_actor_critic/train.py b/example/reinforcement-learning/parallel_actor_critic/train.py
index 128a550..7b78d72 100644
--- a/example/reinforcement-learning/parallel_actor_critic/train.py
+++ b/example/reinforcement-learning/parallel_actor_critic/train.py
@@ -125,7 +125,7 @@ if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--num-envs', type=int, default=16)
parser.add_argument('--t-max', type=int, default=50)
- parser.add_argument('--env-type', default='PongDeterministic-v3')
+ parser.add_argument('--env-type', default='PongDeterministic-v4')
parser.add_argument('--render', action='store_true')
parser.add_argument('--save-pre', default='checkpoints')
parser.add_argument('--save-every', type=int, default=0)
--
To stop receiving notification emails like this one, please contact
['"commits@mxnet.apache.org" <co...@mxnet.apache.org>'].