You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by benlaird <be...@capitalone.com> on 2014/04/22 21:43:44 UTC

Re: java.net.SocketException on reduceByKey() in pyspark

I was getting this error after upgrading my nodes to Python2.7. I suspected
the problem was due to conflicting Python versions, but my 2.7 install
seemed correct on my nodes. 

I set the PYSPARK_PYTHON variable to my 2.7 install (as I still had 2.6
installed and linked to the 'python' executable, with 'python2.7' the name
for my new install)

I’m still figuring out why this was happening, but even though I was
defining the PYSPARK_PYTHON environment variable in my
../conf/spark-shell.sh script, it was being overwritten. I eventually
realized to look at where the python executable is actually being set in
/pyspark/context.py. 
 
sc.pythonExec (where sc is my spark context) was returning ‘python’ instead
of ‘python2.7’ even though I had ‘python2.7’ in my config script. 
 
Setting os.environ[‘PYSPARK_PYTHON’] = ‘python2.7’ directly in my script
before creating the sparkcontext object solved the problem. 




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-net-SocketException-on-reduceByKey-in-pyspark-tp2184p4612.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.