You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by hassan <He...@gmail.com> on 2014/07/03 08:21:25 UTC

Spark S3 LZO input files

I'm trying to read input files from S3. The files are compressed using LZO.
i-e from spark-shell 

sc.textFile("s3n://path/xx.lzo").first returns 'String = �LZO?'

Spark does not uncompress the data from the file. I am using cloudera
manager 5, with CDH 5.0.2. I've already installed 'GPLEXTRAS' parcel and
have included 'opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/hadoop-lzo.jar'
and '/opt/cloudera/parcels/GPLEXTRAS/lib/hadoop/lib/native/' in
SPARK_CLASS_PATH. What am I missing?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-S3-LZO-input-files-tp8706.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.