You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Yogesh Vyas <in...@gmail.com> on 2017/05/23 05:02:26 UTC

streaming of binary files in PySpark

Hi,

I want to use Spark Streaming to read the binary files from HDFS. In the
documentation, it is mentioned to use binaryRecordStream(directory,
recordLength).
But I didn't understand what does the record length means?? Does it means
the size of the binary file or something else?


Regards,
Yogesh