You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Bernardo Vecchia Stein <be...@gmail.com> on 2015/08/18 23:07:13 UTC

Spark scala addFile retrieving file with incorrect size

Hi all,

I'm trying to run a spark job (written in scala) that uses addFile to
download some small files to each node. However, one of the downloaded
files has an incorrect size (the other ones are ok), which causes an error
when using it in the code.

I have looked more into the issue and hexdump'ed both the original and the
spark-retrieved files. The beginning of the files are exactly equal, but
the spark-retrieved one just gets truncated at a "random" position. This
position appears random, however I noticed that it is exactly half the size
of the original file. Not sure if a coincidence or not.

The original file has a size of 296 bytes (the others are a little bit
bigger, around 13 kbytes).

I'm kinda new to spark, so I'm stuck at this point trying to figure out
what is the problem. Does anyone have any idea of what might be the problem
here?

Thank you,
Bernardo