You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Marco Mistroni <mm...@gmail.com> on 2018/04/24 21:28:46 UTC

Problem in persisting file in S3 using Spark: xxx file does not exist Exception

HI all
 i am using the following code for persisting data into S3 (aws keys are
already stored in the environment variables)

dataFrame.coalesce(1).write.format("com.databricks.spark.csv").save(fileName)


However, i keep on receiving an exception that the file does not exist

here's what comes from logs

18/04/24 22:15:32 INFO Persiste: Persisting data to text file:
s3://ec2-bucket-mm-spark/form4-results-2404.results
Exception in thread "main" java.io.IOException: /form4-results-2404.results
doesn't exist

It seems that Spark expects the file to be there before writing? which
seems bizzarre?

I Have even tried to remove the coalesce ,but still got the same exception
Could anyone help pls?
kind regarsd
 marco

Re: Problem in persisting file in S3 using Spark: xxx file does not exist Exception

Posted by Marco Mistroni <mm...@gmail.com>.
Hi
 Sorted ..I just replaced s3 with s3a....I think I recall similar issues in
the past with aws libraries.
Thx anyway for getting back
Kr

On Wed, May 2, 2018, 4:57 PM Paul Tremblay <pa...@gmail.com> wrote:

> I would like to see the full error. However, S3 can give misleading
> messages if you don't have the correct permissions.
>
> On Tue, Apr 24, 2018, 2:28 PM Marco Mistroni <mm...@gmail.com> wrote:
>
>> HI all
>>  i am using the following code for persisting data into S3 (aws keys are
>> already stored in the environment variables)
>>
>> dataFrame.coalesce(1).write.format("com.databricks.spark.csv").save(fileName)
>>
>>
>> However, i keep on receiving an exception that the file does not exist
>>
>> here's what comes from logs
>>
>> 18/04/24 22:15:32 INFO Persiste: Persisting data to text file:
>> s3://ec2-bucket-mm-spark/form4-results-2404.results
>> Exception in thread "main" java.io.IOException:
>> /form4-results-2404.results doesn't exist
>>
>> It seems that Spark expects the file to be there before writing? which
>> seems bizzarre?
>>
>> I Have even tried to remove the coalesce ,but still got the same exception
>> Could anyone help pls?
>> kind regarsd
>>  marco
>>
>

Re: Problem in persisting file in S3 using Spark: xxx file does not exist Exception

Posted by Paul Tremblay <pa...@gmail.com>.
I would like to see the full error. However, S3 can give misleading
messages if you don't have the correct permissions.

On Tue, Apr 24, 2018, 2:28 PM Marco Mistroni <mm...@gmail.com> wrote:

> HI all
>  i am using the following code for persisting data into S3 (aws keys are
> already stored in the environment variables)
>
> dataFrame.coalesce(1).write.format("com.databricks.spark.csv").save(fileName)
>
>
> However, i keep on receiving an exception that the file does not exist
>
> here's what comes from logs
>
> 18/04/24 22:15:32 INFO Persiste: Persisting data to text file:
> s3://ec2-bucket-mm-spark/form4-results-2404.results
> Exception in thread "main" java.io.IOException:
> /form4-results-2404.results doesn't exist
>
> It seems that Spark expects the file to be there before writing? which
> seems bizzarre?
>
> I Have even tried to remove the coalesce ,but still got the same exception
> Could anyone help pls?
> kind regarsd
>  marco
>