You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Neal Richter <nr...@gmail.com> on 2009/07/31 03:11:52 UTC

Errors pushing to output S3

Below is the output of Hive for an INSERT-SELECT from one 'EXTERNAL' table
to another.  This is running in EC2 and the external tables have partitions
registered as path-keys in S3.   The final upload of data to S3 fails.  This
does not always happen, yet when it does the only option seems to be to
re-run the job.

Is there another way?  Perhaps a way to tell hive to retry data-uploads to
S3 if a failure occurs? - Neal


[snip]
 map = 100%,  reduce =52%
 map = 100%,  reduce =88%
 map = 100%,  reduce =93%
 map = 100%,  reduce =100%
 map = 100%,  reduce =49%
 map = 100%,  reduce =100%
 map = 100%,  reduce =45%
 map = 100%,  reduce =86%
 map = 100%,  reduce =100%
 map = 100%,  reduce =67%
 map = 100%,  reduce =12%
 map = 100%,  reduce =14%
 map = 100%,  reduce =0%
 map = 100%,  reduce =17%
 map = 100%,  reduce =0%
 map = 100%,  reduce =45%
Ended Job = job_200907281406_0315
Job Commit failed with exception
'org.apache.hadoop.hive.ql.metadata.HiveException(org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: Encountered too many S3 Internal
Server errors (6), aborting request.)'
FAILED: Execution Error, return code 3 from
org.apache.hadoop.hive.ql.exec.ExecDriver

Re: Errors pushing to output S3

Posted by Neal Richter <nr...@gmail.com>.

Not sure, I will look into it.

My prediction is that we go with HDFS rather than S3 linked destination
tables soon anyway.  In the mean time we're going to instrument the code to
just retry failed queries.

- Neal

On Thu, Aug 6, 2009 at 4:03 PM, Zheng Shao <zs...@gmail.com> wrote:

> Hi Neal,
>
> It seems like that the exception is thrown at:
> org.apache.hadoop.fs.s3.S3Exception:>>
> org.jets3t.service.S3ServiceException
>
> Is there any parameter to adjust the s3 file system?
>
> Zheng
>
> On Thu, Aug 6, 2009 at 7:32 AM, Neal Richter<nr...@gmail.com> wrote:
> > Any response on this?  We had some hadoop output to S3 failures last week
> as
> > well.  Should I post to hive-dev list?
> >
> > Thanks! - Neal
> >
> > On Thu, Jul 30, 2009 at 7:11 PM, Neal Richter <nr...@gmail.com>
> wrote:
> >>
> >> Below is the output of Hive for an INSERT-SELECT from one 'EXTERNAL'
> table
> >> to another.  This is running in EC2 and the external tables have
> partitions
> >> registered as path-keys in S3.   The final upload of data to S3 fails.
> This
> >> does not always happen, yet when it does the only option seems to be to
> >> re-run the job.
> >>
> >> Is there another way?  Perhaps a way to tell hive to retry data-uploads
> to
> >> S3 if a failure occurs? - Neal
> >>
> >>
> >> [snip]
> >>  map = 100%,  reduce =52%
> >>  map = 100%,  reduce =88%
> >>  map = 100%,  reduce =93%
> >>  map = 100%,  reduce =100%
> >>  map = 100%,  reduce =49%
> >>  map = 100%,  reduce =100%
> >>  map = 100%,  reduce =45%
> >>  map = 100%,  reduce =86%
> >>  map = 100%,  reduce =100%
> >>  map = 100%,  reduce =67%
> >>  map = 100%,  reduce =12%
> >>  map = 100%,  reduce =14%
> >>  map = 100%,  reduce =0%
> >>  map = 100%,  reduce =17%
> >>  map = 100%,  reduce =0%
> >>  map = 100%,  reduce =45%
> >> Ended Job = job_200907281406_0315
> >> Job Commit failed with exception
> >>
> 'org.apache.hadoop.hive.ql.metadata.HiveException(org.apache.hadoop.fs.s3.S3Exception:
> >> org.jets3t.service.S3ServiceException: Encountered too many S3 Internal
> >> Server errors (6), aborting request.)'
> >> FAILED: Execution Error, return code 3 from
> >> org.apache.hadoop.hive.ql.exec.ExecDriver
> >>
> >
> >
>
>
>
> --
> Yours,
> Zheng
>

Re: Errors pushing to output S3

Posted by Zheng Shao <zs...@gmail.com>.

Hi Neal,

It seems like that the exception is thrown at:
org.apache.hadoop.fs.s3.S3Exception:>> org.jets3t.service.S3ServiceException

Is there any parameter to adjust the s3 file system?

Zheng

On Thu, Aug 6, 2009 at 7:32 AM, Neal Richter<nr...@gmail.com> wrote:
> Any response on this?  We had some hadoop output to S3 failures last week as
> well.  Should I post to hive-dev list?
>
> Thanks! - Neal
>
> On Thu, Jul 30, 2009 at 7:11 PM, Neal Richter <nr...@gmail.com> wrote:
>>
>> Below is the output of Hive for an INSERT-SELECT from one 'EXTERNAL' table
>> to another.  This is running in EC2 and the external tables have partitions
>> registered as path-keys in S3.   The final upload of data to S3 fails.  This
>> does not always happen, yet when it does the only option seems to be to
>> re-run the job.
>>
>> Is there another way?  Perhaps a way to tell hive to retry data-uploads to
>> S3 if a failure occurs? - Neal
>>
>>
>> [snip]
>>  map = 100%,  reduce =52%
>>  map = 100%,  reduce =88%
>>  map = 100%,  reduce =93%
>>  map = 100%,  reduce =100%
>>  map = 100%,  reduce =49%
>>  map = 100%,  reduce =100%
>>  map = 100%,  reduce =45%
>>  map = 100%,  reduce =86%
>>  map = 100%,  reduce =100%
>>  map = 100%,  reduce =67%
>>  map = 100%,  reduce =12%
>>  map = 100%,  reduce =14%
>>  map = 100%,  reduce =0%
>>  map = 100%,  reduce =17%
>>  map = 100%,  reduce =0%
>>  map = 100%,  reduce =45%
>> Ended Job = job_200907281406_0315
>> Job Commit failed with exception
>> 'org.apache.hadoop.hive.ql.metadata.HiveException(org.apache.hadoop.fs.s3.S3Exception:
>> org.jets3t.service.S3ServiceException: Encountered too many S3 Internal
>> Server errors (6), aborting request.)'
>> FAILED: Execution Error, return code 3 from
>> org.apache.hadoop.hive.ql.exec.ExecDriver
>>
>
>



-- 
Yours,
Zheng

Re: Errors pushing to output S3

Posted by Neal Richter <nr...@gmail.com>.

Any response on this?  We had some hadoop output to S3 failures last week as
well.  Should I post to hive-dev list?

Thanks! - Neal

On Thu, Jul 30, 2009 at 7:11 PM, Neal Richter <nr...@gmail.com> wrote:

> Below is the output of Hive for an INSERT-SELECT from one 'EXTERNAL' table
> to another.  This is running in EC2 and the external tables have partitions
> registered as path-keys in S3.   The final upload of data to S3 fails.  This
> does not always happen, yet when it does the only option seems to be to
> re-run the job.
>
> Is there another way?  Perhaps a way to tell hive to retry data-uploads to
> S3 if a failure occurs? - Neal
>
>
> [snip]
>  map = 100%,  reduce =52%
>  map = 100%,  reduce =88%
>  map = 100%,  reduce =93%
>  map = 100%,  reduce =100%
>  map = 100%,  reduce =49%
>  map = 100%,  reduce =100%
>  map = 100%,  reduce =45%
>  map = 100%,  reduce =86%
>  map = 100%,  reduce =100%
>  map = 100%,  reduce =67%
>  map = 100%,  reduce =12%
>  map = 100%,  reduce =14%
>  map = 100%,  reduce =0%
>  map = 100%,  reduce =17%
>  map = 100%,  reduce =0%
>  map = 100%,  reduce =45%
> Ended Job = job_200907281406_0315
> Job Commit failed with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(org.apache.hadoop.fs.s3.S3Exception:
> org.jets3t.service.S3ServiceException: Encountered too many S3 Internal
> Server errors (6), aborting request.)'
> FAILED: Execution Error, return code 3 from
> org.apache.hadoop.hive.ql.exec.ExecDriver
>
>