You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kudu.apache.org by sky <x_...@163.com> on 2017/08/10 11:37:39 UTC

kudu insert errors

Hi,all
    I am using kudu python API to insert  data to kudu cluster, but a lot of data and too many columns would lead to the errors:             kudu.errors.KuduBadStatus: Incomplete: not enough mutation buffer space remaining for operation: required additional 1146 when 7339290 of 7340032 already used
   or
   write RPC to ip:port  timed out after 5.000s (SENT)
   Why?




Re: Re: kudu insert errors

Posted by Todd Lipcon <to...@cloudera.com>.
On Tue, Aug 15, 2017 at 2:00 AM, sky <x_...@163.com> wrote:

> Hi Todd,
> Where is the buffer size configured? Is this the size of the submission?
>

It's via the API (KuduSession::SetMutationBufferSpace). I'm not sure if
it's exposed via the Python API.


> 'MANUAL_FLUSH' only submitted a few thousand each time, the speed is too
> slow.
>

I think using AUTO_FLUSH_BACKGROUND should be your best bet. Are you
getting the buffer errors while configured for AUTO_FLUSH_BACKGROUND?


>
>
>
>
>
>
> At 2017-08-15 03:56:10, "Todd Lipcon" <to...@cloudera.com> wrote:
> >Hi Sky,
> >
> >It sounds like you are using 'MANUAL_FLUSH' mode for your KuduSession. You
> >should switch to AUTO_FLUSH_BACKGROUND mode, or else you need to call Flush
> >more frequently to ensure that you don't overrun your configured buffer
> >size.
> >
> >-Todd
> >
> >On Thu, Aug 10, 2017 at 4:37 AM, sky <x_...@163.com> wrote:
> >
> >> Hi,all
> >>     I am using kudu python API to insert  data to kudu cluster, but a lot
> >> of data and too many columns would lead to the errors:
> >>  kudu.errors.KuduBadStatus: Incomplete: not enough mutation buffer space
> >> remaining for operation: required additional 1146 when 7339290 of 7340032
> >> already used
> >>    or
> >>    write RPC to ip:port  timed out after 5.000s (SENT)
> >>    Why?
> >>
> >>
> >>
> >>
> >
> >
> >--
> >Todd Lipcon
> >Software Engineer, Cloudera
>
>
>
>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re:Re: kudu insert errors

Posted by sky <x_...@163.com>.
Hi Todd,
Where is the buffer size configured? Is this the size of the submission?
'MANUAL_FLUSH' only submitted a few thousand each time, the speed is too slow.








At 2017-08-15 03:56:10, "Todd Lipcon" <to...@cloudera.com> wrote:
>Hi Sky,
>
>It sounds like you are using 'MANUAL_FLUSH' mode for your KuduSession. You
>should switch to AUTO_FLUSH_BACKGROUND mode, or else you need to call Flush
>more frequently to ensure that you don't overrun your configured buffer
>size.
>
>-Todd
>
>On Thu, Aug 10, 2017 at 4:37 AM, sky <x_...@163.com> wrote:
>
>> Hi,all
>>     I am using kudu python API to insert  data to kudu cluster, but a lot
>> of data and too many columns would lead to the errors:
>>  kudu.errors.KuduBadStatus: Incomplete: not enough mutation buffer space
>> remaining for operation: required additional 1146 when 7339290 of 7340032
>> already used
>>    or
>>    write RPC to ip:port  timed out after 5.000s (SENT)
>>    Why?
>>
>>
>>
>>
>
>
>-- 
>Todd Lipcon
>Software Engineer, Cloudera

Re: kudu insert errors

Posted by Todd Lipcon <to...@cloudera.com>.
Hi Sky,

It sounds like you are using 'MANUAL_FLUSH' mode for your KuduSession. You
should switch to AUTO_FLUSH_BACKGROUND mode, or else you need to call Flush
more frequently to ensure that you don't overrun your configured buffer
size.

-Todd

On Thu, Aug 10, 2017 at 4:37 AM, sky <x_...@163.com> wrote:

> Hi,all
>     I am using kudu python API to insert  data to kudu cluster, but a lot
> of data and too many columns would lead to the errors:
>  kudu.errors.KuduBadStatus: Incomplete: not enough mutation buffer space
> remaining for operation: required additional 1146 when 7339290 of 7340032
> already used
>    or
>    write RPC to ip:port  timed out after 5.000s (SENT)
>    Why?
>
>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera