You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jin Lei <je...@gmail.com> on 2012/08/10 04:39:22 UTC

problem of inserting columns of a great amount

Hello everyone,
I'm a novice to cassandra and meet a problem recently.
I want to insert over 50k columns into cassandra at one time, total size of
which doesn't exceed 16MB, but the database return an exception as follows.

[E 120809 15:37:31 service:1251] error in write to database
    Traceback (most recent call last):
      File "/home/stoneiii/mycode/src/user/service.py", line 1248, in
flush_mutator
        self.mutator.send()

      File "/home/stoneiii/mycode/pylib/pycassa/batch.py", line 127, in send
        conn.batch_mutate(mutations, write_consistency_level)
      File "/home/stoneiii/gaia2/pylib/pycassa/pool.py", line 145, in new_f
        return new_f(self, *args, **kwargs)
      File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in new_f
        return new_f(self, *args, **kwargs)
      File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in new_f
        return new_f(self, *args, **kwargs)
      File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in new_f
        return new_f(self, *args, **kwargs)
      File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in new_f
        return new_f(self, *args, **kwargs)
      File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 140, in new_f
        (self._retry_count, exc.__class__.__name__, exc))
    MaximumRetryException: Retried 6 times. Last failure was error: [Errno
104] Connection reset by peer

Since cassandra supports 2 billion of columns in one table, why can't I
insert 50k columns in this way? Or what settings should I adjust to break
this limit?
Thanks for any hint in advance!

Re: problem of inserting columns of a great amount

Posted by Tyler Hobbs <ty...@datastax.com>.
There is a fair amount of overhead in the Thrift structures for columns and
mutations, so that's a pretty large mutation.

In general, you'll see better performance inserting many small batch
mutations in parallel.

On Fri, Aug 10, 2012 at 2:04 AM, Jin Lei <je...@gmail.com> wrote:

> Sorry, something is wrong with my previous problem description. The fact
> is that the cassandra deny my requests when I try to insert 50k rows
> (rather than 50k columns) into a column family at one time. Each row with 1
> column.
>
> 2012/8/10 Jin Lei <je...@gmail.com>
>
>> Hello everyone,
>> I'm a novice to cassandra and meet a problem recently.
>> I want to insert over 50k columns into cassandra at one time, total size
>> of which doesn't exceed 16MB, but the database return an exception as
>> follows.
>>
>> [E 120809 15:37:31 service:1251] error in write to database
>>     Traceback (most recent call last):
>>       File "/home/stoneiii/mycode/src/user/service.py", line 1248, in
>> flush_mutator
>>         self.mutator.send()
>>
>>       File "/home/stoneiii/mycode/pylib/pycassa/batch.py", line 127, in
>> send
>>
>>         conn.batch_mutate(mutations, write_consistency_level)
>>       File "/home/stoneiii/gaia2/pylib/pycassa/pool.py", line 145, in
>> new_f
>>         return new_f(self, *args, **kwargs)
>>       File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in
>> new_f
>>         return new_f(self, *args, **kwargs)
>>       File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in
>> new_f
>>         return new_f(self, *args, **kwargs)
>>       File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in
>> new_f
>>         return new_f(self, *args, **kwargs)
>>       File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in
>> new_f
>>         return new_f(self, *args, **kwargs)
>>       File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 140, in
>> new_f
>>         (self._retry_count, exc.__class__.__name__, exc))
>>     MaximumRetryException: Retried 6 times. Last failure was error:
>> [Errno 104] Connection reset by peer
>>
>> Since cassandra supports 2 billion of columns in one table, why can't I
>> insert 50k columns in this way? Or what settings should I adjust to break
>> this limit?
>> Thanks for any hint in advance!
>>
>>
>>
>>
>


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: problem of inserting columns of a great amount

Posted by Jin Lei <je...@gmail.com>.
Sorry, something is wrong with my previous problem description. The fact is
that the cassandra deny my requests when I try to insert 50k rows (rather
than 50k columns) into a column family at one time. Each row with 1 column.

2012/8/10 Jin Lei <je...@gmail.com>

> Hello everyone,
> I'm a novice to cassandra and meet a problem recently.
> I want to insert over 50k columns into cassandra at one time, total size
> of which doesn't exceed 16MB, but the database return an exception as
> follows.
>
> [E 120809 15:37:31 service:1251] error in write to database
>     Traceback (most recent call last):
>       File "/home/stoneiii/mycode/src/user/service.py", line 1248, in
> flush_mutator
>         self.mutator.send()
>
>       File "/home/stoneiii/mycode/pylib/pycassa/batch.py", line 127, in
> send
>
>         conn.batch_mutate(mutations, write_consistency_level)
>       File "/home/stoneiii/gaia2/pylib/pycassa/pool.py", line 145, in new_f
>         return new_f(self, *args, **kwargs)
>       File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in
> new_f
>         return new_f(self, *args, **kwargs)
>       File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in
> new_f
>         return new_f(self, *args, **kwargs)
>       File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in
> new_f
>         return new_f(self, *args, **kwargs)
>       File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 145, in
> new_f
>         return new_f(self, *args, **kwargs)
>       File "/home/stoneiii/mycode/pylib/pycassa/pool.py", line 140, in
> new_f
>         (self._retry_count, exc.__class__.__name__, exc))
>     MaximumRetryException: Retried 6 times. Last failure was error: [Errno
> 104] Connection reset by peer
>
> Since cassandra supports 2 billion of columns in one table, why can't I
> insert 50k columns in this way? Or what settings should I adjust to break
> this limit?
> Thanks for any hint in advance!
>
>
>
>