You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Vikas Jaiman <er...@gmail.com> on 2016/10/20 18:58:23 UTC

Does anyone store larger values in Cassandra E.g. 500 KB?

Hi,

Normally people would like to store smaller values in Cassandra. Is there
anyone using it to store for larger values (e.g 500KB or more) and if so
what are the issues you are facing . I Would like to know the tweaks also
which you are considering.

Thanks,
Vikas

Re: Does anyone store larger values in Cassandra E.g. 500 KB?

Posted by Vikas Jaiman <er...@gmail.com>.
Thanks. I will have a look into that.


Vikas

On Fri, Oct 21, 2016 at 7:18 PM, jason zhao yang <
zhaoyangsingapore@gmail.com> wrote:

> 1. usually before storing object, serialization is needed, so we can know
> the size.
> 2. add "chunk id" as last clustering key.
>
> Vikas Jaiman <er...@gmail.com>于2016年10月21日周五 下午11:46写道:
>
>> Thanks for your answer but I am just curious about:
>>
>> i)How do you identify the size of the object which you are going to chunk?
>>
>> ii) While reading or updating how it is going to read all those chunks?
>>
>> Vikas
>>
>> On Thu, Oct 20, 2016 at 9:25 PM, Justin Cameron <ju...@instaclustr.com>
>> wrote:
>>
>> You can, but it is not really very efficient or cost-effective. You may
>> encounter issues with streaming, repairs and compaction if you have very
>> large blobs (100MB+), so try to keep them under 10MB if possible.
>>
>> I'd suggest storing blobs in something like Amazon S3 and keeping just
>> the bucket name & blob id in Cassandra.
>>
>> On Thu, 20 Oct 2016 at 12:03 Vikas Jaiman <er...@gmail.com>
>> wrote:
>>
>> Hi,
>>
>> Normally people would like to store smaller values in Cassandra. Is there
>> anyone using it to store for larger values (e.g 500KB or more) and if so
>> what are the issues you are facing . I Would like to know the tweaks also
>> which you are considering.
>>
>> Thanks,
>> Vikas
>>
>> --
>>
>> Justin Cameron
>>
>> Senior Software Engineer | Instaclustr
>>
>>
>>
>>
>> This email has been sent on behalf of Instaclustr Pty Ltd (Australia) and
>> Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the message.
>>
>>
>>
>>
>> --
>>
>

Re: Does anyone store larger values in Cassandra E.g. 500 KB?

Posted by Jens Rantil <je...@tink.se>.
If I would do this, I would have have two tables; chunks and data:

CREATE TABLE file_chunks {
  filename string,
  chunk int,
  size int, // Optional if you want to query the total size of a file.
  PRIMARY KEY (filename, chunk)
}

CREATE TABLE chunks {
  filename string,
  chunk int,
  data blob,
  PRIMARY KEY ((filename, chunk))
}

By keeping the data chunks in a separate table, you'd make sure to spread
the data more evenly across the cluster. If the size of the files differ in
a size this is a much better approach. Also, using `(filename, chunk)` in
`data` table makes it possible for you do have a background process that
makes sure to delete rows in `chunks` that no longer exist in `file_chunks`.

Jens

On Friday, October 21, 2016, jason zhao yang <zh...@gmail.com>
wrote:

> 1. usually before storing object, serialization is needed, so we can know
> the size.
> 2. add "chunk id" as last clustering key.
>
> Vikas Jaiman <er.vikasjaiman@gmail.com
> <javascript:_e(%7B%7D,'cvml','er.vikasjaiman@gmail.com');>>于2016年10月21日周五
> 下午11:46写道:
>
>> Thanks for your answer but I am just curious about:
>>
>> i)How do you identify the size of the object which you are going to chunk?
>>
>> ii) While reading or updating how it is going to read all those chunks?
>>
>> Vikas
>>
>> On Thu, Oct 20, 2016 at 9:25 PM, Justin Cameron <justin@instaclustr.com
>> <javascript:_e(%7B%7D,'cvml','justin@instaclustr.com');>> wrote:
>>
>>> You can, but it is not really very efficient or cost-effective. You may
>>> encounter issues with streaming, repairs and compaction if you have very
>>> large blobs (100MB+), so try to keep them under 10MB if possible.
>>>
>>> I'd suggest storing blobs in something like Amazon S3 and keeping just
>>> the bucket name & blob id in Cassandra.
>>>
>>> On Thu, 20 Oct 2016 at 12:03 Vikas Jaiman <er.vikasjaiman@gmail.com
>>> <javascript:_e(%7B%7D,'cvml','er.vikasjaiman@gmail.com');>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Normally people would like to store smaller values in Cassandra. Is
>>>> there anyone using it to store for larger values (e.g 500KB or more) and if
>>>> so what are the issues you are facing . I Would like to know the tweaks
>>>> also which you are considering.
>>>>
>>>> Thanks,
>>>> Vikas
>>>>
>>> --
>>>
>>> Justin Cameron
>>>
>>> Senior Software Engineer | Instaclustr
>>>
>>>
>>>
>>>
>>> This email has been sent on behalf of Instaclustr Pty Ltd (Australia)
>>> and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the message.
>>>
>>>
>>
>>
>> --
>>
>

-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

Re: Does anyone store larger values in Cassandra E.g. 500 KB?

Posted by jason zhao yang <zh...@gmail.com>.
1. usually before storing object, serialization is needed, so we can know
the size.
2. add "chunk id" as last clustering key.

Vikas Jaiman <er...@gmail.com>于2016年10月21日周五 下午11:46写道:

> Thanks for your answer but I am just curious about:
>
> i)How do you identify the size of the object which you are going to chunk?
>
> ii) While reading or updating how it is going to read all those chunks?
>
> Vikas
>
> On Thu, Oct 20, 2016 at 9:25 PM, Justin Cameron <ju...@instaclustr.com>
> wrote:
>
> You can, but it is not really very efficient or cost-effective. You may
> encounter issues with streaming, repairs and compaction if you have very
> large blobs (100MB+), so try to keep them under 10MB if possible.
>
> I'd suggest storing blobs in something like Amazon S3 and keeping just the
> bucket name & blob id in Cassandra.
>
> On Thu, 20 Oct 2016 at 12:03 Vikas Jaiman <er...@gmail.com>
> wrote:
>
> Hi,
>
> Normally people would like to store smaller values in Cassandra. Is there
> anyone using it to store for larger values (e.g 500KB or more) and if so
> what are the issues you are facing . I Would like to know the tweaks also
> which you are considering.
>
> Thanks,
> Vikas
>
> --
>
> Justin Cameron
>
> Senior Software Engineer | Instaclustr
>
>
>
>
> This email has been sent on behalf of Instaclustr Pty Ltd (Australia) and
> Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
>
>
> --
>

Re: Does anyone store larger values in Cassandra E.g. 500 KB?

Posted by Vikas Jaiman <er...@gmail.com>.
Thanks for your answer but I am just curious about:

i)How do you identify the size of the object which you are going to chunk?

ii) While reading or updating how it is going to read all those chunks?

Vikas

On Thu, Oct 20, 2016 at 9:25 PM, Justin Cameron <ju...@instaclustr.com>
wrote:

> You can, but it is not really very efficient or cost-effective. You may
> encounter issues with streaming, repairs and compaction if you have very
> large blobs (100MB+), so try to keep them under 10MB if possible.
>
> I'd suggest storing blobs in something like Amazon S3 and keeping just the
> bucket name & blob id in Cassandra.
>
> On Thu, 20 Oct 2016 at 12:03 Vikas Jaiman <er...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Normally people would like to store smaller values in Cassandra. Is there
>> anyone using it to store for larger values (e.g 500KB or more) and if so
>> what are the issues you are facing . I Would like to know the tweaks also
>> which you are considering.
>>
>> Thanks,
>> Vikas
>>
> --
>
> Justin Cameron
>
> Senior Software Engineer | Instaclustr
>
>
>
>
> This email has been sent on behalf of Instaclustr Pty Ltd (Australia) and
> Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>


--

Re: Does anyone store larger values in Cassandra E.g. 500 KB?

Posted by Justin Cameron <ju...@instaclustr.com>.
You can, but it is not really very efficient or cost-effective. You may
encounter issues with streaming, repairs and compaction if you have very
large blobs (100MB+), so try to keep them under 10MB if possible.

I'd suggest storing blobs in something like Amazon S3 and keeping just the
bucket name & blob id in Cassandra.

On Thu, 20 Oct 2016 at 12:03 Vikas Jaiman <er...@gmail.com> wrote:

> Hi,
>
> Normally people would like to store smaller values in Cassandra. Is there
> anyone using it to store for larger values (e.g 500KB or more) and if so
> what are the issues you are facing . I Would like to know the tweaks also
> which you are considering.
>
> Thanks,
> Vikas
>
-- 

Justin Cameron

Senior Software Engineer | Instaclustr




This email has been sent on behalf of Instaclustr Pty Ltd (Australia) and
Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.

Re: Does anyone store larger values in Cassandra E.g. 500 KB?

Posted by Harikrishnan Pillai <HP...@walmartlabs.com>.
We use Cassandra to store images .any data above 2 mb we chunk it and store.it works perfectly .

Sent from my iPhone

> On Oct 20, 2016, at 12:09 PM, Vikas Jaiman <er...@gmail.com> wrote:
> 
> Hi,
> 
> Normally people would like to store smaller values in Cassandra. Is there anyone using it to store for larger values (e.g 500KB or more) and if so what are the issues you are facing . I Would like to know the tweaks also which you are considering.
> 
> Thanks,
> Vikas