You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "Alaa Zubaidi (PDF)" <al...@pdf.com> on 2018/09/14 00:46:43 UTC

cold vs hot data

Hi,

We are using Apache Cassandra 3.11.2 on RedHat 7
The data can grow to +100TB however the hot data will be in most cases less
than 10TB but we still need to keep the rest of data accessible.
Anyone has this problem?
What is the best way to make the cluster more efficient?
Is there a way to somehow automatically move the old data to different
storage (rack, dc, etc)?
Any ideas?

Regards,

-- 

Alaa

-- 
This message may contain confidential and privileged information. If it has 
been sent to you in error, please reply to advise the sender of the error 
and then immediately permanently delete it and all attachments to it from 
your systems. If you are not the intended recipient, do not read, copy, 
disclose or otherwise use this message or any attachments to it. The sender 
disclaims any liability for such unauthorized use.  PLEASE NOTE that all 
incoming e-mails sent to PDF e-mail accounts will be archived and may be 
scanned by us and/or by external service providers to detect and prevent 
threats to our systems, investigate illegal or inappropriate behavior, 
and/or eliminate unsolicited promotional e-mails (“spam”).  If you have any 
concerns about this process, please contact us at legal.department@pdf.com 
<ma...@pdf.com>.

Re: cold vs hot data

Posted by "Alaa Zubaidi (PDF)" <al...@pdf.com>.
Let me check lvmcache..Thanks

On Thu, Sep 13, 2018 at 11:39 PM, Mateusz <ma...@ant.gliwice.pl>
wrote:

> On piątek, 14 września 2018 02:46:43 CEST Alaa Zubaidi (PDF) wrote:
> > The data can grow to +100TB however the hot data will be in most cases
> less
> > than 10TB but we still need to keep the rest of data accessible.
> > Anyone has this problem?
> > What is the best way to make the cluster more efficient?
> > Is there a way to somehow automatically move the old data to different
> > storage (rack, dc, etc)?
> > Any ideas?
>
> We solved it using lvmcache.
>
> --
> Mateusz
> (...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś,
>         krótko mówiąc - podpora społeczeństwa."
>                 Nikos Kazantzakis - "Grek Zorba"
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>


-- 

Alaa Zubaidi
PDF Solutions, Inc.
333 West San Carlos Street, Suite 1000
San Jose, CA 95110  USA
Tel: 408-283-5639
fax: 408-938-6479
email: alaa.zubaidi@pdf.com

-- 
This message may contain confidential and privileged information. If it has 
been sent to you in error, please reply to advise the sender of the error 
and then immediately permanently delete it and all attachments to it from 
your systems. If you are not the intended recipient, do not read, copy, 
disclose or otherwise use this message or any attachments to it. The sender 
disclaims any liability for such unauthorized use.  PLEASE NOTE that all 
incoming e-mails sent to PDF e-mail accounts will be archived and may be 
scanned by us and/or by external service providers to detect and prevent 
threats to our systems, investigate illegal or inappropriate behavior, 
and/or eliminate unsolicited promotional e-mails (“spam”).  If you have any 
concerns about this process, please contact us at legal.department@pdf.com 
<ma...@pdf.com>.

Re: [EXTERNAL] Re: cold vs hot data

Posted by DuyHai Doan <do...@gmail.com>.
Also for the record, I remember Datastax having something called Tiered
Storage that does move data around (folders/disk volume) based on data age.
To be checked

On Mon, Sep 17, 2018 at 10:23 PM, DuyHai Doan <do...@gmail.com> wrote:

> Sean
>
> Without transactions à la SQL, how can you guarantee atomicity between
> both tables for upserts ? I mean, one write could succeed with hot table
> and fail for cold table
>
> The only solution I see is using logged batch, with a huge overhead and
> perf hit on for the writes
>
> On Mon, Sep 17, 2018 at 8:28 PM, Durity, Sean R <
> SEAN_R_DURITY@homedepot.com> wrote:
>
>> An idea:
>>
>> On initial insert, insert into 2 tables:
>> Hot with short TTL
>> Cold/archive with a longer (or no) TTL
>> Then your hot data is always in the same table, but being expired. And
>> you can access the archive table only for the more rare circumstances. Then
>> you could have the HOT table on a different volume of faster storage. If
>> the hot/cold tables are in different keyspaces, then you could also have
>> different replication (a HOT DC and an archive DC, for example)
>>
>>
>> Sean Durity
>>
>>
>> -----Original Message-----
>> From: Mateusz <ma...@ant.gliwice.pl>
>> Sent: Friday, September 14, 2018 2:40 AM
>> To: user@cassandra.apache.org
>> Subject: [EXTERNAL] Re: cold vs hot data
>>
>> On piątek, 14 września 2018 02:46:43 CEST Alaa Zubaidi (PDF) wrote:
>> > The data can grow to +100TB however the hot data will be in most cases
>> > less than 10TB but we still need to keep the rest of data accessible.
>> > Anyone has this problem?
>> > What is the best way to make the cluster more efficient?
>> > Is there a way to somehow automatically move the old data to different
>> > storage (rack, dc, etc)?
>> > Any ideas?
>>
>> We solved it using lvmcache.
>>
>> --
>> Mateusz
>> (...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś,
>> krótko mówiąc - podpora społeczeństwa."
>> Nikos Kazantzakis - "Grek Zorba"
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>>
>> ________________________________
>>
>> The information in this Internet Email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this Email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful. When addressed
>> to our clients any opinions or advice contained in this Email are subject
>> to the terms and conditions expressed in any applicable governing The Home
>> Depot terms of business or client engagement letter. The Home Depot
>> disclaims all responsibility and liability for the accuracy and content of
>> this attachment and for any damages or losses arising from any
>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>> items of a destructive nature, which may be contained in this attachment
>> and shall not be liable for direct, indirect, consequential or special
>> damages in connection with this e-mail message or its attachment.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>
>

Re: [EXTERNAL] Re: cold vs hot data

Posted by "Alaa Zubaidi (PDF)" <al...@pdf.com>.
This is one of the options that we are thinking of, but this will require
more storage, which is something that we are trying to avoid.
We will test the performance for the batch inserts.
Thanks

On Tue, Sep 18, 2018 at 6:35 AM, Durity, Sean R <SEAN_R_DURITY@homedepot.com
> wrote:

> Wouldn’t you have the same problem with two similar tables with different
> primary keys (eg., UserByID and UserByName)? This is a very common pattern
> in Cassandra – inserting into multiple tables… That’s what batches are for
> – atomicity.
>
> I don’t understand the additional concern here.
>
>
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* DuyHai Doan <do...@gmail.com>
> *Sent:* Monday, September 17, 2018 4:23 PM
> *To:* user <us...@cassandra.apache.org>
> *Subject:* Re: [EXTERNAL] Re: cold vs hot data
>
>
>
> Sean
>
>
>
> Without transactions à la SQL, how can you guarantee atomicity between
> both tables for upserts ? I mean, one write could succeed with hot table
> and fail for cold table
>
>
>
> The only solution I see is using logged batch, with a huge overhead and
> perf hit on for the writes
>
>
>
> On Mon, Sep 17, 2018 at 8:28 PM, Durity, Sean R <
> SEAN_R_DURITY@homedepot.com> wrote:
>
> An idea:
>
> On initial insert, insert into 2 tables:
> Hot with short TTL
> Cold/archive with a longer (or no) TTL
> Then your hot data is always in the same table, but being expired. And you
> can access the archive table only for the more rare circumstances. Then you
> could have the HOT table on a different volume of faster storage. If the
> hot/cold tables are in different keyspaces, then you could also have
> different replication (a HOT DC and an archive DC, for example)
>
>
> Sean Durity
>
>
>
> -----Original Message-----
> From: Mateusz <ma...@ant.gliwice.pl>
> Sent: Friday, September 14, 2018 2:40 AM
> To: user@cassandra.apache.org
> Subject: [EXTERNAL] Re: cold vs hot data
>
> On piątek, 14 września 2018 02:46:43 CEST Alaa Zubaidi (PDF) wrote:
> > The data can grow to +100TB however the hot data will be in most cases
> > less than 10TB but we still need to keep the rest of data accessible.
> > Anyone has this problem?
> > What is the best way to make the cluster more efficient?
> > Is there a way to somehow automatically move the old data to different
> > storage (rack, dc, etc)?
> > Any ideas?
>
> We solved it using lvmcache.
>
> --
> Mateusz
> (...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś,
> krótko mówiąc - podpora społeczeństwa."
> Nikos Kazantzakis - "Grek Zorba"
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
> ________________________________
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>



-- 

Alaa Zubaidi
PDF Solutions, Inc.
333 West San Carlos Street, Suite 1000
San Jose, CA 95110  USA
Tel: 408-283-5639
fax: 408-938-6479
email: alaa.zubaidi@pdf.com

-- 
This message may contain confidential and privileged information. If it has 
been sent to you in error, please reply to advise the sender of the error 
and then immediately permanently delete it and all attachments to it from 
your systems. If you are not the intended recipient, do not read, copy, 
disclose or otherwise use this message or any attachments to it. The sender 
disclaims any liability for such unauthorized use.  PLEASE NOTE that all 
incoming e-mails sent to PDF e-mail accounts will be archived and may be 
scanned by us and/or by external service providers to detect and prevent 
threats to our systems, investigate illegal or inappropriate behavior, 
and/or eliminate unsolicited promotional e-mails (“spam”).  If you have any 
concerns about this process, please contact us at legal.department@pdf.com 
<ma...@pdf.com>.

RE: [EXTERNAL] Re: cold vs hot data

Posted by "Durity, Sean R" <SE...@homedepot.com>.
Wouldn’t you have the same problem with two similar tables with different primary keys (eg., UserByID and UserByName)? This is a very common pattern in Cassandra – inserting into multiple tables… That’s what batches are for – atomicity.
I don’t understand the additional concern here.



Sean Durity

From: DuyHai Doan <do...@gmail.com>
Sent: Monday, September 17, 2018 4:23 PM
To: user <us...@cassandra.apache.org>
Subject: Re: [EXTERNAL] Re: cold vs hot data

Sean

Without transactions à la SQL, how can you guarantee atomicity between both tables for upserts ? I mean, one write could succeed with hot table and fail for cold table

The only solution I see is using logged batch, with a huge overhead and perf hit on for the writes

On Mon, Sep 17, 2018 at 8:28 PM, Durity, Sean R <SE...@homedepot.com>> wrote:
An idea:

On initial insert, insert into 2 tables:
Hot with short TTL
Cold/archive with a longer (or no) TTL
Then your hot data is always in the same table, but being expired. And you can access the archive table only for the more rare circumstances. Then you could have the HOT table on a different volume of faster storage. If the hot/cold tables are in different keyspaces, then you could also have different replication (a HOT DC and an archive DC, for example)


Sean Durity


-----Original Message-----
From: Mateusz <ma...@ant.gliwice.pl>>
Sent: Friday, September 14, 2018 2:40 AM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: [EXTERNAL] Re: cold vs hot data

On piątek, 14 września 2018 02:46:43 CEST Alaa Zubaidi (PDF) wrote:
> The data can grow to +100TB however the hot data will be in most cases
> less than 10TB but we still need to keep the rest of data accessible.
> Anyone has this problem?
> What is the best way to make the cluster more efficient?
> Is there a way to somehow automatically move the old data to different
> storage (rack, dc, etc)?
> Any ideas?

We solved it using lvmcache.

--
Mateusz
(...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś,
krótko mówiąc - podpora społeczeństwa."
Nikos Kazantzakis - "Grek Zorba"




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org<ma...@cassandra.apache.org>
For additional commands, e-mail: user-help@cassandra.apache.org<ma...@cassandra.apache.org>


________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: [EXTERNAL] Re: cold vs hot data

Posted by DuyHai Doan <do...@gmail.com>.
Sean

Without transactions à la SQL, how can you guarantee atomicity between both
tables for upserts ? I mean, one write could succeed with hot table and
fail for cold table

The only solution I see is using logged batch, with a huge overhead and
perf hit on for the writes

On Mon, Sep 17, 2018 at 8:28 PM, Durity, Sean R <SEAN_R_DURITY@homedepot.com
> wrote:

> An idea:
>
> On initial insert, insert into 2 tables:
> Hot with short TTL
> Cold/archive with a longer (or no) TTL
> Then your hot data is always in the same table, but being expired. And you
> can access the archive table only for the more rare circumstances. Then you
> could have the HOT table on a different volume of faster storage. If the
> hot/cold tables are in different keyspaces, then you could also have
> different replication (a HOT DC and an archive DC, for example)
>
>
> Sean Durity
>
>
> -----Original Message-----
> From: Mateusz <ma...@ant.gliwice.pl>
> Sent: Friday, September 14, 2018 2:40 AM
> To: user@cassandra.apache.org
> Subject: [EXTERNAL] Re: cold vs hot data
>
> On piątek, 14 września 2018 02:46:43 CEST Alaa Zubaidi (PDF) wrote:
> > The data can grow to +100TB however the hot data will be in most cases
> > less than 10TB but we still need to keep the rest of data accessible.
> > Anyone has this problem?
> > What is the best way to make the cluster more efficient?
> > Is there a way to somehow automatically move the old data to different
> > storage (rack, dc, etc)?
> > Any ideas?
>
> We solved it using lvmcache.
>
> --
> Mateusz
> (...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś,
> krótko mówiąc - podpora społeczeństwa."
> Nikos Kazantzakis - "Grek Zorba"
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>
> ________________________________
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>

RE: [EXTERNAL] Re: cold vs hot data

Posted by "Durity, Sean R" <SE...@homedepot.com>.
An idea:

On initial insert, insert into 2 tables:
Hot with short TTL
Cold/archive with a longer (or no) TTL
Then your hot data is always in the same table, but being expired. And you can access the archive table only for the more rare circumstances. Then you could have the HOT table on a different volume of faster storage. If the hot/cold tables are in different keyspaces, then you could also have different replication (a HOT DC and an archive DC, for example)


Sean Durity


-----Original Message-----
From: Mateusz <ma...@ant.gliwice.pl>
Sent: Friday, September 14, 2018 2:40 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: cold vs hot data

On piątek, 14 września 2018 02:46:43 CEST Alaa Zubaidi (PDF) wrote:
> The data can grow to +100TB however the hot data will be in most cases
> less than 10TB but we still need to keep the rest of data accessible.
> Anyone has this problem?
> What is the best way to make the cluster more efficient?
> Is there a way to somehow automatically move the old data to different
> storage (rack, dc, etc)?
> Any ideas?

We solved it using lvmcache.

--
Mateusz
(...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś,
krótko mówiąc - podpora społeczeństwa."
Nikos Kazantzakis - "Grek Zorba"




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: cold vs hot data

Posted by Mateusz <ma...@ant.gliwice.pl>.
On piątek, 14 września 2018 02:46:43 CEST Alaa Zubaidi (PDF) wrote:
> The data can grow to +100TB however the hot data will be in most cases less
> than 10TB but we still need to keep the rest of data accessible.
> Anyone has this problem?
> What is the best way to make the cluster more efficient?
> Is there a way to somehow automatically move the old data to different
> storage (rack, dc, etc)?
> Any ideas?

We solved it using lvmcache.

-- 
Mateusz 
(...) mam brata - poważny, domator, liczykrupa, hipokryta, pobożniś,
	krótko mówiąc - podpora społeczeństwa."
		Nikos Kazantzakis - "Grek Zorba"




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Re: cold vs hot data

Posted by Jens Rantil <je...@tink.se>.
I guess also OS-level page cache also will help out implicitly to make sure
your common pages aren't touching disk.

On Fri, Sep 14, 2018 at 2:46 AM Alaa Zubaidi (PDF) <al...@pdf.com>
wrote:

> Hi,
>
> We are using Apache Cassandra 3.11.2 on RedHat 7
> The data can grow to +100TB however the hot data will be in most cases
> less than 10TB but we still need to keep the rest of data accessible.
> Anyone has this problem?
> What is the best way to make the cluster more efficient?
> Is there a way to somehow automatically move the old data to different
> storage (rack, dc, etc)?
> Any ideas?
>
> Regards,
>
> --
>
> Alaa
>
>
> *This message may contain confidential and privileged information. If it
> has been sent to you in error, please reply to advise the sender of the
> error and then immediately permanently delete it and all attachments to it
> from your systems. If you are not the intended recipient, do not read,
> copy, disclose or otherwise use this message or any attachments to it. The
> sender disclaims any liability for such unauthorized use. PLEASE NOTE that
> all incoming e-mails sent to PDF e-mail accounts will be archived and may
> be scanned by us and/or by external service providers to detect and prevent
> threats to our systems, investigate illegal or inappropriate behavior,
> and/or eliminate unsolicited promotional e-mails (“spam”). If you have any
> concerns about this process, please contact us at *
> *legal.department@pdf.com* <le...@pdf.com>*.*



-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

Re: cold vs hot data

Posted by "Alaa Zubaidi (PDF)" <al...@pdf.com>.
Thanks Ben, I will try this on 4.0 when its available.


On Thu, Sep 13, 2018 at 7:06 PM, Ben Slater <be...@instaclustr.com>
wrote:

> Not quite a solution but you will probably be interested in the discussion
> on this ticket: https://issues.apache.org/jira/browse/CASSANDRA-8460
>
> On Fri, 14 Sep 2018 at 10:46 Alaa Zubaidi (PDF) <al...@pdf.com>
> wrote:
>
>> Hi,
>>
>> We are using Apache Cassandra 3.11.2 on RedHat 7
>> The data can grow to +100TB however the hot data will be in most cases
>> less than 10TB but we still need to keep the rest of data accessible.
>> Anyone has this problem?
>> What is the best way to make the cluster more efficient?
>> Is there a way to somehow automatically move the old data to different
>> storage (rack, dc, etc)?
>> Any ideas?
>>
>> Regards,
>>
>> --
>>
>> Alaa
>>
>>
>> *This message may contain confidential and privileged information. If it
>> has been sent to you in error, please reply to advise the sender of the
>> error and then immediately permanently delete it and all attachments to it
>> from your systems. If you are not the intended recipient, do not read,
>> copy, disclose or otherwise use this message or any attachments to it. The
>> sender disclaims any liability for such unauthorized use. PLEASE NOTE that
>> all incoming e-mails sent to PDF e-mail accounts will be archived and may
>> be scanned by us and/or by external service providers to detect and prevent
>> threats to our systems, investigate illegal or inappropriate behavior,
>> and/or eliminate unsolicited promotional e-mails (“spam”). If you have any
>> concerns about this process, please contact us at *
>> *legal.department@pdf.com* <le...@pdf.com>*.*
>
> --
>
>
> *Ben Slater*
>
> *Chief Product Officer <https://www.instaclustr.com/>*
>
> <https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
>    <https://www.linkedin.com/company/instaclustr>
>
> Read our latest technical blog posts here
> <https://www.instaclustr.com/blog/>.
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>



-- 

Alaa Zubaidi
PDF Solutions, Inc.
333 West San Carlos Street, Suite 1000
San Jose, CA 95110  USA
Tel: 408-283-5639
fax: 408-938-6479
email: alaa.zubaidi@pdf.com

-- 
This message may contain confidential and privileged information. If it has 
been sent to you in error, please reply to advise the sender of the error 
and then immediately permanently delete it and all attachments to it from 
your systems. If you are not the intended recipient, do not read, copy, 
disclose or otherwise use this message or any attachments to it. The sender 
disclaims any liability for such unauthorized use.  PLEASE NOTE that all 
incoming e-mails sent to PDF e-mail accounts will be archived and may be 
scanned by us and/or by external service providers to detect and prevent 
threats to our systems, investigate illegal or inappropriate behavior, 
and/or eliminate unsolicited promotional e-mails (“spam”).  If you have any 
concerns about this process, please contact us at legal.department@pdf.com 
<ma...@pdf.com>.

Re: cold vs hot data

Posted by Ben Slater <be...@instaclustr.com>.
Not quite a solution but you will probably be interested in the discussion
on this ticket: https://issues.apache.org/jira/browse/CASSANDRA-8460

On Fri, 14 Sep 2018 at 10:46 Alaa Zubaidi (PDF) <al...@pdf.com>
wrote:

> Hi,
>
> We are using Apache Cassandra 3.11.2 on RedHat 7
> The data can grow to +100TB however the hot data will be in most cases
> less than 10TB but we still need to keep the rest of data accessible.
> Anyone has this problem?
> What is the best way to make the cluster more efficient?
> Is there a way to somehow automatically move the old data to different
> storage (rack, dc, etc)?
> Any ideas?
>
> Regards,
>
> --
>
> Alaa
>
>
> *This message may contain confidential and privileged information. If it
> has been sent to you in error, please reply to advise the sender of the
> error and then immediately permanently delete it and all attachments to it
> from your systems. If you are not the intended recipient, do not read,
> copy, disclose or otherwise use this message or any attachments to it. The
> sender disclaims any liability for such unauthorized use. PLEASE NOTE that
> all incoming e-mails sent to PDF e-mail accounts will be archived and may
> be scanned by us and/or by external service providers to detect and prevent
> threats to our systems, investigate illegal or inappropriate behavior,
> and/or eliminate unsolicited promotional e-mails (“spam”). If you have any
> concerns about this process, please contact us at *
> *legal.department@pdf.com* <le...@pdf.com>*.*

-- 


*Ben Slater*

*Chief Product Officer <https://www.instaclustr.com/>*

<https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
<https://www.linkedin.com/company/instaclustr>

Read our latest technical blog posts here
<https://www.instaclustr.com/blog/>.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.