You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Venkata Hari Krishna Nukala <n....@gmail.com> on 2018/04/09 20:00:53 UTC

Re: Can "data_file_directories" make use of multiple disks?

I spent some time in code (trunk) to understand it better. If I understood
it correctly DiskBoundaryManager.getDiskBoundaries() method does the
partition and it has nothing to do with the compaction strategy. Is it
correct?

cassandra.yaml states that "Directories where Cassandra should store data
on disk. Cassandra will spread data evenly across them, *subject to the
granularity of the configured compaction strategy.*". I feel it is not
correct anymore.  Is it worth updating the doc?



On Tue, Mar 27, 2018 at 9:59 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> In Cassandra 3.2 and later, data is partitioned by token range, which
> should give you even distribution of data.
>
> If you're going to go into 3.x, please use the latest 3.11, which at this
> time is 3.11.2.
>
>
> On Tue, Mar 27, 2018 at 8:05 AM Venkata Hari Krishna Nukala <
> n.v.harikrishna.apache@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to replace machines having HDD with little powerful machines
>> having SSD in production. The data present in each node is around 300gb.
>> But the newer machines have 2 X 200GB SSDs instead of a single disk.
>>
>> "data_file_directories" looks like a multi-valued config which I can use.
>> Am I looking at the right config?
>>
>> How does the data is distributed evenly? Leveled Compaction Strategy is
>> used for the tables.
>>
>> Thanks!
>>
>

Re: Can "data_file_directories" make use of multiple disks?

Posted by Venkata Hari Krishna Nukala <n....@gmail.com>.
Paulo, thanks for the confirmation. I had raised a ticket for this.

https://issues.apache.org/jira/browse/CASSANDRA-14372



On Tue, Apr 10, 2018 at 2:37 AM, Paulo Motta <pa...@gmail.com>
wrote:

> > cassandra.yaml states that "Directories where Cassandra should store
> data on disk. Cassandra will spread data evenly across them, subject to the
> granularity of the configured compaction strategy.". I feel it is not
> correct anymore.  Is it worth updating the doc?
>
> In fact this changed after CASSANDRA-6696, but the comment on
> cassandra.yaml (where the docs is created from) was never updated.
> Would you mind opening a ticket to fix this comment ? Thanks.
>
> 2018-04-09 17:00 GMT-03:00 Venkata Hari Krishna Nukala
> <n....@gmail.com>:
> > I spent some time in code (trunk) to understand it better. If I
> understood
> > it correctly DiskBoundaryManager.getDiskBoundaries() method does the
> > partition and it has nothing to do with the compaction strategy. Is it
> > correct?
> >
> > cassandra.yaml states that "Directories where Cassandra should store
> data on
> > disk. Cassandra will spread data evenly across them, subject to the
> > granularity of the configured compaction strategy.". I feel it is not
> > correct anymore.  Is it worth updating the doc?
> >
> >
> >
> > On Tue, Mar 27, 2018 at 9:59 PM, Jonathan Haddad <jo...@jonhaddad.com>
> wrote:
> >>
> >> In Cassandra 3.2 and later, data is partitioned by token range, which
> >> should give you even distribution of data.
> >>
> >> If you're going to go into 3.x, please use the latest 3.11, which at
> this
> >> time is 3.11.2.
> >>
> >>
> >> On Tue, Mar 27, 2018 at 8:05 AM Venkata Hari Krishna Nukala
> >> <n....@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I am trying to replace machines having HDD with little powerful
> machines
> >>> having SSD in production. The data present in each node is around
> 300gb. But
> >>> the newer machines have 2 X 200GB SSDs instead of a single disk.
> >>>
> >>> "data_file_directories" looks like a multi-valued config which I can
> use.
> >>> Am I looking at the right config?
> >>>
> >>> How does the data is distributed evenly? Leveled Compaction Strategy is
> >>> used for the tables.
> >>>
> >>> Thanks!
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: Can "data_file_directories" make use of multiple disks?

Posted by Paulo Motta <pa...@gmail.com>.
> cassandra.yaml states that "Directories where Cassandra should store data on disk. Cassandra will spread data evenly across them, subject to the granularity of the configured compaction strategy.". I feel it is not correct anymore.  Is it worth updating the doc?

In fact this changed after CASSANDRA-6696, but the comment on
cassandra.yaml (where the docs is created from) was never updated.
Would you mind opening a ticket to fix this comment ? Thanks.

2018-04-09 17:00 GMT-03:00 Venkata Hari Krishna Nukala
<n....@gmail.com>:
> I spent some time in code (trunk) to understand it better. If I understood
> it correctly DiskBoundaryManager.getDiskBoundaries() method does the
> partition and it has nothing to do with the compaction strategy. Is it
> correct?
>
> cassandra.yaml states that "Directories where Cassandra should store data on
> disk. Cassandra will spread data evenly across them, subject to the
> granularity of the configured compaction strategy.". I feel it is not
> correct anymore.  Is it worth updating the doc?
>
>
>
> On Tue, Mar 27, 2018 at 9:59 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
>>
>> In Cassandra 3.2 and later, data is partitioned by token range, which
>> should give you even distribution of data.
>>
>> If you're going to go into 3.x, please use the latest 3.11, which at this
>> time is 3.11.2.
>>
>>
>> On Tue, Mar 27, 2018 at 8:05 AM Venkata Hari Krishna Nukala
>> <n....@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I am trying to replace machines having HDD with little powerful machines
>>> having SSD in production. The data present in each node is around 300gb. But
>>> the newer machines have 2 X 200GB SSDs instead of a single disk.
>>>
>>> "data_file_directories" looks like a multi-valued config which I can use.
>>> Am I looking at the right config?
>>>
>>> How does the data is distributed evenly? Leveled Compaction Strategy is
>>> used for the tables.
>>>
>>> Thanks!
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org