You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Salman Akram <sa...@northbaysolutions.net> on 2014/03/18 14:12:11 UTC

Best SSD block size for large SOLR indexes

All,

Is there a rule of thumb for ideal block size for SSDs for large indexes
(in hundreds of GBs)? Read performance is of top importance for us and we
can sacrifice the space a little...

This is the one we just got and wanted to see if there are any test results
out there
http://www.storagereview.com/micron_p420m_enterprise_pcie_ssd_review

-- 
Regards,

Salman Akram

Re: Best SSD block size for large SOLR indexes

Posted by Salman Akram <sa...@northbaysolutions.net>.
We do have couple of commodity SSDs already and they perform good. However,
our user queries are very complex and quite a few of them go above a minute
so we really had to do something about it.

Using this beast vs putting the whole index to RAM, the beast still seemed
a better option. Also we are using some top notch servers already.


On Wed, Mar 19, 2014 at 1:52 AM, Toke Eskildsen <te...@statsbiblioteket.dk>wrote:

> Salman Akram [salman.akram@northbaysolutions.net] wrote:
>
> [Hundreds of GB index]
>
> > http://www.storagereview.com/micron_p420m_enterprise_pcie_ssd_review
>
> May I ask why you have chosen a drive with such a high speed and matching
> cost?
>
> We have some years of experience with using SSDs for search at work and it
> is our experience that commodity SSDs performs very well (one test showed
> something like 80% of RAM speed, YMMW). It seems to me that more servers
> with commodity SSDs could very well be cheaper and give better throughput
> than the beast(s) you're using. Are you trying to minimize latency "at all
> cost"?
>
> Regards,
> Toke Eskildsen




-- 
Regards,

Salman Akram

RE: Best SSD block size for large SOLR indexes

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
Salman Akram [salman.akram@northbaysolutions.net] wrote:

[Hundreds of GB index]

> http://www.storagereview.com/micron_p420m_enterprise_pcie_ssd_review

May I ask why you have chosen a drive with such a high speed and matching cost?

We have some years of experience with using SSDs for search at work and it is our experience that commodity SSDs performs very well (one test showed something like 80% of RAM speed, YMMW). It seems to me that more servers with commodity SSDs could very well be cheaper and give better throughput than the beast(s) you're using. Are you trying to minimize latency "at all cost"?

Regards,
Toke Eskildsen

Re: Best SSD block size for large SOLR indexes

Posted by Salman Akram <sa...@northbaysolutions.net>.
For now I am going with 64kb and results seem good. Thanks for the useful
feedback.


On Wed, Mar 19, 2014 at 9:30 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 3/19/2014 12:09 AM, Salman Akram wrote:
>
>> Thanks for the info. The articles were really useful but still seems I
>> have
>> to do my own testing to find the right page size? I thought for large
>> indexes there would already be some tests done in SOLR community.
>>
>> Side note: We are heavily using Microsoft technology (.NET etc) for
>> development so by looking at all the pros/cons decided to stick with
>> Windows. Wasn't rude ;)
>>
>
> Assuming you are only going to be putting Solr data on it, or anything
> else you put on it will also consist of large files, I would probably go
> with a cluster size at least 64KB for an NTFS volume, and I might consider
> 128KB or 256KB.  There *ARE* a few small files in a Solr index, but not
> enough of them for the wasted space to become a problem.
>
> The easiest way to configure Solr to use a different location than the
> program directory is to change the solr home.
>
> Thanks,
> Shawn
>
>


-- 
Regards,

Salman Akram

Re: Best SSD block size for large SOLR indexes

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/19/2014 12:09 AM, Salman Akram wrote:
> Thanks for the info. The articles were really useful but still seems I have
> to do my own testing to find the right page size? I thought for large
> indexes there would already be some tests done in SOLR community.
>
> Side note: We are heavily using Microsoft technology (.NET etc) for
> development so by looking at all the pros/cons decided to stick with
> Windows. Wasn't rude ;)

Assuming you are only going to be putting Solr data on it, or anything 
else you put on it will also consist of large files, I would probably go 
with a cluster size at least 64KB for an NTFS volume, and I might 
consider 128KB or 256KB.  There *ARE* a few small files in a Solr index, 
but not enough of them for the wasted space to become a problem.

The easiest way to configure Solr to use a different location than the 
program directory is to change the solr home.

Thanks,
Shawn


Re: Best SSD block size for large SOLR indexes

Posted by Salman Akram <sa...@northbaysolutions.net>.
Thanks for the info. The articles were really useful but still seems I have
to do my own testing to find the right page size? I thought for large
indexes there would already be some tests done in SOLR community.

Side note: We are heavily using Microsoft technology (.NET etc) for
development so by looking at all the pros/cons decided to stick with
Windows. Wasn't rude ;)


On Tue, Mar 18, 2014 at 7:22 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 3/18/2014 7:39 AM, Salman Akram wrote:
> > This SSD default size seems to be 4K not 16K (as can be seen below).
> >
> > Bytes Per Sector  :               512
> > Bytes Per Physical Sector :       4096
> > Bytes Per Cluster :               4096
> > Bytes Per FileRecord Segment    : 1024
>
> The *sector* size on a typical SSD is 4KB, but the *page* size is a
> lower level detail, and is more likely to be 16KB, especially on a very
> large SSD.
>
> The Micron P420m is actually mentioned specifically in the SSD article I
> linked, and a table in part 2 states that its page size is 16KB, with a
> block size of 8MB.
>
> Possibly rude side note: Windows? Really?
>
> Thanks,
> Shawn
>
>


-- 
Regards,

Salman Akram

Re: Best SSD block size for large SOLR indexes

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/18/2014 7:39 AM, Salman Akram wrote:
> This SSD default size seems to be 4K not 16K (as can be seen below).
> 
> Bytes Per Sector  :               512
> Bytes Per Physical Sector :       4096
> Bytes Per Cluster :               4096
> Bytes Per FileRecord Segment    : 1024

The *sector* size on a typical SSD is 4KB, but the *page* size is a
lower level detail, and is more likely to be 16KB, especially on a very
large SSD.

The Micron P420m is actually mentioned specifically in the SSD article I
linked, and a table in part 2 states that its page size is 16KB, with a
block size of 8MB.

Possibly rude side note: Windows? Really?

Thanks,
Shawn


Re: Best SSD block size for large SOLR indexes

Posted by Salman Akram <sa...@northbaysolutions.net>.
This SSD default size seems to be 4K not 16K (as can be seen below).

Bytes Per Sector  :               512
Bytes Per Physical Sector :       4096
Bytes Per Cluster :               4096
Bytes Per FileRecord Segment    : 1024

I will go through the articles you sent. Thanks


On Tue, Mar 18, 2014 at 6:31 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 3/18/2014 7:12 AM, Salman Akram wrote:
> > Is there a rule of thumb for ideal block size for SSDs for large indexes
> > (in hundreds of GBs)? Read performance is of top importance for us and we
> > can sacrifice the space a little...
> >
> > This is the one we just got and wanted to see if there are any test
> results
> > out there
> > http://www.storagereview.com/micron_p420m_enterprise_pcie_ssd_review
>
> The best filesystem block size to use for SSDs is dictated more by the
> characteristics of the SSD itself than what data you put on it.
>
> Here's an awesome series of articles about SSDs that I heard about from
> Shalin Shekhar Mangar:
>
>
> http://codecapsule.com/2014/02/12/coding-for-ssds-part-1-introduction-and-table-of-contents/
>
> With the page size of most large SSDs at 16KB, you might want to go with
> a multiple of that, like 64KB, and learn about the proper use of parted
> to align partition boundaries.
>
> As for whether there are Solr settings that can improve the I/O
> characteristics when reading/writing, that I do not know.
>
> Thanks,
> Shawn
>
>


-- 
Regards,

Salman Akram

Re: Best SSD block size for large SOLR indexes

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/18/2014 7:12 AM, Salman Akram wrote:
> Is there a rule of thumb for ideal block size for SSDs for large indexes
> (in hundreds of GBs)? Read performance is of top importance for us and we
> can sacrifice the space a little...
> 
> This is the one we just got and wanted to see if there are any test results
> out there
> http://www.storagereview.com/micron_p420m_enterprise_pcie_ssd_review

The best filesystem block size to use for SSDs is dictated more by the
characteristics of the SSD itself than what data you put on it.

Here's an awesome series of articles about SSDs that I heard about from
Shalin Shekhar Mangar:

http://codecapsule.com/2014/02/12/coding-for-ssds-part-1-introduction-and-table-of-contents/

With the page size of most large SSDs at 16KB, you might want to go with
a multiple of that, like 64KB, and learn about the proper use of parted
to align partition boundaries.

As for whether there are Solr settings that can improve the I/O
characteristics when reading/writing, that I do not know.

Thanks,
Shawn