You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sabine Forkel <Sa...@de.ibm.com> on 2017/05/11 08:56:46 UTC

How to Reduce Block Size in Solr 6.3.0 Running on HDFS

Hi,

Having small files, I set the HDFS block size to 16m in hdfs-site.xml:

  <name>dfs.blocksize</name>
  <value>16m</value>

After restarting the HDFS daemons and the Solr cloud, the block size is
still 128 MB for new files.
Can this be due to the block cache slab size which is 128 MB in size?
How can the HDFS block size be reduced?

Thanks,
Sabine

Re: How to Reduce Block Size in Solr 6.3.0 Running on HDFS

Posted by Shawn Heisey <ap...@elyograg.org>.
On 5/11/2017 2:56 AM, Sabine Forkel wrote:
> Having small files, I set the HDFS block size to 16m in hdfs-site.xml:
>
>   <name>dfs.blocksize</name>
>   <value>16m</value>
>
> After restarting the HDFS daemons and the Solr cloud, the block size is
> still 128 MB for new files.
> Can this be due to the block cache slab size which is 128 MB in size?
> How can the HDFS block size be reduced?

Since HDFS is a foreign world to me, I don't know the answer to the
question you have asked ... but I think the question indicates a
fundamental misunderstanding of exactly what Solr is going to put in
your HDFS.  I'm really curious to find out what files are you talking
about that are small.

The files in a Solr (Lucene) index may start out small, but they will
ultimately be combined into bigger files as you index. The size of the
documents that you are indexing has no influence at all on the ultimate
size of the files that Lucene creates by merging.

If your index is the kind where running an occasional optimize is
warranted, then the files in the Lucene index will all be combined into
one segment by an optimize, with maybe a dozen files.  Most of those
files will be large.

Thanks,
Shawn