You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mike Drob (JIRA)" <ji...@apache.org> on 2014/12/31 17:53:13 UTC

[jira] [Comment Edited] (LUCENE-6153) randomize stored fields/vectors index blocksize

    [ https://issues.apache.org/jira/browse/LUCENE-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262297#comment-14262297 ] 

Mike Drob edited comment on LUCENE-6153 at 12/31/14 4:52 PM:
-------------------------------------------------------------

{code:title=CompressingStoredFieldsFormat.java}
+    if (blockSize < 1) {
{code}
{code:title=CompressingStoredFieldsIndexWriter.java}
+    if (blockSize <= 0) {
{code}
{code:title=CompressingTermVectorsFormat.java}
+    if (blockSize < 1) {
{code}

It would be nice for these to be consistent.

{code:title=Lucene50StoredFieldsFormat.java}
-        return new CompressingStoredFieldsFormat("Lucene50StoredFieldsFast", CompressionMode.FAST, 1 << 14, 128);
+        return new CompressingStoredFieldsFormat("Lucene50StoredFieldsFast", CompressionMode.FAST, 1 << 14, 128, 1024);
{code}
Can we have a constant for default block size = 1024. Also might as well have constants for whatever 1 << 14 and 128 are, but that can be a follow on issue.


was (Author: mdrob):
{code|title:CompressingStoredFieldsFormat.java}
+    if (blockSize < 1) {
{code}
{code|title:CompressingStoredFieldsIndexWriter.java}
+    if (blockSize <= 0) {
{code}
{code|title:CompressingTermVectorsFormat.java}
+    if (blockSize < 1) {
{code}

It would be nice for these to be consistent.

{code|title:Lucene50StoredFieldsFormat.java}
-        return new CompressingStoredFieldsFormat("Lucene50StoredFieldsFast", CompressionMode.FAST, 1 << 14, 128);
+        return new CompressingStoredFieldsFormat("Lucene50StoredFieldsFast", CompressionMode.FAST, 1 << 14, 128, 1024);
{code}
Can we have a constant for default block size = 1024. Also might as well have constants for whatever 1 << 14 and 128 are, but that can be a follow on issue.

> randomize stored fields/vectors index blocksize
> -----------------------------------------------
>
>                 Key: LUCENE-6153
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6153
>             Project: Lucene - Core
>          Issue Type: Test
>            Reporter: Robert Muir
>         Attachments: LUCENE-6153.patch
>
>
> the Compressing impls compress documents into chunks. We then record index data for every N chunks, which is binary searched to find the start of the chunk. today this is always 1024.
> This means to test the stored fields index well, we need to index thousands and thousands of documents. But if we randomize the parameter, we can test it more effectively by setting it to very low values (e.g. 5) in tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org