You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2008/01/02 18:31:33 UTC

[jira] Created: (LUCENE-1115) Some small fixes to contrib/benchmark

Some small fixes to contrib/benchmark
-------------------------------------

                 Key: LUCENE-1115
                 URL: https://issues.apache.org/jira/browse/LUCENE-1115
             Project: Lucene - Java
          Issue Type: Bug
    Affects Versions: 2.3
            Reporter: Michael McCandless
            Assignee: Michael McCandless
            Priority: Minor
             Fix For: 2.3


I've fixed a few small issues I've hit in contrib/benchmark.

First, this alg was only doing work on the first round.  All
subsequent rounds immediately finished:

{code}
analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
work.dir = /lucene/work
docs.file=work/reuters.lines.txt
doc.maker.forever=false
directory=FSDirectory
doc.add.log.step=3000

{ "Rounds"
  ResetSystemErase
  CreateIndex
  { "AddDocs" AddDoc > : *
  CloseIndex
  NewRound
} : 3
{code}

I think this is because we are failing to reset "exhausted" to false
in PerfTask.doLogic(), so I added that.  Plus I had to re-open the
file in LineDocMaker.

Second, I made a small optimization to not call updateExhausted unless
any of the child tasks are TaskSequence or ResetInputsTask (which I
compute up-front).

Finally, we were not allowing flushing by RAM and doc count, so I
fixed the logic in Create/OpenIndexTask to set both RAMBufferSizeMB
and MaxBufferedDocs.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1115) Some small fixes to contrib/benchmark

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555453#action_12555453 ] 

Michael McCandless commented on LUCENE-1115:
--------------------------------------------

Awesome, I will add that test case.  Thanks Doron!

> Some small fixes to contrib/benchmark
> -------------------------------------
>
>                 Key: LUCENE-1115
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1115
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1115.patch
>
>
> I've fixed a few small issues I've hit in contrib/benchmark.
> First, this alg was only doing work on the first round.  All
> subsequent rounds immediately finished:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
> work.dir = /lucene/work
> docs.file=work/reuters.lines.txt
> doc.maker.forever=false
> directory=FSDirectory
> doc.add.log.step=3000
> { "Rounds"
>   ResetSystemErase
>   CreateIndex
>   { "AddDocs" AddDoc > : *
>   CloseIndex
>   NewRound
> } : 3
> {code}
> I think this is because we are failing to reset "exhausted" to false
> in PerfTask.doLogic(), so I added that.  Plus I had to re-open the
> file in LineDocMaker.
> Second, I made a small optimization to not call updateExhausted unless
> any of the child tasks are TaskSequence or ResetInputsTask (which I
> compute up-front).
> Finally, we were not allowing flushing by RAM and doc count, so I
> fixed the logic in Create/OpenIndexTask to set both RAMBufferSizeMB
> and MaxBufferedDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-1115) Some small fixes to contrib/benchmark

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-1115.
----------------------------------------

    Resolution: Fixed

> Some small fixes to contrib/benchmark
> -------------------------------------
>
>                 Key: LUCENE-1115
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1115
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1115.patch
>
>
> I've fixed a few small issues I've hit in contrib/benchmark.
> First, this alg was only doing work on the first round.  All
> subsequent rounds immediately finished:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
> work.dir = /lucene/work
> docs.file=work/reuters.lines.txt
> doc.maker.forever=false
> directory=FSDirectory
> doc.add.log.step=3000
> { "Rounds"
>   ResetSystemErase
>   CreateIndex
>   { "AddDocs" AddDoc > : *
>   CloseIndex
>   NewRound
> } : 3
> {code}
> I think this is because we are failing to reset "exhausted" to false
> in PerfTask.doLogic(), so I added that.  Plus I had to re-open the
> file in LineDocMaker.
> Second, I made a small optimization to not call updateExhausted unless
> any of the child tasks are TaskSequence or ResetInputsTask (which I
> compute up-front).
> Finally, we were not allowing flushing by RAM and doc count, so I
> fixed the logic in Create/OpenIndexTask to set both RAMBufferSizeMB
> and MaxBufferedDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1115) Some small fixes to contrib/benchmark

Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555441#action_12555441 ] 

Doron Cohen commented on LUCENE-1115:
-------------------------------------

Definitely a bug.
Patch looks good, and I like the optimization, thanks for fixing this Mike.

Perhaps rename in TaskSequence from *anyExhaustedTasks* to *anyExhaustableTasks*?

Also, this new test (belongs in TestPerfTaskLogic) passes with the fix but fails without it:
{code}
  /**
   * Test that exhaust in loop works as expected (LUCENE-1115).
   */
  public void testExhaustedLooped() throws Exception {
    // 1. alg definition (required in every "logic" test)
    String algLines[] = {
        "# ----- properties ",
        "doc.maker="+Reuters20DocMaker.class.getName(),
        "doc.add.log.step=3",
        "doc.term.vector=false",
        "doc.maker.forever=false",
        "directory=RAMDirectory",
        "doc.stored=false",
        "doc.tokenized=false",
        "debug.level=1",
        "# ----- alg ",
        "{ \"Rounds\"",
        "  ResetSystemErase",
        "  CreateIndex",
        "  { \"AddDocs\"  AddDoc > : * ",
        "  CloseIndex",
        "} : 2",
    };
    
    // 2. execute the algorithm  (required in every "logic" test)
    Benchmark benchmark = execBenchmark(algLines);

    // 3. test number of docs in the index
    IndexReader ir = IndexReader.open(benchmark.getRunData().getDirectory());
    int ndocsExpected = 20; // Reuters20DocMaker exhausts after 20 docs.
    assertEquals("wrong number of docs in the index!", ndocsExpected, ir.numDocs());
    ir.close();
  }
{code}

Cheers,
Doron

> Some small fixes to contrib/benchmark
> -------------------------------------
>
>                 Key: LUCENE-1115
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1115
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1115.patch
>
>
> I've fixed a few small issues I've hit in contrib/benchmark.
> First, this alg was only doing work on the first round.  All
> subsequent rounds immediately finished:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
> work.dir = /lucene/work
> docs.file=work/reuters.lines.txt
> doc.maker.forever=false
> directory=FSDirectory
> doc.add.log.step=3000
> { "Rounds"
>   ResetSystemErase
>   CreateIndex
>   { "AddDocs" AddDoc > : *
>   CloseIndex
>   NewRound
> } : 3
> {code}
> I think this is because we are failing to reset "exhausted" to false
> in PerfTask.doLogic(), so I added that.  Plus I had to re-open the
> file in LineDocMaker.
> Second, I made a small optimization to not call updateExhausted unless
> any of the child tasks are TaskSequence or ResetInputsTask (which I
> compute up-front).
> Finally, we were not allowing flushing by RAM and doc count, so I
> fixed the logic in Create/OpenIndexTask to set both RAMBufferSizeMB
> and MaxBufferedDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1115) Some small fixes to contrib/benchmark

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1115:
---------------------------------------

    Attachment: LUCENE-1115.patch

Attached patch.  All tests pass.  I plan to commit in a day or so.

> Some small fixes to contrib/benchmark
> -------------------------------------
>
>                 Key: LUCENE-1115
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1115
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1115.patch
>
>
> I've fixed a few small issues I've hit in contrib/benchmark.
> First, this alg was only doing work on the first round.  All
> subsequent rounds immediately finished:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
> work.dir = /lucene/work
> docs.file=work/reuters.lines.txt
> doc.maker.forever=false
> directory=FSDirectory
> doc.add.log.step=3000
> { "Rounds"
>   ResetSystemErase
>   CreateIndex
>   { "AddDocs" AddDoc > : *
>   CloseIndex
>   NewRound
> } : 3
> {code}
> I think this is because we are failing to reset "exhausted" to false
> in PerfTask.doLogic(), so I added that.  Plus I had to re-open the
> file in LineDocMaker.
> Second, I made a small optimization to not call updateExhausted unless
> any of the child tasks are TaskSequence or ResetInputsTask (which I
> compute up-front).
> Finally, we were not allowing flushing by RAM and doc count, so I
> fixed the logic in Create/OpenIndexTask to set both RAMBufferSizeMB
> and MaxBufferedDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org