You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2008/01/02 18:31:33 UTC
[jira] Created: (LUCENE-1115) Some small fixes to contrib/benchmark
Some small fixes to contrib/benchmark
-------------------------------------
Key: LUCENE-1115
URL: https://issues.apache.org/jira/browse/LUCENE-1115
Project: Lucene - Java
Issue Type: Bug
Affects Versions: 2.3
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
Fix For: 2.3
I've fixed a few small issues I've hit in contrib/benchmark.
First, this alg was only doing work on the first round. All
subsequent rounds immediately finished:
{code}
analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
work.dir = /lucene/work
docs.file=work/reuters.lines.txt
doc.maker.forever=false
directory=FSDirectory
doc.add.log.step=3000
{ "Rounds"
ResetSystemErase
CreateIndex
{ "AddDocs" AddDoc > : *
CloseIndex
NewRound
} : 3
{code}
I think this is because we are failing to reset "exhausted" to false
in PerfTask.doLogic(), so I added that. Plus I had to re-open the
file in LineDocMaker.
Second, I made a small optimization to not call updateExhausted unless
any of the child tasks are TaskSequence or ResetInputsTask (which I
compute up-front).
Finally, we were not allowing flushing by RAM and doc count, so I
fixed the logic in Create/OpenIndexTask to set both RAMBufferSizeMB
and MaxBufferedDocs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-1115) Some small fixes to
contrib/benchmark
Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555453#action_12555453 ]
Michael McCandless commented on LUCENE-1115:
--------------------------------------------
Awesome, I will add that test case. Thanks Doron!
> Some small fixes to contrib/benchmark
> -------------------------------------
>
> Key: LUCENE-1115
> URL: https://issues.apache.org/jira/browse/LUCENE-1115
> Project: Lucene - Java
> Issue Type: Bug
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
> Attachments: LUCENE-1115.patch
>
>
> I've fixed a few small issues I've hit in contrib/benchmark.
> First, this alg was only doing work on the first round. All
> subsequent rounds immediately finished:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
> work.dir = /lucene/work
> docs.file=work/reuters.lines.txt
> doc.maker.forever=false
> directory=FSDirectory
> doc.add.log.step=3000
> { "Rounds"
> ResetSystemErase
> CreateIndex
> { "AddDocs" AddDoc > : *
> CloseIndex
> NewRound
> } : 3
> {code}
> I think this is because we are failing to reset "exhausted" to false
> in PerfTask.doLogic(), so I added that. Plus I had to re-open the
> file in LineDocMaker.
> Second, I made a small optimization to not call updateExhausted unless
> any of the child tasks are TaskSequence or ResetInputsTask (which I
> compute up-front).
> Finally, we were not allowing flushing by RAM and doc count, so I
> fixed the logic in Create/OpenIndexTask to set both RAMBufferSizeMB
> and MaxBufferedDocs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Resolved: (LUCENE-1115) Some small fixes to
contrib/benchmark
Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless resolved LUCENE-1115.
----------------------------------------
Resolution: Fixed
> Some small fixes to contrib/benchmark
> -------------------------------------
>
> Key: LUCENE-1115
> URL: https://issues.apache.org/jira/browse/LUCENE-1115
> Project: Lucene - Java
> Issue Type: Bug
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
> Attachments: LUCENE-1115.patch
>
>
> I've fixed a few small issues I've hit in contrib/benchmark.
> First, this alg was only doing work on the first round. All
> subsequent rounds immediately finished:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
> work.dir = /lucene/work
> docs.file=work/reuters.lines.txt
> doc.maker.forever=false
> directory=FSDirectory
> doc.add.log.step=3000
> { "Rounds"
> ResetSystemErase
> CreateIndex
> { "AddDocs" AddDoc > : *
> CloseIndex
> NewRound
> } : 3
> {code}
> I think this is because we are failing to reset "exhausted" to false
> in PerfTask.doLogic(), so I added that. Plus I had to re-open the
> file in LineDocMaker.
> Second, I made a small optimization to not call updateExhausted unless
> any of the child tasks are TaskSequence or ResetInputsTask (which I
> compute up-front).
> Finally, we were not allowing flushing by RAM and doc count, so I
> fixed the logic in Create/OpenIndexTask to set both RAMBufferSizeMB
> and MaxBufferedDocs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-1115) Some small fixes to
contrib/benchmark
Posted by "Doron Cohen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555441#action_12555441 ]
Doron Cohen commented on LUCENE-1115:
-------------------------------------
Definitely a bug.
Patch looks good, and I like the optimization, thanks for fixing this Mike.
Perhaps rename in TaskSequence from *anyExhaustedTasks* to *anyExhaustableTasks*?
Also, this new test (belongs in TestPerfTaskLogic) passes with the fix but fails without it:
{code}
/**
* Test that exhaust in loop works as expected (LUCENE-1115).
*/
public void testExhaustedLooped() throws Exception {
// 1. alg definition (required in every "logic" test)
String algLines[] = {
"# ----- properties ",
"doc.maker="+Reuters20DocMaker.class.getName(),
"doc.add.log.step=3",
"doc.term.vector=false",
"doc.maker.forever=false",
"directory=RAMDirectory",
"doc.stored=false",
"doc.tokenized=false",
"debug.level=1",
"# ----- alg ",
"{ \"Rounds\"",
" ResetSystemErase",
" CreateIndex",
" { \"AddDocs\" AddDoc > : * ",
" CloseIndex",
"} : 2",
};
// 2. execute the algorithm (required in every "logic" test)
Benchmark benchmark = execBenchmark(algLines);
// 3. test number of docs in the index
IndexReader ir = IndexReader.open(benchmark.getRunData().getDirectory());
int ndocsExpected = 20; // Reuters20DocMaker exhausts after 20 docs.
assertEquals("wrong number of docs in the index!", ndocsExpected, ir.numDocs());
ir.close();
}
{code}
Cheers,
Doron
> Some small fixes to contrib/benchmark
> -------------------------------------
>
> Key: LUCENE-1115
> URL: https://issues.apache.org/jira/browse/LUCENE-1115
> Project: Lucene - Java
> Issue Type: Bug
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
> Attachments: LUCENE-1115.patch
>
>
> I've fixed a few small issues I've hit in contrib/benchmark.
> First, this alg was only doing work on the first round. All
> subsequent rounds immediately finished:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
> work.dir = /lucene/work
> docs.file=work/reuters.lines.txt
> doc.maker.forever=false
> directory=FSDirectory
> doc.add.log.step=3000
> { "Rounds"
> ResetSystemErase
> CreateIndex
> { "AddDocs" AddDoc > : *
> CloseIndex
> NewRound
> } : 3
> {code}
> I think this is because we are failing to reset "exhausted" to false
> in PerfTask.doLogic(), so I added that. Plus I had to re-open the
> file in LineDocMaker.
> Second, I made a small optimization to not call updateExhausted unless
> any of the child tasks are TaskSequence or ResetInputsTask (which I
> compute up-front).
> Finally, we were not allowing flushing by RAM and doc count, so I
> fixed the logic in Create/OpenIndexTask to set both RAMBufferSizeMB
> and MaxBufferedDocs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-1115) Some small fixes to contrib/benchmark
Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-1115:
---------------------------------------
Attachment: LUCENE-1115.patch
Attached patch. All tests pass. I plan to commit in a day or so.
> Some small fixes to contrib/benchmark
> -------------------------------------
>
> Key: LUCENE-1115
> URL: https://issues.apache.org/jira/browse/LUCENE-1115
> Project: Lucene - Java
> Issue Type: Bug
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
> Attachments: LUCENE-1115.patch
>
>
> I've fixed a few small issues I've hit in contrib/benchmark.
> First, this alg was only doing work on the first round. All
> subsequent rounds immediately finished:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
> work.dir = /lucene/work
> docs.file=work/reuters.lines.txt
> doc.maker.forever=false
> directory=FSDirectory
> doc.add.log.step=3000
> { "Rounds"
> ResetSystemErase
> CreateIndex
> { "AddDocs" AddDoc > : *
> CloseIndex
> NewRound
> } : 3
> {code}
> I think this is because we are failing to reset "exhausted" to false
> in PerfTask.doLogic(), so I added that. Plus I had to re-open the
> file in LineDocMaker.
> Second, I made a small optimization to not call updateExhausted unless
> any of the child tasks are TaskSequence or ResetInputsTask (which I
> compute up-front).
> Finally, we were not allowing flushing by RAM and doc count, so I
> fixed the logic in Create/OpenIndexTask to set both RAMBufferSizeMB
> and MaxBufferedDocs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org