You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Michael McCandless <lu...@mikemccandless.com> on 2008/01/16 12:25:49 UTC

counting sub tasks in contrib/benchmark

I'd like to run an alg like this:

   ResetSystemErase
   { "BuildIndex"
     CreateIndex
     { "AddDocs" AddDoc > : 200000
     CloseIndex
   }

   RepSumByPrefRound BuildIndex

But in the report, for rec/s, I'd like to see the total BuildIndex
time divided by 200,000, ie, the net time per document to create this
index, including the time taken to create and to close the index.

Instead I see the total time divided by 200,002, because each of the
CreateIndex & CloseIndex increment the counter by 1.

I think it's not possible to express this in the contrib/benchmark
scripting language now?

Really I want a way to run a given task but NOT counting its counter.
Eg, if a task is prefixed with "-" then do not include its returned
count in the report?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: counting sub tasks in contrib/benchmark

Posted by Grant Ingersoll <gs...@apache.org>.
On Jan 16, 2008, at 8:33 AM, Doron Cohen wrote:

> On Jan 16, 2008 3:29 PM, Grant Ingersoll <gs...@apache.org> wrote:
>
>> I think you can do:
>>
>> RepSumByPref AddDocs
>>
>> And it will report on just that, for instance, in the standard.alg,
>> this is done inside the round to report out info on that rounds  
>> AddDocs.
>>
>> I think you could even do it outside the round, just by substituting
>> BuildIndex for "AddDocs".
>>
>> In general, I think you can report on any named task.
>>
>
> This is correct, But this way the statistics would not
> include the index creation and close time, which I thought
> Mike wanted included?

Oops, I read it again.  I think you are right. 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: counting sub tasks in contrib/benchmark

Posted by Doron Cohen <cd...@gmail.com>.
On Jan 16, 2008 3:29 PM, Grant Ingersoll <gs...@apache.org> wrote:

> I think you can do:
>
> RepSumByPref AddDocs
>
> And it will report on just that, for instance, in the standard.alg,
> this is done inside the round to report out info on that rounds AddDocs.
>
> I think you could even do it outside the round, just by substituting
> BuildIndex for "AddDocs".
>
> In general, I think you can report on any named task.
>

This is correct, But this way the statistics would not
include the index creation and close time, which I thought
Mike wanted included?

Re: counting sub tasks in contrib/benchmark

Posted by Grant Ingersoll <gs...@apache.org>.
I think you can do:

RepSumByPref AddDocs

And it will report on just that, for instance, in the standard.alg,  
this is done inside the round to report out info on that rounds AddDocs.

I think you could even do it outside the round, just by substituting  
BuildIndex for "AddDocs".

In general, I think you can report on any named task.

-Grant


On Jan 16, 2008, at 6:25 AM, Michael McCandless wrote:

> I'd like to run an alg like this:
>
>  ResetSystemErase
>  { "BuildIndex"
>    CreateIndex
>    { "AddDocs" AddDoc > : 200000
>    CloseIndex
>  }
>
>  RepSumByPrefRound BuildIndex
>
> But in the report, for rec/s, I'd like to see the total BuildIndex
> time divided by 200,000, ie, the net time per document to create this
> index, including the time taken to create and to close the index.
>
> Instead I see the total time divided by 200,002, because each of the
> CreateIndex & CloseIndex increment the counter by 1.
>
> I think it's not possible to express this in the contrib/benchmark
> scripting language now?
>
> Really I want a way to run a given task but NOT counting its counter.
> Eg, if a task is prefixed with "-" then do not include its returned
> count in the report?
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: counting sub tasks in contrib/benchmark

Posted by Michael McCandless <lu...@mikemccandless.com>.
Doron Cohen wrote:

> On Jan 16, 2008 1:25 PM, Michael McCandless  
> <lu...@mikemccandless.com>
> wrote:
>
>> I'd like to run an alg like this:
>>
>>   ResetSystemErase
>>   { "BuildIndex"
>>     CreateIndex
>>     { "AddDocs" AddDoc > : 200000
>>     CloseIndex
>>   }
>>
>>   RepSumByPrefRound BuildIndex
>>
>> But in the report, for rec/s, I'd like to see the total BuildIndex
>> time divided by 200,000, ie, the net time per document to create this
>> index, including the time taken to create and to close the index.
>>
>> Instead I see the total time divided by 200,002, because each of the
>> CreateIndex & CloseIndex increment the counter by 1.
>>
>> I think it's not possible to express this in the contrib/benchmark
>> scripting language now?
>>
>
> This is true. One way to do this is to extend CreateIndexTask by say
> CreateIndex0Task which would call super.doLogic() and return 0.

Right, but then we'd need to do it for each such task...

>
>> Really I want a way to run a given task but NOT counting its counter.
>> Eg, if a task is prefixed with "-" then do not include its returned
>> count in the report?
>
>
> This is a more general feature, I like it, shouldn't be that hard  
> to add
> either.
> New JIRA issue?

Good, I'll open a Jira!  I don't yet have a patch ... just an idea at  
this point :)

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: counting sub tasks in contrib/benchmark

Posted by Doron Cohen <cd...@gmail.com>.
On Jan 16, 2008 1:25 PM, Michael McCandless <lu...@mikemccandless.com>
wrote:

> I'd like to run an alg like this:
>
>   ResetSystemErase
>   { "BuildIndex"
>     CreateIndex
>     { "AddDocs" AddDoc > : 200000
>     CloseIndex
>   }
>
>   RepSumByPrefRound BuildIndex
>
> But in the report, for rec/s, I'd like to see the total BuildIndex
> time divided by 200,000, ie, the net time per document to create this
> index, including the time taken to create and to close the index.
>
> Instead I see the total time divided by 200,002, because each of the
> CreateIndex & CloseIndex increment the counter by 1.
>
> I think it's not possible to express this in the contrib/benchmark
> scripting language now?
>

This is true. One way to do this is to extend CreateIndexTask by say
CreateIndex0Task which would call super.doLogic() and return 0.


> Really I want a way to run a given task but NOT counting its counter.
> Eg, if a task is prefixed with "-" then do not include its returned
> count in the report?


This is a more general feature, I like it, shouldn't be that hard to add
either.
New JIRA issue?

Mike
>