You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Ankit Murarka <an...@rancoretech.com> on 2013/12/03 08:18:02 UTC

Number of Times 1 Field has occured in a document within a Given TimeRange,.

Hello.

This might be a long mail but I have mentioned everything very clearly
so that I can get needed assistance.

Indexing:
I have a use case. I am indexing two fields.

Field 1 : Value. Say suppose 1,2,3,4,5 etc..

Field 2 : Time in Long Format . Say 20131203010005, 20131203132332 etc..

Both the field values are extracted from number of documents. Each
document contains N number of such entry.

Indexing is not a problem. I am able to index both the fields properly.

Search:

Construction of Query:
During searching, I want that given a value (for field 1 say E.g. 2),
get me the count of occurence of 2 each hour in the given index. i.e.
From 20131203000000-20131203005959 and then from
20131203010000-20131203015959..

I gave first field in TermQuery. and for second I used NumericRange
Query -creating query for 24 time slots in a day.

Created a Boolean Query and gave TermQuery and NumericRange Query as two
clauses with MUST and executed.

Execution/Result:

The query is giving me the output in terms of documents where the value
2 and given range is present. Based on current implementation, I need to
iterate through each doc found, get all the value field (matching input
value=2) and then again impose an IF condition for the range and
increment a counter everytime the IF is executed.

This is OK but I am looking for a shorter method.

A. Is it possible that on firing first query, I get the count of
occurence itself. I think .search always returns number of docs.
B. If this is not possible, is it possible that having obtained the
document in which the given input might be present, again execute a
query on that document itself and find the occurence of given input for
the given time range.
C. I tried with putting a count of occurence of given value during the
indexing phase in index itself. But since TIME CROSSOVER can also happen
inside the same file, the count which is stored during the indexing
process is not proper. Hence I don't think I can store the count of the
occurence during the indexing phase itself.

Please assist. Let me know if any point is not clear and I will clarify
it again.

--
Regards

Ankit Murarka

"What lies behind us and what lies before us are tiny matters compared with what lies within us"

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Number of Times 1 Field has occured in a document within a Given TimeRange,.

Posted by Michael Sokolov <ms...@safaribooksonline.com>.

Have you read about numeric range faceting? 
http://blog.mikemccandless.com/2013/05/dynamic-faceting-with-lucene.html

On 12/6/2013 5:34 AM, Ankit Murarka wrote:
> Well a bit strange as this is the 1st time, I am not receiving any 
> reply to the question even after sending it again.
>
> Would be very helpful if someone can throw some light on the problem.
>
> On 04-12-2013 18:54, Ankit Murarka wrote:
>> Hello.
>>
>> Would really appreciate if someone can guide me on the below 
>> mentioned issue.
>>
>> On 03-12-2013 12:48, Ankit Murarka wrote:
>>> Hello.
>>>
>>> This might be a long mail but I have mentioned everything very 
>>> clearly so that I can get needed assistance.
>>>
>>> Indexing:
>>> I have a use case. I am indexing two fields.
>>>
>>> Field 1 : Value. Say suppose 1,2,3,4,5 etc..
>>>
>>> Field 2 : Time  in Long Format . Say 20131203010005, 20131203132332 
>>> etc..
>>>
>>> Both the field values are extracted from number of documents. Each 
>>> document contains N number of such entry.
>>>
>>> Indexing is not a problem. I am able to index both the fields properly.
>>>
>>> Search:
>>>
>>> Construction of Query:
>>> During searching, I want that given a value (for field 1 say E.g. 
>>> 2), get me the count of occurence of 2 each hour in the given index. 
>>> i.e. From 20131203000000-20131203005959 and then from 
>>> 20131203010000-20131203015959..
>>>
>>> I gave first field in TermQuery. and for second I used NumericRange 
>>> Query -creating query for 24 time slots in a day.
>>>
>>> Created a Boolean Query and gave TermQuery and NumericRange Query as 
>>> two clauses with MUST and executed.
>>>
>>> Execution/Result:
>>>
>>> The query is giving me the output in terms of documents where the 
>>> value 2 and given range is present. Based on current implementation, 
>>> I need to iterate through each doc found, get all the value field 
>>> (matching input value=2) and then again impose an IF condition for 
>>> the range and increment a counter everytime the IF is executed.
>>>
>>> This is OK but I am looking for a shorter method.
>>>
>>> A. Is it possible that on firing first query, I get the count of 
>>> occurence itself. I think .search always returns number of docs.
>>> B. If this is not possible, is it possible that having obtained the 
>>> document in which the given input might be present, again execute a 
>>> query on that document itself and find the occurence of given input 
>>> for the given time range.
>>> C. I tried with putting a count of occurence of given value during 
>>> the indexing phase in index itself. But since TIME CROSSOVER can 
>>> also happen inside the same file, the count which is stored during 
>>> the indexing process is not proper. Hence I don't think I can store 
>>> the count of the occurence during the indexing phase itself.
>>>
>>> Please assist. Let me know if any point is not clear and I will 
>>> clarify it again.
>>>
>>
>>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Number of Times 1 Field has occured in a document within a Given TimeRange,.

Posted by Ankit Murarka <an...@rancoretech.com>.

Well a bit strange as this is the 1st time, I am not receiving any reply 
to the question even after sending it again.

Would be very helpful if someone can throw some light on the problem.

On 04-12-2013 18:54, Ankit Murarka wrote:
> Hello.
>
> Would really appreciate if someone can guide me on the below mentioned 
> issue.
>
> On 03-12-2013 12:48, Ankit Murarka wrote:
>> Hello.
>>
>> This might be a long mail but I have mentioned everything very 
>> clearly so that I can get needed assistance.
>>
>> Indexing:
>> I have a use case. I am indexing two fields.
>>
>> Field 1 : Value. Say suppose 1,2,3,4,5 etc..
>>
>> Field 2 : Time  in Long Format . Say 20131203010005, 20131203132332 
>> etc..
>>
>> Both the field values are extracted from number of documents. Each 
>> document contains N number of such entry.
>>
>> Indexing is not a problem. I am able to index both the fields properly.
>>
>> Search:
>>
>> Construction of Query:
>> During searching, I want that given a value (for field 1 say E.g. 2), 
>> get me the count of occurence of 2 each hour in the given index. i.e. 
>> From 20131203000000-20131203005959 and then from 
>> 20131203010000-20131203015959..
>>
>> I gave first field in TermQuery. and for second I used NumericRange 
>> Query -creating query for 24 time slots in a day.
>>
>> Created a Boolean Query and gave TermQuery and NumericRange Query as 
>> two clauses with MUST and executed.
>>
>> Execution/Result:
>>
>> The query is giving me the output in terms of documents where the 
>> value 2 and given range is present. Based on current implementation, 
>> I need to iterate through each doc found, get all the value field 
>> (matching input value=2) and then again impose an IF condition for 
>> the range and increment a counter everytime the IF is executed.
>>
>> This is OK but I am looking for a shorter method.
>>
>> A. Is it possible that on firing first query, I get the count of 
>> occurence itself. I think .search always returns number of docs.
>> B. If this is not possible, is it possible that having obtained the 
>> document in which the given input might be present, again execute a 
>> query on that document itself and find the occurence of given input 
>> for the given time range.
>> C. I tried with putting a count of occurence of given value during 
>> the indexing phase in index itself. But since TIME CROSSOVER can also 
>> happen inside the same file, the count which is stored during the 
>> indexing process is not proper. Hence I don't think I can store the 
>> count of the occurence during the indexing phase itself.
>>
>> Please assist. Let me know if any point is not clear and I will 
>> clarify it again.
>>
>
>


-- 
Regards

Ankit Murarka

"What lies behind us and what lies before us are tiny matters compared with what lies within us"


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Number of Times 1 Field has occured in a document within a Given TimeRange,.

Posted by Ankit Murarka <an...@rancoretech.com>.

Hello.

Would really appreciate if someone can guide me on the below mentioned 
issue.

On 03-12-2013 12:48, Ankit Murarka wrote:
> Hello.
>
> This might be a long mail but I have mentioned everything very clearly 
> so that I can get needed assistance.
>
> Indexing:
> I have a use case. I am indexing two fields.
>
> Field 1 : Value. Say suppose 1,2,3,4,5 etc..
>
> Field 2 : Time  in Long Format . Say 20131203010005, 20131203132332 etc..
>
> Both the field values are extracted from number of documents. Each 
> document contains N number of such entry.
>
> Indexing is not a problem. I am able to index both the fields properly.
>
> Search:
>
> Construction of Query:
> During searching, I want that given a value (for field 1 say E.g. 2), 
> get me the count of occurence of 2 each hour in the given index. i.e. 
> From 20131203000000-20131203005959 and then from 
> 20131203010000-20131203015959..
>
> I gave first field in TermQuery. and for second I used NumericRange 
> Query -creating query for 24 time slots in a day.
>
> Created a Boolean Query and gave TermQuery and NumericRange Query as 
> two clauses with MUST and executed.
>
> Execution/Result:
>
> The query is giving me the output in terms of documents where the 
> value 2 and given range is present. Based on current implementation, I 
> need to iterate through each doc found, get all the value field 
> (matching input value=2) and then again impose an IF condition for the 
> range and increment a counter everytime the IF is executed.
>
> This is OK but I am looking for a shorter method.
>
> A. Is it possible that on firing first query, I get the count of 
> occurence itself. I think .search always returns number of docs.
> B. If this is not possible, is it possible that having obtained the 
> document in which the given input might be present, again execute a 
> query on that document itself and find the occurence of given input 
> for the given time range.
> C. I tried with putting a count of occurence of given value during the 
> indexing phase in index itself. But since TIME CROSSOVER can also 
> happen inside the same file, the count which is stored during the 
> indexing process is not proper. Hence I don't think I can store the 
> count of the occurence during the indexing phase itself.
>
> Please assist. Let me know if any point is not clear and I will 
> clarify it again.
>


-- 
Regards


"What lies behind us and what lies before us are tiny matters compared with what lies within us"


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org