You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Joe MarkAnthony <mr...@comcast.net> on 2009/01/08 08:51:22 UTC

Calculated terms during a query

Greetings,
      I would like to search for items based on 'calculated' terms.
Specifically, say I am using Lucene to search a collection of tasks, with
fields "start_date" and "end_date", among others.

The question to solve is:
"Find all tasks that took longer than 100 days".

So the easy answer is to create a third field "task_duration", and store
that by subtracting start_date from end_date during indexing.  OK, this
works fine (using NumberTools, and so forth).

However, this solution doesn't work well when you start adding more fields.
For example, in my scenario, there actually are four fields:
"planned_start_date",
"actual_start_date","planned_end_date","actual_end_date".

Now, to support any such reasonable calculated query ("How many tasks were
more than 10 days late" or "How many tasks started on time", etc) you now
have to store 5 'calculated' terms for each document.  This can get out of
hand.

So, is there a better way to do this in Lucene?
I know some people will say this is an example of where a relational
database is best, and perhaps such concepts do not fit within text
indexing...ok, understood.

But perhaps there is a better way - has any thought gone into this for
Lucene?

Thanks in advance,
J



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org