You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Furash Gary <fu...@mcao.maricopa.gov> on 2006/08/29 16:38:13 UTC

Documents that know more?

I'm sure this is just a design point that I'm missing, but is there a
way to have my document objects know more about themselves?

At the time I create my document, I know a bit about how information is
being stored in it (e.g., this field represents a SOUNDEX copy, etc.),
yet the logic for that kind of stuff is kept in separate, unrelated
classes that have to be used properly in the document creation/retrieval
programs.

Would it make sense and, if so, is there a way to embed more logic in my
document objects (or some kind of subclass)?

Gary Furash, MBA, PMP
Applications Manager, Maricopa County Attorney's Office

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Documents that know more?

Posted by Furash Gary <fu...@mcao.maricopa.gov>.
Thanks.  Sort of what I was thinking of was the fact that document X,
field N, was built via tokenizer/analyzer N.  If I need to search an
index of document Xs, then I should be using tokenizer/analyzer N
without having to "know" that it was built that way.

-----Original Message-----
From: Steven Rowe [mailto:sarowe@syr.edu] 
Sent: Tuesday, August 29, 2006 8:04 AM
To: java-user@lucene.apache.org
Subject: Re: Documents that know more?

There has been a long-running thread on the java-dev list about how to
allow application-specific "extra stuff" to be placed in the index, at
multiple levels of granularity.  Some of this conversation is captured
on the Wiki at:

http://wiki.apache.org/jakarta-lucene/FlexibleIndexing

Maybe you could modify the above page to include your requirements?

Steve

Furash Gary wrote:
> I'm sure this is just a design point that I'm missing, but is there a 
> way to have my document objects know more about themselves?
> 
> At the time I create my document, I know a bit about how information 
> is being stored in it (e.g., this field represents a SOUNDEX copy, 
> etc.), yet the logic for that kind of stuff is kept in separate, 
> unrelated classes that have to be used properly in the document 
> creation/retrieval programs.
> 
> Would it make sense and, if so, is there a way to embed more logic in 
> my document objects (or some kind of subclass)?
> 
> Gary Furash, MBA, PMP
> Applications Manager, Maricopa County Attorney's Office
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Documents that know more?

Posted by Steven Rowe <sa...@syr.edu>.
There has been a long-running thread on the java-dev list about how to
allow application-specific "extra stuff" to be placed in the index, at
multiple levels of granularity.  Some of this conversation is captured
on the Wiki at:

http://wiki.apache.org/jakarta-lucene/FlexibleIndexing

Maybe you could modify the above page to include your requirements?

Steve

Furash Gary wrote:
> I'm sure this is just a design point that I'm missing, but is there a
> way to have my document objects know more about themselves?
> 
> At the time I create my document, I know a bit about how information is
> being stored in it (e.g., this field represents a SOUNDEX copy, etc.),
> yet the logic for that kind of stuff is kept in separate, unrelated
> classes that have to be used properly in the document creation/retrieval
> programs.
> 
> Would it make sense and, if so, is there a way to embed more logic in my
> document objects (or some kind of subclass)?
> 
> Gary Furash, MBA, PMP
> Applications Manager, Maricopa County Attorney's Office
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Documents that know more?

Posted by "Michael D. Curtin" <mi...@curtin.com>.
Furash Gary wrote:

> I'm sure this is just a design point that I'm missing, but is there a
> way to have my document objects know more about themselves?
> 
> At the time I create my document, I know a bit about how information is
> being stored in it (e.g., this field represents a SOUNDEX copy, etc.),
> yet the logic for that kind of stuff is kept in separate, unrelated
> classes that have to be used properly in the document creation/retrieval
> programs.
> 
> Would it make sense and, if so, is there a way to embed more logic in my
> document objects (or some kind of subclass)?

Whether it would make sense for your application or not is something I think 
you're going to have to decide yourself.  From your description, it sounds 
like it might be useful to have some of that additional information around later.

If so, how about adding a field or two to your Lucene index that store the 
additional facts?  You don't have to index them, if you don't intend to search 
based on them.  If you use an efficient encoding scheme, such fields might not 
add much to your disk usage either.

Good luck!

--MDC

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org