You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Reza Ghaffaripour <re...@gmail.com> on 2005/12/07 09:49:45 UTC
repeating fields
hi all,
im new to lucene. i have an xml with repeating tags.something like :
<a>
<p>x</p>
<p>xx</p>
<p>xxx</p>
<p>xxxx</p>
</a>
I add the "p" field as follows:
myDocument.add(Field.Text("p", "x"));
myDocument.add(Field.Text("p", "xx"));
but when i search for "x" it returns the first hit only.
what should i do ? i want to search for "x" and get all the 4 hits.
--
Reza Ghaffaripour
www.rezaghp.com
Re: repeating fields
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Dec 7, 2005, at 8:48 AM, Reza Ghaffaripour wrote:
> I think having different documents will not be a good idea.
> for me each xml is an ebook. and "p" means paragraph.
> i have hundereds of paragraphs in every ebook. and i think i should
> keep
> each ebook in a single
> document. am i right ?
How you design your index requires consideration of all you're trying
to do with it. It's an art form, in fact. So while we can offer
some ideas, ultimately you have to find what fits. The granularity
of what you index as a Document is the granularity of what you get
back from searches as Hits.
There are blended approaches - an index does not have to be
homogeneous in Document design. You could have documents that
represent the entire e-book, and documents that represent each
paragraph. You can use a field on each document "type" to
distinguish them and filter in a search.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: repeating fields
Posted by Malcolm <ma...@btinternet.com>.
That's what I have, loads of different <p> tags and <abs>(abstract) tags etc
in each xml document so a lucene document for each is okay.
malcolm
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: repeating fields
Posted by Reza Ghaffaripour <re...@gmail.com>.
I think having different documents will not be a good idea.
for me each xml is an ebook. and "p" means paragraph.
i have hundereds of paragraphs in every ebook. and i think i should keep
each ebook in a single
document. am i right ?
On 12/7/05, Malcolm <ma...@btinternet.com> wrote:
>
>
> Firstly you should obtain LUKE and check everything is layed out correctly
> in your index.
> Secondly maybe a Wildcard/prefix query or termquery.forexample(termquery):
>
> TermQuery heTerm = new TermQuery(
> new Term("p",
> "x"));
> TermQuery sheTerm = new TermQuery(
> new Term("p",
> "xx"));
> TermQuery theyTerm = new TermQuery(
> new Term("p",
> "xxx"));
>
> I'm sure the folks on here will be able to come up with a more efficient
> method.Try obtaining Lucene in Action or look at the examples at
> http://lucenebook.com/
> cheers,
> Malcolm Clark
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
--
Reza Ghaffaripour
www.rezaghp.com
Re: repeating fields
Posted by Malcolm <ma...@btinternet.com>.
Firstly you should obtain LUKE and check everything is layed out correctly
in your index.
Secondly maybe a Wildcard/prefix query or termquery.for example(termquery):
TermQuery heTerm = new TermQuery(
new Term("p",
"x"));
TermQuery sheTerm = new TermQuery(
new Term("p",
"xx"));
TermQuery theyTerm = new TermQuery(
new Term("p",
"xxx"));
I'm sure the folks on here will be able to come up with a more efficient
method.Try obtaining Lucene in Action or look at the examples at
http://lucenebook.com/
cheers,
Malcolm Clark
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: repeating fields
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Dec 7, 2005, at 3:49 AM, Reza Ghaffaripour wrote:
> hi all,
> im new to lucene. i have an xml with repeating tags.something like :
> <a>
> <p>x</p>
> <p>xx</p>
> <p>xxx</p>
> <p>xxxx</p>
> </a>
>
> I add the "p" field as follows:
> myDocument.add(Field.Text("p", "x"));
> myDocument.add(Field.Text("p", "xx"));
>
> but when i search for "x" it returns the first hit only.
> what should i do ? i want to search for "x" and get all the 4 hits.
Hits return Documents. You indexed only a single document, not 4.
If you would like each <p> element to be a separate hit then index
each as a separate Document.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org