You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by aslam bari <ia...@yahoo.co.in> on 2006/12/29 11:49:04 UTC

NewBie:- Which Analyzer is best for Text and Number Indexing

Hello All,
 I m new to Lucene and want to know which anayzer to use for indexing the Text and Numbers as well. But here is a problem.
I have a xml file which has text and numbers in values. I want to index some(special) nodes value with [text and Numbers] but don't need all of the nodes to come in Number indexing, they should go with default text indexing.

<City>
    <Garden>
        <Plot name="Rjmahal 765 Nodia" place="raghu 8988 ami">
        <Joy name="Hum 700 fdfd">
    </Garden>
</City>

In this Example , i need to index only Plot\name value "Rjmahal 765 Nodia" with text indexing as well as number indexing. All other values should go text indexing. Because i need to search only on Plot\names number like 765. I don't need to search on 8988 and 700 etc.

Any help will be appreciatable.

Thanks...

Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download Now! http://messenger.yahoo.com/download.php

Re: NewBie:- Which Analyzer is best for Text and Number Indexing

Posted by Erick Erickson <er...@gmail.com>.
Assuming that you are indexing these in different fields in your Lucene
document, you can use a PerFieldAnalyzerWrapper to use different analyzers
for each field. Be sure you carefully coordinate the analyzer you use for
indexing with the one you use for searching or your results will not be what
you expect....

You could break the input stream up upon input, and use, say, a
SimpleAnalyzer to just break on whitespace. This is really using your own
custom analyzer that cracks your XML document and returns selected data from
specific fields.

Assuming your index structure allows you to use a different analyzer for
different fields in your lucene document, I'd recommend the
PerFieldAnalyzerWrapper....

Best
Erick

On 12/29/06, aslam bari <ia...@yahoo.co.in> wrote:
>
> Hello All,
> I m new to Lucene and want to know which anayzer to use for indexing the
> Text and Numbers as well. But here is a problem.
> I have a xml file which has text and numbers in values. I want to index
> some(special) nodes value with [text and Numbers] but don't need all of the
> nodes to come in Number indexing, they should go with default text indexing.
>
> <City>
>     <Garden>
>         <Plot name="Rjmahal 765 Nodia" place="raghu 8988 ami">
>         <Joy name="Hum 700 fdfd">
>     </Garden>
> </City>
>
> In this Example , i need to index only Plot\name value "Rjmahal 765 Nodia"
> with text indexing as well as number indexing. All other values should go
> text indexing. Because i need to search only on Plot\names number like 765.
> I don't need to search on 8988 and 700 etc.
>
> Any help will be appreciatable.
>
> Thanks...
>
> Send free SMS to your Friends on Mobile from your Yahoo! Messenger.
> Download Now! http://messenger.yahoo.com/download.php
>