You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by louiebagz <lo...@yahoo.com> on 2007/03/19 17:10:34 UTC

how to index XML elements with the same name using Lucene

hello guys, 

I need some help. I'm working on an XML file and trying to create a lucene
index for each element. My XML file have repeating elements with different
values. When I tried to run lucene, it can only index one of the elements.
Both files are attached for your reference. 

Hoping for your favorable response. 

Thank you!

Louie 

http://www.nabble.com/file/7271/keywords.xml keywords.xml 
http://www.nabble.com/file/7272/IndexKeyWords.java IndexKeyWords.java 

-- 
View this message in context: http://www.nabble.com/how-to-index-XML-elements-with-the-same-name-using-Lucene-tf3428085.html#a9555198
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

Re: how to index XML elements with the same name using Lucene

Posted by Cheolgoo Kang <ap...@gmail.com>.
Keywords.setKeyword(String) could've been able to stack all the
keywords set by the digester. So, setKeyword(String) method should be
written like below using java.util.List:

    public static class KeyWords
    {
        private String lineNum;
        private List<String> kw = new LinkedList<String>();

        public void setLineNum(String newLineNum)
        {
            lineNum = newLineNum;
        }
        public String getLineNum()
        {
            return lineNum;
        }

        public void setKeyWord(String newKW)
        {
            kw.add( newKW );
        }
        public List<String> getKeyWordList()
        {
            return kw;
        }
    }

So, you have to change the IndexKeyWords.addKeywords(Keywords) method
to handle the 'list' of keywords read from xml file. It'll solve the
'reading several elements from xml file' problem.

And make sure you care about those keywords with two or more words.
StandardAnalyzer will split all the words in one keyword if they are
separated by whitespace. Then you'll have to use PhraseQuery or span
queries to search a exact phrase like "controlled hypertension".

HTH


On 3/20/07, louiebagz <lo...@yahoo.com> wrote:
>
> hello guys,
>
> I need some help. I'm working on an XML file and trying to create a lucene
> index for each element. My XML file have repeating elements with different
> values. When I tried to run lucene, it can only index one of the elements.
> Both files are attached for your reference.
>
> Hoping for your favorable response.
>
> Thank you!
>
> Louie
>
> http://www.nabble.com/file/7271/keywords.xml keywords.xml
> http://www.nabble.com/file/7272/IndexKeyWords.java IndexKeyWords.java
>
> --
> View this message in context: http://www.nabble.com/how-to-index-XML-elements-with-the-same-name-using-Lucene-tf3428085.html#a9555198
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>


-- 
Cheolgoo

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org