You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shridhar Venkatraman <Sh...@NeemTree.com> on 2007/03/27 14:08:20 UTC

Reposting unABLE to match

Solr <http://localhost:8084/Genie/>


  Solr Admin (GENIE)

ShridharVAIO:8084
cwd=C:\Program Files\netbeans-5.5\enterprise3\apache-tomcat-5.5.17\bin
SolrHome=c:\Documents and
Settings\Shridhar\Desktop\Public\Sana\KN\Genie\GenieConf/


    Field Analysis

*Field name* 	
*Field value (Index)*
verbose output
highlight matches 	"unABLE TO CONNECT"
*Field value (Query)*
verbose output 	"unABLE TO CONNECT"
	


      Index Analyzer


        org.apache.solr.analysis.HTMLStripWhitespaceTokenizerFactory {}

term position 	1	2	3
term text 	"unABLE	TO	CONNECT"
term type 	word	word	word
source start,end 	0,7	8,10	11,19


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position 	1	2	3
term text 	"unABLE	TO	CONNECT"
term type 	word	word	word
source start,end 	0,7	8,10	11,19


        org.apache.solr.analysis.StandardFilterFactory {}

term position 	1	2	3
term text 	"unABLE	TO	CONNECT"
term type 	word	word	word
source start,end 	0,7	8,10	11,19


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position 	1	2
term text 	"unABLE	CONNECT"
term type 	word	word
source start,end 	0,7	11,19


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position 	1	2	3
term text 	un	ABLE	CONNECT
			unABLE
term type 	word	word	word
word
source start,end 	1,3	3,7	11,18
				1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position 	1	2	3
term text 	un	able	connect
			unable
term type 	word	word	word
word
source start,end 	1,3	3,7	11,18
				1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position 	1	2	3
term text 	un	able	connect
			unable
term type 	word	word	word
word
source start,end 	1,3	3,7	11,18
				1,7


      Query Analyzer


        org.apache.solr.analysis.HTMLStripStandardTokenizerFactory {}

term position 	1	2	3
term text 	unABLE	TO	CONNECT
term type 	<ALPHANUM>	<ALPHANUM>	<ALPHANUM>
source start,end 	1,7	8,10	11,18


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position 	1	2	3
term text 	unABLE	TO	CONNECT
term type 	<ALPHANUM>	<ALPHANUM>	<ALPHANUM>
source start,end 	1,7	8,10	11,18


        org.apache.solr.analysis.StandardFilterFactory {}

term position 	1	2	3
term text 	unABLE	TO	CONNECT
term type 	<ALPHANUM>	<ALPHANUM>	<ALPHANUM>
source start,end 	1,7	8,10	11,18


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position 	1	2
term text 	unABLE	CONNECT
term type 	<ALPHANUM>	<ALPHANUM>
source start,end 	1,7	11,18


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position 	1	2	3
term text 	un	ABLE	CONNECT
			unABLE
term type 	<ALPHANUM>	<ALPHANUM>	<ALPHANUM>
<ALPHANUM>
source start,end 	1,3	3,7	11,18
				1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position 	1	2	3
term text 	un	able	connect
			unable
term type 	<ALPHANUM>	<ALPHANUM>	<ALPHANUM>
<ALPHANUM>
source start,end 	1,3	3,7	11,18
				1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position 	1	2	3
term text 	un	able	connect
			unable
term type 	<ALPHANUM>	<ALPHANUM>	<ALPHANUM>
<ALPHANUM>
source start,end 	1,3	3,7	11,18
				1,7




Re: Reposting unABLE to match

Posted by Chris Hostetter <ho...@fucit.org>.
:     Sorry for this multiple postings...
:     My email text did not get posted along with the attachment, don't know why ?
:     Here it is again.

in general: don't use attachments, paste text directly into hte body of
your email, that may have had soemthing to do with your problem.


-Hoss


Re: Reposting unABLE to match

Posted by Bertrand Delacretaz <bd...@apache.org>.
On 3/27/07, Shridhar Venkatraman <Sh...@neemtree.com> wrote:

...Reposting unABLE to match

No need to repost if your message made it to the list.

If it hasn't been answered yet, it either means that no one knows the
answer or that no one has had the time to answer yet. We're all
volunteers here.

-Bertrand

Re: Reposting unABLE to match

Posted by Ma...@ibsbe.be.
what exactly is the problem ?

seems like you end up with the same term text in both query and index 
analyzer ... you should have found a match...





Shridhar Venkatraman <Sh...@NeemTree.com> 
27/03/2007 14:08
Please respond to
solr-user@lucene.apache.org


To
solr-user@lucene.apache.org
cc

Subject
Reposting unABLE to match






Solr <http://localhost:8084/Genie/>


  Solr Admin (GENIE)

ShridharVAIO:8084
cwd=C:\Program Files\netbeans-5.5\enterprise3\apache-tomcat-5.5.17\bin
SolrHome=c:\Documents and
Settings\Shridhar\Desktop\Public\Sana\KN\Genie\GenieConf/


    Field Analysis

*Field name* 
*Field value (Index)*
verbose output
highlight matches                "unABLE TO CONNECT"
*Field value (Query)*
verbose output           "unABLE TO CONNECT"
 


      Index Analyzer


        org.apache.solr.analysis.HTMLStripWhitespaceTokenizerFactory {}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.StandardFilterFactory {}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position            1               2
term text                "unABLE                 CONNECT"
term type                word            word
source start,end                 0,7             11,19


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position            1               2               3
term text                un              ABLE            CONNECT
                                                 unABLE
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


      Query Analyzer


        org.apache.solr.analysis.HTMLStripStandardTokenizerFactory {}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.StandardFilterFactory {}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position            1               2
term text                unABLE          CONNECT
term type                <ALPHANUM>              <ALPHANUM>
source start,end                 1,7             11,18


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position            1               2               3
term text                un              ABLE            CONNECT
                                                 unABLE
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7





Re: Reposting unABLE to match

Posted by Yonik Seeley <yo...@apache.org>.
On 3/27/07, Shridhar Venkatraman <Sh...@neemtree.com> wrote:
>  The phrase "unABLE TO CONNECT" does not match in my system. However, any
>      combination of case is ok as long as the first letter 'U" is in
> uppercase.
>
>      Bad-> uNABLE, unABLE, unaBLE....
>      Gud-> Unable, UNable, UNAble...
>
>  Any ideas ?

WordDelimiterFilter

lowercase to uppercase transition => split
uppercase to lowercase => no split  (so capitalized words, and words
like IBMs won't cause a split).

Either configure WordDelimiterFilter differently (use catenation but
not generation), or remove it altogether.
Don't forget to re-index after you have made changes.

-Yonik

Re: Reposting unABLE to match

Posted by Ma...@ibsbe.be.
the only thing i can think of is the fact that in the index analysis the 
term-type is "word"
and in the query analysis the term-type is "alphanumeric"

you should be getting a match if that doesnt matter ... you get exactly 
the same term texts ...





Shridhar Venkatraman <Sh...@NeemTree.com> 
27/03/2007 14:08
Please respond to
solr-user@lucene.apache.org


To
solr-user@lucene.apache.org
cc

Subject
Reposting unABLE to match






Solr <http://localhost:8084/Genie/>


  Solr Admin (GENIE)

ShridharVAIO:8084
cwd=C:\Program Files\netbeans-5.5\enterprise3\apache-tomcat-5.5.17\bin
SolrHome=c:\Documents and
Settings\Shridhar\Desktop\Public\Sana\KN\Genie\GenieConf/


    Field Analysis

*Field name* 
*Field value (Index)*
verbose output
highlight matches                "unABLE TO CONNECT"
*Field value (Query)*
verbose output           "unABLE TO CONNECT"
 


      Index Analyzer


        org.apache.solr.analysis.HTMLStripWhitespaceTokenizerFactory {}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.StandardFilterFactory {}

term position            1               2               3
term text                "unABLE                 TO              CONNECT"
term type                word            word            word
source start,end                 0,7             8,10            11,19


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position            1               2
term text                "unABLE                 CONNECT"
term type                word            word
source start,end                 0,7             11,19


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position            1               2               3
term text                un              ABLE            CONNECT
                                                 unABLE
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                word            word            word
word
source start,end                 1,3             3,7             11,18
                                                                 1,7


      Query Analyzer


        org.apache.solr.analysis.HTMLStripStandardTokenizerFactory {}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.SynonymFilterFactory
        {synonyms=synonyms.txt, expand=true, ignoreCase=true}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.StandardFilterFactory {}

term position            1               2               3
term text                unABLE          TO              CONNECT
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
source start,end                 1,7             8,10            11,18


        org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
        ignoreCase=true}

term position            1               2
term text                unABLE          CONNECT
term type                <ALPHANUM>              <ALPHANUM>
source start,end                 1,7             11,18


        org.apache.solr.analysis.WordDelimiterFilterFactory
        {generateNumberParts=1, catenateWords=1, generateWordParts=1,
        catenateAll=1, catenateNumbers=1}

term position            1               2               3
term text                un              ABLE            CONNECT
                                                 unABLE
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.LowerCaseFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7


        org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position            1               2               3
term text                un              able            connect
                                                 unable
term type                <ALPHANUM>              <ALPHANUM> <ALPHANUM>
<ALPHANUM>
source start,end                 1,3             3,7             11,18
                                                                 1,7