You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Andreas Owen <ao...@conx.ch> on 2014/03/12 14:44:17 UTC
Re[2]: NOT SOLVED searches for single char tokens instead of from 3 uppwards
yes that is exactly what happend in the analyzer. the term i searched for was listed on both sides (index & query).
here's the rest:
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
enablePositionIncrements=true ensures that a 'gap' is left to
allow for accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
-----Original-Nachricht-----
> Von: "Jack Krupansky" <ja...@basetechnology.com>
> An: solr-user@lucene.apache.org
> Datum: 12/03/2014 13:25
> Betreff: Re: NOT SOLVED searches for single char tokens instead of from 3 uppwards
>
> You didn't show the new index analyzer - it's tricky to assure that index
> and query are compatible, but the Admin UI Analysis page can help.
>
> Generally, using pure defaults for WDF is not what you want, especially for
> query time. Usually there needs to be a slight asymmetry between index and
> query for WDF - index generates more terms than query.
>
> -- Jack Krupansky
>
> -----Original Message-----
> From: Andreas Owen
> Sent: Wednesday, March 12, 2014 6:20 AM
> To: solr-user@lucene.apache.org
> Subject: RE: NOT SOLVED searches for single char tokens instead of from 3
> uppwards
>
> I now have the following:
>
> <analyzer type="query">
> <tokenizer class="solr.WhiteSpaceTokenizerFactory"/>
> <filter class="solr.WordDelimiterFilterFactory" types="at-under-alpha.txt"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="lang/stopwords_de.txt" format="snowball"
> enablePositionIncrements="true"/> <!-- remove common words -->
> <filter class="solr.GermanNormalizationFilterFactory"/>
> <filter class="solr.SnowballPorterFilterFactory" language="German"/>
> </analyzer>
>
> The gui analysis shows me that wdf doesn't cut the underscore anymore but it
> still returns 0 results?
>
> Output:
>
> <lst name="debug">
> <str name="rawquerystring">yh_cug</str>
> <str name="querystring">yh_cug</str>
> <str name="parsedquery">(+DisjunctionMaxQuery((tags:yh_cug^10.0 |
> links:yh_cug^5.0 | thema:yh_cug^15.0 | plain_text:yh_cug^10.0 |
> url:yh_cug^5.0 | h_*:yh_cug^14.0 | inhaltstyp:yh_cug^6.0 |
> breadcrumb:yh_cug^6.0 | contentmanager:yh_cug^5.0 | title:yh_cug^20.0 |
> editorschoice:yh_cug^200.0 | doctype:yh_cug^10.0))
> ((expiration:[1394619501862 TO *]
> (+MatchAllDocsQuery(*:*) -expiration:*))^6.0)
> FunctionQuery((div(int(clicks),max(int(displays),const(1))))^8.0))/no_coord</str>
> <str name="parsedquery_toString">+(tags:yh_cug^10.0 | links:yh_cug^5.0 |
> thema:yh_cug^15.0 | plain_text:yh_cug^10.0 | url:yh_cug^5.0 |
> h_*:yh_cug^14.0 | inhaltstyp:yh_cug^6.0 | breadcrumb:yh_cug^6.0 |
> contentmanager:yh_cug^5.0 | title:yh_cug^20.0 | editorschoice:yh_cug^200.0 |
> doctype:yh_cug^10.0) ((expiration:[1394619501862 TO *]
> (+*:* -expiration:*))^6.0)
> (div(int(clicks),max(int(displays),const(1))))^8.0</str>
> <lst name="explain"/>
> <arr name="expandedSynonyms">
> <str>yh_cug</str>
> </arr>
> <lst name="reasonForNotExpandingSynonyms">
> <str name="name">DidntFindAnySynonyms</str>
> <str name="explanation">No synonyms found for this query. Check your
> synonyms file.</str>
> </lst>
> <lst name="mainQueryParser">
> <str name="QParser">ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <arr name="boost_queries">
> <str>(expiration:[NOW TO *] OR (*:* -expiration:*))^6</str>
> </arr>
> <arr name="parsed_boost_queries">
> <str>(expiration:[1394619501862 TO *]
> (+MatchAllDocsQuery(*:*) -expiration:*))^6.0</str>
> </arr>
> <arr name="boostfuncs">
> <str>div(clicks,max(displays,1))^8</str>
> </arr>
> </lst>
> <lst name="synonymQueryParser">
> <str name="QParser">ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <arr name="boostfuncs">
> <str>div(clicks,max(displays,1))^8</str>
> </arr>
> </lst>
> <lst name="timing">
>
>
>
>
> -----Original Message-----
> From: Jack Krupansky [mailto:jack@basetechnology.com]
> Sent: Dienstag, 11. März 2014 14:25
> To: solr-user@lucene.apache.org
> Subject: Re: NOT SOLVED searches for single char tokens instead of from 3
> uppwards
>
> The usual use of an ngram filter is at index time and not at query time.
> What exactly are you trying to achieve by using ngram filtering at query
> time as well as index time?
>
> Generally, it is inappropriate to combine the word delimiter filter with the
> standard tokenizer - the later removes the punctuation that normally
> influences how WDF treats the parts of a token. Use the white space
> tokenizer if you intend to use WDF.
>
> Which query parser are you using? What fields are being queried?
>
> Please post the parsed query string from the debug output - it will show the
> precise generated query.
>
> I think what you are seeing is that the ngram filter is generating tokens
> like "h_cugtest" and then the WDF is removing the underscore and then "h"
> gets generated as a separate token.
>
> -- Jack Krupansky
>
> -----Original Message-----
> From: Andreas Owen
> Sent: Tuesday, March 11, 2014 5:09 AM
> To: solr-user@lucene.apache.org
> Subject: RE: NOT SOLVED searches for single char tokens instead of from 3
> uppwards
>
> I got it roght the first time and here is my requesthandler. The field
> "plain_text" is searched correctly and has the sam fieldtype as "title" ->
> "text_de"
>
> <queryParser name="synonym_edismax"
> class="solr.SynonymExpandingExtendedDismaxQParserPlugin">
> <lst name="synonymAnalyzers">
> <lst name="myCoolAnalyzer">
> <lst name="tokenizer">
> <str name="class">standard</str>
> </lst>
> <lst name="filter">
> <str name="class">shingle</str>
> <str name="outputUnigramsIfNoShingles">true</str>
> <str name="outputUnigrams">true</str>
> <str name="minShingleSize">2</str>
> <str name="maxShingleSize">4</str>
> </lst>
> <lst name="filter">
> <str name="class">synonym</str>
> <str name="tokenizerFactory">solr.KeywordTokenizerFactory</str>
> <str name="synonyms">synonyms.txt</str>
> <str name="expand">true</str>
> <str name="ignoreCase">true</str>
> </lst>
> </lst>
> </lst>
> </queryParser>
>
> <requestHandler name="/select2" class="solr.SearchHandler">
> <lst name="defaults">
> <str name="echoParams">explicit</str>
> <int name="rows">10</int>
> <str name="defType">synonym_edismax</str>
> <str name="synonyms">true</str>
> <str name="qf">plain_text^10 editorschoice^200
> title^20 h_*^14
> tags^10 thema^15 inhaltstyp^6 breadcrumb^6 doctype^10
> contentmanager^5 links^5
> last_modified^5 url^5
> </str>
>
> <str name="fq">{!q.op=OR} (*:* -organisations:["" TO *] -roles:["" TO *])
> (+organisations:($org) +roles:($r)) (-organisations:["" TO *] +roles:($r))
> (+organisations:($org) -roles:["" TO *])</str>
> <str name="bq">(expiration:[NOW TO *] OR (*:* -expiration:*))^6</str>
> <!-- tested: now or newer or empty gets small boost -->
> <str name="bf">div(clicks,max(displays,1))^8</str> <!-- tested -->
>
> <str name="df">text</str>
> <str name="fl">*,path,score</str>
> <str name="wt">json</str>
> <str name="q.op">AND</str>
>
> <!-- Highlighting defaults -->
> <str name="hl">on</str>
> <str name="hl.fl">plain_text,title</str>
> <str name="hl.fragSize">200</str>
> <str name="hl.simple.pre"><b></str>
> <str name="hl.simple.post"></b></str>
>
> <!-- <lst name="invariants"> -->
> <str name="facet">on</str>
> <str name="facet.mincount">1</str>
> <str name="facet.field">{!ex=inhaltstyp_s}inhaltstyp_s</str>
> <str name="f.inhaltstyp_s.facet.sort">index</str>
> <str name="facet.field">{!ex=doctype}doctype</str>
> <str name="f.doctype.facet.sort">index</str>
> <str name="facet.field">{!ex=thema_f}thema_f</str>
> <str name="f.thema_f.facet.sort">index</str>
> <str name="facet.field">{!ex=author_s}author_s</str>
> <str name="f.author_s.facet.sort">index</str>
> <str name="facet.field">{!ex=sachverstaendiger_s}sachverstaendiger_s</str>
> <str name="f.sachverstaendiger_s.facet.sort">index</str>
> <str name="facet.field">{!ex=veranstaltung_s}veranstaltung_s</str>
> <str name="f.veranstaltung_s.facet.sort">index</str>
> <str name="facet.date">{!ex=last_modified}last_modified</str>
> <str name="facet.date.gap">+1MONTH</str>
> <str name="facet.date.end">NOW/MONTH+1MONTH</str>
> <str name="facet.date.start">NOW/MONTH-36MONTHS</str>
> <str name="facet.date.other">after</str>
>
> </lst>
> </requestHandler>
>
>
>
> i have a field with the following type:
>
> <fieldType name="text_de" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer>
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="lang/stopwords_de.txt" format="snowball"
> enablePositionIncrements="true"/> <!-- remove common words -->
> <filter class="solr.GermanNormalizationFilterFactory"/>
> <filter class="solr.SnowballPorterFilterFactory"
> language="German"/>
> <filter class="solr.NGramFilterFactory" minGramSize="3"
> maxGramSize="15"/>
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> </analyzer>
> </fieldType>
>
>
> shouldn't this make tokens from 3 to 15 in length and not from 1? heres is a
> query report of 2 results:
>
> > <lst name="responseHeader"> <int name="status">0</int> <int
> > name="QTime">125</int> <lst name="params"> <str
> > name="debugQuery">true</str> <str
> > name="fl">title,roles,organisations,id</str> <str
> > name="indent">true</str> <str name="q">yh_cugtest</str> <str
> > name="_">1394522589347</str> <str name="wt">xml</str> <str
> > name="fq">organisations:* roles:*</str> </lst></lst> <result
> > name="response" numFound="5" start="0">
> > ..........
> > <str name="dms:2681">
> > 1.6365329 = (MATCH) sum of: 1.6346203 = (MATCH) max of:
> > 0.14759353 = (MATCH) product of: 0.28596246 = (MATCH) sum of:
> > 0.01528686 = (MATCH) weight(plain_text:cug in 0) [DefaultSimilarity],
> > result of: 0.01528686 = score(doc=0,freq=1.0 = termFreq=1.0
> > ), product of: 0.035319194 = queryWeight, product of:
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.0063751927 =
> > queryNorm 0.43282017 = fieldWeight in 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.0119499 = (MATCH) weight(plain_text:ugt in
> > 0) [DefaultSimilarity], result of: 0.0119499 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ),
> product of: 0.031227252 = queryWeight, product of:
> > 4.8982444 = idf(docFreq=18, maxDocs=937) 0.0063751927 =
> > queryNorm 0.38267535 = fieldWeight in 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 4.8982444 = idf(docFreq=18, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH) weight(plain_text:yhc
> > in 0) [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.078125 = fieldNorm(doc=0)
> 0.019351374 = (MATCH)
> > weight(plain_text:hcu in 0) [DefaultSimilarity], result of:
> > 0.019351374 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.01528686 = (MATCH) weight(plain_text:cug in
> > 0) [DefaultSimilarity], result of: 0.01528686 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.035319194 = queryWeight, product of: 5.540098 =
> > idf(docFreq=9, maxDocs=937) 0.0063751927 =
> queryNorm 0.43282017 = fieldWeight in 0, product of:
> 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH) weight(plain_text:cugt in
> > 0) [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of: 0.03973814
> > = queryWeight, product of: 6.2332454 = idf(docFreq=4,
> > maxDocs=937) 0.0063751927 = queryNorm 0.4869723
> > = fieldWeight in 0, product of: 1.0 = tf(freq=1.0), with
> > freq of: 1.0 = termFreq=1.0 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.078125 = fieldNorm(doc=0)
> > 0.019351374 = (MATCH) weight(plain_text:yhcu in 0)
> > [DefaultSimilarity], result of:
> 0.019351374 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> 0.03973814 =
> > queryWeight, product of: 6.2332454 = idf(docFreq=4,
> > maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.01528686 = (MATCH) weight(plain_text:cug in
> > 0) [DefaultSimilarity], result of: 0.01528686 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.035319194 = queryWeight, product of: 5.540098 =
> > idf(docFreq=9, maxDocs=937) 0.0063751927 = queryNorm
> > 0.43282017 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> 5.540098 = idf(docFreq=9, maxDocs=937) 0.078125 =
> fieldNorm(doc=0)
> > 0.019351374 = (MATCH) weight(plain_text:hcug in 0)
> > [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH) weight(plain_text:cugt
> > in 0) [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937)
> 0.0063751927 = queryNorm 0.4869723 = fieldWeight in 0,
> product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 =
> > termFreq=1.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.078125 = fieldNorm(doc=0) 0.01528686 = (MATCH)
> > weight(plain_text:cug in 0) [DefaultSimilarity], result of:
> > 0.01528686 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.035319194 = queryWeight, product of: 5.540098 =
> > idf(docFreq=9, maxDocs=937) 0.0063751927 = queryNorm
> > 0.43282017 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH) weight(plain_text:yhcug
> > in 0) [DefaultSimilarity], result
> of: 0.019351374 = score(doc=0,freq=1.0 = termFreq=1.0 ), product
> of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH) weight(plain_text:cugt
> > in 0) [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 =
> termFreq=1.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH) weight(plain_text:hcugt
> > in 0) [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH) weight(plain_text:cugt
> > in 0) [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> 0.4869723 =
> > fieldWeight in 0, product of: 1.0 = tf(freq=1.0), with
> > freq of: 1.0 = termFreq=1.0 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.078125 = fieldNorm(doc=0)
> > 0.019351374 = (MATCH) weight(plain_text:yhcugt in 0)
> > [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.516129 = coord(16/31) 1.6346203 =
> (MATCH) sum of: 0.08372684 = (MATCH) weight(title:yh in 0)
> [DefaultSimilarity], result of:
> > 0.08372684 = score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0),
> with freq of: 13.0 = termFreq=13.0 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.08993706 = (MATCH) weight(title:c in 0) [DefaultSimilarity], result
> > of: 0.08993706 = score(doc=0,freq=15.0 = termFreq=15.0 ),
> > product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm 1.1316214 = fieldWeight in 0, product of:
> > 3.8729835 = tf(freq=15.0), with freq of: 15.0 =
> > termFreq=15.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.023221647 = (MATCH)
> > weight(title:hc in 0) [DefaultSimilarity], result of:
> > 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.040221076 = (MATCH) weight(title:cu in 0)
> > [DefaultSimilarity], result of: 0.040221076 =
> > score(doc=0,freq=3.0 = termFreq=3.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.50607646 = fieldWeight in 0, product of: 1.7320508 =
> > tf(freq=3.0), with freq of: 3.0 = termFreq=3.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.046443295 = (MATCH) weight(title:cug in 0)
> > [DefaultSimilarity], result
> of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:ugt in 0)
> > [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> 0.08372684 = (MATCH)
> > weight(title:yh in 0) [DefaultSimilarity], result of:
> > 0.08372684 = score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08993706 = (MATCH) weight(title:c in 0)
> > [DefaultSimilarity], result of: 0.08993706 =
> > score(doc=0,freq=15.0 = termFreq=15.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.1316214 = fieldWeight
> in 0, product of: 3.8729835 = tf(freq=15.0), with freq of:
> > 15.0 = termFreq=15.0 6.2332454 = idf(docFreq=4,
> > maxDocs=937) 0.046875 = fieldNorm(doc=0) 0.023221647
> > = (MATCH) weight(title:yhc in 0) [DefaultSimilarity], result of:
> > 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 =
> queryWeight, product of: 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.012750385 = queryNorm 1.053482 = fieldWeight in 0, product
> > of: 3.6055512 = tf(freq=13.0), with freq of:
> > 13.0 = termFreq=13.0 6.2332454 = idf(docFreq=4,
> > maxDocs=937) 0.046875 = fieldNorm(doc=0) 0.040221076
> > = (MATCH) weight(title:cu in 0) [DefaultSimilarity], result of:
> > 0.040221076 = score(doc=0,freq=3.0 = termFreq=3.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.50607646 = fieldWeight in 0, product of: 1.7320508 =
> > tf(freq=3.0), with freq of: 3.0 = termFreq=3.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH)
> weight(title:hcu in 0) [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 = tf(freq=1.0),
> > with freq of: 1.0 = termFreq=1.0 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.046443295 = (MATCH) weight(title:cug in 0) [DefaultSimilarity], result
> > of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product
> > of: 0.07947628 = queryWeight, product of: 6.2332454
> > = idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 = tf(freq=4.0),
> > with freq of:
> 4.0 = termFreq=4.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.046443295 = (MATCH)
> > weight(title:cugt in 0) [DefaultSimilarity], result of:
> > 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:yh in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937)
> 0.012750385 = queryNorm 1.053482 = fieldWeight in 0, product of:
> > 3.6055512 = tf(freq=13.0), with freq of: 13.0 =
> > termFreq=13.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.040221076 = (MATCH)
> > weight(title:cu in 0) [DefaultSimilarity], result of:
> > 0.040221076 = score(doc=0,freq=3.0 = termFreq=3.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.50607646 = fieldWeight in 0, product of: 1.7320508 =
> > tf(freq=3.0), with freq of: 3.0 = termFreq=3.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:yhcu in 0)
> > [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 =
> termFreq=1.0 ), product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm 0.29218337 = fieldWeight in 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875
> = fieldNorm(doc=0) 0.046443295 = (MATCH) weight(title:cug in 0)
> [DefaultSimilarity],
> > result of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0
> > ), product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm 0.58436674 = fieldWeight in 0, product of:
> > 2.0 = tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:hcug in 0)
> > [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.046443295 = (MATCH) weight(title:cugt in 0) [DefaultSimilarity],
> > result of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0
> > ), product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm 0.58436674 = fieldWeight in 0, product of:
> > 2.0 = tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:yh in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> 1.053482 =
> > fieldWeight in 0, product of: 3.6055512 = tf(freq=13.0),
> > with freq of: 13.0 = termFreq=13.0 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.046443295 = (MATCH) weight(title:cug in 0) [DefaultSimilarity], result
> > of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product
> > of: 0.07947628 = queryWeight, product of: 6.2332454
> > = idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 = tf(freq=4.0),
> > with freq of: 4.0 = termFreq=4.0 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.023221647 = (MATCH) weight(title:yhcug in 0) [DefaultSimilarity],
> > result
> > of:
> 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> 0.046443295 = (MATCH)
> > weight(title:cugt in 0) [DefaultSimilarity], result of:
> > 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:hcugt in 0)
> > [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight
> in 0, product of: 1.0 = tf(freq=1.0), with freq of:
> 1.0
> > = termFreq=1.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:yh
> > in 0) [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.046443295 = (MATCH) weight(title:cugt in 0)
> > [DefaultSimilarity], result of: 0.046443295 =
> > score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 =
> queryWeight, product of: 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.012750385 = queryNorm 0.58436674 = fieldWeight in 0,
> > product of: 2.0 = tf(freq=4.0), with freq of:
> > 4.0 = termFreq=4.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.023221647 = (MATCH)
> > weight(title:yhcugt in 0) [DefaultSimilarity], result of:
> > 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 = tf(freq=1.0),
> > with freq of: 1.0 = termFreq=1.0 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.0019125579 = (MATCH) product of:
> 0.0038251157 = (MATCH) sum of: 0.0038251157 = (MATCH) sum of:
> 0.0038251157 =
> > (MATCH) MatchAllDocsQuery, product of: 0.0038251157 =
> > queryNorm 0.5 = coord(1/2) 0.0 = (MATCH)
> > FunctionQuery(div(int(clicks),max(int(displays),const(1)))), product
> > of: 0.0 = div(int(clicks)=0,max(int(displays)=0,const(1))) 8.0
> > = boost 6.375193E-4 = queryNorm </str>
> >
> >
> > <str name="dms:216">0.00420471 = (MATCH) sum of: 0.0022921523 =
> > (MATCH) max of: 0.0022921523 = (MATCH) product of: 0.03552836 =
> > (MATCH) sum of: 0.01776418 = (MATCH) weight(plain_text:c in 699)
> > [DefaultSimilarity], result of: 0.01776418 =
> > score(doc=699,freq=10.0 = termFreq=10.0 ), product of:
> > 0.025590291 = queryWeight, product of: 4.014042 =
> > idf(docFreq=45, maxDocs=937) 0.0063751927 = queryNorm
> > 0.6941766 = fieldWeight in 699, product of: 3.1622777 =
> > tf(freq=10.0), with freq of: 10.0 = termFreq=10.0
> > 4.014042 = idf(docFreq=45, maxDocs=937) 0.0546875 =
> > fieldNorm(doc=699) 0.01776418 = (MATCH) weight(plain_text:c in
> > 699) [DefaultSimilarity], result of:
> 0.01776418 = score(doc=699,freq=10.0 = termFreq=10.0 ), product of:
> > 0.025590291 = queryWeight, product of: 4.014042 =
> > idf(docFreq=45, maxDocs=937) 0.0063751927 = queryNorm
> > 0.6941766 = fieldWeight in 699, product of: 3.1622777 =
> > tf(freq=10.0), with freq of: 10.0 = termFreq=10.0
> > 4.014042 = idf(docFreq=45, maxDocs=937) 0.0546875 =
> > fieldNorm(doc=699) 0.06451613 = coord(2/31) 0.0019125579 = (MATCH)
> > product of: 0.0038251157 = (MATCH) sum of: 0.0038251157 =
> > (MATCH) sum of: 0.0038251157 = (MATCH) MatchAllDocsQuery, product
> > of: 0.0038251157 = queryNorm 0.5 = coord(1/2) 0.0 =
> > (MATCH) FunctionQuery(div(int(clicks),max(int(displays),const(1)))),
> > product of: 0.0 = div(int(clicks)=0,max(int(displays)=152,const(1)))
> > 8.0 = boost 6.375193E-4 =
> queryNorm
RE: Re[2]: NOT SOLVED searches for single char tokens instead of from 3 uppwards
Posted by Andreas Owen <ao...@conx.ch>.
I have gotten nearly everything to work. There are to queries where i dont get back what i want.
"avaloq frage 1" -> only returns if i set minGramSize=1 while indexing
"yh_cug" -> query parser doesn't remove "_" but the indexer does (WDF) so there is no match
Is there a way to also query the hole term "avaloq frage 1" without tokenizing it?
Fieldtype:
<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory" types="at-under-alpha.txt"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball" enablePositionIncrements="true"/> <!-- remove common words -->
<filter class="solr.GermanNormalizationFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="German"/> <!-- remove noun/adjective inflections like plural endings -->
<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="15"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhiteSpaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory" types="at-under-alpha.txt"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball" enablePositionIncrements="true"/> <!-- remove common words -->
<filter class="solr.GermanNormalizationFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="German"/>
</analyzer>
</fieldType>
-----Original Message-----
From: Andreas Owen [mailto:ao@conx.ch]
Sent: Mittwoch, 12. März 2014 18:39
To: solr-user@lucene.apache.org
Subject: RE: Re[2]: NOT SOLVED searches for single char tokens instead of from 3 uppwards
Hi Jack,
do you know how i can use local parameters in my solrconfig? The params are visible in the debugquery-output but solr doesn't parse them.
<lst name="invariants">
<str name="fq">{!q.op=OR} (*:* -organisations:["" TO *] -roles:["" TO *]) (+organisations:($org) +roles:($r)) (-organisations:["" TO *] +roles:($r)) (+organisations:($org) -roles:["" TO *])</str> </lst>
-----Original Message-----
From: Andreas Owen [mailto:ao@conx.ch]
Sent: Mittwoch, 12. März 2014 14:44
To: solr-user@lucene.apache.org
Subject: Re[2]: NOT SOLVED searches for single char tokens instead of from 3 uppwards
yes that is exactly what happend in the analyzer. the term i searched for was listed on both sides (index & query).
here's the rest:
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
enablePositionIncrements=true ensures that a 'gap' is left to
allow for accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
-----Original-Nachricht-----
> Von: "Jack Krupansky" <ja...@basetechnology.com>
> An: solr-user@lucene.apache.org
> Datum: 12/03/2014 13:25
> Betreff: Re: NOT SOLVED searches for single char tokens instead of
> from 3 uppwards
>
> You didn't show the new index analyzer - it's tricky to assure that
> index and query are compatible, but the Admin UI Analysis page can help.
>
> Generally, using pure defaults for WDF is not what you want,
> especially for query time. Usually there needs to be a slight
> asymmetry between index and query for WDF - index generates more terms than query.
>
> -- Jack Krupansky
>
> -----Original Message-----
> From: Andreas Owen
> Sent: Wednesday, March 12, 2014 6:20 AM
> To: solr-user@lucene.apache.org
> Subject: RE: NOT SOLVED searches for single char tokens instead of
> from 3 uppwards
>
> I now have the following:
>
> <analyzer type="query">
> <tokenizer class="solr.WhiteSpaceTokenizerFactory"/>
> <filter class="solr.WordDelimiterFilterFactory"
> types="at-under-alpha.txt"/> <filter
> class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="lang/stopwords_de.txt" format="snowball"
> enablePositionIncrements="true"/> <!-- remove common words --> <filter
> class="solr.GermanNormalizationFilterFactory"/>
> <filter class="solr.SnowballPorterFilterFactory" language="German"/>
> </analyzer>
>
> The gui analysis shows me that wdf doesn't cut the underscore anymore
> but it still returns 0 results?
>
> Output:
>
> <lst name="debug">
> <str name="rawquerystring">yh_cug</str>
> <str name="querystring">yh_cug</str>
> <str name="parsedquery">(+DisjunctionMaxQuery((tags:yh_cug^10.0 |
> links:yh_cug^5.0 | thema:yh_cug^15.0 | plain_text:yh_cug^10.0 |
> url:yh_cug^5.0 | h_*:yh_cug^14.0 | inhaltstyp:yh_cug^6.0 |
> breadcrumb:yh_cug^6.0 | contentmanager:yh_cug^5.0 | title:yh_cug^20.0
> |
> editorschoice:yh_cug^200.0 | doctype:yh_cug^10.0))
> ((expiration:[1394619501862 TO *]
> (+MatchAllDocsQuery(*:*) -expiration:*))^6.0)
> FunctionQuery((div(int(clicks),max(int(displays),const(1))))^8.0))/no_
> coord</str>
> <str name="parsedquery_toString">+(tags:yh_cug^10.0 |
> links:yh_cug^5.0 |
> thema:yh_cug^15.0 | plain_text:yh_cug^10.0 | url:yh_cug^5.0 |
> h_*:yh_cug^14.0 | inhaltstyp:yh_cug^6.0 | breadcrumb:yh_cug^6.0 |
> contentmanager:yh_cug^5.0 | title:yh_cug^20.0 |
> editorschoice:yh_cug^200.0 |
> doctype:yh_cug^10.0) ((expiration:[1394619501862 TO *]
> (+*:* -expiration:*))^6.0)
> (div(int(clicks),max(int(displays),const(1))))^8.0</str>
> <lst name="explain"/>
> <arr name="expandedSynonyms">
> <str>yh_cug</str>
> </arr>
> <lst name="reasonForNotExpandingSynonyms">
> <str name="name">DidntFindAnySynonyms</str>
> <str name="explanation">No synonyms found for this query. Check
> your synonyms file.</str>
> </lst>
> <lst name="mainQueryParser">
> <str name="QParser">ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <arr name="boost_queries">
> <str>(expiration:[NOW TO *] OR (*:* -expiration:*))^6</str>
> </arr>
> <arr name="parsed_boost_queries">
> <str>(expiration:[1394619501862 TO *]
> (+MatchAllDocsQuery(*:*) -expiration:*))^6.0</str>
> </arr>
> <arr name="boostfuncs">
> <str>div(clicks,max(displays,1))^8</str>
> </arr>
> </lst>
> <lst name="synonymQueryParser">
> <str name="QParser">ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <arr name="boostfuncs">
> <str>div(clicks,max(displays,1))^8</str>
> </arr>
> </lst>
> <lst name="timing">
>
>
>
>
> -----Original Message-----
> From: Jack Krupansky [mailto:jack@basetechnology.com]
> Sent: Dienstag, 11. März 2014 14:25
> To: solr-user@lucene.apache.org
> Subject: Re: NOT SOLVED searches for single char tokens instead of
> from 3 uppwards
>
> The usual use of an ngram filter is at index time and not at query time.
> What exactly are you trying to achieve by using ngram filtering at
> query time as well as index time?
>
> Generally, it is inappropriate to combine the word delimiter filter
> with the standard tokenizer - the later removes the punctuation that
> normally influences how WDF treats the parts of a token. Use the white
> space tokenizer if you intend to use WDF.
>
> Which query parser are you using? What fields are being queried?
>
> Please post the parsed query string from the debug output - it will
> show the precise generated query.
>
> I think what you are seeing is that the ngram filter is generating
> tokens like "h_cugtest" and then the WDF is removing the underscore and then "h"
> gets generated as a separate token.
>
> -- Jack Krupansky
>
> -----Original Message-----
> From: Andreas Owen
> Sent: Tuesday, March 11, 2014 5:09 AM
> To: solr-user@lucene.apache.org
> Subject: RE: NOT SOLVED searches for single char tokens instead of
> from 3 uppwards
>
> I got it roght the first time and here is my requesthandler. The field
> "plain_text" is searched correctly and has the sam fieldtype as
> "title" -> "text_de"
>
> <queryParser name="synonym_edismax"
> class="solr.SynonymExpandingExtendedDismaxQParserPlugin">
> <lst name="synonymAnalyzers">
> <lst name="myCoolAnalyzer">
> <lst name="tokenizer">
> <str name="class">standard</str>
> </lst>
> <lst name="filter">
> <str name="class">shingle</str>
> <str name="outputUnigramsIfNoShingles">true</str>
> <str name="outputUnigrams">true</str>
> <str name="minShingleSize">2</str>
> <str name="maxShingleSize">4</str>
> </lst>
> <lst name="filter">
> <str name="class">synonym</str>
> <str name="tokenizerFactory">solr.KeywordTokenizerFactory</str>
> <str name="synonyms">synonyms.txt</str>
> <str name="expand">true</str>
> <str name="ignoreCase">true</str>
> </lst>
> </lst>
> </lst>
> </queryParser>
>
> <requestHandler name="/select2" class="solr.SearchHandler">
> <lst name="defaults">
> <str name="echoParams">explicit</str>
> <int name="rows">10</int>
> <str name="defType">synonym_edismax</str>
> <str name="synonyms">true</str>
> <str name="qf">plain_text^10 editorschoice^200
> title^20 h_*^14
> tags^10 thema^15 inhaltstyp^6 breadcrumb^6 doctype^10
> contentmanager^5 links^5
> last_modified^5 url^5
> </str>
>
> <str name="fq">{!q.op=OR} (*:* -organisations:["" TO *] -roles:["" TO
> *])
> (+organisations:($org) +roles:($r)) (-organisations:["" TO *]
> +roles:($r))
> (+organisations:($org) -roles:["" TO *])</str>
> <str name="bq">(expiration:[NOW TO *] OR (*:*
> -expiration:*))^6</str>
> <!-- tested: now or newer or empty gets small boost -->
> <str name="bf">div(clicks,max(displays,1))^8</str> <!-- tested -->
>
> <str name="df">text</str>
> <str name="fl">*,path,score</str>
> <str name="wt">json</str>
> <str name="q.op">AND</str>
>
> <!-- Highlighting defaults -->
> <str name="hl">on</str>
> <str name="hl.fl">plain_text,title</str>
> <str name="hl.fragSize">200</str>
> <str name="hl.simple.pre"><b></str>
> <str name="hl.simple.post"></b></str>
>
> <!-- <lst name="invariants"> -->
> <str name="facet">on</str>
> <str name="facet.mincount">1</str>
> <str name="facet.field">{!ex=inhaltstyp_s}inhaltstyp_s</str>
> <str name="f.inhaltstyp_s.facet.sort">index</str>
> <str name="facet.field">{!ex=doctype}doctype</str>
> <str name="f.doctype.facet.sort">index</str>
> <str name="facet.field">{!ex=thema_f}thema_f</str>
> <str name="f.thema_f.facet.sort">index</str>
> <str name="facet.field">{!ex=author_s}author_s</str>
> <str name="f.author_s.facet.sort">index</str>
> <str
> name="facet.field">{!ex=sachverstaendiger_s}sachverstaendiger_s</str>
> <str name="f.sachverstaendiger_s.facet.sort">index</str>
> <str name="facet.field">{!ex=veranstaltung_s}veranstaltung_s</str>
> <str name="f.veranstaltung_s.facet.sort">index</str>
> <str name="facet.date">{!ex=last_modified}last_modified</str>
> <str name="facet.date.gap">+1MONTH</str>
> <str name="facet.date.end">NOW/MONTH+1MONTH</str>
> <str name="facet.date.start">NOW/MONTH-36MONTHS</str>
> <str name="facet.date.other">after</str>
>
> </lst>
> </requestHandler>
>
>
>
> i have a field with the following type:
>
> <fieldType name="text_de" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer>
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="lang/stopwords_de.txt" format="snowball"
> enablePositionIncrements="true"/> <!-- remove common words -->
> <filter
> class="solr.GermanNormalizationFilterFactory"/>
> <filter class="solr.SnowballPorterFilterFactory"
> language="German"/>
> <filter class="solr.NGramFilterFactory" minGramSize="3"
> maxGramSize="15"/>
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> </analyzer>
> </fieldType>
>
>
> shouldn't this make tokens from 3 to 15 in length and not from 1?
> heres is a query report of 2 results:
>
> > <lst name="responseHeader"> <int name="status">0</int> <int
> > name="QTime">125</int> <lst name="params"> <str
> > name="debugQuery">true</str> <str
> > name="fl">title,roles,organisations,id</str> <str
> > name="indent">true</str> <str name="q">yh_cugtest</str> <str
> > name="_">1394522589347</str> <str name="wt">xml</str> <str
> > name="fq">organisations:* roles:*</str> </lst></lst> <result
> > name="response" numFound="5" start="0">
> > ..........
> > <str name="dms:2681">
> > 1.6365329 = (MATCH) sum of: 1.6346203 = (MATCH) max of:
> > 0.14759353 = (MATCH) product of: 0.28596246 = (MATCH) sum of:
> > 0.01528686 = (MATCH) weight(plain_text:cug in 0)
> > [DefaultSimilarity], result of: 0.01528686 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of: 0.035319194 = queryWeight, product of:
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.0063751927 =
> > queryNorm 0.43282017 = fieldWeight in 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.0119499 = (MATCH) weight(plain_text:ugt
> > in
> > 0) [DefaultSimilarity], result of: 0.0119499 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ),
> product of: 0.031227252 = queryWeight, product of:
> > 4.8982444 = idf(docFreq=18, maxDocs=937) 0.0063751927
> > = queryNorm 0.38267535 = fieldWeight in 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 4.8982444 = idf(docFreq=18, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH) weight(plain_text:yhc
> > in 0) [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.078125 = fieldNorm(doc=0)
> 0.019351374 = (MATCH)
> > weight(plain_text:hcu in 0) [DefaultSimilarity], result of:
> > 0.019351374 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.01528686 = (MATCH) weight(plain_text:cug
> > in
> > 0) [DefaultSimilarity], result of: 0.01528686 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.035319194 = queryWeight, product of: 5.540098 =
> > idf(docFreq=9, maxDocs=937) 0.0063751927 =
> queryNorm 0.43282017 = fieldWeight in 0, product of:
> 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:cugt in
> > 0) [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4,
> > maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0)
> > 0.019351374 = (MATCH) weight(plain_text:yhcu in 0)
> > [DefaultSimilarity], result of:
> 0.019351374 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> 0.03973814 =
> > queryWeight, product of: 6.2332454 = idf(docFreq=4,
> > maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.01528686 = (MATCH) weight(plain_text:cug
> > in
> > 0) [DefaultSimilarity], result of: 0.01528686 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.035319194 = queryWeight, product of: 5.540098 =
> > idf(docFreq=9, maxDocs=937) 0.0063751927 = queryNorm
> > 0.43282017 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> 5.540098 = idf(docFreq=9, maxDocs=937)
> 0.078125 =
> fieldNorm(doc=0)
> > 0.019351374 = (MATCH) weight(plain_text:hcug in 0)
> > [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:cugt in 0) [DefaultSimilarity], result of:
> > 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937)
> 0.0063751927 = queryNorm 0.4869723 = fieldWeight in
> 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 =
> > termFreq=1.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.078125 = fieldNorm(doc=0) 0.01528686 = (MATCH)
> > weight(plain_text:cug in 0) [DefaultSimilarity], result of:
> > 0.01528686 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.035319194 = queryWeight, product of: 5.540098 =
> > idf(docFreq=9, maxDocs=937) 0.0063751927 = queryNorm
> > 0.43282017 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:yhcug in 0) [DefaultSimilarity], result
> of: 0.019351374 = score(doc=0,freq=1.0 = termFreq=1.0 ),
> product
> of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:cugt in 0) [DefaultSimilarity], result of:
> > 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 =
> termFreq=1.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:hcugt in 0) [DefaultSimilarity], result of:
> > 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:cugt in 0) [DefaultSimilarity], result of:
> > 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> 0.4869723 =
> > fieldWeight in 0, product of: 1.0 = tf(freq=1.0), with
> > freq of: 1.0 = termFreq=1.0 6.2332454
> > = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0)
> > 0.019351374 = (MATCH) weight(plain_text:yhcugt in 0)
> > [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.516129 = coord(16/31) 1.6346203 =
> (MATCH) sum of: 0.08372684 = (MATCH) weight(title:yh in 0)
> [DefaultSimilarity], result of:
> > 0.08372684 = score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0),
> with freq of: 13.0 = termFreq=13.0 6.2332454
> =
> > idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.08993706 = (MATCH) weight(title:c in 0) [DefaultSimilarity],
> > result
> > of: 0.08993706 = score(doc=0,freq=15.0 = termFreq=15.0 ),
> > product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm 1.1316214 = fieldWeight in 0, product of:
> > 3.8729835 = tf(freq=15.0), with freq of: 15.0 =
> > termFreq=15.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.023221647 = (MATCH)
> > weight(title:hc in 0) [DefaultSimilarity], result of:
> > 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.040221076 = (MATCH) weight(title:cu in 0)
> > [DefaultSimilarity], result of: 0.040221076 =
> > score(doc=0,freq=3.0 = termFreq=3.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.50607646 = fieldWeight in 0, product of: 1.7320508 =
> > tf(freq=3.0), with freq of: 3.0 = termFreq=3.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.046443295 = (MATCH) weight(title:cug in 0)
> > [DefaultSimilarity], result
> of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:ugt in 0)
> > [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> 0.08372684 = (MATCH)
> > weight(title:yh in 0) [DefaultSimilarity], result of:
> > 0.08372684 = score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08993706 = (MATCH) weight(title:c in 0)
> > [DefaultSimilarity], result of: 0.08993706 =
> > score(doc=0,freq=15.0 = termFreq=15.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.1316214 = fieldWeight
> in 0, product of: 3.8729835 = tf(freq=15.0), with freq of:
> > 15.0 = termFreq=15.0 6.2332454 = idf(docFreq=4,
> > maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.023221647 = (MATCH) weight(title:yhc in 0) [DefaultSimilarity], result of:
> > 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 =
> queryWeight, product of: 6.2332454 = idf(docFreq=4,
> maxDocs=937)
> > 0.012750385 = queryNorm 1.053482 = fieldWeight in 0,
> > product
> > of: 3.6055512 = tf(freq=13.0), with freq of:
> > 13.0 = termFreq=13.0 6.2332454 = idf(docFreq=4,
> > maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.040221076 = (MATCH) weight(title:cu in 0) [DefaultSimilarity], result of:
> > 0.040221076 = score(doc=0,freq=3.0 = termFreq=3.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.50607646 = fieldWeight in 0, product of: 1.7320508 =
> > tf(freq=3.0), with freq of: 3.0 = termFreq=3.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH)
> weight(title:hcu in 0) [DefaultSimilarity], result of:
> 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0)
> > 0.046443295 = (MATCH) weight(title:cug in 0) [DefaultSimilarity],
> > result
> > of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ),
> > product
> > of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of:
> 4.0 = termFreq=4.0 6.2332454 = idf(docFreq=4,
> maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.046443295 = (MATCH)
> > weight(title:cugt in 0) [DefaultSimilarity], result of:
> > 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:yh in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937)
> 0.012750385 = queryNorm 1.053482 = fieldWeight in 0, product of:
> > 3.6055512 = tf(freq=13.0), with freq of: 13.0 =
> > termFreq=13.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.040221076 = (MATCH)
> > weight(title:cu in 0) [DefaultSimilarity], result of:
> > 0.040221076 = score(doc=0,freq=3.0 = termFreq=3.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.50607646 = fieldWeight in 0, product of: 1.7320508 =
> > tf(freq=3.0), with freq of: 3.0 = termFreq=3.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:yhcu in 0)
> > [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 =
> termFreq=1.0 ), product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385
> > = queryNorm 0.29218337 = fieldWeight in 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875
> = fieldNorm(doc=0) 0.046443295 = (MATCH) weight(title:cug in 0)
> [DefaultSimilarity],
> > result of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0
> > ), product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm 0.58436674 = fieldWeight in 0, product of:
> > 2.0 = tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:hcug in 0)
> > [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.046443295 = (MATCH) weight(title:cugt in 0) [DefaultSimilarity],
> > result of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0
> > ), product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm 0.58436674 = fieldWeight in 0, product of:
> > 2.0 = tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:yh in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> 1.053482 =
> > fieldWeight in 0, product of: 3.6055512 = tf(freq=13.0),
> > with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0)
> > 0.046443295 = (MATCH) weight(title:cug in 0) [DefaultSimilarity],
> > result
> > of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ),
> > product
> > of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0)
> > 0.023221647 = (MATCH) weight(title:yhcug in 0) [DefaultSimilarity],
> > result
> > of:
> 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> 0.046443295 = (MATCH)
> > weight(title:cugt in 0) [DefaultSimilarity], result of:
> > 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:hcugt in
> > 0) [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight
> in 0, product of: 1.0 = tf(freq=1.0), with freq of:
> 1.0
> > = termFreq=1.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.08372684 = (MATCH)
> > weight(title:yh in 0) [DefaultSimilarity], result of:
> > 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.046443295 = (MATCH) weight(title:cugt in 0)
> > [DefaultSimilarity], result of: 0.046443295 =
> > score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 =
> queryWeight, product of: 6.2332454 = idf(docFreq=4,
> maxDocs=937)
> > 0.012750385 = queryNorm 0.58436674 = fieldWeight in 0,
> > product of: 2.0 = tf(freq=4.0), with freq of:
> > 4.0 = termFreq=4.0 6.2332454 = idf(docFreq=4,
> > maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.023221647 = (MATCH)
> > weight(title:yhcugt in 0) [DefaultSimilarity], result of:
> > 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0)
> > 0.0019125579 = (MATCH) product of:
> 0.0038251157 = (MATCH) sum of: 0.0038251157 = (MATCH) sum of:
> 0.0038251157 =
> > (MATCH) MatchAllDocsQuery, product of: 0.0038251157 =
> > queryNorm 0.5 = coord(1/2) 0.0 = (MATCH)
> > FunctionQuery(div(int(clicks),max(int(displays),const(1)))), product
> > of: 0.0 = div(int(clicks)=0,max(int(displays)=0,const(1)))
> > 8.0 = boost 6.375193E-4 = queryNorm </str>
> >
> >
> > <str name="dms:216">0.00420471 = (MATCH) sum of: 0.0022921523
> > =
> > (MATCH) max of: 0.0022921523 = (MATCH) product of:
> > 0.03552836 =
> > (MATCH) sum of: 0.01776418 = (MATCH) weight(plain_text:c in
> > 699) [DefaultSimilarity], result of: 0.01776418 =
> > score(doc=699,freq=10.0 = termFreq=10.0 ), product of:
> > 0.025590291 = queryWeight, product of: 4.014042 =
> > idf(docFreq=45, maxDocs=937) 0.0063751927 = queryNorm
> > 0.6941766 = fieldWeight in 699, product of: 3.1622777
> > = tf(freq=10.0), with freq of: 10.0 = termFreq=10.0
> > 4.014042 = idf(docFreq=45, maxDocs=937) 0.0546875 =
> > fieldNorm(doc=699) 0.01776418 = (MATCH) weight(plain_text:c
> > in
> > 699) [DefaultSimilarity], result of:
> 0.01776418 = score(doc=699,freq=10.0 = termFreq=10.0 ), product of:
> > 0.025590291 = queryWeight, product of: 4.014042 =
> > idf(docFreq=45, maxDocs=937) 0.0063751927 = queryNorm
> > 0.6941766 = fieldWeight in 699, product of: 3.1622777
> > = tf(freq=10.0), with freq of: 10.0 = termFreq=10.0
> > 4.014042 = idf(docFreq=45, maxDocs=937) 0.0546875 =
> > fieldNorm(doc=699) 0.06451613 = coord(2/31) 0.0019125579 =
> > (MATCH) product of: 0.0038251157 = (MATCH) sum of:
> > 0.0038251157 =
> > (MATCH) sum of: 0.0038251157 = (MATCH) MatchAllDocsQuery,
> > product
> > of: 0.0038251157 = queryNorm 0.5 = coord(1/2) 0.0 =
> > (MATCH) FunctionQuery(div(int(clicks),max(int(displays),const(1)))),
> > product of: 0.0 =
> > div(int(clicks)=0,max(int(displays)=152,const(1)))
> > 8.0 = boost 6.375193E-4 =
> queryNorm
RE: Re[2]: NOT SOLVED searches for single char tokens instead of from 3 uppwards
Posted by Andreas Owen <ao...@conx.ch>.
Hi Jack,
do you know how i can use local parameters in my solrconfig? The params are visible in the debugquery-output but solr doesn't parse them.
<lst name="invariants">
<str name="fq">{!q.op=OR} (*:* -organisations:["" TO *] -roles:["" TO *]) (+organisations:($org) +roles:($r)) (-organisations:["" TO *] +roles:($r)) (+organisations:($org) -roles:["" TO *])</str>
</lst>
-----Original Message-----
From: Andreas Owen [mailto:ao@conx.ch]
Sent: Mittwoch, 12. März 2014 14:44
To: solr-user@lucene.apache.org
Subject: Re[2]: NOT SOLVED searches for single char tokens instead of from 3 uppwards
yes that is exactly what happend in the analyzer. the term i searched for was listed on both sides (index & query).
here's the rest:
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
enablePositionIncrements=true ensures that a 'gap' is left to
allow for accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
-----Original-Nachricht-----
> Von: "Jack Krupansky" <ja...@basetechnology.com>
> An: solr-user@lucene.apache.org
> Datum: 12/03/2014 13:25
> Betreff: Re: NOT SOLVED searches for single char tokens instead of
> from 3 uppwards
>
> You didn't show the new index analyzer - it's tricky to assure that
> index and query are compatible, but the Admin UI Analysis page can help.
>
> Generally, using pure defaults for WDF is not what you want,
> especially for query time. Usually there needs to be a slight
> asymmetry between index and query for WDF - index generates more terms than query.
>
> -- Jack Krupansky
>
> -----Original Message-----
> From: Andreas Owen
> Sent: Wednesday, March 12, 2014 6:20 AM
> To: solr-user@lucene.apache.org
> Subject: RE: NOT SOLVED searches for single char tokens instead of
> from 3 uppwards
>
> I now have the following:
>
> <analyzer type="query">
> <tokenizer class="solr.WhiteSpaceTokenizerFactory"/>
> <filter class="solr.WordDelimiterFilterFactory"
> types="at-under-alpha.txt"/> <filter
> class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="lang/stopwords_de.txt" format="snowball"
> enablePositionIncrements="true"/> <!-- remove common words --> <filter
> class="solr.GermanNormalizationFilterFactory"/>
> <filter class="solr.SnowballPorterFilterFactory" language="German"/>
> </analyzer>
>
> The gui analysis shows me that wdf doesn't cut the underscore anymore
> but it still returns 0 results?
>
> Output:
>
> <lst name="debug">
> <str name="rawquerystring">yh_cug</str>
> <str name="querystring">yh_cug</str>
> <str name="parsedquery">(+DisjunctionMaxQuery((tags:yh_cug^10.0 |
> links:yh_cug^5.0 | thema:yh_cug^15.0 | plain_text:yh_cug^10.0 |
> url:yh_cug^5.0 | h_*:yh_cug^14.0 | inhaltstyp:yh_cug^6.0 |
> breadcrumb:yh_cug^6.0 | contentmanager:yh_cug^5.0 | title:yh_cug^20.0
> |
> editorschoice:yh_cug^200.0 | doctype:yh_cug^10.0))
> ((expiration:[1394619501862 TO *]
> (+MatchAllDocsQuery(*:*) -expiration:*))^6.0)
> FunctionQuery((div(int(clicks),max(int(displays),const(1))))^8.0))/no_
> coord</str>
> <str name="parsedquery_toString">+(tags:yh_cug^10.0 |
> links:yh_cug^5.0 |
> thema:yh_cug^15.0 | plain_text:yh_cug^10.0 | url:yh_cug^5.0 |
> h_*:yh_cug^14.0 | inhaltstyp:yh_cug^6.0 | breadcrumb:yh_cug^6.0 |
> contentmanager:yh_cug^5.0 | title:yh_cug^20.0 |
> editorschoice:yh_cug^200.0 |
> doctype:yh_cug^10.0) ((expiration:[1394619501862 TO *]
> (+*:* -expiration:*))^6.0)
> (div(int(clicks),max(int(displays),const(1))))^8.0</str>
> <lst name="explain"/>
> <arr name="expandedSynonyms">
> <str>yh_cug</str>
> </arr>
> <lst name="reasonForNotExpandingSynonyms">
> <str name="name">DidntFindAnySynonyms</str>
> <str name="explanation">No synonyms found for this query. Check
> your synonyms file.</str>
> </lst>
> <lst name="mainQueryParser">
> <str name="QParser">ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <arr name="boost_queries">
> <str>(expiration:[NOW TO *] OR (*:* -expiration:*))^6</str>
> </arr>
> <arr name="parsed_boost_queries">
> <str>(expiration:[1394619501862 TO *]
> (+MatchAllDocsQuery(*:*) -expiration:*))^6.0</str>
> </arr>
> <arr name="boostfuncs">
> <str>div(clicks,max(displays,1))^8</str>
> </arr>
> </lst>
> <lst name="synonymQueryParser">
> <str name="QParser">ExtendedDismaxQParser</str>
> <null name="altquerystring"/>
> <arr name="boostfuncs">
> <str>div(clicks,max(displays,1))^8</str>
> </arr>
> </lst>
> <lst name="timing">
>
>
>
>
> -----Original Message-----
> From: Jack Krupansky [mailto:jack@basetechnology.com]
> Sent: Dienstag, 11. März 2014 14:25
> To: solr-user@lucene.apache.org
> Subject: Re: NOT SOLVED searches for single char tokens instead of
> from 3 uppwards
>
> The usual use of an ngram filter is at index time and not at query time.
> What exactly are you trying to achieve by using ngram filtering at
> query time as well as index time?
>
> Generally, it is inappropriate to combine the word delimiter filter
> with the standard tokenizer - the later removes the punctuation that
> normally influences how WDF treats the parts of a token. Use the white
> space tokenizer if you intend to use WDF.
>
> Which query parser are you using? What fields are being queried?
>
> Please post the parsed query string from the debug output - it will
> show the precise generated query.
>
> I think what you are seeing is that the ngram filter is generating
> tokens like "h_cugtest" and then the WDF is removing the underscore and then "h"
> gets generated as a separate token.
>
> -- Jack Krupansky
>
> -----Original Message-----
> From: Andreas Owen
> Sent: Tuesday, March 11, 2014 5:09 AM
> To: solr-user@lucene.apache.org
> Subject: RE: NOT SOLVED searches for single char tokens instead of
> from 3 uppwards
>
> I got it roght the first time and here is my requesthandler. The field
> "plain_text" is searched correctly and has the sam fieldtype as
> "title" -> "text_de"
>
> <queryParser name="synonym_edismax"
> class="solr.SynonymExpandingExtendedDismaxQParserPlugin">
> <lst name="synonymAnalyzers">
> <lst name="myCoolAnalyzer">
> <lst name="tokenizer">
> <str name="class">standard</str>
> </lst>
> <lst name="filter">
> <str name="class">shingle</str>
> <str name="outputUnigramsIfNoShingles">true</str>
> <str name="outputUnigrams">true</str>
> <str name="minShingleSize">2</str>
> <str name="maxShingleSize">4</str>
> </lst>
> <lst name="filter">
> <str name="class">synonym</str>
> <str name="tokenizerFactory">solr.KeywordTokenizerFactory</str>
> <str name="synonyms">synonyms.txt</str>
> <str name="expand">true</str>
> <str name="ignoreCase">true</str>
> </lst>
> </lst>
> </lst>
> </queryParser>
>
> <requestHandler name="/select2" class="solr.SearchHandler">
> <lst name="defaults">
> <str name="echoParams">explicit</str>
> <int name="rows">10</int>
> <str name="defType">synonym_edismax</str>
> <str name="synonyms">true</str>
> <str name="qf">plain_text^10 editorschoice^200
> title^20 h_*^14
> tags^10 thema^15 inhaltstyp^6 breadcrumb^6 doctype^10
> contentmanager^5 links^5
> last_modified^5 url^5
> </str>
>
> <str name="fq">{!q.op=OR} (*:* -organisations:["" TO *] -roles:["" TO
> *])
> (+organisations:($org) +roles:($r)) (-organisations:["" TO *]
> +roles:($r))
> (+organisations:($org) -roles:["" TO *])</str>
> <str name="bq">(expiration:[NOW TO *] OR (*:*
> -expiration:*))^6</str>
> <!-- tested: now or newer or empty gets small boost -->
> <str name="bf">div(clicks,max(displays,1))^8</str> <!-- tested -->
>
> <str name="df">text</str>
> <str name="fl">*,path,score</str>
> <str name="wt">json</str>
> <str name="q.op">AND</str>
>
> <!-- Highlighting defaults -->
> <str name="hl">on</str>
> <str name="hl.fl">plain_text,title</str>
> <str name="hl.fragSize">200</str>
> <str name="hl.simple.pre"><b></str>
> <str name="hl.simple.post"></b></str>
>
> <!-- <lst name="invariants"> -->
> <str name="facet">on</str>
> <str name="facet.mincount">1</str>
> <str name="facet.field">{!ex=inhaltstyp_s}inhaltstyp_s</str>
> <str name="f.inhaltstyp_s.facet.sort">index</str>
> <str name="facet.field">{!ex=doctype}doctype</str>
> <str name="f.doctype.facet.sort">index</str>
> <str name="facet.field">{!ex=thema_f}thema_f</str>
> <str name="f.thema_f.facet.sort">index</str>
> <str name="facet.field">{!ex=author_s}author_s</str>
> <str name="f.author_s.facet.sort">index</str>
> <str
> name="facet.field">{!ex=sachverstaendiger_s}sachverstaendiger_s</str>
> <str name="f.sachverstaendiger_s.facet.sort">index</str>
> <str name="facet.field">{!ex=veranstaltung_s}veranstaltung_s</str>
> <str name="f.veranstaltung_s.facet.sort">index</str>
> <str name="facet.date">{!ex=last_modified}last_modified</str>
> <str name="facet.date.gap">+1MONTH</str>
> <str name="facet.date.end">NOW/MONTH+1MONTH</str>
> <str name="facet.date.start">NOW/MONTH-36MONTHS</str>
> <str name="facet.date.other">after</str>
>
> </lst>
> </requestHandler>
>
>
>
> i have a field with the following type:
>
> <fieldType name="text_de" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer>
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="lang/stopwords_de.txt" format="snowball"
> enablePositionIncrements="true"/> <!-- remove common words -->
> <filter
> class="solr.GermanNormalizationFilterFactory"/>
> <filter class="solr.SnowballPorterFilterFactory"
> language="German"/>
> <filter class="solr.NGramFilterFactory" minGramSize="3"
> maxGramSize="15"/>
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> </analyzer>
> </fieldType>
>
>
> shouldn't this make tokens from 3 to 15 in length and not from 1?
> heres is a query report of 2 results:
>
> > <lst name="responseHeader"> <int name="status">0</int> <int
> > name="QTime">125</int> <lst name="params"> <str
> > name="debugQuery">true</str> <str
> > name="fl">title,roles,organisations,id</str> <str
> > name="indent">true</str> <str name="q">yh_cugtest</str> <str
> > name="_">1394522589347</str> <str name="wt">xml</str> <str
> > name="fq">organisations:* roles:*</str> </lst></lst> <result
> > name="response" numFound="5" start="0">
> > ..........
> > <str name="dms:2681">
> > 1.6365329 = (MATCH) sum of: 1.6346203 = (MATCH) max of:
> > 0.14759353 = (MATCH) product of: 0.28596246 = (MATCH) sum of:
> > 0.01528686 = (MATCH) weight(plain_text:cug in 0)
> > [DefaultSimilarity], result of: 0.01528686 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of: 0.035319194 = queryWeight, product of:
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.0063751927 =
> > queryNorm 0.43282017 = fieldWeight in 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.0119499 = (MATCH) weight(plain_text:ugt
> > in
> > 0) [DefaultSimilarity], result of: 0.0119499 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ),
> product of: 0.031227252 = queryWeight, product of:
> > 4.8982444 = idf(docFreq=18, maxDocs=937) 0.0063751927
> > = queryNorm 0.38267535 = fieldWeight in 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 4.8982444 = idf(docFreq=18, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH) weight(plain_text:yhc
> > in 0) [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.078125 = fieldNorm(doc=0)
> 0.019351374 = (MATCH)
> > weight(plain_text:hcu in 0) [DefaultSimilarity], result of:
> > 0.019351374 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.01528686 = (MATCH) weight(plain_text:cug
> > in
> > 0) [DefaultSimilarity], result of: 0.01528686 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.035319194 = queryWeight, product of: 5.540098 =
> > idf(docFreq=9, maxDocs=937) 0.0063751927 =
> queryNorm 0.43282017 = fieldWeight in 0, product of:
> 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:cugt in
> > 0) [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4,
> > maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0)
> > 0.019351374 = (MATCH) weight(plain_text:yhcu in 0)
> > [DefaultSimilarity], result of:
> 0.019351374 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> 0.03973814 =
> > queryWeight, product of: 6.2332454 = idf(docFreq=4,
> > maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.01528686 = (MATCH) weight(plain_text:cug
> > in
> > 0) [DefaultSimilarity], result of: 0.01528686 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.035319194 = queryWeight, product of: 5.540098 =
> > idf(docFreq=9, maxDocs=937) 0.0063751927 = queryNorm
> > 0.43282017 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> 5.540098 = idf(docFreq=9, maxDocs=937)
> 0.078125 =
> fieldNorm(doc=0)
> > 0.019351374 = (MATCH) weight(plain_text:hcug in 0)
> > [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:cugt in 0) [DefaultSimilarity], result of:
> > 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937)
> 0.0063751927 = queryNorm 0.4869723 = fieldWeight in
> 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 =
> > termFreq=1.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.078125 = fieldNorm(doc=0) 0.01528686 = (MATCH)
> > weight(plain_text:cug in 0) [DefaultSimilarity], result of:
> > 0.01528686 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.035319194 = queryWeight, product of: 5.540098 =
> > idf(docFreq=9, maxDocs=937) 0.0063751927 = queryNorm
> > 0.43282017 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 5.540098 = idf(docFreq=9, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:yhcug in 0) [DefaultSimilarity], result
> of: 0.019351374 = score(doc=0,freq=1.0 = termFreq=1.0 ),
> product
> of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:cugt in 0) [DefaultSimilarity], result of:
> > 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 =
> termFreq=1.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:hcugt in 0) [DefaultSimilarity], result of:
> > 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.019351374 = (MATCH)
> > weight(plain_text:cugt in 0) [DefaultSimilarity], result of:
> > 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> 0.4869723 =
> > fieldWeight in 0, product of: 1.0 = tf(freq=1.0), with
> > freq of: 1.0 = termFreq=1.0 6.2332454
> > = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0)
> > 0.019351374 = (MATCH) weight(plain_text:yhcugt in 0)
> > [DefaultSimilarity], result of: 0.019351374 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.03973814 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.0063751927 = queryNorm
> > 0.4869723 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.078125 =
> > fieldNorm(doc=0) 0.516129 = coord(16/31) 1.6346203 =
> (MATCH) sum of: 0.08372684 = (MATCH) weight(title:yh in 0)
> [DefaultSimilarity], result of:
> > 0.08372684 = score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0),
> with freq of: 13.0 = termFreq=13.0 6.2332454
> =
> > idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.08993706 = (MATCH) weight(title:c in 0) [DefaultSimilarity],
> > result
> > of: 0.08993706 = score(doc=0,freq=15.0 = termFreq=15.0 ),
> > product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm 1.1316214 = fieldWeight in 0, product of:
> > 3.8729835 = tf(freq=15.0), with freq of: 15.0 =
> > termFreq=15.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.023221647 = (MATCH)
> > weight(title:hc in 0) [DefaultSimilarity], result of:
> > 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.040221076 = (MATCH) weight(title:cu in 0)
> > [DefaultSimilarity], result of: 0.040221076 =
> > score(doc=0,freq=3.0 = termFreq=3.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.50607646 = fieldWeight in 0, product of: 1.7320508 =
> > tf(freq=3.0), with freq of: 3.0 = termFreq=3.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.046443295 = (MATCH) weight(title:cug in 0)
> > [DefaultSimilarity], result
> of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:ugt in 0)
> > [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> 0.08372684 = (MATCH)
> > weight(title:yh in 0) [DefaultSimilarity], result of:
> > 0.08372684 = score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08993706 = (MATCH) weight(title:c in 0)
> > [DefaultSimilarity], result of: 0.08993706 =
> > score(doc=0,freq=15.0 = termFreq=15.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.1316214 = fieldWeight
> in 0, product of: 3.8729835 = tf(freq=15.0), with freq of:
> > 15.0 = termFreq=15.0 6.2332454 = idf(docFreq=4,
> > maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.023221647 = (MATCH) weight(title:yhc in 0) [DefaultSimilarity], result of:
> > 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 =
> queryWeight, product of: 6.2332454 = idf(docFreq=4,
> maxDocs=937)
> > 0.012750385 = queryNorm 1.053482 = fieldWeight in 0,
> > product
> > of: 3.6055512 = tf(freq=13.0), with freq of:
> > 13.0 = termFreq=13.0 6.2332454 = idf(docFreq=4,
> > maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.040221076 = (MATCH) weight(title:cu in 0) [DefaultSimilarity], result of:
> > 0.040221076 = score(doc=0,freq=3.0 = termFreq=3.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.50607646 = fieldWeight in 0, product of: 1.7320508 =
> > tf(freq=3.0), with freq of: 3.0 = termFreq=3.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH)
> weight(title:hcu in 0) [DefaultSimilarity], result of:
> 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0)
> > 0.046443295 = (MATCH) weight(title:cug in 0) [DefaultSimilarity],
> > result
> > of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ),
> > product
> > of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of:
> 4.0 = termFreq=4.0 6.2332454 = idf(docFreq=4,
> maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.046443295 = (MATCH)
> > weight(title:cugt in 0) [DefaultSimilarity], result of:
> > 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:yh in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937)
> 0.012750385 = queryNorm 1.053482 = fieldWeight in 0, product of:
> > 3.6055512 = tf(freq=13.0), with freq of: 13.0 =
> > termFreq=13.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.040221076 = (MATCH)
> > weight(title:cu in 0) [DefaultSimilarity], result of:
> > 0.040221076 = score(doc=0,freq=3.0 = termFreq=3.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.50607646 = fieldWeight in 0, product of: 1.7320508 =
> > tf(freq=3.0), with freq of: 3.0 = termFreq=3.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:yhcu in 0)
> > [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 =
> termFreq=1.0 ), product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385
> > = queryNorm 0.29218337 = fieldWeight in 0, product of:
> > 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875
> = fieldNorm(doc=0) 0.046443295 = (MATCH) weight(title:cug in 0)
> [DefaultSimilarity],
> > result of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0
> > ), product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm 0.58436674 = fieldWeight in 0, product of:
> > 2.0 = tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:hcug in 0)
> > [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> > 0.046443295 = (MATCH) weight(title:cugt in 0) [DefaultSimilarity],
> > result of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0
> > ), product of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm 0.58436674 = fieldWeight in 0, product of:
> > 2.0 = tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:yh in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> 1.053482 =
> > fieldWeight in 0, product of: 3.6055512 = tf(freq=13.0),
> > with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0)
> > 0.046443295 = (MATCH) weight(title:cug in 0) [DefaultSimilarity],
> > result
> > of: 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ),
> > product
> > of: 0.07947628 = queryWeight, product of:
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.012750385 =
> > queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0)
> > 0.023221647 = (MATCH) weight(title:yhcug in 0) [DefaultSimilarity],
> > result
> > of:
> 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.08372684 = (MATCH) weight(title:h in 0)
> > [DefaultSimilarity], result of: 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 =
> idf(docFreq=4, maxDocs=937) 0.046875 = fieldNorm(doc=0)
> 0.046443295 = (MATCH)
> > weight(title:cugt in 0) [DefaultSimilarity], result of:
> > 0.046443295 = score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.58436674 = fieldWeight in 0, product of: 2.0 =
> > tf(freq=4.0), with freq of: 4.0 = termFreq=4.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.023221647 = (MATCH) weight(title:hcugt in
> > 0) [DefaultSimilarity], result of: 0.023221647 =
> > score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight
> in 0, product of: 1.0 = tf(freq=1.0), with freq of:
> 1.0
> > = termFreq=1.0 6.2332454 = idf(docFreq=4, maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.08372684 = (MATCH)
> > weight(title:yh in 0) [DefaultSimilarity], result of:
> > 0.08372684 =
> > score(doc=0,freq=13.0 = termFreq=13.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 1.053482 = fieldWeight in 0, product of: 3.6055512 =
> > tf(freq=13.0), with freq of: 13.0 = termFreq=13.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0) 0.046443295 = (MATCH) weight(title:cugt in 0)
> > [DefaultSimilarity], result of: 0.046443295 =
> > score(doc=0,freq=4.0 = termFreq=4.0 ), product of:
> > 0.07947628 =
> queryWeight, product of: 6.2332454 = idf(docFreq=4,
> maxDocs=937)
> > 0.012750385 = queryNorm 0.58436674 = fieldWeight in 0,
> > product of: 2.0 = tf(freq=4.0), with freq of:
> > 4.0 = termFreq=4.0 6.2332454 = idf(docFreq=4,
> > maxDocs=937)
> > 0.046875 = fieldNorm(doc=0) 0.023221647 = (MATCH)
> > weight(title:yhcugt in 0) [DefaultSimilarity], result of:
> > 0.023221647 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of:
> > 0.07947628 = queryWeight, product of: 6.2332454 =
> > idf(docFreq=4, maxDocs=937) 0.012750385 = queryNorm
> > 0.29218337 = fieldWeight in 0, product of: 1.0 =
> > tf(freq=1.0), with freq of: 1.0 = termFreq=1.0
> > 6.2332454 = idf(docFreq=4, maxDocs=937) 0.046875 =
> > fieldNorm(doc=0)
> > 0.0019125579 = (MATCH) product of:
> 0.0038251157 = (MATCH) sum of: 0.0038251157 = (MATCH) sum of:
> 0.0038251157 =
> > (MATCH) MatchAllDocsQuery, product of: 0.0038251157 =
> > queryNorm 0.5 = coord(1/2) 0.0 = (MATCH)
> > FunctionQuery(div(int(clicks),max(int(displays),const(1)))), product
> > of: 0.0 = div(int(clicks)=0,max(int(displays)=0,const(1)))
> > 8.0 = boost 6.375193E-4 = queryNorm </str>
> >
> >
> > <str name="dms:216">0.00420471 = (MATCH) sum of: 0.0022921523
> > =
> > (MATCH) max of: 0.0022921523 = (MATCH) product of:
> > 0.03552836 =
> > (MATCH) sum of: 0.01776418 = (MATCH) weight(plain_text:c in
> > 699) [DefaultSimilarity], result of: 0.01776418 =
> > score(doc=699,freq=10.0 = termFreq=10.0 ), product of:
> > 0.025590291 = queryWeight, product of: 4.014042 =
> > idf(docFreq=45, maxDocs=937) 0.0063751927 = queryNorm
> > 0.6941766 = fieldWeight in 699, product of: 3.1622777
> > = tf(freq=10.0), with freq of: 10.0 = termFreq=10.0
> > 4.014042 = idf(docFreq=45, maxDocs=937) 0.0546875 =
> > fieldNorm(doc=699) 0.01776418 = (MATCH) weight(plain_text:c
> > in
> > 699) [DefaultSimilarity], result of:
> 0.01776418 = score(doc=699,freq=10.0 = termFreq=10.0 ), product of:
> > 0.025590291 = queryWeight, product of: 4.014042 =
> > idf(docFreq=45, maxDocs=937) 0.0063751927 = queryNorm
> > 0.6941766 = fieldWeight in 699, product of: 3.1622777
> > = tf(freq=10.0), with freq of: 10.0 = termFreq=10.0
> > 4.014042 = idf(docFreq=45, maxDocs=937) 0.0546875 =
> > fieldNorm(doc=699) 0.06451613 = coord(2/31) 0.0019125579 =
> > (MATCH) product of: 0.0038251157 = (MATCH) sum of:
> > 0.0038251157 =
> > (MATCH) sum of: 0.0038251157 = (MATCH) MatchAllDocsQuery,
> > product
> > of: 0.0038251157 = queryNorm 0.5 = coord(1/2) 0.0 =
> > (MATCH) FunctionQuery(div(int(clicks),max(int(displays),const(1)))),
> > product of: 0.0 =
> > div(int(clicks)=0,max(int(displays)=152,const(1)))
> > 8.0 = boost 6.375193E-4 =
> queryNorm