You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Lord Khan Han <kh...@gmail.com> on 2011/10/04 15:26:44 UTC
Re: Shingle and Query Performance

We figured out that if use only shingle field not combined with ouput
Unigram than performance getting better. I f we use output unigram its not
good from the normal index field. so we decide to make separate field
only combined shingle using this field to support main queries.

On Wed, Aug 31, 2011 at 1:01 PM, Lord Khan Han <kh...@gmail.com>wrote:

> Thanks Erick.. If I figure out something I will let you know also..  No
> body replied except you I thought there might be more people involve here..
>
> Thanks
>
>
> On Wed, Aug 31, 2011 at 3:47 AM, Erick Erickson <er...@gmail.com>wrote:
>
>> OK, I'll have to defer because this makes no sense.
>> 4+ seconds in the debug component?
>>
>> Sorry I can't be more help here, but nothing really
>> jumps out.
>> Erick
>>
>> On Tue, Aug 30, 2011 at 12:45 PM, Lord Khan Han <kh...@gmail.com>
>> wrote:
>> > Below the output of the debug. I am measuring pure solr qtime which show
>> in
>> > the Qtime field in solr xml.
>> >
>> > <arr name="parsed_filter_queries">
>> > <str>mrank:[0 TO 100]</str>
>> > </arr>
>> > <lst name="timing">
>> > <double name="time">8584.0</double>
>> > <lst name="prepare">
>> > <double name="time">12.0</double>
>> > <lst name="org.apache.solr.handler.component.QueryComponent">
>> > <double name="time">12.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.FacetComponent">
>> > <double name="time">0.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>> > <double name="time">0.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.HighlightComponent">
>> > <double name="time">0.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.StatsComponent">
>> > <double name="time">0.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.SpellCheckComponent">
>> > <double name="time">0.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.DebugComponent">
>> > <double name="time">0.0</double>
>> > </lst>
>> > </lst>
>> > <lst name="process">
>> > <double name="time">8572.0</double>
>> > <lst name="org.apache.solr.handler.component.QueryComponent">
>> > <double name="time">4480.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.FacetComponent">
>> > <double name="time">0.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>> > <double name="time">0.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.HighlightComponent">
>> > <double name="time">41.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.StatsComponent">
>> > <double name="time">0.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.SpellCheckComponent">
>> > <double name="time">0.0</double>
>> > </lst>
>> > <lst name="org.apache.solr.handler.component.DebugComponent">
>> > <double name="time">4051.0</double>
>> > </lst>
>> >
>> > On Tue, Aug 30, 2011 at 5:38 PM, Erick Erickson <
>> erickerickson@gmail.com>wrote:
>> >
>> >> Can we see the output if you specify both
>> >> &debugQuery=on&debug=true
>> >>
>> >> the debug=true will show the time taken up with various
>> >> components, which is sometimes surprising...
>> >>
>> >> Second, we never asked the most basic question, what are
>> >> you measuring? Is this the QTime of the returned response?
>> >> (which is the time actually spent searching) or the time until
>> >> the response gets back to the client, which may involve lots besides
>> >> searching...
>> >>
>> >> Best
>> >> Erick
>> >>
>> >> On Tue, Aug 30, 2011 at 7:59 AM, Lord Khan Han <
>> khanuniverse1@gmail.com>
>> >> wrote:
>> >> > Hi Eric,
>> >> >
>> >> > Fields are lazy loading, content stored in solr and machine 32 gig..
>> solr
>> >> > has 20 gig heap. There is no swapping.
>> >> >
>> >> > As you see we have many phrases in the same query . I couldnt find a
>> way
>> >> to
>> >> > drop qtime to subsecends. Suprisingly non shingled test better qtime
>> !
>> >> >
>> >> >
>> >> > On Mon, Aug 29, 2011 at 3:10 PM, Erick Erickson <
>> erickerickson@gmail.com
>> >> >wrote:
>> >> >
>> >> >> Oh, one other thing: have you profiled your machine
>> >> >> to see if you're swapping? How much memory are
>> >> >> you giving your JVM? What is the underlying
>> >> >> hardware setup?
>> >> >>
>> >> >> Best
>> >> >> Erick
>> >> >>
>> >> >> On Mon, Aug 29, 2011 at 8:09 AM, Erick Erickson <
>> >> erickerickson@gmail.com>
>> >> >> wrote:
>> >> >> > 200K docs and 36G index? It sounds like you're storing
>> >> >> > your documents in the Solr index. In and of itself, that
>> >> >> > shouldn't hurt your query times, *unless* you have
>> >> >> > lazy field loading turned off, have you checked that
>> >> >> > lazy field loading is enabled?
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Best
>> >> >> > Erick
>> >> >> >
>> >> >> > On Sun, Aug 28, 2011 at 5:30 AM, Lord Khan Han <
>> >> khanuniverse1@gmail.com>
>> >> >> wrote:
>> >> >> >> Another insteresting thing is : all one word or more word queries
>> >> >> including
>> >> >> >> phrase queries such as "barack obama"  slower in shingle
>> >> configuration.
>> >> >> What
>> >> >> >> i am doing wrong ? without shingle "barack obama" Querytime 300ms
>> >>  with
>> >> >> >> shingle  780 ms..
>> >> >> >>
>> >> >> >>
>> >> >> >> On Sat, Aug 27, 2011 at 7:58 PM, Lord Khan Han <
>> >> khanuniverse1@gmail.com
>> >> >> >wrote:
>> >> >> >>
>> >> >> >>> Hi,
>> >> >> >>>
>> >> >> >>> What is the difference between solr 3.3  and the trunk ?
>> >> >> >>> I will try 3.3  and let you know the results.
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> Here the search handler:
>> >> >> >>>
>> >> >> >>> <requestHandler name="search" class="solr.SearchHandler"
>> >> >> default="true">
>> >> >> >>>      <lst name="defaults">
>> >> >> >>>        <str name="echoParams">explicit</str>
>> >> >> >>>        <int name="rows">10</int>
>> >> >> >>>        <!--<str name="fq">category:vv</str>-->
>> >> >> >>>  <str name="fq">mrank:[0 TO 100]</str>
>> >> >> >>>        <str name="echoParams">explicit</str>
>> >> >> >>>        <int name="rows">10</int>
>> >> >> >>>  <str name="defType">edismax</str>
>> >> >> >>>        <!--<str name="qf">title^0.05 url^1.2 content^1.7
>> >> >> >>> m_title^10.0</str>-->
>> >> >> >>> <str name="qf">title^1.05 url^1.2 content^1.7 m_title^10.0</str>
>> >> >> >>>  <!-- <str name="bf">recip(ee_score,-0.85,1,0.2)</str> -->
>> >> >> >>>  <str name="pf">content^18.0 m_title^5.0</str>
>> >> >> >>>  <int name="ps">1</int>
>> >> >> >>>  <int name="qs">0</int>
>> >> >> >>>  <str name="mm">2&lt;-25%</str>
>> >> >> >>>  <str name="spellcheck">true</str>
>> >> >> >>>  <!--<str name="spellcheck.collate">true</str>   -->
>> >> >> >>> <str name="spellcheck.count">5</str>
>> >> >> >>>  <str name="spellcheck.dictionary">subobjective</str>
>> >> >> >>> <str name="spellcheck.onlyMorePopular">false</str>
>> >> >> >>>   <str name="hl.tag.pre">&lt;b&gt;</str>
>> >> >> >>> <str name="hl.tag.post">&lt;/b&gt;</str>
>> >> >> >>>  <str name="hl.useFastVectorHighlighter">true</str>
>> >> >> >>>      </lst>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> On Sat, Aug 27, 2011 at 5:31 PM, Erik Hatcher <
>> >> erik.hatcher@gmail.com
>> >> >> >wrote:
>> >> >> >>>
>> >> >> >>>> I'm not sure what the issue could be at this point.   I see
>> you've
>> >> got
>> >> >> >>>> qt=search - what's the definition of that request handler?
>> >> >> >>>>
>> >> >> >>>> What is the parsed query (from the debugQuery response)?
>> >> >> >>>>
>> >> >> >>>> Have you tried this with Solr 3.3 to see if there's any
>> appreciable
>> >> >> >>>> difference?
>> >> >> >>>>
>> >> >> >>>>        Erik
>> >> >> >>>>
>> >> >> >>>> On Aug 27, 2011, at 09:34 , Lord Khan Han wrote:
>> >> >> >>>>
>> >> >> >>>> > When grouping off the query time ie 3567 ms  to 1912 ms .
>> >> Grouping
>> >> >> >>>> > increasing the query time and make useless to cache. But same
>> >> config
>> >> >> >>>> faster
>> >> >> >>>> > without shingle still.
>> >> >> >>>> >
>> >> >> >>>> > We have and head to head test this wednesday tihs commercial
>> >> search
>> >> >> >>>> engine.
>> >> >> >>>> > So I am looking for all suggestions.
>> >> >> >>>> >
>> >> >> >>>> >
>> >> >> >>>> >
>> >> >> >>>> > On Sat, Aug 27, 2011 at 3:37 PM, Erik Hatcher <
>> >> >> erik.hatcher@gmail.com
>> >> >> >>>> >wrote:
>> >> >> >>>> >
>> >> >> >>>> >> Please confirm is this is caused by grouping.  Turn grouping
>> >> off,
>> >> >> >>>> what's
>> >> >> >>>> >> query time like?
>> >> >> >>>> >>
>> >> >> >>>> >>
>> >> >> >>>> >> On Aug 27, 2011, at 07:27 , Lord Khan Han wrote:
>> >> >> >>>> >>
>> >> >> >>>> >>> On the other hand We couldnt use the cache for below types
>> >> >> queries. I
>> >> >> >>>> >> think
>> >> >> >>>> >>> its caused from grouping. Anyway we need to be sub second
>> >> without
>> >> >> >>>> cache.
>> >> >> >>>> >>>
>> >> >> >>>> >>>
>> >> >> >>>> >>>
>> >> >> >>>> >>> On Sat, Aug 27, 2011 at 2:18 PM, Lord Khan Han <
>> >> >> >>>> khanuniverse1@gmail.com
>> >> >> >>>> >>> wrote:
>> >> >> >>>> >>>
>> >> >> >>>> >>>> Hi,
>> >> >> >>>> >>>>
>> >> >> >>>> >>>> Thanks for the reply.
>> >> >> >>>> >>>>
>> >> >> >>>> >>>> Here the solr log capture.:
>> >> >> >>>> >>>>
>> >> >> >>>> >>>> ******
>> >> >> >>>> >>>>
>> >> >> >>>> >>>>
>> >> >> >>>> >>
>> >> >> >>>>
>> >> >>
>> >>
>> hl.fragsize=100&spellcheck=true&spellcheck.q=XXXXX&group.limit=5&hl.simple.pre=<b>&hl.fl=content&spellcheck.collate=true&wt=javabin&hl=true&rows=20&version=2&fl=score,approved,domain,host,id,lang,mimetype,title,tstamp,url,category&hl.snippets=3&start=0&q=%2BXXXX+-"XXXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXXX"+-"XXXXXX"+-XXXX+-"XXXXXX"+-XXX+-"XXXXX"+-XXXX+-XXXX+-"XXXXX"+-"XXXXX"+-"XXXXX"+-XXXX+-"XXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXX"+-"XXXXXX"+-"XXXXXX"+-XXXX+-"XXXXX"+-"XXXXXX"+-XXXX+-"XXXXX"+-"XXXXX"+-XXXXX+-"XXXXX"+-"XXXXX"+-"XXXXX"+-"XXXXX"+-XXXXX+-"XXXXXX"+-"XXXXXX"+-XXXXXX+-XXXXX+-"XXXXX"+"XXXXX"+"XXXXX"+"XXXXXX"++&group.field=host&hl.simple.post=</b>&group=true&qt=search&fq=mrank:[0+TO+100]&fq=word_count:[70+TO+*]
>> >> >> >>>> >>>> ******
>> >> >> >>>> >>>>
>> >> >> >>>> >>>> XXXX is the words. All phrases "xxxxx"  has two words
>> inside.
>> >> >> >>>> >>>>
>> >> >> >>>> >>>> The timing from the DebugQuery:
>> >> >> >>>> >>>>
>> >> >> >>>> >>>> <lst name="timing">
>> >> >> >>>> >>>> <double name="time">8654.0</double>
>> >> >> >>>> >>>> <lst name="prepare">
>> >> >> >>>> >>>> <double name="time">16.0</double>
>> >> >> >>>> >>>> <lst
>> name="org.apache.solr.handler.component.QueryComponent">
>> >> >> >>>> >>>> <double name="time">16.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> name="org.apache.solr.handler.component.FacetComponent">
>> >> >> >>>> >>>> <double name="time">0.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> >> >> name="org.apache.solr.handler.component.MoreLikeThisComponent">
>> >> >> >>>> >>>> <double name="time">0.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> >> name="org.apache.solr.handler.component.HighlightComponent">
>> >> >> >>>> >>>> <double name="time">0.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> name="org.apache.solr.handler.component.StatsComponent">
>> >> >> >>>> >>>> <double name="time">0.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> >> >> name="org.apache.solr.handler.component.SpellCheckComponent">
>> >> >> >>>> >>>> <double name="time">0.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> name="org.apache.solr.handler.component.DebugComponent">
>> >> >> >>>> >>>> <double name="time">0.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst name="process">
>> >> >> >>>> >>>> <double name="time">8638.0</double>
>> >> >> >>>> >>>> <lst
>> name="org.apache.solr.handler.component.QueryComponent">
>> >> >> >>>> >>>> <double name="time">4473.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> name="org.apache.solr.handler.component.FacetComponent">
>> >> >> >>>> >>>> <double name="time">0.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> >> >> name="org.apache.solr.handler.component.MoreLikeThisComponent">
>> >> >> >>>> >>>> <double name="time">0.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> >> name="org.apache.solr.handler.component.HighlightComponent">
>> >> >> >>>> >>>> <double name="time">42.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> name="org.apache.solr.handler.component.StatsComponent">
>> >> >> >>>> >>>> <double name="time">0.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> >> >> name="org.apache.solr.handler.component.SpellCheckComponent">
>> >> >> >>>> >>>> <double name="time">1.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>> <lst
>> name="org.apache.solr.handler.component.DebugComponent">
>> >> >> >>>> >>>> <double name="time">4122.0</double>
>> >> >> >>>> >>>> </lst>
>> >> >> >>>> >>>>
>> >> >> >>>> >>>>
>> >> >> >>>> >>>> The funny thing is if I removed the ShingleFilter from the
>> >> below
>> >> >> >>>> >> "sh_text"
>> >> >> >>>> >>>> field and index normally  the query time is half of the
>> >> current
>> >> >> >>>> shingle
>> >> >> >>>> >> one
>> >> >> >>>> >>>> !. Shouldn't  be shingled index better for such heavy 2
>> word
>> >> >> phrases
>> >> >> >>>> >> search
>> >> >> >>>> >>>> ? I am confused.
>> >> >> >>>> >>>>
>> >> >> >>>> >>>> On the other hand One of the on the shelf big FAT
>> companies
>> >> >> search
>> >> >> >>>> >> engine
>> >> >> >>>> >>>> doing the same query same machine 0.7 / 0.8 secs without
>> cache
>> >> .
>> >> >> I am
>> >> >> >>>> >>>> confident we can do better in solr but couldnt find the
>> way at
>> >> >> the
>> >> >> >>>> >> moment.
>> >> >> >>>> >>>>
>> >> >> >>>> >>>> thanks for helping..
>> >> >> >>>> >>>>
>> >> >> >>>> >>>>
>> >> >> >>>> >>>>
>> >> >> >>>> >>>>
>> >> >> >>>> >>>> On Sat, Aug 27, 2011 at 2:46 AM, Erik Hatcher <
>> >> >> >>>> erik.hatcher@gmail.com
>> >> >> >>>> >>> wrote:
>> >> >> >>>> >>>>
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>> On Aug 26, 2011, at 17:49 , Lord Khan Han wrote:
>> >> >> >>>> >>>>>> We are indexing news  document from the various sites.
>> >> >> Currently we
>> >> >> >>>> >> have
>> >> >> >>>> >>>>>> 200K docs indexed. Total index size is 36 gig.  There is
>> >> also
>> >> >> >>>> >>>>> attachement to
>> >> >> >>>> >>>>>> the news (pdf -docs etc) So document size could be high
>> (ie
>> >> >> 10mb).
>> >> >> >>>> >>>>>>
>> >> >> >>>> >>>>>> We are using some complex queries which includes around
>> 30 -
>> >> 40
>> >> >> >>>> terms
>> >> >> >>>> >>>>> per
>> >> >> >>>> >>>>>> query. %70 of this terms is two word phrases. We are
>> using
>> >> >> >>>> >>>>>> with conjunction +  and -  to pinpoint exact result.
>> >> >> >>>> >>>>>> There is also grouping, dismax and boosting , Termvector
>> HL
>> >>  .
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>> You're using a lot of componentry there, and have complex
>> >> >> queries.
>> >> >> >>>>  We
>> >> >> >>>> >>>>> need more details.
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>> Turn on debugQuery=true... what do the timings say for
>> each
>> >> >> >>>> component?
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>>> Our problem is query times. Currently its around 6-7
>> secs. I
>> >> >> know
>> >> >> >>>> our
>> >> >> >>>> >>>>> query
>> >> >> >>>> >>>>>> is little bit heavy but we want to improve query
>> >> performance. I
>> >> >> >>>> >> believe
>> >> >> >>>> >>>>> we
>> >> >> >>>> >>>>>> can make it sub second but no succes at the moment.
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>> Please provide an example query or two (perhaps a full
>> line
>> >> >> logged
>> >> >> >>>> from
>> >> >> >>>> >>>>> Solr itself), and then let's see what debugQuery says
>> about
>> >> your
>> >> >> >>>> query
>> >> >> >>>> >> being
>> >> >> >>>> >>>>> parsed.
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>>> We tried to use shingle 2 word token it decreases the
>> query
>> >> >> >>>> performcen
>> >> >> >>>> >>>>> !! We
>> >> >> >>>> >>>>>> assumed it will help the speed up phrases search..
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>> Again, we'd need to see a parsed query to understand this
>> >> >> deeper.
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>> Lots of synonym expansion?  A parsed query will tell us.
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>>> (using solr latest trunk and HW is pretty good, 32 core
>> >>  with
>> >> >> 32
>> >> >> >>>> gig
>> >> >> >>>> >>>>> ram)
>> >> >> >>>> >>>>>>
>> >> >> >>>> >>>>>> Here the field def:
>> >> >> >>>> >>>>>>
>> >> >> >>>> >>>>>> <fieldType name="sh_text" class="solr.TextField"
>> >> >> >>>> >>>>> positionIncrementGap="100"
>> >> >> >>>> >>>>>> autoGeneratePhraseQueries="true">
>> >> >> >>>> >>>>>>    <analyzer type="index">
>> >> >> >>>> >>>>>>      <tokenizer
>> class="solr.WhitespaceTokenizerFactory"/>
>> >> >> >>>> >>>>>>      <filter class="solr.StopFilterFactory"
>> >> ignoreCase="true"
>> >> >> >>>> >>>>>> words="stopwords.txt" enablePositionIncrements="true" />
>> >> >> >>>> >>>>>>      <filter class="solr.WordDelimiterFilterFactory"
>> >> >> >>>> >>>>>> generateWordParts="1" generateNumberParts="1"
>> >> catenateWords="1"
>> >> >> >>>> >>>>>> catenateNumbers="1" catenateAll="0"
>> splitOnCaseChange="1"/>
>> >> >> >>>> >>>>>>      <!--<filter
>> class="solr.LowerCaseFilterFactory"/>-->
>> >> >> >>>> >>>>>>      <filter class="solr.KeywordMarkerFilterFactory"
>> >> >> >>>> >>>>>> protected="protwords.txt"/>
>> >> >> >>>> >>>>>>      <filter class="solr.ShingleFilterFactory"
>> >> >> maxShingleSize="2"
>> >> >> >>>> >>>>>> outputUnigrams="true"/>
>> >> >> >>>> >>>>>>    </analyzer>
>> >> >> >>>> >>>>>>    <analyzer type="query">
>> >> >> >>>> >>>>>>      <tokenizer
>> class="solr.WhitespaceTokenizerFactory"/>
>> >> >> >>>> >>>>>>      <filter class="solr.SynonymFilterFactory"
>> >> >> >>>> >> synonyms="synonyms.txt"
>> >> >> >>>> >>>>>> ignoreCase="true" expand="true"/>
>> >> >> >>>> >>>>>>      <filter class="solr.StopFilterFactory"
>> >> ignoreCase="true"
>> >> >> >>>> >>>>>> words="stopwords.txt" enablePositionIncrements="true" />
>> >> >> >>>> >>>>>>      <filter class="solr.WordDelimiterFilterFactory"
>> >> >> >>>> >>>>>> generateWordParts="1" generateNumberParts="1"
>> >> catenateWords="0"
>> >> >> >>>> >>>>>> catenateNumbers="0" catenateAll="0"
>> splitOnCaseChange="1"/>
>> >> >> >>>> >>>>>>      <!--<filter
>> class="solr.LowerCaseFilterFactory"/>-->
>> >> >> >>>> >>>>>>      <filter class="solr.KeywordMarkerFilterFactory"
>> >> >> >>>> >>>>>> protected="protwords.txt"/>
>> >> >> >>>> >>>>>>      <filter class="solr.ShingleFilterFactory"
>> >> >> maxShingleSize="2"
>> >> >> >>>> >>>>>> outputUnigrams="true"/>
>> >> >> >>>> >>>>>>    </analyzer>
>> >> >> >>>> >>>>>>  </fieldType>
>> >> >> >>>> >>>>>>
>> >> >> >>>> >>>>>> and
>> >> >> >>>> >>>>>>
>> >> >> >>>> >>>>>> <field name="content" type="sh_text" stored="true"
>> >> >> indexed="true"
>> >> >> >>>> >>>>>> termVectors="true" termPositions="true"
>> termOffsets="true"/>
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>>
>> >> >> >>>> >>>>
>> >> >> >>>> >>
>> >> >> >>>> >>
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>
>