You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Darx Oman <da...@gmail.com> on 2014/09/21 11:23:41 UTC

UIMA Dictionary Annotator

Hi there
I trying to use UIMA Dictionary Annotator with Solr 4.10.0

I did the following:


1) added a field to the schema
<field name="uimaKeyWords" type="string" indexed="true" stored="true"
multiValued="true" />



2)Modified solrConfig as follow
<requestHandler name="/update" class="solr.UpdateRequestHandler">
       <lst name="defaults">
         <str name="update.chain">uima</str>
       </lst>
  </requestHandler>

  <updateRequestProcessorChain name="uima" default="true">
      <processor
class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
          <lst name="uimaConfig">
              <lst name="runtimeParameters"> </lst>
              <str name="analysisEngine">DictionaryAnnotator.xml</str>

              <lst name="analyzeFields">
                <bool name="merge">false</bool>
                <arr name="fields">
                  <str>text</str>
                </arr>
              </lst>

              <lst name="fieldMappings">
                <lst name="type">
                  <str name="name">org.apache.uima.DictionaryEntry</str>
                  <lst name="mapping">
                    <str name="feature">tokenType</str>
                    <str name="field">uimaKeyWords</str>
                  </lst>
                </lst>
              </lst>

          </lst>
      </processor>
      <processor class="solr.LogUpdateProcessorFactory" />
      <processor class="solr.RunUpdateProcessorFactory" />
  </updateRequestProcessorChain>


3) copied the following jar files to   solr\collection1\lib folder
     lucene-analyzers-uima-4.10.0.jar
     solr-uima-4.10.0.jar
     uima-an-dictionary.jar
     uimaj-core-2.3.1.jar
     WhitespaceTokenizer-2.3.1.jar
     xmlbeans-2.4.0.jar


4)Added some entries to dictionary.xml
      <entry>
                <key>iPod</key>
       </entry>
      <entry>
                <key>samsung</key>
      </entry>

but when I indexed xml documents from  example\exampledocs
no annotation happened

this is the xml response I got

<doc>
    <str name="id">IW-02</str>
    <str name="name">iPod & iPod Mini USB 2.0 Cable</str>
    <str name="manu">Belkin</str>
    <str name="manu_id_s">belkin</str>
    <arr name="cat">
      <str>electronics</str>
      <str>connector</str>
    </arr>
    <arr name="features">
      <str>car power adapter for iPod, white</str>
    </arr>
    <float name="weight">2.0</float>
    <float name="price">11.5</float>
    <str name="price_c">11.50,USD</str>
    <int name="popularity">1</int>
    <bool name="inStock">false</bool>
    <str name="store">37.7752,-122.4232</str>
    <date name="manufacturedate_dt">2006-02-14T23:55:59Z</date>
    <long name="_version_">1479845823070076928</long></doc>
  <doc>


what might went wrong?

am I missing something?

Re: UIMA Dictionary Annotator

Posted by Darx Oman <da...@gmail.com>.
It was some configuration errors

this is the new configuration


  <updateRequestProcessorChain name="uima" default="true">
      <processor
class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
          <lst name="uimaConfig">
              <lst name="runtimeParameters"> </lst>
              <str name="analysisEngine">AggregateAE.xml</str>

              <lst name="analyzeFields">
                <bool name="merge">false</bool>
                <arr name="fields">
                  <str>name</str>
                </arr>
              </lst>

              <lst name="fieldMappings">
                <lst name="type">
                  <str name="name">org.apache.uima.DictionaryEntry</str>
                  <lst name="mapping">
                    <str name="feature">coveredText</str>
                    <str name="field">uimaKeyWords</str>
                  </lst>
                </lst>
              </lst>

          </lst>
      </processor>
      <processor class="solr.LogUpdateProcessorFactory" />
      <processor class="solr.RunUpdateProcessorFactory" />
  </updateRequestProcessorChain>

it works fine

On Sun, Sep 21, 2014 at 1:23 PM, Darx Oman <da...@gmail.com> wrote:

> Hi there
> I trying to use UIMA Dictionary Annotator with Solr 4.10.0
>
> I did the following:
>
>
> 1) added a field to the schema
> <field name="uimaKeyWords" type="string" indexed="true" stored="true"
> multiValued="true" />
>
>
>
> 2)Modified solrConfig as follow
> <requestHandler name="/update" class="solr.UpdateRequestHandler">
>        <lst name="defaults">
>          <str name="update.chain">uima</str>
>        </lst>
>   </requestHandler>
>
>   <updateRequestProcessorChain name="uima" default="true">
>       <processor
> class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
>           <lst name="uimaConfig">
>               <lst name="runtimeParameters"> </lst>
>               <str name="analysisEngine">DictionaryAnnotator.xml</str>
>
>               <lst name="analyzeFields">
>                 <bool name="merge">false</bool>
>                 <arr name="fields">
>                   <str>text</str>
>                 </arr>
>               </lst>
>
>               <lst name="fieldMappings">
>                 <lst name="type">
>                   <str name="name">org.apache.uima.DictionaryEntry</str>
>                   <lst name="mapping">
>                     <str name="feature">tokenType</str>
>                     <str name="field">uimaKeyWords</str>
>                   </lst>
>                 </lst>
>               </lst>
>
>           </lst>
>       </processor>
>       <processor class="solr.LogUpdateProcessorFactory" />
>       <processor class="solr.RunUpdateProcessorFactory" />
>   </updateRequestProcessorChain>
>
>
> 3) copied the following jar files to   solr\collection1\lib folder
>      lucene-analyzers-uima-4.10.0.jar
>      solr-uima-4.10.0.jar
>      uima-an-dictionary.jar
>      uimaj-core-2.3.1.jar
>      WhitespaceTokenizer-2.3.1.jar
>      xmlbeans-2.4.0.jar
>
>
> 4)Added some entries to dictionary.xml
>       <entry>
>                 <key>iPod</key>
>        </entry>
>       <entry>
>                 <key>samsung</key>
>       </entry>
>
> but when I indexed xml documents from  example\exampledocs
> no annotation happened
>
> this is the xml response I got
>
> <doc>
>     <str name="id">IW-02</str>
>     <str name="name">iPod & iPod Mini USB 2.0 Cable</str>
>     <str name="manu">Belkin</str>
>     <str name="manu_id_s">belkin</str>
>     <arr name="cat">
>       <str>electronics</str>
>       <str>connector</str>
>     </arr>
>     <arr name="features">
>       <str>car power adapter for iPod, white</str>
>     </arr>
>     <float name="weight">2.0</float>
>     <float name="price">11.5</float>
>     <str name="price_c">11.50,USD</str>
>     <int name="popularity">1</int>
>     <bool name="inStock">false</bool>
>     <str name="store">37.7752,-122.4232</str>
>     <date name="manufacturedate_dt">2006-02-14T23:55:59Z</date>
>     <long name="_version_">1479845823070076928</long></doc>
>   <doc>
>
>
> what might went wrong?
>
> am I missing something?
>
>
>