You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by aruninfo100 <ar...@gmail.com> on 2017/05/25 21:24:50 UTC

Unable to enrich UIMA annotated results to Solr fields

Hi All,

I am trying to integrate openNLP-UIMA with Solr.I have installed the pear
package generated by building the opennlp-uima source.
I have analyzed the text files using *CAS Visual Debugger* by loading the 
respective AE and tokens are annotated as expected.

*Solrconfig:*

 <updateRequestProcessorChain name="uima" >
  <processor
class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
    <lst name="uimaConfig">
      <lst name="runtimeParameters">
      </lst>
      <str
name="analysisEngine">D:/solr-6.1.0/server/solr/star/conf/AnalyzerEngineMain.xml</str>
     
      <bool name="ignoreErrors">true</bool>
     
      <lst name="analyzeFields">
        <bool name="merge">false</bool>
        <arr name="fields">
          <str>content</str>
        </arr>
      </lst>
      <lst name="fieldMappings">
	  
	  <lst name="type">
          <str name="name">opennlp.uima.Sentence</str>
          <lst name="mapping">
            <str name="feature">coveredText</str>
            <str name="field">sentence_mxf</str>
          </lst>
        </lst> 
         <lst name="type">
          <str name="name">opennlp.uima.Money</str>
          <lst name="mapping">
            <str name="feature">coveredText</str>
            <str name="field">money_mxf</str>
          </lst>
        </lst>
        <lst name="type">
          <str name="name">opennlp.uima.Organization</str>
          <lst name="mapping">
            <str name="feature">coveredText</str>
            <str name="field">organization_mxf</str>
          </lst>
        </lst>
        <lst name="type">
          <str name="name">opennlp.uima.Percentage</str>
          <lst name="mapping">
            <str name="feature">coveredText</str>
            <str name="field">percentage_mxf</str>
          </lst>
        </lst>         
        <lst name="type">
          <str name="name">opennlp.uima.Time</str>
          <lst name="mapping">
            <str name="feature">coveredText</str>
            <str name="field">time_mxf</str>
          </lst>
        </lst> 
        <lst name="type">
          <str name="name">opennlp.uima.Person</str>
          <lst name="mapping">
            <str name="feature">coveredText</str>
            <str name="field">person_mxf</str>
          </lst>
        </lst>
        </lst>
    </lst>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

<requestHandler name="/update" class="solr.UpdateRequestHandler">
  <lst name="defaults">
    <str name="update.chain">uima</str>
  </lst>
</requestHandler>

*schema:*

 <dynamicField name="*_mxf" type="text_general" multiValued="true"
indexed="true" stored="true"/>

When I index the documents and query it,I am getting only
sentence,money,percentage fields for each document but NameFinders like
person,location,date,organization which was extracting as expected in *CAS
Visual Debugger* is not getting enriched as Solr field for each documents.

*AnalyzerEngineMain.xml*:

<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
	<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
	<primitive>false</primitive>

	<delegateAnalysisEngineSpecifiers>
		<delegateAnalysisEngine key="PEAR">
			<import
location="file:///D:/solr-6.1.0/server/solr/star/conf/opennlp.uima.OpenNlpTextAnalyzer/opennlp.uima.OpenNlpTextAnalyzer_pear.xml"
/>
		</delegateAnalysisEngine>
		
	</delegateAnalysisEngineSpecifiers>

	<analysisEngineMetaData>
		<name>OpenNlpTextAnalyzer</name>
		<description />
		<version>1.0</version>
		<vendor>Apache Software Foundation</vendor>
		<configurationParameters />
		<configurationParameterSettings />
		<flowConstraints>
			<fixedFlow>
				<node>PEAR</node>
				</fixedFlow>
		</flowConstraints>
		<capabilities>
			<capability>
				<inputs />
				<outputs />
				<languagesSupported>
					<language>en</language>
				</languagesSupported>
			</capability>
		</capabilities>
		<operationalProperties>
			<modifiesCas>true</modifiesCas>
			<multipleDeploymentAllowed>false</multipleDeploymentAllowed>
			<outputsNewCASes>false</outputsNewCASes>
		</operationalProperties>
	</analysisEngineMetaData>

	<resourceManagerConfiguration>
	</resourceManagerConfiguration>
</analysisEngineDescription>

The descriptor files are the same found in :
https://svn.apache.org/repos/asf/opennlp/trunk/opennlp-uima/descriptors/
<https://svn.apache.org/repos/asf/opennlp/trunk/opennlp-uima/descriptors/>  

Thanks and Regards,
Arun




--
View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-enrich-UIMA-annotated-results-to-Solr-fields-tp4337349.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Unable to enrich UIMA annotated results to Solr fields

Posted by aruninfo100 <ar...@gmail.com>.
I was able to resolve the issue.I was passing the extracted text content of
each document to Solr for indexing after converting to lowercase(did this
for a different usage).When the original content(without  converting
lowercase) was indexed annotated entities were enriched to respective fields
to Solr.
Noted this when I was analyzing the texts using CVD .

Thanks and Regards,
Arun




--
View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-enrich-UIMA-annotated-results-to-Solr-fields-tp4337349p4337942.html
Sent from the Solr - User mailing list archive at Nabble.com.