You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by aruninfo100 <ar...@gmail.com> on 2017/05/25 21:24:50 UTC
Unable to enrich UIMA annotated results to Solr fields
Hi All,
I am trying to integrate openNLP-UIMA with Solr.I have installed the pear
package generated by building the opennlp-uima source.
I have analyzed the text files using *CAS Visual Debugger* by loading the
respective AE and tokens are annotated as expected.
*Solrconfig:*
<updateRequestProcessorChain name="uima" >
<processor
class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
<lst name="uimaConfig">
<lst name="runtimeParameters">
</lst>
<str
name="analysisEngine">D:/solr-6.1.0/server/solr/star/conf/AnalyzerEngineMain.xml</str>
<bool name="ignoreErrors">true</bool>
<lst name="analyzeFields">
<bool name="merge">false</bool>
<arr name="fields">
<str>content</str>
</arr>
</lst>
<lst name="fieldMappings">
<lst name="type">
<str name="name">opennlp.uima.Sentence</str>
<lst name="mapping">
<str name="feature">coveredText</str>
<str name="field">sentence_mxf</str>
</lst>
</lst>
<lst name="type">
<str name="name">opennlp.uima.Money</str>
<lst name="mapping">
<str name="feature">coveredText</str>
<str name="field">money_mxf</str>
</lst>
</lst>
<lst name="type">
<str name="name">opennlp.uima.Organization</str>
<lst name="mapping">
<str name="feature">coveredText</str>
<str name="field">organization_mxf</str>
</lst>
</lst>
<lst name="type">
<str name="name">opennlp.uima.Percentage</str>
<lst name="mapping">
<str name="feature">coveredText</str>
<str name="field">percentage_mxf</str>
</lst>
</lst>
<lst name="type">
<str name="name">opennlp.uima.Time</str>
<lst name="mapping">
<str name="feature">coveredText</str>
<str name="field">time_mxf</str>
</lst>
</lst>
<lst name="type">
<str name="name">opennlp.uima.Person</str>
<lst name="mapping">
<str name="feature">coveredText</str>
<str name="field">person_mxf</str>
</lst>
</lst>
</lst>
</lst>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
<requestHandler name="/update" class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">uima</str>
</lst>
</requestHandler>
*schema:*
<dynamicField name="*_mxf" type="text_general" multiValued="true"
indexed="true" stored="true"/>
When I index the documents and query it,I am getting only
sentence,money,percentage fields for each document but NameFinders like
person,location,date,organization which was extracting as expected in *CAS
Visual Debugger* is not getting enriched as Solr field for each documents.
*AnalyzerEngineMain.xml*:
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
<primitive>false</primitive>
<delegateAnalysisEngineSpecifiers>
<delegateAnalysisEngine key="PEAR">
<import
location="file:///D:/solr-6.1.0/server/solr/star/conf/opennlp.uima.OpenNlpTextAnalyzer/opennlp.uima.OpenNlpTextAnalyzer_pear.xml"
/>
</delegateAnalysisEngine>
</delegateAnalysisEngineSpecifiers>
<analysisEngineMetaData>
<name>OpenNlpTextAnalyzer</name>
<description />
<version>1.0</version>
<vendor>Apache Software Foundation</vendor>
<configurationParameters />
<configurationParameterSettings />
<flowConstraints>
<fixedFlow>
<node>PEAR</node>
</fixedFlow>
</flowConstraints>
<capabilities>
<capability>
<inputs />
<outputs />
<languagesSupported>
<language>en</language>
</languagesSupported>
</capability>
</capabilities>
<operationalProperties>
<modifiesCas>true</modifiesCas>
<multipleDeploymentAllowed>false</multipleDeploymentAllowed>
<outputsNewCASes>false</outputsNewCASes>
</operationalProperties>
</analysisEngineMetaData>
<resourceManagerConfiguration>
</resourceManagerConfiguration>
</analysisEngineDescription>
The descriptor files are the same found in :
https://svn.apache.org/repos/asf/opennlp/trunk/opennlp-uima/descriptors/
<https://svn.apache.org/repos/asf/opennlp/trunk/opennlp-uima/descriptors/>
Thanks and Regards,
Arun
--
View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-enrich-UIMA-annotated-results-to-Solr-fields-tp4337349.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Unable to enrich UIMA annotated results to Solr fields
Posted by aruninfo100 <ar...@gmail.com>.
I was able to resolve the issue.I was passing the extracted text content of
each document to Solr for indexing after converting to lowercase(did this
for a different usage).When the original content(without converting
lowercase) was indexed annotated entities were enriched to respective fields
to Solr.
Noted this when I was analyzing the texts using CVD .
Thanks and Regards,
Arun
--
View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-enrich-UIMA-annotated-results-to-Solr-fields-tp4337349p4337942.html
Sent from the Solr - User mailing list archive at Nabble.com.