You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by chris3001 <ch...@hotmail.com> on 2012/03/28 03:57:03 UTC

Solr with UIMA

I am having a hard time integrating UIMA with Solr. I have downloaded the
Solr 3.5 dist and have it successfully running with nutch and tika on
windows 7 using solrcell and curl via cygwin. To begin, I copied the 6 jars
from solr/contrib/uima/lib to the working /lib in solr. Next, I read the
readme.txt file in solr/contrib/uima/lib and edited both my solrconfig.xml
and schema.xml accordingly to no avail. I then found this link which seemed
a bit more applicable since I didnt care to use Alchemy or OpenCalais:
http://code.google.com/a/apache-extras.org/p/rondhuit-uima/?redir=1 Still-
when I run a curl command that imports a pdf via solrcell I do not get the
additional UIMA fields nor do I get anything on my logs. The test.pdf is
parsed though and I see the pdf in Solr using:
curl
'http://localhost:8080/solr/update/extract?fmap.content=content&literal.id=doc1&commit=true'
-F "file=@test.pdf"

What I added to my SolrConfig.XML:

/<updateRequestProcessorChain name="uima">
  <processor
class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
    <lst name="uimaConfig">
      <lst name="runtimeParameters">
      </lst>
      <str
name="analysisEngine">C:\web\solrcelluimacrawler\com\rondhuit\uima\desc\KeyphraseExtractAnnotatorDescriptor.xml</str>
      <bool name="ignoreErrors">true</bool>
      <str name="logField">id</str>
      <lst name="analyzeFields">
        <bool name="merge">false</bool>
        <arr name="fields">
          <str>content</str>
        </arr>
      </lst>
      <lst name="fieldMappings">
        <lst name="type">
          <str name="name">com.rondhuit.uima.yahoo.Keyphrase</str>
          <lst name="mapping">
            <str name="feature">keyphrase</str>
            <str name="field">UIMAname</str>
          </lst>
        </lst>
      </lst>
    </lst>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
/
I also adjusted my requestHander:

/<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
    <lst name="defaults">
      <str name="update.processor">uima</str>
    </lst>
  </requestHandler>/

Finally, my added entries in my Schema.xml

/
<field name="UIMAname" type="string" indexed="true" stored="true"
multiValued="true" required="false"/>
<dynamicField name="*_sm"  type="string"  indexed="true"  stored="true"/>
/

All I am trying to do is have test *any* UIMA AE in Solr and cannot figure
out what I am doing wrong. Thank you in advance for reading this.


--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3863324.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr with UIMA

Posted by chris3001 <ch...@hotmail.com>.
Tommaso,
I apologize for my delayed response. Thank you very much for your time
looking into this!! 
I will try to replicate your efforts on my end this week.

Respectfully,
Chris

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3898094.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr with UIMA

Posted by debdoot <de...@gmail.com>.
Further observation on the error:

All requests to add documents through the /update URL land up with the same
error, irrespective of the fields contained in the document. If I don't use
the UIMAUpdateRequestProcessor, I can add/update documents successfully.

Here are the snippets relevant to updateRequestProcessor declarations in my
solrconfig.xml

<requestHandler name="/update" 
                  class="solr.XmlUpdateRequestHandler">
    
     
   
       <lst name="defaults">
         <str name="update.processor">uima</str>
       </lst>
      
    </requestHandler>

<updateRequestProcessorChain name="uima">
  <processor
class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
    <lst name="uimaConfig">
      <lst name="runtimeParameters">
      </lst>
      <str name="analysisEngine">C:\ex1\RoomNumberAnnotator.xml</str>
      <bool name="ignoreErrors">false</bool>
      
      <lst name="analyzeFields">
        <bool name="merge">false</bool>
        <arr name="fields">
          <str>content</str>
        </arr>
      </lst>
      <lst name="fieldMappings">
        <lst name="type">
          <str name="name">org.apache.uima.tutorial.RoomNumber</str>
          <lst name="mapping">
            <str name="feature">building</str>
            <str name="field">UIMAname</str>
          </lst>
        </lst>
      </lst>
    </lst>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>


Please help.

Thanks
Debdoot

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3987083.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr with UIMA

Posted by Tommaso Teofili <to...@gmail.com>.
Hi all,

2012/6/1 Jack Krupansky <ja...@basetechnology.com>

> Is it failing on the first document? I see "uid 5", suggests that it is
> not. If not, how is this document different from the others?
>
> I see the exception
> org.apache.uima.resource.**ResourceInitializationExceptio**n, suggesting
> that some file cannot be loaded.
>
> It sounds like it may be having trouble loading "aePath"
> ("analysisEngine"). Or maybe some other file?
>

thanks Jack, that's correct, it's most likely what's causing the reported
error.
Tommaso


>
> -- Jack Krupansky
>
> -----Original Message----- From: debdoot
> Sent: Thursday, May 31, 2012 11:59 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr with UIMA
>
>
> Hi Tommaso,
>
> I have followed the steps you have listed to try to deploy the example
> RoomNumberAnnotator with Solr 3.5.
> Here is the error trace that I get:
>
>
> org.apache.solr.common.**SolrException: processing error: null. uid=5,
> text=&quot;Test Room HAW GN-K35...&quot;
> at
> org.apache.solr.uima.**processor.**UIMAUpdateRequestProcessor.**
> processAdd&#40;**UIMAUpdateRequestProcessor.**java:107&#41;
> at
> org.apache.solr.handler.**XMLLoader.processUpdate&#40;**
> XMLLoader.java:158&#41;
> at org.apache.solr.handler.**XMLLoader.load&#40;XMLLoader.**java:79&#41;
> at
> org.apache.solr.handler.**ContentStreamHandlerBase.**
> handleRequestBody&#40;**ContentStreamHandlerBase.java:**58&#41;
> at
> org.apache.solr.handler.**RequestHandlerBase.**handleRequest&#40;**
> RequestHandlerBase.java:129&#**41;
> at org.apache.solr.core.SolrCore.**execute&#40;SolrCore.java:**1372&#41;
> at
> org.apache.solr.servlet.**SolrDispatchFilter.execute&#**
> 40;SolrDispatchFilter.java:**356&#41;
> at
> org.apache.solr.servlet.**SolrDispatchFilter.doFilter&#**
> 40;SolrDispatchFilter.java:**252&#41;
> at
> com.ibm.ws.webcontainer.**filter.FilterInstanceWrapper.**doFilter&#40;**
> FilterInstanceWrapper.java:**192&#41;
> at
> com.ibm.ws.webcontainer.**filter.WebAppFilterChain.**doFilter&#40;**
> WebAppFilterChain.java:89&#41;
> at
> com.ibm.ws.webcontainer.**filter.WebAppFilterManager.**doFilter&#40;**
> WebAppFilterManager.java:919&#**41;
> at
> com.ibm.ws.webcontainer.**filter.WebAppFilterManager.**invokeFilters&#40;*
> *WebAppFilterManager.java:1016&**#41;
> at
> com.ibm.ws.webcontainer.**webapp.WebApp.handleRequest&#**
> 40;WebApp.java:3703&#41;
> at
> com.ibm.ws.webcontainer.**webapp.WebGroup.handleRequest&**
> #40;WebGroup.java:304&#41;
> at
> com.ibm.ws.webcontainer.**WebContainer.handleRequest&#**
> 40;WebContainer.java:953&#41;
> at
> com.ibm.ws.webcontainer.**WSWebContainer.handleRequest&#**
> 40;WSWebContainer.java:1655&#**41;
> at
> com.ibm.ws.webcontainer.**channel.WCChannelLink.ready&#**
> 40;WCChannelLink.java:195&#41;
> at
> com.ibm.ws.http.channel.**inbound.impl.HttpInboundLink.**
> handleDiscrimination&#40;**HttpInboundLink.java:452&#41;
> at
> com.ibm.ws.http.channel.**inbound.impl.HttpInboundLink.**
> handleNewRequest&#40;**HttpInboundLink.java:511&#41;
> at
> com.ibm.ws.http.channel.**inbound.impl.HttpInboundLink.**
> processRequest&#40;**HttpInboundLink.java:305&#41;
> at
> com.ibm.ws.http.channel.**inbound.impl.HttpInboundLink.**
> ready&#40;HttpInboundLink.**java:276&#41;
> at
> com.ibm.ws.tcp.channel.impl.**NewConnectionInitialReadCallba**
> ck.sendToDiscriminators&#40;**NewConnectionInitialReadCallba**
> ck.java:214&#41;
> at
> com.ibm.ws.tcp.channel.impl.**NewConnectionInitialReadCallba**
> ck.complete&#40;**NewConnectionInitialReadCallba**ck.java:113&#41;
> at
> com.ibm.ws.tcp.channel.impl.**AioReadCompletionListener.**
> futureCompleted&#40;**AioReadCompletionListener.**java:165&#41;
> at
> com.ibm.io.async.**AbstractAsyncFuture.**invokeCallback&#40;**
> AbstractAsyncFuture.java:217&#**41;
> at
> com.ibm.io.async.**AsyncChannelFuture.**fireCompletionActions&#40;**
> AsyncChannelFuture.java:161&#**41;
> at com.ibm.io.async.AsyncFuture.**completed&#40;AsyncFuture.**
> java:138&#41;
> at com.ibm.io.async.**ResultHandler.complete&#40;**
> ResultHandler.java:204&#41;
> at
> com.ibm.io.async.**ResultHandler.**runEventProcessingLoop&#40;**
> ResultHandler.java:775&#41;
> at com.ibm.io.async.**ResultHandler$2.run&#40;**
> ResultHandler.java:905&#41;
> at com.ibm.ws.util.ThreadPool$**Worker.run&#40;ThreadPool.**java:1650&#41;
> Caused by: org.apache.uima.resource.**ResourceInitializationExceptio**n
> at
> org.apache.solr.uima.**processor.ae.**OverridingParamsAEProvider.**
> getAE&#40;**OverridingParamsAEProvider.**java:86&#41;
> at
> org.apache.solr.uima.**processor.**UIMAUpdateRequestProcessor.**
> processText&#40;**UIMAUpdateRequestProcessor.**java:144&#41;
> at
> org.apache.solr.uima.**processor.**UIMAUpdateRequestProcessor.**
> processAdd&#40;**UIMAUpdateRequestProcessor.**java:77&#41;
> ... 30 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.uima.util.**XMLInputSource.&lt;init&gt;&#**
> 40;XMLInputSource.java:118&#**41;
> at
> org.apache.solr.uima.**processor.ae.**OverridingParamsAEProvider.**
> getAE&#40;**OverridingParamsAEProvider.**java:58&#41;
> ... 32 more
>
> at
> com.ibm.ws.webcontainer.**webapp.**WebAppDispatcherContext.**sendError(**
> WebAppDispatcherContext.java:**624)
> at
> com.ibm.ws.webcontainer.**webapp.**WebAppDispatcherContext.**sendError(**
> WebAppDispatcherContext.java:**642)
> at
> com.ibm.ws.webcontainer.srt.**SRTServletResponse.sendError(**
> SRTServletResponse.java:1235)
> at
> org.apache.solr.servlet.**SolrDispatchFilter.sendError(**
> SolrDispatchFilter.java:380)
> at
> org.apache.solr.servlet.**SolrDispatchFilter.**writeResponse(**
> SolrDispatchFilter.java:326)
> at
> org.apache.solr.servlet.**SolrDispatchFilter.doFilter(**
> SolrDispatchFilter.java:265)
> ....
> ....
>
> Please let me know if you have any insights on what could be the issue.
>
> Thanks in advance,
> Debdoot
>
>
> --
> View this message in context: http://lucene.472066.n3.**
> nabble.com/Solr-with-UIMA-**tp3863324p3987056.html<http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3987056.html>
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr with UIMA

Posted by Jack Krupansky <ja...@basetechnology.com>.
Is it failing on the first document? I see "uid 5", suggests that it is not. 
If not, how is this document different from the others?

I see the exception
org.apache.uima.resource.ResourceInitializationException, suggesting that 
some file cannot be loaded.

It sounds like it may be having trouble loading "aePath" ("analysisEngine"). 
Or maybe some other file?

-- Jack Krupansky

-----Original Message----- 
From: debdoot
Sent: Thursday, May 31, 2012 11:59 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr with UIMA

Hi Tommaso,

I have followed the steps you have listed to try to deploy the example
RoomNumberAnnotator with Solr 3.5.
Here is the error trace that I get:


org.apache.solr.common.SolrException: processing error: null. uid=5,
text=&quot;Test Room HAW GN-K35...&quot;
at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd&#40;UIMAUpdateRequestProcessor.java:107&#41;
at
org.apache.solr.handler.XMLLoader.processUpdate&#40;XMLLoader.java:158&#41;
at org.apache.solr.handler.XMLLoader.load&#40;XMLLoader.java:79&#41;
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody&#40;ContentStreamHandlerBase.java:58&#41;
at
org.apache.solr.handler.RequestHandlerBase.handleRequest&#40;RequestHandlerBase.java:129&#41;
at org.apache.solr.core.SolrCore.execute&#40;SolrCore.java:1372&#41;
at
org.apache.solr.servlet.SolrDispatchFilter.execute&#40;SolrDispatchFilter.java:356&#41;
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter&#40;SolrDispatchFilter.java:252&#41;
at
com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter&#40;FilterInstanceWrapper.java:192&#41;
at
com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter&#40;WebAppFilterChain.java:89&#41;
at
com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter&#40;WebAppFilterManager.java:919&#41;
at
com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters&#40;WebAppFilterManager.java:1016&#41;
at
com.ibm.ws.webcontainer.webapp.WebApp.handleRequest&#40;WebApp.java:3703&#41;
at
com.ibm.ws.webcontainer.webapp.WebGroup.handleRequest&#40;WebGroup.java:304&#41;
at
com.ibm.ws.webcontainer.WebContainer.handleRequest&#40;WebContainer.java:953&#41;
at
com.ibm.ws.webcontainer.WSWebContainer.handleRequest&#40;WSWebContainer.java:1655&#41;
at
com.ibm.ws.webcontainer.channel.WCChannelLink.ready&#40;WCChannelLink.java:195&#41;
at
com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination&#40;HttpInboundLink.java:452&#41;
at
com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewRequest&#40;HttpInboundLink.java:511&#41;
at
com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.processRequest&#40;HttpInboundLink.java:305&#41;
at
com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.ready&#40;HttpInboundLink.java:276&#41;
at
com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.sendToDiscriminators&#40;NewConnectionInitialReadCallback.java:214&#41;
at
com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.complete&#40;NewConnectionInitialReadCallback.java:113&#41;
at
com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted&#40;AioReadCompletionListener.java:165&#41;
at
com.ibm.io.async.AbstractAsyncFuture.invokeCallback&#40;AbstractAsyncFuture.java:217&#41;
at
com.ibm.io.async.AsyncChannelFuture.fireCompletionActions&#40;AsyncChannelFuture.java:161&#41;
at com.ibm.io.async.AsyncFuture.completed&#40;AsyncFuture.java:138&#41;
at com.ibm.io.async.ResultHandler.complete&#40;ResultHandler.java:204&#41;
at
com.ibm.io.async.ResultHandler.runEventProcessingLoop&#40;ResultHandler.java:775&#41;
at com.ibm.io.async.ResultHandler$2.run&#40;ResultHandler.java:905&#41;
at com.ibm.ws.util.ThreadPool$Worker.run&#40;ThreadPool.java:1650&#41;
Caused by: org.apache.uima.resource.ResourceInitializationException
at
org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE&#40;OverridingParamsAEProvider.java:86&#41;
at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processText&#40;UIMAUpdateRequestProcessor.java:144&#41;
at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd&#40;UIMAUpdateRequestProcessor.java:77&#41;
... 30 more
Caused by: java.lang.NullPointerException
at
org.apache.uima.util.XMLInputSource.&lt;init&gt;&#40;XMLInputSource.java:118&#41;
at
org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE&#40;OverridingParamsAEProvider.java:58&#41;
... 32 more

at
com.ibm.ws.webcontainer.webapp.WebAppDispatcherContext.sendError(WebAppDispatcherContext.java:624)
at
com.ibm.ws.webcontainer.webapp.WebAppDispatcherContext.sendError(WebAppDispatcherContext.java:642)
at
com.ibm.ws.webcontainer.srt.SRTServletResponse.sendError(SRTServletResponse.java:1235)
at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:380)
at
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:326)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:265)
....
....

Please let me know if you have any insights on what could be the issue.

Thanks in advance,
Debdoot


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3987056.html
Sent from the Solr - User mailing list archive at Nabble.com. 


Re: Solr with UIMA

Posted by debdoot <de...@gmail.com>.
Hi Tommaso,

I have followed the steps you have listed to try to deploy the example
RoomNumberAnnotator with Solr 3.5.
Here is the error trace that I get:


org.apache.solr.common.SolrException: processing error: null. uid=5, 
text=&quot;Test Room HAW GN-K35...&quot;
	at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd&#40;UIMAUpdateRequestProcessor.java:107&#41;
	at
org.apache.solr.handler.XMLLoader.processUpdate&#40;XMLLoader.java:158&#41;
	at org.apache.solr.handler.XMLLoader.load&#40;XMLLoader.java:79&#41;
	at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody&#40;ContentStreamHandlerBase.java:58&#41;
	at
org.apache.solr.handler.RequestHandlerBase.handleRequest&#40;RequestHandlerBase.java:129&#41;
	at org.apache.solr.core.SolrCore.execute&#40;SolrCore.java:1372&#41;
	at
org.apache.solr.servlet.SolrDispatchFilter.execute&#40;SolrDispatchFilter.java:356&#41;
	at
org.apache.solr.servlet.SolrDispatchFilter.doFilter&#40;SolrDispatchFilter.java:252&#41;
	at
com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter&#40;FilterInstanceWrapper.java:192&#41;
	at
com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter&#40;WebAppFilterChain.java:89&#41;
	at
com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter&#40;WebAppFilterManager.java:919&#41;
	at
com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters&#40;WebAppFilterManager.java:1016&#41;
	at
com.ibm.ws.webcontainer.webapp.WebApp.handleRequest&#40;WebApp.java:3703&#41;
	at
com.ibm.ws.webcontainer.webapp.WebGroup.handleRequest&#40;WebGroup.java:304&#41;
	at
com.ibm.ws.webcontainer.WebContainer.handleRequest&#40;WebContainer.java:953&#41;
	at
com.ibm.ws.webcontainer.WSWebContainer.handleRequest&#40;WSWebContainer.java:1655&#41;
	at
com.ibm.ws.webcontainer.channel.WCChannelLink.ready&#40;WCChannelLink.java:195&#41;
	at
com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination&#40;HttpInboundLink.java:452&#41;
	at
com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewRequest&#40;HttpInboundLink.java:511&#41;
	at
com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.processRequest&#40;HttpInboundLink.java:305&#41;
	at
com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.ready&#40;HttpInboundLink.java:276&#41;
	at
com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.sendToDiscriminators&#40;NewConnectionInitialReadCallback.java:214&#41;
	at
com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.complete&#40;NewConnectionInitialReadCallback.java:113&#41;
	at
com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted&#40;AioReadCompletionListener.java:165&#41;
	at
com.ibm.io.async.AbstractAsyncFuture.invokeCallback&#40;AbstractAsyncFuture.java:217&#41;
	at
com.ibm.io.async.AsyncChannelFuture.fireCompletionActions&#40;AsyncChannelFuture.java:161&#41;
	at com.ibm.io.async.AsyncFuture.completed&#40;AsyncFuture.java:138&#41;
	at com.ibm.io.async.ResultHandler.complete&#40;ResultHandler.java:204&#41;
	at
com.ibm.io.async.ResultHandler.runEventProcessingLoop&#40;ResultHandler.java:775&#41;
	at com.ibm.io.async.ResultHandler$2.run&#40;ResultHandler.java:905&#41;
	at com.ibm.ws.util.ThreadPool$Worker.run&#40;ThreadPool.java:1650&#41;
Caused by: org.apache.uima.resource.ResourceInitializationException
	at
org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE&#40;OverridingParamsAEProvider.java:86&#41;
	at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processText&#40;UIMAUpdateRequestProcessor.java:144&#41;
	at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd&#40;UIMAUpdateRequestProcessor.java:77&#41;
	... 30 more
Caused by: java.lang.NullPointerException
	at
org.apache.uima.util.XMLInputSource.&lt;init&gt;&#40;XMLInputSource.java:118&#41;
	at
org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE&#40;OverridingParamsAEProvider.java:58&#41;
	... 32 more

	at
com.ibm.ws.webcontainer.webapp.WebAppDispatcherContext.sendError(WebAppDispatcherContext.java:624)
	at
com.ibm.ws.webcontainer.webapp.WebAppDispatcherContext.sendError(WebAppDispatcherContext.java:642)
	at
com.ibm.ws.webcontainer.srt.SRTServletResponse.sendError(SRTServletResponse.java:1235)
	at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:380)
	at
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:326)
	at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:265)
....
....

Please let me know if you have any insights on what could be the issue.

Thanks in advance,
Debdoot


--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3987056.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr with UIMA

Posted by Tommaso Teofili <to...@gmail.com>.
Hi again Chris,

I finally manage to find some proper time to test your configuration.
First thing to notice is that it worked for me assuming the following
pre-requisites were satisfied:
- you had the jar containing the AnalysisEngine for the RoomAnnotator.xml
in your libraries section (this is actually the uimaj-examples.jar which is
shipped with the UIMA SDK under libs[1]) :
- you had the solr-uima jar in your libraries

the above are done adding the following lines to the solrconfig (usually on
the top of the file just beneath the <luceneMatchVersion> element)

  <lib dir="../../dist/" regex="apache-solr-uima-\d.*\.jar" />
  <lib dir="../../contrib/uima/lib" regex=".*\.jar" />
  <lib dir="/path/to/apache-uima/lib" />

If you want to know what's going wrong I'd advice to not ignore errors
within the UIMAUpdateProcessor configuration:
<bool name="ignoreErrors">false</bool>

What I get if I run your same curl command and then make a *:* query is :

<response>
  <lst name="responseHeader">
   <int name="status">0</int>
   <int name="QTime">2</int>
   <lst name="params">
     <str name="wt">xml</str>
     <str name="start">0</str>
     <str name="q">*:*</str>
     <str name="rows">10</str>
   </lst>
   </lst>
   <result name="response" numFound="1" start="0">
     <doc>
       <str name="id">4</str>
        <str name="content">Test Room HAW GN-K35</str>
        <arr name="UIMAname">
           <str>Hawthorne</str>
         </arr>
      </doc>
    </result>
   </response>

which look ok to me.
Hope this helps.
Tommaso

[1] : http://mirror.switch.ch/mirror/apache/dist//uima///uimaj-2.3.1-bin.zip

2012/3/28 chris3001 <ch...@hotmail.com>

> Tommaso,
> Thank you so much for looking into this, I am very grateful!
>
> Chris
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3865291.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr with UIMA

Posted by chris3001 <ch...@hotmail.com>.
Tommaso,
Thank you so much for looking into this, I am very grateful!

Chris 

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3865291.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr with UIMA

Posted by Tommaso Teofili <to...@gmail.com>.
Hi Chris,

I did never tried the Nutch integration so I can't help with that.
However I'll try to repeat your same setup and will let you know what it
comes out for me.

Tommaso

2012/3/28 chris3001 <ch...@hotmail.com>

> Still not getting there on Solr with UIMA...
> Has anyone taken example 1 (RoomAnnotator) and successfully tested this by
> any chance?
>
> Thanks to Tommaso my curl statement has changed to /update:
>
> curl http://localhost:8080/solr/update?commit=true -H "Content-Type:
> text/xml" --data-binary '<add><doc><field name="id">4</field><field
> name="content">Test Room HAW GN-K35</field></doc></add>'
>
> Next- my solrconfig has these two parts:
> Part1:
>   <requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
>    <lst name="defaults">
>      <str name="update.processor">uima</str>
>    </lst>
>  </requestHandler>
>
> Part2:
> <updateRequestProcessorChain name="uima">
>  <processor
> class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
>    <lst name="uimaConfig">
>      <lst name="runtimeParameters">
>      </lst>
>      <str
>
> name="analysisEngine">C:\uima\examples\descriptors\tutorial\ex1\RoomNumberAnnotator.xml</str>
>       <bool name="ignoreErrors">true</bool>
>      <str name="logField">id</str>
>      <lst name="analyzeFields">
>        <bool name="merge">false</bool>
>        <arr name="fields">
>          <str>content</str>
>        </arr>
>      </lst>
>      <lst name="fieldMappings">
>        <lst name="type">
>           <str name="name">org.apache.uima.tutorial.RoomNumber</str>
>          <lst name="mapping">
>            <str name="feature">building</str>
>             <str name="field">UIMAname</str>
>          </lst>
>        </lst>
>      </lst>
>    </lst>
>  </processor>
>  <processor class="solr.LogUpdateProcessorFactory" />
>  <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
>
> Finally, my schema.xml:
>
> <field name="UIMAname" type="string" indexed="true" stored="true"
> multiValued="true" required="false"/>
>
> When I run this example AE XML Descriptor in the Document Analyzer I see
> the
> token GN-K35 highlighted. However, when I try integrating into Solr using
> above settings and search for *:* in: http://localhost:8080/solr/admin/ I
> do
> not see the UIMAname tag at all. Nor with any data (namely, GN-K35 in this
> example).
>
> Thank you for your time in reading this.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3864810.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr with UIMA

Posted by chris3001 <ch...@hotmail.com>.
Still not getting there on Solr with UIMA...
Has anyone taken example 1 (RoomAnnotator) and successfully tested this by
any chance?

Thanks to Tommaso my curl statement has changed to /update:

curl http://localhost:8080/solr/update?commit=true -H "Content-Type:
text/xml" --data-binary '<add><doc><field name="id">4</field><field
name="content">Test Room HAW GN-K35</field></doc></add>'

Next- my solrconfig has these two parts:
Part1:
  <requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
    <lst name="defaults">
      <str name="update.processor">uima</str>
    </lst>
  </requestHandler>

Part2:
<updateRequestProcessorChain name="uima">
  <processor
class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
    <lst name="uimaConfig">
      <lst name="runtimeParameters">
      </lst>
      <str
name="analysisEngine">C:\uima\examples\descriptors\tutorial\ex1\RoomNumberAnnotator.xml</str>
      <bool name="ignoreErrors">true</bool>
      <str name="logField">id</str>
      <lst name="analyzeFields">
        <bool name="merge">false</bool>
        <arr name="fields">
          <str>content</str>
        </arr>
      </lst>
      <lst name="fieldMappings">
        <lst name="type">
          <str name="name">org.apache.uima.tutorial.RoomNumber</str>
          <lst name="mapping">
            <str name="feature">building</str>
            <str name="field">UIMAname</str>
          </lst>
        </lst>
      </lst>
    </lst>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Finally, my schema.xml:

<field name="UIMAname" type="string" indexed="true" stored="true"
multiValued="true" required="false"/>

When I run this example AE XML Descriptor in the Document Analyzer I see the
token GN-K35 highlighted. However, when I try integrating into Solr using
above settings and search for *:* in: http://localhost:8080/solr/admin/ I do
not see the UIMAname tag at all. Nor with any data (namely, GN-K35 in this
example).

Thank you for your time in reading this.



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3864810.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr with UIMA

Posted by chris3001 <ch...@hotmail.com>.
Tommaso-
Thank you so much for your reply and pointing this out! I will look into it.
However, when I run nutch I still dont see the new fields:

$ bin/nutch crawl urls -solr http://localhost:8080/solr/ -depth 1 -topN 2

Does that still have to do with the update/extract call?

Thanks again for your time.

Chris

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3864418.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr with UIMA

Posted by Tommaso Teofili <to...@gmail.com>.
Hi Chris,

2012/3/28 chris3001 <ch...@hotmail.com>

> I am having a hard time integrating UIMA with Solr. I have downloaded the
> Solr 3.5 dist and have it successfully running with nutch and tika on
> windows 7 using solrcell and curl via cygwin. To begin, I copied the 6 jars
> from solr/contrib/uima/lib to the working /lib in solr. Next, I read the
> readme.txt file in solr/contrib/uima/lib and edited both my solrconfig.xml
> and schema.xml accordingly to no avail. I then found this link which seemed
> a bit more applicable since I didnt care to use Alchemy or OpenCalais:
> http://code.google.com/a/apache-extras.org/p/rondhuit-uima/?redir=1 Still-
> when I run a curl command that imports a pdf via solrcell I do not get the
> additional UIMA fields nor do I get anything on my logs. The test.pdf is
> parsed though and I see the pdf in Solr using:
> curl
> '
> http://localhost:8080/solr/update/extract?fmap.content=content&literal.id=doc1&commit=true
> '
> -F "file=@test.pdf"
>
> What I added to my SolrConfig.XML:
>
> /<updateRequestProcessorChain name="uima">
>  <processor
> class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
>    <lst name="uimaConfig">
>      <lst name="runtimeParameters">
>      </lst>
>      <str
>
> name="analysisEngine">C:\web\solrcelluimacrawler\com\rondhuit\uima\desc\KeyphraseExtractAnnotatorDescriptor.xml</str>
>      <bool name="ignoreErrors">true</bool>
>      <str name="logField">id</str>
>      <lst name="analyzeFields">
>        <bool name="merge">false</bool>
>        <arr name="fields">
>          <str>content</str>
>        </arr>
>      </lst>
>      <lst name="fieldMappings">
>        <lst name="type">
>          <str name="name">com.rondhuit.uima.yahoo.Keyphrase</str>
>          <lst name="mapping">
>            <str name="feature">keyphrase</str>
>            <str name="field">UIMAname</str>
>          </lst>
>        </lst>
>      </lst>
>    </lst>
>  </processor>
>  <processor class="solr.LogUpdateProcessorFactory" />
>  <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
> /
> I also adjusted my requestHander:
>
> /<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
>    <lst name="defaults">
>      <str name="update.processor">uima</str>
>    </lst>
>  </requestHandler>/
>
> Finally, my added entries in my Schema.xml
>
> /
> <field name="UIMAname" type="string" indexed="true" stored="true"
> multiValued="true" required="false"/>
> <dynamicField name="*_sm"  type="string"  indexed="true"  stored="true"/>
> /
>
> All I am trying to do is have test *any* UIMA AE in Solr and cannot figure
> out what I am doing wrong. Thank you in advance for reading this.
>
>
if I understood things correctly the problem is that you're using the
/update/extract call which uses the SolrCell ExtractingRequestHandler while
the UIMA update processor chain is available via the /update path, see:

<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
   <lst name="defaults">
     <str name="update.processor">uima</str>
   </lst>
 </requestHandler>/

HTH
Tommaso

>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3863324.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr with UIMA

Posted by dsy99 <ds...@rediffmail.com>.
Hi Rahul,
Thank you for the reply. I tried by modifying the
updateRequestProcessorChain as follows:

<updateRequestProcessorChain name="uima" default="true">

 But still I am not able to see the UIMA fields in the result. I executed
the following curl command to index a file named "test.docx"

curl
"http://localhost:8983/solr/update/extract?fmap.content=content&literal.id=doc47&commit=true"
-F "file=@test.docx"

When I searched the same document with
"http://localhost:8983/solr/select?q=id:doc47" command, got the following
result.

<result name="response" numFound="1" start="0">
  <doc>
     <str name="author">divakar</str>
     <arr name="content_type">
        <str>
          
application/vnd.openxmlformats-officedocument.wordprocessingml.document
        </str>
     </arr>
     <str name="id">doc47</str>
     <date name="last_modified">2012-04-18T14:19:00Z</date>
  </doc>
</result>

Could you please help where I am wrong?

With Thaks & Regds:
Divakar

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3925670.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr with UIMA

Posted by introfini <in...@gmail.com>.
Rahul Warawdekar wrote
> 
> Hi Divakar,
> 
> Try making your updateRequestProcessorChain as default. Simply add
> default="true" as follows and check if that works.
> 
> <updateRequestProcessorChain name="uima" *default="true"*>
> 
> 

Rahul,

This fixed my problem, you saved my week!

I was following the README.txt instructions and they didn't work, after
adding the default="true" it immediately start working. 

Maybe that should go into the README.txt?

Thank you.




--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p4001014.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr with UIMA

Posted by Rahul Warawdekar <ra...@gmail.com>.
Hi Divakar,

Try making your updateRequestProcessorChain as default. Simply add
default="true" as follows and check if that works.

<updateRequestProcessorChain name="uima" *default="true"*>


On Thu, Apr 19, 2012 at 12:01 PM, dsy99 <ds...@rediffmail.com> wrote:

> Hi Chris,
> Are you been able to get success to integrate the UIMA in SOLR.
>
> I too  tried to integrate Uima in Solr by following the instructions
> provided in README i.e. the following four steps:
>
> Step1. I set <lib/> tags in solrconfig.xml appropriately to point the jar
> files.
>
>   <lib dir="../../contrib/uima/lib" />
>    <lib dir="../../dist/" regex="apache-solr-uima-\d.*\.jar" />
>
> Step2. modified my "schema.xml" adding the fields I wanted to  hold
> metadata
> specifying proper values for type, indexed, stored and multiValued options
> as follows:
>
>    <field name="language" type="string" indexed="true" stored="true"
> required="false"/>
>  <field name="concept" type="string" indexed="true" stored="true"
> multiValued="true" required="false"/>
>   <field name="sentence" type="text" indexed="true" stored="true"
> multiValued="true" required="false" />
>
> Step3. modified my solrconfig.xml adding the following snippet:
>
>  <updateRequestProcessorChain name="uima">
>    <processor
> class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
>      <lst name="uimaConfig">
>        <lst name="runtimeParameters">
>           <str name="keyword_apikey">VALID_ALCHEMYAPI_KEY</str>
>          <str name="concept_apikey">VALID_ALCHEMYAPI_KEY</str>
>          <str name="lang_apikey">VALID_ALCHEMYAPI_KEY</str>
>          <str name="cat_apikey">VALID_ALCHEMYAPI_KEY</str>
>          <str name="entities_apikey">VALID_ALCHEMYAPI_KEY</str>
>          <str name="oc_licenseID">VALID_OPENCALAIS_KEY</str>
>        </lst>
>        <str
>
> name="analysisEngine">/org/apache/uima/desc/OverridingParamsExtServicesAE.xml</str>
>
>        <bool name="ignoreErrors">true</bool>
>
>         <lst name="analyzeFields">
>          <bool name="merge">false</bool>
>          <arr name="fields">
>             <str>text</str>
>           </arr>
>        </lst>
>        <lst name="fieldMappings">
>          <lst name="type">
>            <str
> name="name">org.apache.uima.alchemy.ts.concept.ConceptFS</str>
>            <lst name="mapping">
>              <str name="feature">text</str>
>              <str name="field">concept</str>
>            </lst>
>          </lst>
>          <lst name="type">
>            <str
> name="name">org.apache.uima.alchemy.ts.language.LanguageFS</str>
>            <lst name="mapping">
>              <str name="feature">language</str>
>              <str name="field">language</str>
>            </lst>
>          </lst>
>          <lst name="type">
>            <str name="name">org.apache.uima.SentenceAnnotation</str>
>            <lst name="mapping">
>              <str name="feature">coveredText</str>
>              <str name="field">sentence</str>
>             </lst>
>          </lst>
>        </lst>
>      </lst>
>    </processor>
>    <processor class="solr.LogUpdateProcessorFactory" />
>    <processor class="solr.RunUpdateProcessorFactory" />
>  </updateRequestProcessorChain>
>
> Step 4: and finally created a new UpdateRequestHandler with the following:
>   <requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
>    <lst name="defaults">
>      <str name="update.processor">uima</str>
>    </lst>
>
>
> Further I  indexed a word file called text.docx using the following
> command:
>
> curl
> "
> http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&fmap.content=attr_content&commit=true
> "
> -F "myfile=@UIMA_sample_test.docx"
>
> When I searched the file I am not able to see the additional UIMA fields.
>
> Can you please help if you been able to solve the problem.
>
>
> With Regds & Thanks
> Divakar
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3923443.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Thanks and Regards
Rahul A. Warawdekar

Re: Solr with UIMA

Posted by dsy99 <ds...@rediffmail.com>.
Hi Chris,
Are you been able to get success to integrate the UIMA in SOLR.

I too  tried to integrate Uima in Solr by following the instructions
provided in README i.e. the following four steps:

Step1. I set <lib/> tags in solrconfig.xml appropriately to point the jar
files.

   <lib dir="../../contrib/uima/lib" />
   <lib dir="../../dist/" regex="apache-solr-uima-\d.*\.jar" />

Step2. modified my "schema.xml" adding the fields I wanted to  hold metadata
specifying proper values for type, indexed, stored and multiValued options
as follows:

    <field name="language" type="string" indexed="true" stored="true"
required="false"/>
  <field name="concept" type="string" indexed="true" stored="true"
multiValued="true" required="false"/>
  <field name="sentence" type="text" indexed="true" stored="true"
multiValued="true" required="false" />

Step3. modified my solrconfig.xml adding the following snippet:

  <updateRequestProcessorChain name="uima">
    <processor
class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
      <lst name="uimaConfig">
        <lst name="runtimeParameters">
          <str name="keyword_apikey">VALID_ALCHEMYAPI_KEY</str>
          <str name="concept_apikey">VALID_ALCHEMYAPI_KEY</str>
          <str name="lang_apikey">VALID_ALCHEMYAPI_KEY</str>
          <str name="cat_apikey">VALID_ALCHEMYAPI_KEY</str>
          <str name="entities_apikey">VALID_ALCHEMYAPI_KEY</str>
          <str name="oc_licenseID">VALID_OPENCALAIS_KEY</str>
        </lst>
        <str
name="analysisEngine">/org/apache/uima/desc/OverridingParamsExtServicesAE.xml</str>
        
        <bool name="ignoreErrors">true</bool>
        
        <lst name="analyzeFields">
          <bool name="merge">false</bool>
          <arr name="fields">
            <str>text</str>
          </arr>
        </lst>
        <lst name="fieldMappings">
          <lst name="type">
            <str
name="name">org.apache.uima.alchemy.ts.concept.ConceptFS</str>
            <lst name="mapping">
              <str name="feature">text</str>
              <str name="field">concept</str>
            </lst>
          </lst>
          <lst name="type">
            <str
name="name">org.apache.uima.alchemy.ts.language.LanguageFS</str>
            <lst name="mapping">
              <str name="feature">language</str>
              <str name="field">language</str>
            </lst>
          </lst>
          <lst name="type">
            <str name="name">org.apache.uima.SentenceAnnotation</str>
            <lst name="mapping">
              <str name="feature">coveredText</str>
              <str name="field">sentence</str>
            </lst>
          </lst>
        </lst>
      </lst>
    </processor>
    <processor class="solr.LogUpdateProcessorFactory" />
    <processor class="solr.RunUpdateProcessorFactory" />
  </updateRequestProcessorChain>

Step 4: and finally created a new UpdateRequestHandler with the following:
  <requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
    <lst name="defaults">
      <str name="update.processor">uima</str>
    </lst>


Further I  indexed a word file called text.docx using the following command: 

curl
"http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&fmap.content=attr_content&commit=true"
-F "myfile=@UIMA_sample_test.docx"

When I searched the file I am not able to see the additional UIMA fields.

Can you please help if you been able to solve the problem.


With Regds & Thanks
Divakar

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3923443.html
Sent from the Solr - User mailing list archive at Nabble.com.