You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by javaxmlsoapdev <vi...@yahoo.com> on 2009/11/25 17:16:45 UTC

Where to put ExternalRequestHandler and Tika jars

My SOLR_HOME =/home/solr_1_4_0/apache-solr-1.4.0/example/solr/conf in
tomcat.sh

POI, PDFBox, Tika and related jars are under
/home/solr_1_4_0/apache-solr-1.4.0/lib

When I try to index files using SolrJ API as follow, I don't see content of
the file being indexed. It only indexes file size (bytes) and file/type into
"content" field. See below schema defintion as well.
ContentStreamUpdateRequest up = new
ContentStreamUpdateRequest("/update/extract");
up.addFile(file);
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
server.request(up);

schema.xml has following
 <field name="issueKey" type="slong" indexed="true" stored="true"
required="true" /> 
 <field name="content" type="text" indexed="true" stored="true"
multiValued="true"/>	 

<defaultSearchField>content</defaultSearchField>

And solrconfig.xml has
<requestHandler name="/update/extract"
class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
    <lst name="defaults">
      <str name="map.content">content</str>
      <str name="defaultField">content</str>
    </lst>
  </requestHandler>

Luke response is as below, which displays correct count (7) of indexed
documents but no "content" in the index. in tomcat logs I don't see any
errors or anything. Unless I am going blind with something I don't see
anything missing in setting things up. Can anyone advise. Do I need to
include tika jars in tomcat's deployed solr/lib or unde /example/lib in
SOLR_HOME?

  <?xml version="1.0" encoding="UTF-8" ?> 
- <response>
- <lst name="responseHeader">
  <int name="status">0</int> 
  <int name="QTime">28</int> 
  </lst>
- <lst name="index">
  <int name="numDocs">7</int> 
  <int name="maxDoc">7</int> 
  <int name="numTerms">25</int> 
  <long name="version">1259164190261</long> 
  <bool name="optimized">false</bool> 
  <bool name="current">true</bool> 
  <bool name="hasDeletions">false</bool> 
  <str
name="directory">org.apache.lucene.store.NIOFSDirectory:org.apache.lucene.store.NIOFSDirectory@/home/tomcat-solr/bin/docs/data/index</str> 
  <date name="lastModified">2009-11-25T15:50:03Z</date> 
  </lst>
- <lst name="fields">
- <lst name="content">
  <str name="type">text</str> 
  <str name="schema">ITSM----------</str> 
  <str name="index">ITS----------</str> 
  <int name="docs">7</int> 
  <int name="distinct">18</int> 
- <lst name="topTerms">
  <int name="text">3</int> 
  <int name="applic">3</int> 
  <int name="msword">3</int> 
  <int name="applicationmsword">3</int> 
  <int name="plain">2</int> 
  <int name="textplain">2</int> 
  <int name="70144">1</int> 
  <int name="453">1</int> 
  <int name="2370">1</int> 
  <int name="html">1</int> 
  </lst>
- <lst name="histogram">
  <int name="1">12</int> 
  <int name="2">2</int> 
  <int name="4">4</int> 
  </lst>
  </lst>
- <lst name="issueKey">
  <str name="type">slong</str> 
  <str name="schema">I-S----O-----l</str> 
  <str name="index">I-S----O-----</str> 
  <int name="docs">7</int> 
  <int name="distinct">7</int> 
- <lst name="topTerms">
  <int name="1">1</int> 
  <int name="2">1</int> 
  <int name="3">1</int> 
  <int name="4">1</int> 
  <int name="5">1</int> 
  <int name="6">1</int> 
  <int name="0">1</int> 
  </lst>
- <lst name="histogram">
  <int name="1">7</int> 
  </lst>
  </lst>
  </lst>
- <lst name="info">
- <lst name="key">
  <str name="I">Indexed</str> 
  <str name="T">Tokenized</str> 
  <str name="S">Stored</str> 
  <str name="M">Multivalued</str> 
  <str name="V">TermVector Stored</str> 
  <str name="o">Store Offset With TermVector</str> 
  <str name="p">Store Position With TermVector</str> 
  <str name="O">Omit Norms</str> 
  <str name="L">Lazy</str> 
  <str name="B">Binary</str> 
  <str name="C">Compressed</str> 
  <str name="f">Sort Missing First</str> 
  <str name="l">Sort Missing Last</str> 
  </lst>
  <str name="NOTE">Document Frequency (df) is not updated when a document is
marked for deletion. df values include deleted documents.</str> 
  </lst>
  </response>
-- 
View this message in context: http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26515579.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Where to put ExternalRequestHandler and Tika jars

Posted by Juan Pedro Danculovic <jd...@gmail.com>.
Now it is working.. It was a libraries problem as you said...

thanks!


On Mon, Nov 30, 2009 at 12:25 PM, javaxmlsoapdev <vi...@yahoo.com> wrote:

>
> Yes. code I posted in first thread does work. And I am able to retrieve
> data
> from the document index. did you include all required jars in deployed solr
> application's lib folder? what errors are you seeing?
>
> Juan Pedro Danculovic wrote:
> >
> > HI! does your example finally works? I index the data with solrj and I
> > have
> > the same problem and could not retrieve file data.
> >
> >
> > On Wed, Nov 25, 2009 at 3:41 PM, javaxmlsoapdev <vi...@yahoo.com>
> wrote:
> >
> >>
> >> grrrrrrrr. I had to include tika and related parsing jars into
> >> tomcat/webapps/solr/WEB-INF/lib.. this was an embarrassing mistake.
> >> apologies for all the noise.
> >>
> >> Thanks,
> >> --
> >> View this message in context:
> >>
> http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26518100.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26576242.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Where to put ExternalRequestHandler and Tika jars

Posted by javaxmlsoapdev <vi...@yahoo.com>.
Yes. code I posted in first thread does work. And I am able to retrieve data
from the document index. did you include all required jars in deployed solr
application's lib folder? what errors are you seeing?

Juan Pedro Danculovic wrote:
> 
> HI! does your example finally works? I index the data with solrj and I
> have
> the same problem and could not retrieve file data.
> 
> 
> On Wed, Nov 25, 2009 at 3:41 PM, javaxmlsoapdev <vi...@yahoo.com> wrote:
> 
>>
>> grrrrrrrr. I had to include tika and related parsing jars into
>> tomcat/webapps/solr/WEB-INF/lib.. this was an embarrassing mistake.
>> apologies for all the noise.
>>
>> Thanks,
>> --
>> View this message in context:
>> http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26518100.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26576242.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Where to put ExternalRequestHandler and Tika jars

Posted by Juan Pedro Danculovic <jd...@gmail.com>.
HI! does your example finally works? I index the data with solrj and I have
the same problem and could not retrieve file data.


On Wed, Nov 25, 2009 at 3:41 PM, javaxmlsoapdev <vi...@yahoo.com> wrote:

>
> grrrrrrrr. I had to include tika and related parsing jars into
> tomcat/webapps/solr/WEB-INF/lib.. this was an embarrassing mistake.
> apologies for all the noise.
>
> Thanks,
> --
> View this message in context:
> http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26518100.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Where to put ExternalRequestHandler and Tika jars

Posted by javaxmlsoapdev <vi...@yahoo.com>.
grrrrrrrr. I had to include tika and related parsing jars into
tomcat/webapps/solr/WEB-INF/lib.. this was an embarrassing mistake.
apologies for all the noise. 

Thanks,
-- 
View this message in context: http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26518100.html
Sent from the Solr - User mailing list archive at Nabble.com.