You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by javaxmlsoapdev <vi...@yahoo.com> on 2009/11/25 17:16:45 UTC
Where to put ExternalRequestHandler and Tika jars
My SOLR_HOME =/home/solr_1_4_0/apache-solr-1.4.0/example/solr/conf in
tomcat.sh
POI, PDFBox, Tika and related jars are under
/home/solr_1_4_0/apache-solr-1.4.0/lib
When I try to index files using SolrJ API as follow, I don't see content of
the file being indexed. It only indexes file size (bytes) and file/type into
"content" field. See below schema defintion as well.
ContentStreamUpdateRequest up = new
ContentStreamUpdateRequest("/update/extract");
up.addFile(file);
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
server.request(up);
schema.xml has following
<field name="issueKey" type="slong" indexed="true" stored="true"
required="true" />
<field name="content" type="text" indexed="true" stored="true"
multiValued="true"/>
<defaultSearchField>content</defaultSearchField>
And solrconfig.xml has
<requestHandler name="/update/extract"
class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
<lst name="defaults">
<str name="map.content">content</str>
<str name="defaultField">content</str>
</lst>
</requestHandler>
Luke response is as below, which displays correct count (7) of indexed
documents but no "content" in the index. in tomcat logs I don't see any
errors or anything. Unless I am going blind with something I don't see
anything missing in setting things up. Can anyone advise. Do I need to
include tika jars in tomcat's deployed solr/lib or unde /example/lib in
SOLR_HOME?
<?xml version="1.0" encoding="UTF-8" ?>
- <response>
- <lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">28</int>
</lst>
- <lst name="index">
<int name="numDocs">7</int>
<int name="maxDoc">7</int>
<int name="numTerms">25</int>
<long name="version">1259164190261</long>
<bool name="optimized">false</bool>
<bool name="current">true</bool>
<bool name="hasDeletions">false</bool>
<str
name="directory">org.apache.lucene.store.NIOFSDirectory:org.apache.lucene.store.NIOFSDirectory@/home/tomcat-solr/bin/docs/data/index</str>
<date name="lastModified">2009-11-25T15:50:03Z</date>
</lst>
- <lst name="fields">
- <lst name="content">
<str name="type">text</str>
<str name="schema">ITSM----------</str>
<str name="index">ITS----------</str>
<int name="docs">7</int>
<int name="distinct">18</int>
- <lst name="topTerms">
<int name="text">3</int>
<int name="applic">3</int>
<int name="msword">3</int>
<int name="applicationmsword">3</int>
<int name="plain">2</int>
<int name="textplain">2</int>
<int name="70144">1</int>
<int name="453">1</int>
<int name="2370">1</int>
<int name="html">1</int>
</lst>
- <lst name="histogram">
<int name="1">12</int>
<int name="2">2</int>
<int name="4">4</int>
</lst>
</lst>
- <lst name="issueKey">
<str name="type">slong</str>
<str name="schema">I-S----O-----l</str>
<str name="index">I-S----O-----</str>
<int name="docs">7</int>
<int name="distinct">7</int>
- <lst name="topTerms">
<int name="1">1</int>
<int name="2">1</int>
<int name="3">1</int>
<int name="4">1</int>
<int name="5">1</int>
<int name="6">1</int>
<int name="0">1</int>
</lst>
- <lst name="histogram">
<int name="1">7</int>
</lst>
</lst>
</lst>
- <lst name="info">
- <lst name="key">
<str name="I">Indexed</str>
<str name="T">Tokenized</str>
<str name="S">Stored</str>
<str name="M">Multivalued</str>
<str name="V">TermVector Stored</str>
<str name="o">Store Offset With TermVector</str>
<str name="p">Store Position With TermVector</str>
<str name="O">Omit Norms</str>
<str name="L">Lazy</str>
<str name="B">Binary</str>
<str name="C">Compressed</str>
<str name="f">Sort Missing First</str>
<str name="l">Sort Missing Last</str>
</lst>
<str name="NOTE">Document Frequency (df) is not updated when a document is
marked for deletion. df values include deleted documents.</str>
</lst>
</response>
--
View this message in context: http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26515579.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Where to put ExternalRequestHandler and Tika jars
Posted by Juan Pedro Danculovic <jd...@gmail.com>.
Now it is working.. It was a libraries problem as you said...
thanks!
On Mon, Nov 30, 2009 at 12:25 PM, javaxmlsoapdev <vi...@yahoo.com> wrote:
>
> Yes. code I posted in first thread does work. And I am able to retrieve
> data
> from the document index. did you include all required jars in deployed solr
> application's lib folder? what errors are you seeing?
>
> Juan Pedro Danculovic wrote:
> >
> > HI! does your example finally works? I index the data with solrj and I
> > have
> > the same problem and could not retrieve file data.
> >
> >
> > On Wed, Nov 25, 2009 at 3:41 PM, javaxmlsoapdev <vi...@yahoo.com>
> wrote:
> >
> >>
> >> grrrrrrrr. I had to include tika and related parsing jars into
> >> tomcat/webapps/solr/WEB-INF/lib.. this was an embarrassing mistake.
> >> apologies for all the noise.
> >>
> >> Thanks,
> >> --
> >> View this message in context:
> >>
> http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26518100.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26576242.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
Re: Where to put ExternalRequestHandler and Tika jars
Posted by javaxmlsoapdev <vi...@yahoo.com>.
Yes. code I posted in first thread does work. And I am able to retrieve data
from the document index. did you include all required jars in deployed solr
application's lib folder? what errors are you seeing?
Juan Pedro Danculovic wrote:
>
> HI! does your example finally works? I index the data with solrj and I
> have
> the same problem and could not retrieve file data.
>
>
> On Wed, Nov 25, 2009 at 3:41 PM, javaxmlsoapdev <vi...@yahoo.com> wrote:
>
>>
>> grrrrrrrr. I had to include tika and related parsing jars into
>> tomcat/webapps/solr/WEB-INF/lib.. this was an embarrassing mistake.
>> apologies for all the noise.
>>
>> Thanks,
>> --
>> View this message in context:
>> http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26518100.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>
>
--
View this message in context: http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26576242.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Where to put ExternalRequestHandler and Tika jars
Posted by Juan Pedro Danculovic <jd...@gmail.com>.
HI! does your example finally works? I index the data with solrj and I have
the same problem and could not retrieve file data.
On Wed, Nov 25, 2009 at 3:41 PM, javaxmlsoapdev <vi...@yahoo.com> wrote:
>
> grrrrrrrr. I had to include tika and related parsing jars into
> tomcat/webapps/solr/WEB-INF/lib.. this was an embarrassing mistake.
> apologies for all the noise.
>
> Thanks,
> --
> View this message in context:
> http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26518100.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
Re: Where to put ExternalRequestHandler and Tika jars
Posted by javaxmlsoapdev <vi...@yahoo.com>.
grrrrrrrr. I had to include tika and related parsing jars into
tomcat/webapps/solr/WEB-INF/lib.. this was an embarrassing mistake.
apologies for all the noise.
Thanks,
--
View this message in context: http://old.nabble.com/Where-to-put-ExternalRequestHandler-and-Tika-jars-tp26515579p26518100.html
Sent from the Solr - User mailing list archive at Nabble.com.