You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Charlie Hubbard <ch...@gmail.com> on 2015/06/14 19:08:37 UTC
Solrj Tika/Cell not using defaultField
I'm having trouble getting Solr to pay attention to the defaultField value
when I send a document to Solr Cell or Tika. Here is my post I'm sending
using Solrj
POST
/solr/collection1/update/extract?extractOnly=true&defaultField=text&wt=javabin&version=2
HTTP/1.1
When I get the response back the NamedList contains the content it
extracted but it's under the name null and null_metadata respectively.
I've seen it return the defaultField I give it before, but for some reason
now it's not returning it. I've even tried to configure the
ExtractRequestHandler like so:
<requestHandler name="/update/extract"
startup="lazy"
class="solr.extraction.ExtractingRequestHandler">
<lst name="defaults">
<str name="defaultField">text</str>
<!--<str name="lowernames">true</str>-->
<!--<str name="uprefix">ignored_</str>-->
<!-- capture link hrefs but ignore div attributes -->
<str name="captureAttr">true</str>
<str name="fmap.content">text</str>
<str name="fmap.a">links</str>
<str name="fmap.div">ignored_</str>
</lst>
<!--<str name="tika.config">tika.config</str>-->
</requestHandler>
But even that doesn't get picked up. Here is the SOLR code I use to set
the parameters:
public SolrRequest toSolrExtractRequest() throws IOException {
ContentStreamUpdateRequest req = new
ContentStreamUpdateRequest("/update/extract");
req.addFile(getLocation(), null);
req.setParam(EXTRACT_ONLY, "true");
req.setParam(DEFAULT_FIELD, "text");
return req;
}
So why is this not working?
Charlie