You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Vasiliy Boldyrev <va...@gmail.com> on 2017/06/14 17:30:45 UTC

Can't upload pdf file to example Core

 Hello,

 I used Apache Solr™ version 6.6.0 but can't upload pdf file to Core

 Instruction and Example has been get from
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika

 Add to solconfig.xml additional path to /dist/ and /contrib/extraction jar
files.
<lib dir="${solr.install.dir:../../..}/contrib/extraction/lib" regex=
".*\.jar" />
<lib dir="${solr.install.dir:../../..}/dist/" regex="solr-cell-\d.*\.jar" />

 Change requestHandler with name=/update/extract:
 add <str name="fmap.Last-Modified">last_modified</str> but did not add
optional parameter "tika.config" to requestHandler

 From web interface try Upload pdf doc to Core "techproducts" from example
but received error - "Unsupported ContentType: application/pdf"

 http://localhost:8983/solr/#/techproducts/documents -> Document type -
File Upload - Choose  solr-word.pdf and received error

 "Unsupported ContentType: application/pdf Not in: [application/xml,
application/csv, application/json, text/json, text/csv, text/xml,
application/javabin]"

 From Core log file:
 ERROR - 2017-06-14 17:19:01.190; [   x:techproducts]
org.apache.solr.common.SolrException; org.apache.solr.common.SolrException:
Unsupported ContentType: application/pdf  Not in: [application/xml,
application/csv, application/json, text/json, text/csv, text/xml,
application/javabin]
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:90)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Unknown Source)


Try used curl utilite for uploading pdf file but receive same error:

C:\install\solr-6.6.0\example\exampledocs>curl.exe
http://localhost:8983/solr/techproducts/update/extract?literal.id=doc1&commit=true
-F solr-word.pdf
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">400</int><int
name="QTime">31</int></lst><lst name="error"><lst name="metadata"><str nam
e="error-class">org.apache.solr.common.SolrException</str><str
name="root-error-class">org.apache.solr.common.SolrException</str></lst
><str name="msg">missing content stream</str><int
name="code">400</int></lst>
</response>
'commit' is not recognized as an internal or external command, operable
program or batch file.

BR, Vasily Boldyrev

Re: Can't upload pdf file to example Core

Posted by Susheel Kumar <su...@gmail.com>.
Try using the curl command directly on terminal/console and it will work. I
just tried on 6.6 on a mac.  The upload thru UI would not work for PDF's
unless more parameters are provided.  The upload thru UI though works
directly for  XML/JSON files etc.

curl '
http://localhost:8983/solr/techproducts/update/extract?literal.id=doc1&commit=true'
-F
"myfile=@example/exampledocs/solr-word.pdf"


On Wed, Jun 14, 2017 at 1:30 PM, Vasiliy Boldyrev <
vasiliy.boldyrev@gmail.com> wrote:

>  Hello,
>
>  I used Apache Solr™ version 6.6.0 but can't upload pdf file to Core
>
>  Instruction and Example has been get from
> https://cwiki.apache.org/confluence/display/solr/
> Uploading+Data+with+Solr+Cell+using+Apache+Tika
>
>  Add to solconfig.xml additional path to /dist/ and /contrib/extraction jar
> files.
> <lib dir="${solr.install.dir:../../..}/contrib/extraction/lib" regex=
> ".*\.jar" />
> <lib dir="${solr.install.dir:../../..}/dist/" regex="solr-cell-\d.*\.jar"
> />
>
>  Change requestHandler with name=/update/extract:
>  add <str name="fmap.Last-Modified">last_modified</str> but did not add
> optional parameter "tika.config" to requestHandler
>
>  From web interface try Upload pdf doc to Core "techproducts" from example
> but received error - "Unsupported ContentType: application/pdf"
>
>  http://localhost:8983/solr/#/techproducts/documents -> Document type -
> File Upload - Choose  solr-word.pdf and received error
>
>  "Unsupported ContentType: application/pdf Not in: [application/xml,
> application/csv, application/json, text/json, text/csv, text/xml,
> application/javabin]"
>
>  From Core log file:
>  ERROR - 2017-06-14 17:19:01.190; [   x:techproducts]
> org.apache.solr.common.SolrException; org.apache.solr.common.
> SolrException:
> Unsupported ContentType: application/pdf  Not in: [application/xml,
> application/csv, application/json, text/json, text/csv, text/xml,
> application/javabin]
> at
> org.apache.solr.handler.UpdateRequestHandler$1.load(
> UpdateRequestHandler.java:90)
> at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
> ContentStreamHandlerBase.java:68)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(
> RequestHandlerBase.java:173)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:361)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:305)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilter(ServletHandler.java:1691)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:143)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHandler.java:548)
> at
> org.eclipse.jetty.server.session.SessionHandler.
> doHandle(SessionHandler.java:226)
> at
> org.eclipse.jetty.server.handler.ContextHandler.
> doHandle(ContextHandler.java:1180)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHandler.java:512)
> at
> org.eclipse.jetty.server.session.SessionHandler.
> doScope(SessionHandler.java:185)
> at
> org.eclipse.jetty.server.handler.ContextHandler.
> doScope(ContextHandler.java:1112)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:141)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> ContextHandlerCollection.java:213)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.
> handle(HandlerCollection.java:119)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
> at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> RewriteHandler.java:335)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:534)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> at
> org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConnection.java:251)
> at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
> AbstractConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChannelEndPoint.java:93)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> executeProduceConsume(ExecuteProduceConsume.java:303)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> produceConsume(ExecuteProduceConsume.java:148)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> ExecuteProduceConsume.java:136)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> QueuedThreadPool.java:671)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
> QueuedThreadPool.java:589)
> at java.lang.Thread.run(Unknown Source)
>
>
> Try used curl utilite for uploading pdf file but receive same error:
>
> C:\install\solr-6.6.0\example\exampledocs>curl.exe
> http://localhost:8983/solr/techproducts/update/extract?
> literal.id=doc1&commit=true
> -F solr-word.pdf
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
> <lst name="responseHeader"><int name="status">400</int><int
> name="QTime">31</int></lst><lst name="error"><lst name="metadata"><str nam
> e="error-class">org.apache.solr.common.SolrException</str><str
> name="root-error-class">org.apache.solr.common.SolrException</str></lst
> ><str name="msg">missing content stream</str><int
> name="code">400</int></lst>
> </response>
> 'commit' is not recognized as an internal or external command, operable
> program or batch file.
>
> BR, Vasily Boldyrev
>