You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by MatthewMeredith <ma...@gmail.com> on 2017/06/26 23:21:22 UTC

Solr PDF parsing failing with java error

I have a shell script set up to clear a solr core and re-index a folder of
PDF files nightly like so:

cd /opt/solr/ && 
bin/post -c comox_core -host 67.231.17.10 -d
"<delete><query>attr_is_pdf:true</query></delete>" && 
bin/post -c comox_core -host 67.231.17.10 -filetypes pdf
/home/townofco/public_html/modx/assets/pdfs -params
"literal.is_pdf=true&uprefix=attr_"
All was working fine as far as I could tell but now I'm getting errors. For
every single file (558 of them) I'm getting something along these lines:

SimplePostTool: WARNING: IOException while reading response:
java.io.IOException: Server returned HTTP response code: 500 for URL:
http://67.231.17.10:8983/solr/comox_core/update/extract?literal.is_pdf=true&uprefix=attr_&resource.name=%2Fhome%2Ftownofco%2Fpublic_html%2Fmodx%2Fassets%2Fpdfs%2F2016+Meeting+dates.pdf&literal.id=%2Fhome%2Ftownofco%2Fpublic_html%2Fmodx%2Fassets%2Fpdfs%2F2016+Meeting+dates.pdf
SimplePostTool: WARNING: Solr returned an error #500 (Server Error) for url:
http://67.231.17.10:8983/solr/comox_core/update/extract?literal.is_pdf=true&uprefix=attr_&resource.name=%2Fhome%2Ftownofco%2Fpublic_html%2Fmodx%2Fassets%2Fpdfs%2FTips+on+packing+your+blue+box+for+a+windy+day.pdf&literal.id=%2Fhome%2Ftownofco%2Fpublic_html%2Fmodx%2Fassets%2Fpdfs%2FTips+on+packing+your+blue+box+for+a+windy+day.pdf
SimplePostTool: WARNING: Response: <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 500 Server Error</title>
</head>
<body>
HTTP ERROR 500

<p>Problem accessing /solr/comox_core/update/extract. Reason:
<pre>    Server Error</pre></p>
Caused by:
<pre>java.lang.NoClassDefFoundError: Could not initialize class
org.apache.pdfbox.pdmodel.PDPage
    at org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:217)
    at org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:185)
    at
org.apache.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:212)
    at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:344)
    at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:134)
    at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:146)
    at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256)
    at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256)
    at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
    at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
    at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
    at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:2053)
    at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:652)
    at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
    at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229)
    at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:184)
    at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
    at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
    at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
    at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
    at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
    at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
    at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
    at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
    at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
    at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
    at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
    at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
    at org.eclipse.jetty.server.Server.handle(Server.java:518)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
    at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
    at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
    at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
    at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
    at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
    at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
    at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
    at java.lang.Thread.run(Thread.java:745)
</pre>

</body>
</html>
Does anyone know what is happening and how to fix it? After I run the script
I get the COMMIT confirmation saying 558 files committed, but in my Solr
Admin page there are only 162 showing and if I search for a specific text
string from one of the PDF files, it does not get returned.

EDIT: I should add that if I search for the title of a PDF file, it DOES get
returned...

I checked my lib dir in the solrconfig.xml file and everything looks fine.
Here's my ExtractRequestHandler:

<requestHandler name="/update/extract"
                  startup="lazy"
                  class="solr.extraction.ExtractingRequestHandler" >
    <lst name="defaults">
      <str name="lowernames">true</str>
      <str name="fmap.meta">ignored_</str>
      <str name="fmap.content">_text_</str>
    </lst>
  </requestHandler>
EDIT 2: When I try to run a query in the Solr Admin using the
/update/extract request handler, I get the following returned:

{
  "responseHeader":{
    "status":400,
    "QTime":0},
  "error":{
    "metadata":[
      "error-class","org.apache.solr.common.SolrException",
      "root-error-class","org.apache.solr.common.SolrException"],
    "msg":"missing content stream",
    "code":400}}



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-PDF-parsing-failing-with-java-error-tp4342909.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr PDF parsing failing with java error

Posted by Erick Erickson <er...@gmail.com>.

Take a look at the solr logs, they'll give you a more explicit message.

My guess: Someone went into the Solr admin UI, clicked "core admin"
and then said "I wonder what this 'new core' button does?". The
default name is, you guessed it, "new_core". And if you don't have the
underlying directories set up already, there's no conf directory. And
so there's no solrconfig.xml to parse. And...

They can get past this by going into /var/solr/data and 'rm -rf new_core'.

Or, safer is to go in to

/var/solr/data/new_core/

and rename core.properties to anything else. Solr won't try to load
the core then. This assumes a relatively modern Solr, 4.x and above.
Or at least one that does not have cores defined in solr.xml.

Best,
Erick

On Tue, Jun 27, 2017 at 9:00 AM, MatthewMeredith
<ma...@gmail.com> wrote:
> Erick Erickson wrote
>> Sure, someone changed the system variable "solr.install.dir" (i.e.
>> -Dsolr.install.dir=some other place). Or removed it. Or changed the
>> startup script. Or....
>>
>> I've gotten very skeptical of "we didn't change anything but suddenly
>> it stopped working". Usually it's something someone's changed
>> unbeknownst to the person you're interacting with.
>>
>> The solr log usually shows the paths where everything gets loaded
>> from. You should be able to track where Solr is looking for all its
>> resources.
>>
>> It's also possible one of the jars was corrupted on disk (disks do go
>> bad).
>>
>> So you can also inspect the jars to see if that class. Here's a way to
>> look for one:
>>
>> find . -name '*jar' -exec bash -c 'jar tvf {} | grep
>> ParserDiscoveryAnnotation' \; -print
>>
>> where ParserDiscoveryAnnotation is the class you're not finding.
>>
>> Best,
>> Erick
>
> Erick,
>
> Don't worry, I'm equally as sceptical of the situation... But my client
> doesn't have access to the server and I haven't been on in months... So
> unless my web host went tinkering :P Could an update have caused an issue?
>
> If I type in:
>
> cd $SOLR_INSTALL
>
> as per the README file, I'm taken to /root. This doesn't seem right, does
> it? In my Solr Admin, the CWD is listed as /opt/solr-6.0.1/server and my
> core instance is at /var/solr/data/comox_core
>
> I tried going to the contrib/extraction/lib folder and running that find
> command, but I just got:
>
> bash: jar: command not found
>
> a bunch of times (once per .jar file, I assume).
>
> Another interesting thing is that when I opened my Solr Admin this morning,
> I was shown the following error:
>
> SolrCore Initialization Failures
> Hacked:
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Could not load conf for core Hacked: Error loading solr config from
> /var/solr/data/new_core/conf/solrconfig.xml
>
> I have no idea where this "new_core" bit is coming from... I've only ever
> had one core (comox_core).
>
> I really appreciate any help you can give!
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Solr-PDF-parsing-failing-with-java-error-tp4342909p4343053.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr PDF parsing failing with java error

Posted by MatthewMeredith <ma...@gmail.com>.

Erick Erickson wrote
> Sure, someone changed the system variable "solr.install.dir" (i.e.
> -Dsolr.install.dir=some other place). Or removed it. Or changed the
> startup script. Or....
> 
> I've gotten very skeptical of "we didn't change anything but suddenly
> it stopped working". Usually it's something someone's changed
> unbeknownst to the person you're interacting with.
> 
> The solr log usually shows the paths where everything gets loaded
> from. You should be able to track where Solr is looking for all its
> resources.
> 
> It's also possible one of the jars was corrupted on disk (disks do go
> bad).
> 
> So you can also inspect the jars to see if that class. Here's a way to
> look for one:
> 
> find . -name '*jar' -exec bash -c 'jar tvf {} | grep
> ParserDiscoveryAnnotation' \; -print
> 
> where ParserDiscoveryAnnotation is the class you're not finding.
> 
> Best,
> Erick

Erick,

Don't worry, I'm equally as sceptical of the situation... But my client
doesn't have access to the server and I haven't been on in months... So
unless my web host went tinkering :P Could an update have caused an issue? 

If I type in:

cd $SOLR_INSTALL

as per the README file, I'm taken to /root. This doesn't seem right, does
it? In my Solr Admin, the CWD is listed as /opt/solr-6.0.1/server and my
core instance is at /var/solr/data/comox_core

I tried going to the contrib/extraction/lib folder and running that find
command, but I just got:

bash: jar: command not found

a bunch of times (once per .jar file, I assume).

Another interesting thing is that when I opened my Solr Admin this morning,
I was shown the following error:

SolrCore Initialization Failures
Hacked:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load conf for core Hacked: Error loading solr config from
/var/solr/data/new_core/conf/solrconfig.xml

I have no idea where this "new_core" bit is coming from... I've only ever
had one core (comox_core).

I really appreciate any help you can give!



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-PDF-parsing-failing-with-java-error-tp4342909p4343053.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr PDF parsing failing with java error

Posted by Erick Erickson <er...@gmail.com>.

Sure, someone changed the system variable "solr.install.dir" (i.e.
-Dsolr.install.dir=some other place). Or removed it. Or changed the
startup script. Or....

I've gotten very skeptical of "we didn't change anything but suddenly
it stopped working". Usually it's something someone's changed
unbeknownst to the person you're interacting with.

The solr log usually shows the paths where everything gets loaded
from. You should be able to track where Solr is looking for all its
resources.

It's also possible one of the jars was corrupted on disk (disks do go bad).

So you can also inspect the jars to see if that class. Here's a way to
look for one:

find . -name '*jar' -exec bash -c 'jar tvf {} | grep
ParserDiscoveryAnnotation' \; -print

where ParserDiscoveryAnnotation is the class you're not finding.

Best,
Erick

On Mon, Jun 26, 2017 at 9:48 PM, MatthewMeredith
<ma...@gmail.com> wrote:
> Thanks so much for the reply, Erick!
>
> I haven't touched anything in several months; I got a message from the
> client I built the website for saying the PDF files they're putting into the
> folder weren't being indexed so I went in to investigate and discovered the
> error. Here's the applicable part of my solrconfig.xml (again, I haven't
> changed anything in the files):
>
>   <lib dir="${solr.install.dir:../../../..}/contrib/extraction/lib"
> regex=".*\.jar" />
>   <lib dir="${solr.install.dir:../../../..}/dist/"
> regex="solr-cell-\d.*\.jar" />
>
> The contrib/extraction/lib folder has 33 .jar files including
> pdfbox-1.8.8.jar tika-core-1.7.jar tika-java7-1.7.jar etc...
>
> Can you think of any other reason it would be giving that 500 error?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Solr-PDF-parsing-failing-with-java-error-tp4342909p4342958.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr PDF parsing failing with java error

Posted by MatthewMeredith <ma...@gmail.com>.

Thanks so much for the reply, Erick!

I haven't touched anything in several months; I got a message from the
client I built the website for saying the PDF files they're putting into the
folder weren't being indexed so I went in to investigate and discovered the
error. Here's the applicable part of my solrconfig.xml (again, I haven't
changed anything in the files):

  <lib dir="${solr.install.dir:../../../..}/contrib/extraction/lib"
regex=".*\.jar" />
  <lib dir="${solr.install.dir:../../../..}/dist/"
regex="solr-cell-\d.*\.jar" />

The contrib/extraction/lib folder has 33 .jar files including
pdfbox-1.8.8.jar tika-core-1.7.jar tika-java7-1.7.jar etc...

Can you think of any other reason it would be giving that 500 error?



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-PDF-parsing-failing-with-java-error-tp4342909p4342958.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr PDF parsing failing with java error

Posted by Erick Erickson <er...@gmail.com>.

Well, assuming you didn't, say, install a new Solr or some such it
looks like somebody removed some of the jar files that Tika depends
on, they're in the contrib area. Or changed the solrconfig.xml file to
not contain the <lib...> something like:

  <lib dir="${solr.install.dir:../../../..}/contrib/extraction/lib"
regex=".*\.jar" />
  <lib dir="${solr.install.dir:../../../..}/dist/"
regex="solr-cell-\d.*\.jar" />

BTW, for various reasons I prefer to do the heavy Tika lifting on a
client rather than use Solr's extracting request handler see:
https://lucidworks.com/2012/02/14/indexing-with-solrj/

That said it's up to you.

Best,
Erick

On Mon, Jun 26, 2017 at 4:21 PM, MatthewMeredith
<ma...@gmail.com> wrote:
> I have a shell script set up to clear a solr core and re-index a folder of
> PDF files nightly like so:
>
> cd /opt/solr/ &&
> bin/post -c comox_core -host 67.231.17.10 -d
> "<delete><query>attr_is_pdf:true</query></delete>" &&
> bin/post -c comox_core -host 67.231.17.10 -filetypes pdf
> /home/townofco/public_html/modx/assets/pdfs -params
> "literal.is_pdf=true&uprefix=attr_"
> All was working fine as far as I could tell but now I'm getting errors. For
> every single file (558 of them) I'm getting something along these lines:
>
> SimplePostTool: WARNING: IOException while reading response:
> java.io.IOException: Server returned HTTP response code: 500 for URL:
> http://67.231.17.10:8983/solr/comox_core/update/extract?literal.is_pdf=true&uprefix=attr_&resource.name=%2Fhome%2Ftownofco%2Fpublic_html%2Fmodx%2Fassets%2Fpdfs%2F2016+Meeting+dates.pdf&literal.id=%2Fhome%2Ftownofco%2Fpublic_html%2Fmodx%2Fassets%2Fpdfs%2F2016+Meeting+dates.pdf
> SimplePostTool: WARNING: Solr returned an error #500 (Server Error) for url:
> http://67.231.17.10:8983/solr/comox_core/update/extract?literal.is_pdf=true&uprefix=attr_&resource.name=%2Fhome%2Ftownofco%2Fpublic_html%2Fmodx%2Fassets%2Fpdfs%2FTips+on+packing+your+blue+box+for+a+windy+day.pdf&literal.id=%2Fhome%2Ftownofco%2Fpublic_html%2Fmodx%2Fassets%2Fpdfs%2FTips+on+packing+your+blue+box+for+a+windy+day.pdf
> SimplePostTool: WARNING: Response: <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
> <title>Error 500 Server Error</title>
> </head>
> <body>
> HTTP ERROR 500
>
> <p>Problem accessing /solr/comox_core/update/extract. Reason:
> <pre>    Server Error</pre></p>
> Caused by:
> <pre>java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.pdfbox.pdmodel.PDPage
>     at org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:217)
>     at org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:185)
>     at
> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:212)
>     at
> org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:344)
>     at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:134)
>     at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:146)
>     at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256)
>     at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256)
>     at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>     at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
>     at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
>     at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)
>     at org.apache.solr.core.SolrCore.execute(SolrCore.java:2053)
>     at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:652)
>     at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
>     at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229)
>     at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:184)
>     at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
>     at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
>     at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>     at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>     at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>     at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
>     at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
>     at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>     at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
>     at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>     at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>     at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>     at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>     at org.eclipse.jetty.server.Server.handle(Server.java:518)
>     at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
>     at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
>     at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>     at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>     at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>     at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
>     at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
>     at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
>     at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
>     at java.lang.Thread.run(Thread.java:745)
> </pre>
>
> </body>
> </html>
> Does anyone know what is happening and how to fix it? After I run the script
> I get the COMMIT confirmation saying 558 files committed, but in my Solr
> Admin page there are only 162 showing and if I search for a specific text
> string from one of the PDF files, it does not get returned.
>
> EDIT: I should add that if I search for the title of a PDF file, it DOES get
> returned...
>
> I checked my lib dir in the solrconfig.xml file and everything looks fine.
> Here's my ExtractRequestHandler:
>
> <requestHandler name="/update/extract"
>                   startup="lazy"
>                   class="solr.extraction.ExtractingRequestHandler" >
>     <lst name="defaults">
>       <str name="lowernames">true</str>
>       <str name="fmap.meta">ignored_</str>
>       <str name="fmap.content">_text_</str>
>     </lst>
>   </requestHandler>
> EDIT 2: When I try to run a query in the Solr Admin using the
> /update/extract request handler, I get the following returned:
>
> {
>   "responseHeader":{
>     "status":400,
>     "QTime":0},
>   "error":{
>     "metadata":[
>       "error-class","org.apache.solr.common.SolrException",
>       "root-error-class","org.apache.solr.common.SolrException"],
>     "msg":"missing content stream",
>     "code":400}}
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Solr-PDF-parsing-failing-with-java-error-tp4342909.html
> Sent from the Solr - User mailing list archive at Nabble.com.