You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Chris Bamford <cb...@mimecast.com> on 2016/04/22 18:13:49 UTC

Jempbox runtime error

Hi

I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used.  My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:

 java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata

It is happening on an RTF file which unfortunately I cannot share.

I have generated a maven dependency tree and there is no mention of jempbox in there at all.  Has anyone seen this issue or have any ideas of what I could try?

Thanks,

- Chris


Chris Bamford
Lead Software Engineer

CityPoint, One Ropemaker Street, 
London,
EC2Y 9AW.

mobile +44 7860 405292
tel: +44 (0) 207 847 8700
web www.mimecast.com


The information contained in this communication from cbamford@mimecast.com is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org and others authorized to receive it. If you are not user@tika.apache.org you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.


Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com

This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com


RE: Jempbox runtime error

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Ok, phew.  Yes, they are, but we’re not…yet. ☺

Tika 1.13 should be around the corner, and that’ll include PDFBox 2.0 (and Jempbox!).

Best,

          Tim

From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 1:05 PM
To: user@tika.apache.org
Subject: Re: Jempbox runtime error

Thanks.

No, it was my confusion - PDFBox (which is also part of our app) has recently dropped it (see http://pdfbox.apache.org/2.0/migration.html).
So we may be actively managing it out - will revisit and hopefully all will be good.

Cheers,

- Chris


Chris Bamford

m: +44 7860 405292

www.mimecast.com<http://www.mimecast.com/>

Lead Software Engineer

p: +44 207 847 8700

Address click here<http://www.mimecast.com/About-us/Contact-us/>

________________________________
[http://www.mimecast.com]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=0ee97977eba7ea83596102e98e887681>




[LinkedIn]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=0af5ddd0c3a0a8b6be71b5b69b09b513>


[YouTube]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=1d33186193043ac11aa2de88fab2cd79>


[Facebook]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=89480d9b115cbf17a99e17bd11045609>


[Blog]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=7a9d8ba1eab0c90c3cdda0ff306625c2>


[Twitter]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=d05873ca23f5f82ca4bbe30ab29477c0>







On 22 Apr 2016, at 17:54, Allison, Timothy B. <ta...@mitre.org>> wrote:


That should be in our tika-parsers’ pom

<jempbox.version>1.8.11</jempbox.version>

So, um, where did you see that we had dropped Jempbox?  I know that we wanted to at some point, but XMPBox only works on PDF/A so we aren’t going to move to that any time soon.

Cheers,

          Tim


From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:51 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Re: Jempbox runtime error

Hi Tim,

Nice to hear from you too - and thanks for the quick reply!

Good to know about the dependency, will try to include it (what version do you recommend?).

Thanks

- Chris


Chris Bamford

m: +44 7860 405292

www.mimecast.com<http://www.mimecast.com/>

Lead Software Engineer

p: +44 207 847 8700

Address click here<http://www.mimecast.com/About-us/Contact-us/>

________________________________
<image001.png><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=1d7dcd0d6b645d7adffb6c266d10bd1e>




<image002.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=0a3d27cb162a3239c064921a4c5aa231>


<image003.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=19a4f40b085f561c8417b232cecba3b8>


<image004.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=4befc68ae3c36b74613befac61365f92>


<image005.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=c18e757b199760a7639b14a093ecc854>


<image006.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=88cffd899bb6263568309604cc938d96>








On 22 Apr 2016, at 17:39, Allison, Timothy B. <ta...@mitre.org>> wrote:

Hi Chris,
  Good to hear from you.  We do still use Jempbox in 1.12 for the PDFParser and the JempboxExtractor.  The RTF must have an embedded PDF or Jpeg or another image file.
  Is there any chance Maven is not smiling upon you with transitive dependencies?  When you bundle your app are you including all dependencies?
  Very strange that it isn’t showing up in the dependency tree…

  Hmmm…
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:14 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Jempbox runtime error

Hi

I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used.  My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:

 java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata

It is happening on an RTF file which unfortunately I cannot share.

I have generated a maven dependency tree and there is no mention of jempbox in there at all.  Has anyone seen this issue or have any ideas of what I could try?

Thanks,

- Chris







[ Our Blog<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=d6a6a16cc391eeea05fbe4932cfbd281> ]   [ Twitter<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=1fa6e400d55dc8eaac4d686256abba88> ]   [ YouTube<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=6b2d1f4cf3e8ba6c5f96421cd53dc0d8> ]






Chris Bamford
Lead Software Engineer

m: +44 7860 405292


CityPoint, One Ropemaker Street, London, EC2Y 9AW.


+44 (0) 207 847 8700





Disclaimer
The information contained in this communication from cbamford@mimecast.com<ma...@mimecast.com> sent at 2016-04-22 17:13:51 is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org<ma...@tika.apache.org> and others authorized to receive it. If you are not user@tika.apache.org<ma...@tika.apache.org> you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.

Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com<ma...@mimecast.com>
________________________________
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com







Re: Jempbox runtime error

Posted by Chris Bamford <cb...@mimecast.com>.
Thanks.

No, it was my confusion - PDFBox (which is also part of our app) has recently dropped it (see http://pdfbox.apache.org/2.0/migration.html).
So we may be actively managing it out - will revisit and hopefully all will be good.

Cheers,

- Chris

Chris Bamford
Lead Software Engineer
m: +44 7860 405292
p: +44 207 847 8700
w: www.mimecast.com
Address click here: www.mimecast.com/About-us/Contact-us/

On 22 Apr 2016, at 17:54, Allison, Timothy B. <ta...@mitre.org>> wrote:


That should be in our tika-parsers’ pom

<jempbox.version>1.8.11</jempbox.version>

So, um, where did you see that we had dropped Jempbox?  I know that we wanted to at some point, but XMPBox only works on PDF/A so we aren’t going to move to that any time soon.

Cheers,

          Tim


From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:51 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Re: Jempbox runtime error

Hi Tim,

Nice to hear from you too - and thanks for the quick reply!

Good to know about the dependency, will try to include it (what version do you recommend?).

Thanks

- Chris


Chris Bamford

m: +44 7860 405292

www.mimecast.com<http://www.mimecast.com/>

Lead Software Engineer

p: +44 207 847 8700

Address click here<http://www.mimecast.com/About-us/Contact-us/>

________________________________
<image001.png><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=1d7dcd0d6b645d7adffb6c266d10bd1e>




<image002.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=0a3d27cb162a3239c064921a4c5aa231>


<image003.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=19a4f40b085f561c8417b232cecba3b8>


<image004.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=4befc68ae3c36b74613befac61365f92>


<image005.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=c18e757b199760a7639b14a093ecc854>


<image006.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=88cffd899bb6263568309604cc938d96>







On 22 Apr 2016, at 17:39, Allison, Timothy B. <ta...@mitre.org>> wrote:

Hi Chris,
  Good to hear from you.  We do still use Jempbox in 1.12 for the PDFParser and the JempboxExtractor.  The RTF must have an embedded PDF or Jpeg or another image file.
  Is there any chance Maven is not smiling upon you with transitive dependencies?  When you bundle your app are you including all dependencies?
  Very strange that it isn’t showing up in the dependency tree…

  Hmmm…
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:14 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Jempbox runtime error

Hi

I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used.  My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:

 java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata

It is happening on an RTF file which unfortunately I cannot share.

I have generated a maven dependency tree and there is no mention of jempbox in there at all.  Has anyone seen this issue or have any ideas of what I could try?

Thanks,

- Chris







[ Our Blog<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=d6a6a16cc391eeea05fbe4932cfbd281> ]   [ Twitter<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=1fa6e400d55dc8eaac4d686256abba88> ]   [ YouTube<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=6b2d1f4cf3e8ba6c5f96421cd53dc0d8> ]






Chris Bamford
Lead Software Engineer

m: +44 7860 405292


CityPoint, One Ropemaker Street, London, EC2Y 9AW.


+44 (0) 207 847 8700





Disclaimer
The information contained in this communication from cbamford@mimecast.com<ma...@mimecast.com> sent at 2016-04-22 17:13:51 is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org<ma...@tika.apache.org> and others authorized to receive it. If you are not user@tika.apache.org<ma...@tika.apache.org> you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.

Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com<ma...@mimecast.com>
________________________________
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com<http://www.mimecast.com>



RE: Jempbox runtime error

Posted by "Allison, Timothy B." <ta...@mitre.org>.
That should be in our tika-parsers’ pom

<jempbox.version>1.8.11</jempbox.version>

So, um, where did you see that we had dropped Jempbox?  I know that we wanted to at some point, but XMPBox only works on PDF/A so we aren’t going to move to that any time soon.

Cheers,

          Tim


From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:51 PM
To: user@tika.apache.org
Subject: Re: Jempbox runtime error

Hi Tim,

Nice to hear from you too - and thanks for the quick reply!

Good to know about the dependency, will try to include it (what version do you recommend?).

Thanks

- Chris


Chris Bamford

m: +44 7860 405292

www.mimecast.com<http://www.mimecast.com/>

Lead Software Engineer

p: +44 207 847 8700

Address click here<http://www.mimecast.com/About-us/Contact-us/>

________________________________
[http://www.mimecast.com]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=1d7dcd0d6b645d7adffb6c266d10bd1e>




[LinkedIn]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=0a3d27cb162a3239c064921a4c5aa231>


[YouTube]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=19a4f40b085f561c8417b232cecba3b8>


[Facebook]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=4befc68ae3c36b74613befac61365f92>


[Blog]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=c18e757b199760a7639b14a093ecc854>


[Twitter]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=88cffd899bb6263568309604cc938d96>







On 22 Apr 2016, at 17:39, Allison, Timothy B. <ta...@mitre.org>> wrote:

Hi Chris,
  Good to hear from you.  We do still use Jempbox in 1.12 for the PDFParser and the JempboxExtractor.  The RTF must have an embedded PDF or Jpeg or another image file.
  Is there any chance Maven is not smiling upon you with transitive dependencies?  When you bundle your app are you including all dependencies?
  Very strange that it isn’t showing up in the dependency tree…

  Hmmm…
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:14 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Jempbox runtime error

Hi

I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used.  My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:

 java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata

It is happening on an RTF file which unfortunately I cannot share.

I have generated a maven dependency tree and there is no mention of jempbox in there at all.  Has anyone seen this issue or have any ideas of what I could try?

Thanks,

- Chris







[ Our Blog<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=d6a6a16cc391eeea05fbe4932cfbd281> ]   [ Twitter<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=1fa6e400d55dc8eaac4d686256abba88> ]   [ YouTube<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=6b2d1f4cf3e8ba6c5f96421cd53dc0d8> ]






Chris Bamford
Lead Software Engineer

m: +44 7860 405292


CityPoint, One Ropemaker Street, London, EC2Y 9AW.


+44 (0) 207 847 8700





Disclaimer
The information contained in this communication from cbamford@mimecast.com<ma...@mimecast.com> sent at 2016-04-22 17:13:51 is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org<ma...@tika.apache.org> and others authorized to receive it. If you are not user@tika.apache.org<ma...@tika.apache.org> you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.

Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com<ma...@mimecast.com>
________________________________
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com




Re: Jempbox runtime error

Posted by Chris Bamford <cb...@mimecast.com>.
Hi Tim,

Nice to hear from you too - and thanks for the quick reply!

Good to know about the dependency, will try to include it (what version do you recommend?).

Thanks

- Chris

Chris Bamford
Lead Software Engineer
m: +44 7860 405292
p: +44 207 847 8700
w: www.mimecast.com
Address click here: www.mimecast.com/About-us/Contact-us/

On 22 Apr 2016, at 17:39, Allison, Timothy B. <ta...@mitre.org>> wrote:

Hi Chris,
  Good to hear from you.  We do still use Jempbox in 1.12 for the PDFParser and the JempboxExtractor.  The RTF must have an embedded PDF or Jpeg or another image file.
  Is there any chance Maven is not smiling upon you with transitive dependencies?  When you bundle your app are you including all dependencies?
  Very strange that it isn’t showing up in the dependency tree…

  Hmmm…
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:14 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Jempbox runtime error

Hi

I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used.  My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:

 java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata

It is happening on an RTF file which unfortunately I cannot share.

I have generated a maven dependency tree and there is no mention of jempbox in there at all.  Has anyone seen this issue or have any ideas of what I could try?

Thanks,

- Chris







[ Our Blog<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=d6a6a16cc391eeea05fbe4932cfbd281> ]   [ Twitter<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=1fa6e400d55dc8eaac4d686256abba88> ]   [ YouTube<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=6b2d1f4cf3e8ba6c5f96421cd53dc0d8> ]






Chris Bamford
Lead Software Engineer

m: +44 7860 405292


CityPoint, One Ropemaker Street, London, EC2Y 9AW.


+44 (0) 207 847 8700






Disclaimer
The information contained in this communication from cbamford@mimecast.com<ma...@mimecast.com> sent at 2016-04-22 17:13:51 is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org<ma...@tika.apache.org> and others authorized to receive it. If you are not user@tika.apache.org<ma...@tika.apache.org> you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.

Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com<ma...@mimecast.com>

________________________________
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com<http://www.mimecast.com>



RE: Jempbox runtime error

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Hi Chris,
  Good to hear from you.  We do still use Jempbox in 1.12 for the PDFParser and the JempboxExtractor.  The RTF must have an embedded PDF or Jpeg or another image file.
  Is there any chance Maven is not smiling upon you with transitive dependencies?  When you bundle your app are you including all dependencies?
  Very strange that it isn’t showing up in the dependency tree…

  Hmmm…
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:14 PM
To: user@tika.apache.org
Subject: Jempbox runtime error

Hi

I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used.  My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:

 java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata

It is happening on an RTF file which unfortunately I cannot share.

I have generated a maven dependency tree and there is no mention of jempbox in there at all.  Has anyone seen this issue or have any ideas of what I could try?

Thanks,

- Chris







[ Our Blog<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=d6a6a16cc391eeea05fbe4932cfbd281> ]   [ Twitter<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=1fa6e400d55dc8eaac4d686256abba88> ]   [ YouTube<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=6b2d1f4cf3e8ba6c5f96421cd53dc0d8> ]






Chris Bamford
Lead Software Engineer

m: +44 7860 405292


CityPoint, One Ropemaker Street, London, EC2Y 9AW.


+44 (0) 207 847 8700






Disclaimer
The information contained in this communication from cbamford@mimecast.com<ma...@mimecast.com> sent at 2016-04-22 17:13:51 is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org<ma...@tika.apache.org> and others authorized to receive it. If you are not user@tika.apache.org<ma...@tika.apache.org> you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.

Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com<ma...@mimecast.com>

________________________________
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com
________________________________