You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Chris Bamford <cb...@mimecast.com> on 2016/04/22 18:13:49 UTC
Jempbox runtime error
Hi
I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used. My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:
java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata
It is happening on an RTF file which unfortunately I cannot share.
I have generated a maven dependency tree and there is no mention of jempbox in there at all. Has anyone seen this issue or have any ideas of what I could try?
Thanks,
- Chris
Chris Bamford
Lead Software Engineer
CityPoint, One Ropemaker Street,
London,
EC2Y 9AW.
mobile +44 7860 405292
tel: +44 (0) 207 847 8700
web www.mimecast.com
The information contained in this communication from cbamford@mimecast.com is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org and others authorized to receive it. If you are not user@tika.apache.org you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.
Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com
RE: Jempbox runtime error
Posted by "Allison, Timothy B." <ta...@mitre.org>.
Ok, phew. Yes, they are, but we’re not…yet. ☺
Tika 1.13 should be around the corner, and that’ll include PDFBox 2.0 (and Jempbox!).
Best,
Tim
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 1:05 PM
To: user@tika.apache.org
Subject: Re: Jempbox runtime error
Thanks.
No, it was my confusion - PDFBox (which is also part of our app) has recently dropped it (see http://pdfbox.apache.org/2.0/migration.html).
So we may be actively managing it out - will revisit and hopefully all will be good.
Cheers,
- Chris
Chris Bamford
m: +44 7860 405292
www.mimecast.com<http://www.mimecast.com/>
Lead Software Engineer
p: +44 207 847 8700
Address click here<http://www.mimecast.com/About-us/Contact-us/>
________________________________
[http://www.mimecast.com]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=0ee97977eba7ea83596102e98e887681>
[LinkedIn]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=0af5ddd0c3a0a8b6be71b5b69b09b513>
[YouTube]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=1d33186193043ac11aa2de88fab2cd79>
[Facebook]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=89480d9b115cbf17a99e17bd11045609>
[Blog]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=7a9d8ba1eab0c90c3cdda0ff306625c2>
[Twitter]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=d05873ca23f5f82ca4bbe30ab29477c0>
On 22 Apr 2016, at 17:54, Allison, Timothy B. <ta...@mitre.org>> wrote:
That should be in our tika-parsers’ pom
<jempbox.version>1.8.11</jempbox.version>
So, um, where did you see that we had dropped Jempbox? I know that we wanted to at some point, but XMPBox only works on PDF/A so we aren’t going to move to that any time soon.
Cheers,
Tim
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:51 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Re: Jempbox runtime error
Hi Tim,
Nice to hear from you too - and thanks for the quick reply!
Good to know about the dependency, will try to include it (what version do you recommend?).
Thanks
- Chris
Chris Bamford
m: +44 7860 405292
www.mimecast.com<http://www.mimecast.com/>
Lead Software Engineer
p: +44 207 847 8700
Address click here<http://www.mimecast.com/About-us/Contact-us/>
________________________________
<image001.png><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=1d7dcd0d6b645d7adffb6c266d10bd1e>
<image002.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=0a3d27cb162a3239c064921a4c5aa231>
<image003.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=19a4f40b085f561c8417b232cecba3b8>
<image004.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=4befc68ae3c36b74613befac61365f92>
<image005.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=c18e757b199760a7639b14a093ecc854>
<image006.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=88cffd899bb6263568309604cc938d96>
On 22 Apr 2016, at 17:39, Allison, Timothy B. <ta...@mitre.org>> wrote:
Hi Chris,
Good to hear from you. We do still use Jempbox in 1.12 for the PDFParser and the JempboxExtractor. The RTF must have an embedded PDF or Jpeg or another image file.
Is there any chance Maven is not smiling upon you with transitive dependencies? When you bundle your app are you including all dependencies?
Very strange that it isn’t showing up in the dependency tree…
Hmmm…
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:14 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Jempbox runtime error
Hi
I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used. My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:
java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata
It is happening on an RTF file which unfortunately I cannot share.
I have generated a maven dependency tree and there is no mention of jempbox in there at all. Has anyone seen this issue or have any ideas of what I could try?
Thanks,
- Chris
[ Our Blog<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=d6a6a16cc391eeea05fbe4932cfbd281> ] [ Twitter<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=1fa6e400d55dc8eaac4d686256abba88> ] [ YouTube<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=6b2d1f4cf3e8ba6c5f96421cd53dc0d8> ]
Chris Bamford
Lead Software Engineer
m: +44 7860 405292
CityPoint, One Ropemaker Street, London, EC2Y 9AW.
+44 (0) 207 847 8700
Disclaimer
The information contained in this communication from cbamford@mimecast.com<ma...@mimecast.com> sent at 2016-04-22 17:13:51 is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org<ma...@tika.apache.org> and others authorized to receive it. If you are not user@tika.apache.org<ma...@tika.apache.org> you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.
Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com<ma...@mimecast.com>
________________________________
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com
Re: Jempbox runtime error
Posted by Chris Bamford <cb...@mimecast.com>.
Thanks.
No, it was my confusion - PDFBox (which is also part of our app) has recently dropped it (see http://pdfbox.apache.org/2.0/migration.html).
So we may be actively managing it out - will revisit and hopefully all will be good.
Cheers,
- Chris
Chris Bamford
Lead Software Engineer
m: +44 7860 405292
p: +44 207 847 8700
w: www.mimecast.com
Address click here: www.mimecast.com/About-us/Contact-us/
On 22 Apr 2016, at 17:54, Allison, Timothy B. <ta...@mitre.org>> wrote:
That should be in our tika-parsers’ pom
<jempbox.version>1.8.11</jempbox.version>
So, um, where did you see that we had dropped Jempbox? I know that we wanted to at some point, but XMPBox only works on PDF/A so we aren’t going to move to that any time soon.
Cheers,
Tim
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:51 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Re: Jempbox runtime error
Hi Tim,
Nice to hear from you too - and thanks for the quick reply!
Good to know about the dependency, will try to include it (what version do you recommend?).
Thanks
- Chris
Chris Bamford
m: +44 7860 405292
www.mimecast.com<http://www.mimecast.com/>
Lead Software Engineer
p: +44 207 847 8700
Address click here<http://www.mimecast.com/About-us/Contact-us/>
________________________________
<image001.png><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=1d7dcd0d6b645d7adffb6c266d10bd1e>
<image002.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=0a3d27cb162a3239c064921a4c5aa231>
<image003.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=19a4f40b085f561c8417b232cecba3b8>
<image004.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=4befc68ae3c36b74613befac61365f92>
<image005.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=c18e757b199760a7639b14a093ecc854>
<image006.gif><https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=88cffd899bb6263568309604cc938d96>
On 22 Apr 2016, at 17:39, Allison, Timothy B. <ta...@mitre.org>> wrote:
Hi Chris,
Good to hear from you. We do still use Jempbox in 1.12 for the PDFParser and the JempboxExtractor. The RTF must have an embedded PDF or Jpeg or another image file.
Is there any chance Maven is not smiling upon you with transitive dependencies? When you bundle your app are you including all dependencies?
Very strange that it isn’t showing up in the dependency tree…
Hmmm…
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:14 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Jempbox runtime error
Hi
I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used. My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:
java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata
It is happening on an RTF file which unfortunately I cannot share.
I have generated a maven dependency tree and there is no mention of jempbox in there at all. Has anyone seen this issue or have any ideas of what I could try?
Thanks,
- Chris
[ Our Blog<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=d6a6a16cc391eeea05fbe4932cfbd281> ] [ Twitter<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=1fa6e400d55dc8eaac4d686256abba88> ] [ YouTube<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=6b2d1f4cf3e8ba6c5f96421cd53dc0d8> ]
Chris Bamford
Lead Software Engineer
m: +44 7860 405292
CityPoint, One Ropemaker Street, London, EC2Y 9AW.
+44 (0) 207 847 8700
Disclaimer
The information contained in this communication from cbamford@mimecast.com<ma...@mimecast.com> sent at 2016-04-22 17:13:51 is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org<ma...@tika.apache.org> and others authorized to receive it. If you are not user@tika.apache.org<ma...@tika.apache.org> you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.
Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com<ma...@mimecast.com>
________________________________
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com<http://www.mimecast.com>
RE: Jempbox runtime error
Posted by "Allison, Timothy B." <ta...@mitre.org>.
That should be in our tika-parsers’ pom
<jempbox.version>1.8.11</jempbox.version>
So, um, where did you see that we had dropped Jempbox? I know that we wanted to at some point, but XMPBox only works on PDF/A so we aren’t going to move to that any time soon.
Cheers,
Tim
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:51 PM
To: user@tika.apache.org
Subject: Re: Jempbox runtime error
Hi Tim,
Nice to hear from you too - and thanks for the quick reply!
Good to know about the dependency, will try to include it (what version do you recommend?).
Thanks
- Chris
Chris Bamford
m: +44 7860 405292
www.mimecast.com<http://www.mimecast.com/>
Lead Software Engineer
p: +44 207 847 8700
Address click here<http://www.mimecast.com/About-us/Contact-us/>
________________________________
[http://www.mimecast.com]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=1d7dcd0d6b645d7adffb6c266d10bd1e>
[LinkedIn]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=0a3d27cb162a3239c064921a4c5aa231>
[YouTube]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=19a4f40b085f561c8417b232cecba3b8>
[Facebook]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=4befc68ae3c36b74613befac61365f92>
[Blog]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=c18e757b199760a7639b14a093ecc854>
[Twitter]<https://serviceB.mimecast.com/mimecast/click?account=C1A1&code=88cffd899bb6263568309604cc938d96>
On 22 Apr 2016, at 17:39, Allison, Timothy B. <ta...@mitre.org>> wrote:
Hi Chris,
Good to hear from you. We do still use Jempbox in 1.12 for the PDFParser and the JempboxExtractor. The RTF must have an embedded PDF or Jpeg or another image file.
Is there any chance Maven is not smiling upon you with transitive dependencies? When you bundle your app are you including all dependencies?
Very strange that it isn’t showing up in the dependency tree…
Hmmm…
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:14 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Jempbox runtime error
Hi
I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used. My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:
java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata
It is happening on an RTF file which unfortunately I cannot share.
I have generated a maven dependency tree and there is no mention of jempbox in there at all. Has anyone seen this issue or have any ideas of what I could try?
Thanks,
- Chris
[ Our Blog<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=d6a6a16cc391eeea05fbe4932cfbd281> ] [ Twitter<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=1fa6e400d55dc8eaac4d686256abba88> ] [ YouTube<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=6b2d1f4cf3e8ba6c5f96421cd53dc0d8> ]
Chris Bamford
Lead Software Engineer
m: +44 7860 405292
CityPoint, One Ropemaker Street, London, EC2Y 9AW.
+44 (0) 207 847 8700
Disclaimer
The information contained in this communication from cbamford@mimecast.com<ma...@mimecast.com> sent at 2016-04-22 17:13:51 is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org<ma...@tika.apache.org> and others authorized to receive it. If you are not user@tika.apache.org<ma...@tika.apache.org> you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.
Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com<ma...@mimecast.com>
________________________________
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com
Re: Jempbox runtime error
Posted by Chris Bamford <cb...@mimecast.com>.
Hi Tim,
Nice to hear from you too - and thanks for the quick reply!
Good to know about the dependency, will try to include it (what version do you recommend?).
Thanks
- Chris
Chris Bamford
Lead Software Engineer
m: +44 7860 405292
p: +44 207 847 8700
w: www.mimecast.com
Address click here: www.mimecast.com/About-us/Contact-us/
On 22 Apr 2016, at 17:39, Allison, Timothy B. <ta...@mitre.org>> wrote:
Hi Chris,
Good to hear from you. We do still use Jempbox in 1.12 for the PDFParser and the JempboxExtractor. The RTF must have an embedded PDF or Jpeg or another image file.
Is there any chance Maven is not smiling upon you with transitive dependencies? When you bundle your app are you including all dependencies?
Very strange that it isn’t showing up in the dependency tree…
Hmmm…
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:14 PM
To: user@tika.apache.org<ma...@tika.apache.org>
Subject: Jempbox runtime error
Hi
I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used. My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:
java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata
It is happening on an RTF file which unfortunately I cannot share.
I have generated a maven dependency tree and there is no mention of jempbox in there at all. Has anyone seen this issue or have any ideas of what I could try?
Thanks,
- Chris
[ Our Blog<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=d6a6a16cc391eeea05fbe4932cfbd281> ] [ Twitter<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=1fa6e400d55dc8eaac4d686256abba88> ] [ YouTube<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=6b2d1f4cf3e8ba6c5f96421cd53dc0d8> ]
Chris Bamford
Lead Software Engineer
m: +44 7860 405292
CityPoint, One Ropemaker Street, London, EC2Y 9AW.
+44 (0) 207 847 8700
Disclaimer
The information contained in this communication from cbamford@mimecast.com<ma...@mimecast.com> sent at 2016-04-22 17:13:51 is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org<ma...@tika.apache.org> and others authorized to receive it. If you are not user@tika.apache.org<ma...@tika.apache.org> you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.
Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com<ma...@mimecast.com>
________________________________
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com<http://www.mimecast.com>
RE: Jempbox runtime error
Posted by "Allison, Timothy B." <ta...@mitre.org>.
Hi Chris,
Good to hear from you. We do still use Jempbox in 1.12 for the PDFParser and the JempboxExtractor. The RTF must have an embedded PDF or Jpeg or another image file.
Is there any chance Maven is not smiling upon you with transitive dependencies? When you bundle your app are you including all dependencies?
Very strange that it isn’t showing up in the dependency tree…
Hmmm…
From: Chris Bamford [mailto:cbamford@mimecast.com]
Sent: Friday, April 22, 2016 12:14 PM
To: user@tika.apache.org
Subject: Jempbox runtime error
Hi
I recently upgraded to tika 1.12 from 1.7 and read the notes about Jempbox being no longer used. My pom now pulls in 1.12 versions of tika-core, tika-parsers, tika-xmp and tika-bundle.
The app is running well but very occasionally we see:
java.lang.NoClassDefFoundError: org/apache/jempbox/xmp/XMPMetadata
It is happening on an RTF file which unfortunately I cannot share.
I have generated a maven dependency tree and there is no mention of jempbox in there at all. Has anyone seen this issue or have any ideas of what I could try?
Thanks,
- Chris
[ Our Blog<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=d6a6a16cc391eeea05fbe4932cfbd281> ] [ Twitter<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=1fa6e400d55dc8eaac4d686256abba88> ] [ YouTube<https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=6b2d1f4cf3e8ba6c5f96421cd53dc0d8> ]
Chris Bamford
Lead Software Engineer
m: +44 7860 405292
CityPoint, One Ropemaker Street, London, EC2Y 9AW.
+44 (0) 207 847 8700
Disclaimer
The information contained in this communication from cbamford@mimecast.com<ma...@mimecast.com> sent at 2016-04-22 17:13:51 is confidential and may be legally privileged. It is intended solely for use by user@tika.apache.org<ma...@tika.apache.org> and others authorized to receive it. If you are not user@tika.apache.org<ma...@tika.apache.org> you are hereby notified that any disclosure, copying, distribution or taking action in reliance of the contents of this information is strictly prohibited and may be unlawful.
Mimecast Ltd. is a company registered in England and Wales with the company number 4698693 VAT No. GB 832 5179 29
Registered Office: CityPoint, One Ropemaker Street, Moorgate, London, EC2Y 9AW Email Address: info@mimecast.com<ma...@mimecast.com>
________________________________
This email message has been scanned for viruses by Mimecast.
Mimecast delivers a complete managed email solution from a single web based platform.
For more information please visit http://www.mimecast.com
________________________________