You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by "Allison, Timothy B." <ta...@mitre.org> on 2015/07/08 17:22:54 UTC

DomXmpParser: namespace not found

All,
Apologies for the idiocy I'm about to reveal (well, that won't be a revelation to anyone, really), but is there an obvious solution for this kind of error:

Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a definition for the namespace http://ns.adobe.com/lightroom/1.0/
                at org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:848)
                at org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:290)
                at org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:234)
                at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:198)
                at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:105)
                at org.apache.tika.parser.image.xmp.JempboxExtractor.parse(JempboxExtractor.java:59)

On a handful of image files in our test docs on Tika, I'm getting this with:

http://ns.adobe.com/lightroom/1.0/
http://ns.adobe.com/exif/1.0/aux/

I'm still kicking the tires on whether we'll be able to make the migration to xmpbox from jempbox.

Thank you.

  Best,

              Tim

Re: DomXmpParser: namespace not found

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi,

> Am 09.07.2015 um 18:13 schrieb Tilman Hausherr <TH...@t-online.de>:
> 
> Am 09.07.2015 um 15:35 schrieb Allison, Timothy B.:
>> From my perspective, it would be great to have a general xmp parser that also allows for some variance from spec (PDFBOX-2855).  We've been using jempbox for pdfs as well as images over on Tika, and it has worked well for us.
>> 
>> I'd prefer to continue using your xmp parser, but I understand if you need to limit what you're willing to take on.
>> 
>> I'll take a look at xmlgraphics, and I'll discuss the fallback option with Tika devs about moving jempbox into Tika.
> 
> I had a quick look at xmlgraphics xmp, it would also required extra implementation.
> 
> I don't mind having it in xmpbox (we have some non-PDF stuff at other places too), we just need a schema definition. Or the most complex possible file with that namespace. "All" there is to do then is to add a file in org.apache.xmpbox.schema.

would it be possible to get the XMP files causing the exception so we have something to test with?

BR
Maruan

> 
> Tilman
> 
>> 
>> Thank you.
>> 
>> Cheers,
>> 
>>                Tim
>> 
>> -----Original Message-----
>> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de]
>> Sent: Thursday, July 09, 2015 4:56 AM
>> To: users@pdfbox.apache.org
>> Subject: Re: DomXmpParser: namespace not found
>> 
>> Hi,
>> 
>>> Am 08.07.2015 um 22:42 schrieb Tilman Hausherr <TH...@t-online.de>:
>>> 
>>> Am 08.07.2015 um 17:22 schrieb Allison, Timothy B.:
>>>> All,
>>>> Apologies for the idiocy I'm about to reveal (well, that won't be a revelation to anyone, really), but is there an obvious solution for this kind of error:
>>>> 
>>>> Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a definition for the namespace http://ns.adobe.com/lightroom/1.0/
>>>>                 at org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:848)
>>>>                 at org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:290)
>>>>                 at org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:234)
>>>>                 at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:198)
>>>>                 at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:105)
>>>>                 at org.apache.tika.parser.image.xmp.JempboxExtractor.parse(JempboxExtractor.java:59)
>>>> 
>>>> On a handful of image files in our test docs on Tika, I'm getting this with:
>>>> 
>>>> http://ns.adobe.com/lightroom/1.0/
>>>> http://ns.adobe.com/exif/1.0/aux/
>>>> 
>>> These namespaces are not supported by xmpbox. We've had this problem with another namespace (I can't remember which one), and it wasn't possible to support it because we couldn't find a schema definition.
>>> 
>>> But you say these are image files. So this isn't about pdf xmp.
>> xmpbox is targeted around PDF/A-1. So I'd think we should discuss to extend it to support other PDF standard meta data requirements as well as generic XMP use cases to again have a generic XMP library. OTOH there is org.apache.xmlgraphics.xmp
>> 
>> WDYT?
>> 
>> BR
>> Maruan
>> 
>> 
>>> Tilman
>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: DomXmpParser: namespace not found

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 09.07.2015 um 15:35 schrieb Allison, Timothy B.:
>  From my perspective, it would be great to have a general xmp parser that also allows for some variance from spec (PDFBOX-2855).  We've been using jempbox for pdfs as well as images over on Tika, and it has worked well for us.
>
> I'd prefer to continue using your xmp parser, but I understand if you need to limit what you're willing to take on.
>
> I'll take a look at xmlgraphics, and I'll discuss the fallback option with Tika devs about moving jempbox into Tika.

I had a quick look at xmlgraphics xmp, it would also required extra 
implementation.

I don't mind having it in xmpbox (we have some non-PDF stuff at other 
places too), we just need a schema definition. Or the most complex 
possible file with that namespace. "All" there is to do then is to add a 
file in org.apache.xmpbox.schema.

Tilman

>
> Thank you.
>
> Cheers,
>
>                 Tim
>
> -----Original Message-----
> From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de]
> Sent: Thursday, July 09, 2015 4:56 AM
> To: users@pdfbox.apache.org
> Subject: Re: DomXmpParser: namespace not found
>
> Hi,
>
>> Am 08.07.2015 um 22:42 schrieb Tilman Hausherr <TH...@t-online.de>:
>>
>> Am 08.07.2015 um 17:22 schrieb Allison, Timothy B.:
>>> All,
>>> Apologies for the idiocy I'm about to reveal (well, that won't be a revelation to anyone, really), but is there an obvious solution for this kind of error:
>>>
>>> Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a definition for the namespace http://ns.adobe.com/lightroom/1.0/
>>>                  at org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:848)
>>>                  at org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:290)
>>>                  at org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:234)
>>>                  at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:198)
>>>                  at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:105)
>>>                  at org.apache.tika.parser.image.xmp.JempboxExtractor.parse(JempboxExtractor.java:59)
>>>
>>> On a handful of image files in our test docs on Tika, I'm getting this with:
>>>
>>> http://ns.adobe.com/lightroom/1.0/
>>> http://ns.adobe.com/exif/1.0/aux/
>>>
>> These namespaces are not supported by xmpbox. We've had this problem with another namespace (I can't remember which one), and it wasn't possible to support it because we couldn't find a schema definition.
>>
>> But you say these are image files. So this isn't about pdf xmp.
> xmpbox is targeted around PDF/A-1. So I'd think we should discuss to extend it to support other PDF standard meta data requirements as well as generic XMP use cases to again have a generic XMP library. OTOH there is org.apache.xmlgraphics.xmp
>
> WDYT?
>
> BR
> Maruan
>
>
>> Tilman
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


RE: DomXmpParser: namespace not found

Posted by "Allison, Timothy B." <ta...@mitre.org>.
>From my perspective, it would be great to have a general xmp parser that also allows for some variance from spec (PDFBOX-2855).  We've been using jempbox for pdfs as well as images over on Tika, and it has worked well for us. 

I'd prefer to continue using your xmp parser, but I understand if you need to limit what you're willing to take on.

I'll take a look at xmlgraphics, and I'll discuss the fallback option with Tika devs about moving jempbox into Tika.

Thank you.

Cheers,

               Tim

-----Original Message-----
From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
Sent: Thursday, July 09, 2015 4:56 AM
To: users@pdfbox.apache.org
Subject: Re: DomXmpParser: namespace not found

Hi,

> Am 08.07.2015 um 22:42 schrieb Tilman Hausherr <TH...@t-online.de>:
> 
> Am 08.07.2015 um 17:22 schrieb Allison, Timothy B.:
>> All,
>> Apologies for the idiocy I'm about to reveal (well, that won't be a revelation to anyone, really), but is there an obvious solution for this kind of error:
>> 
>> Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a definition for the namespace http://ns.adobe.com/lightroom/1.0/
>>                 at org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:848)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:290)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:234)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:198)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:105)
>>                 at org.apache.tika.parser.image.xmp.JempboxExtractor.parse(JempboxExtractor.java:59)
>> 
>> On a handful of image files in our test docs on Tika, I'm getting this with:
>> 
>> http://ns.adobe.com/lightroom/1.0/
>> http://ns.adobe.com/exif/1.0/aux/
>> 
> 
> These namespaces are not supported by xmpbox. We've had this problem with another namespace (I can't remember which one), and it wasn't possible to support it because we couldn't find a schema definition.
> 
> But you say these are image files. So this isn't about pdf xmp.

xmpbox is targeted around PDF/A-1. So I'd think we should discuss to extend it to support other PDF standard meta data requirements as well as generic XMP use cases to again have a generic XMP library. OTOH there is org.apache.xmlgraphics.xmp

WDYT?

BR
Maruan


> 
> Tilman
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


RE: DomXmpParser: namespace not found

Posted by "Allison, Timothy B." <ta...@mitre.org>.
>From my perspective, it would be great to have a general xmp parser that also allows for some variance from spec (PDFBOX-2855).  We've been using jempbox for pdfs as well as images over on Tika, and it has worked well for us. 

I'd prefer to continue using your xmp parser, but I understand if you need to limit what you're willing to take on.

I'll take a look at xmlgraphics, and I'll discuss the fallback option with Tika devs about moving jempbox into Tika.

Thank you.

Cheers,

               Tim

-----Original Message-----
From: Maruan Sahyoun [mailto:sahyoun@fileaffairs.de] 
Sent: Thursday, July 09, 2015 4:56 AM
To: users@pdfbox.apache.org
Subject: Re: DomXmpParser: namespace not found

Hi,

> Am 08.07.2015 um 22:42 schrieb Tilman Hausherr <TH...@t-online.de>:
> 
> Am 08.07.2015 um 17:22 schrieb Allison, Timothy B.:
>> All,
>> Apologies for the idiocy I'm about to reveal (well, that won't be a revelation to anyone, really), but is there an obvious solution for this kind of error:
>> 
>> Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a definition for the namespace http://ns.adobe.com/lightroom/1.0/
>>                 at org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:848)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:290)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:234)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:198)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:105)
>>                 at org.apache.tika.parser.image.xmp.JempboxExtractor.parse(JempboxExtractor.java:59)
>> 
>> On a handful of image files in our test docs on Tika, I'm getting this with:
>> 
>> http://ns.adobe.com/lightroom/1.0/
>> http://ns.adobe.com/exif/1.0/aux/
>> 
> 
> These namespaces are not supported by xmpbox. We've had this problem with another namespace (I can't remember which one), and it wasn't possible to support it because we couldn't find a schema definition.
> 
> But you say these are image files. So this isn't about pdf xmp.

xmpbox is targeted around PDF/A-1. So I'd think we should discuss to extend it to support other PDF standard meta data requirements as well as generic XMP use cases to again have a generic XMP library. OTOH there is org.apache.xmlgraphics.xmp

WDYT?

BR
Maruan


> 
> Tilman
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


Re: DomXmpParser: namespace not found

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi,

> Am 08.07.2015 um 22:42 schrieb Tilman Hausherr <TH...@t-online.de>:
> 
> Am 08.07.2015 um 17:22 schrieb Allison, Timothy B.:
>> All,
>> Apologies for the idiocy I'm about to reveal (well, that won't be a revelation to anyone, really), but is there an obvious solution for this kind of error:
>> 
>> Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a definition for the namespace http://ns.adobe.com/lightroom/1.0/
>>                 at org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:848)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:290)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:234)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:198)
>>                 at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:105)
>>                 at org.apache.tika.parser.image.xmp.JempboxExtractor.parse(JempboxExtractor.java:59)
>> 
>> On a handful of image files in our test docs on Tika, I'm getting this with:
>> 
>> http://ns.adobe.com/lightroom/1.0/
>> http://ns.adobe.com/exif/1.0/aux/
>> 
> 
> These namespaces are not supported by xmpbox. We've had this problem with another namespace (I can't remember which one), and it wasn't possible to support it because we couldn't find a schema definition.
> 
> But you say these are image files. So this isn't about pdf xmp.

xmpbox is targeted around PDF/A-1. So I'd think we should discuss to extend it to support other PDF standard meta data requirements as well as generic XMP use cases to again have a generic XMP library. OTOH there is org.apache.xmlgraphics.xmp

WDYT?

BR
Maruan


> 
> Tilman
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


Re: DomXmpParser: namespace not found

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 08.07.2015 um 17:22 schrieb Allison, Timothy B.:
> All,
> Apologies for the idiocy I'm about to reveal (well, that won't be a revelation to anyone, really), but is there an obvious solution for this kind of error:
>
> Caused by: org.apache.xmpbox.xml.XmpParsingException: Cannot find a definition for the namespace http://ns.adobe.com/lightroom/1.0/
>                  at org.apache.xmpbox.xml.DomXmpParser.checkPropertyDefinition(DomXmpParser.java:848)
>                  at org.apache.xmpbox.xml.DomXmpParser.parseChildrenAsProperties(DomXmpParser.java:290)
>                  at org.apache.xmpbox.xml.DomXmpParser.parseDescriptionRoot(DomXmpParser.java:234)
>                  at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:198)
>                  at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:105)
>                  at org.apache.tika.parser.image.xmp.JempboxExtractor.parse(JempboxExtractor.java:59)
>
> On a handful of image files in our test docs on Tika, I'm getting this with:
>
> http://ns.adobe.com/lightroom/1.0/
> http://ns.adobe.com/exif/1.0/aux/
>

These namespaces are not supported by xmpbox. We've had this problem 
with another namespace (I can't remember which one), and it wasn't 
possible to support it because we couldn't find a schema definition.

But you say these are image files. So this isn't about pdf xmp.

Tilman



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org