You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2019/02/06 16:48:00 UTC

[jira] [Closed] (PDFBOX-4196) getUnqualifiedSequenceDateValueList returns date with undesired default values

     [ https://issues.apache.org/jira/browse/PDFBOX-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tilman Hausherr closed PDFBOX-4196.
-----------------------------------
    Resolution: Won't Do

I had a look to find out where {{getUnqualifiedSequenceDateValueList}} comes from, it comes from {{DateConverter.toCalendar}}. That one does use some default values. But Calendar doesn't support "null" or "-1" - if I set the month to -1, this is what happens:
{code}
Calendar cal = Calendar.getInstance();
cal.clear();
System.out.println(cal.getTime());
System.out.println(cal.get(Calendar.MONTH));
cal.set(Calendar.MONTH, -1);
System.out.println(cal.get(Calendar.MONTH));
{code}
The output is:
{code}
Thu Jan 01 00:00:00 CET 1970
0
11
{code}
So what you'd need to do is to get the value yourself and parse it yourself. The best would be you use {{DateConverter.toCalendar}} and modify it to your needs, to feed a Calendar-like structure.

> getUnqualifiedSequenceDateValueList returns date with undesired default values
> ------------------------------------------------------------------------------
>
>                 Key: PDFBOX-4196
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4196
>             Project: PDFBox
>          Issue Type: Wish
>          Components: XmpBox
>    Affects Versions: 2.0.8
>            Reporter: Johannes Manner
>            Priority: Major
>
> We try to integrate xmp metadata in pdf files (bibliograhpy metadata written in Latex) and read this metadata later on. 
> The problem in this case is, that sometimes only the year attribute is specified, but the xmp metadata parsing within xmpbox creates calendar values with defaults, especially the month attribute is critical for us.
> Maybe an example is more informative:
>  <dc:date>
>    <rdf:Seq>
>      <rdf:li>2017-06</rdf:li>
>     </rdf:Seq>
>  </dc:date>
>  In this case, we are only interested in the year and month value. The day value should be null or filled with a non-valid value like -1 or similar.
>  
> More problematic is the following case:
> <dc:date>
>    <rdf:Seq>
>        <rdf:li>2017</rdf:li>
>     </rdf:Seq>
>  </dc:date>
> Per default PDFBox fills the month with 1 -> January and my import recognices this 1 as January, but the value is not present.
> Our wish is, to have a functionality, where we can handle only information that are present and not struggeling with default values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org