You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Olivier Cailloux <ol...@gmail.com> on 2017/04/24 12:53:33 UTC
Get list of PDPageLabelRange?
Dear list,
How can I obtain, from a given PDDocument, a list of the page label
ranges that it contains?
Here is an example where I obtain the first PDPageLabelRange. How to
retrieve the other ones? I realize I can iterate over all pages of the
document and query for the possible existence of a pageLabelRange at
each page, but I suspect there must be a more efficient (and simpler) way.
try (PDDocument document = PDDocument.load(\u2026)) {
assert !document.isEncrypted();
PDDocumentCatalog catalog = document.getDocumentCatalog();
PDPageLabels labels = catalog.getPageLabels();
PDPageLabelRange pageLabelRange = labels.getPageLabelRange(0);
}
Thanks.
Olivier
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Get list of PDPageLabelRange?
Posted by Olivier Cailloux <ol...@gmail.com>.
Le 28/04/2017 � 21:32, Tilman Hausherr a �crit :
> /**
> * Get an ordered set of page indices having a page label range.
> *
> * @return set of page indices.
> */
> public SortedSet<Integer> getPageIndices()
>
>
> Please give feedback whether it works for you.
Perfect. Thanks. (Unrelatedly, it\u2019s very handy that the snapshots are
available through Maven.)
(Nitpicking: have you considered NavigableSet instead of SortedSet? But
for my purpose SortedSet is enough.)
Olivier
>
> Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Get list of PDPageLabelRange?
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 28.04.2017 um 13:42 schrieb Olivier Cailloux:
> Le 27/04/2017 � 18:15, Tilman Hausherr a �crit :
>> So, would this help you?
>>
>>
>> public Set<Integer>getPageIndices()
>> {
>> return labels.keySet();
>> }
>>
>
> Perfect! (Possibly also with a (javadoc) guarantee about the ascending
> iteration order over the set contents, if that\u2019s easy to provide.)
> Olivier
Done.
https://issues.apache.org/jira/browse/PDFBOX-3770
You get get a snapshot here:
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.6-SNAPSHOT/
/**
* Get an ordered set of page indices having a page label range.
*
* @return set of page indices.
*/
public SortedSet<Integer> getPageIndices()
Please give feedback whether it works for you.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Get list of PDPageLabelRange?
Posted by Olivier Cailloux <ol...@gmail.com>.
Le 27/04/2017 � 18:15, Tilman Hausherr a �crit :
> So, would this help you?
>
>
> public Set<Integer>getPageIndices()
> {
> return labels.keySet();
> }
>
Perfect! (Possibly also with a (javadoc) guarantee about the ascending
iteration order over the set contents, if that\u2019s easy to provide.)
Olivier
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Get list of PDPageLabelRange?
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 26.04.2017 um 13:44 schrieb Olivier Cailloux:
> Le 25/04/2017 � 18:43, Tilman Hausherr a �crit :
>> Am 25.04.2017 um 14:15 schrieb Olivier Cailloux:
>>> Le 24/04/2017 � 20:10, Tilman Hausherr a �crit :
>>>> Am 24.04.2017 um 14:53 schrieb Olivier Cailloux:
>>>>> Dear list,
>>>>> How can I obtain, from a given PDDocument, a list of the page
>>>>> label ranges that it contains?
>>>>>
>>>>> Here is an example where I obtain the first PDPageLabelRange. How
>>>>> to retrieve the other ones? I realize I can iterate over all pages
>>>>> of the document and query for the possible existence of a
>>>>> pageLabelRange at each page, but I suspect there must be a more
>>>>> efficient (and simpler) way.
>>>>>
>>>>> try (PDDocument document = PDDocument.load(\u2026)) {
>>>>> assert !document.isEncrypted();
>>>>> PDDocumentCatalog catalog = document.getDocumentCatalog();
>>>>> PDPageLabels labels = catalog.getPageLabels();
>>>>> PDPageLabelRange pageLabelRange = labels.getPageLabelRange(0);
>>>>> }
>>>>
>>>> [accidentally mailed; repost for the list]
>>>>
>>>> Do you need the range (which is a naming scheme) or do you need the
>>>> label?
>>> Thanks for your reply. I need the ranges.
>>>
>>>>
>>>> First one isn't available for some reason. There's a Map<Integer,
>>>> PDPageLabelRange> labels, but it is not available to the public.
>>> Should I file a request for improving this situation somehow?
>>> Olivier
>>
>> I'm wondering whether it is really THAT important?
> Certainly not extremely important. But still more elegant and nice to
> have, me thinks. I suspect other users will wonder how to iterate over
> the label ranges, as it seems a natural information to provide. (My
> two cents.)
>
>> I'm not sure if I should expose the map; maybe return a list of pages
>> as a set? You could then iterate on that one to get the ranges.
> A set of PDPageLabelRange in itself has low usefulness, as
> PDPageLabelRange objects do not contain the corresponding page
> indexes. I\u2019d rather suggest providing a set of page indexes on which
> PDPageLabelRange start. (Assuming you do not plan to change the rest
> of the API.) This would fit well with the existing
> labels.getPageLabelRange method. Or even better (IMHO), add the page
> index info to the PDPageLabelRange object. I feel it belongs there.
The page label object mirrors the page label dictionary (see in the PDF
specification "Entries in a page label dictionary") so we won't add the
page number.
"I\u2019d rather suggest providing a set of page indexes on which
PDPageLabelRange start."
That is what I meant, altough my words "maybe return a list of pages as
a set?" missed the word "indexes".
So, would this help you?
public Set<Integer>getPageIndices()
{
return labels.keySet();
}
Tilman
> Olivier
>
>>
>> Tilman
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Get list of PDPageLabelRange?
Posted by Olivier Cailloux <ol...@gmail.com>.
Le 25/04/2017 � 18:43, Tilman Hausherr a �crit :
> Am 25.04.2017 um 14:15 schrieb Olivier Cailloux:
>> Le 24/04/2017 � 20:10, Tilman Hausherr a �crit :
>>> Am 24.04.2017 um 14:53 schrieb Olivier Cailloux:
>>>> Dear list,
>>>> How can I obtain, from a given PDDocument, a list of the page label
>>>> ranges that it contains?
>>>>
>>>> Here is an example where I obtain the first PDPageLabelRange. How
>>>> to retrieve the other ones? I realize I can iterate over all pages
>>>> of the document and query for the possible existence of a
>>>> pageLabelRange at each page, but I suspect there must be a more
>>>> efficient (and simpler) way.
>>>>
>>>> try (PDDocument document = PDDocument.load(\u2026)) {
>>>> assert !document.isEncrypted();
>>>> PDDocumentCatalog catalog = document.getDocumentCatalog();
>>>> PDPageLabels labels = catalog.getPageLabels();
>>>> PDPageLabelRange pageLabelRange = labels.getPageLabelRange(0);
>>>> }
>>>
>>> [accidentally mailed; repost for the list]
>>>
>>> Do you need the range (which is a naming scheme) or do you need the
>>> label?
>> Thanks for your reply. I need the ranges.
>>
>>>
>>> First one isn't available for some reason. There's a Map<Integer,
>>> PDPageLabelRange> labels, but it is not available to the public.
>> Should I file a request for improving this situation somehow?
>> Olivier
>
> I'm wondering whether it is really THAT important?
Certainly not extremely important. But still more elegant and nice to
have, me thinks. I suspect other users will wonder how to iterate over
the label ranges, as it seems a natural information to provide. (My two
cents.)
> I'm not sure if I should expose the map; maybe return a list of pages
> as a set? You could then iterate on that one to get the ranges.
A set of PDPageLabelRange in itself has low usefulness, as
PDPageLabelRange objects do not contain the corresponding page indexes.
I\u2019d rather suggest providing a set of page indexes on which
PDPageLabelRange start. (Assuming you do not plan to change the rest of
the API.) This would fit well with the existing labels.getPageLabelRange
method. Or even better (IMHO), add the page index info to the
PDPageLabelRange object. I feel it belongs there.
Olivier
>
> Tilman
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Get list of PDPageLabelRange?
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 25.04.2017 um 14:15 schrieb Olivier Cailloux:
> Le 24/04/2017 � 20:10, Tilman Hausherr a �crit :
>> Am 24.04.2017 um 14:53 schrieb Olivier Cailloux:
>>> Dear list,
>>> How can I obtain, from a given PDDocument, a list of the page label
>>> ranges that it contains?
>>>
>>> Here is an example where I obtain the first PDPageLabelRange. How to
>>> retrieve the other ones? I realize I can iterate over all pages of
>>> the document and query for the possible existence of a
>>> pageLabelRange at each page, but I suspect there must be a more
>>> efficient (and simpler) way.
>>>
>>> try (PDDocument document = PDDocument.load(\u2026)) {
>>> assert !document.isEncrypted();
>>> PDDocumentCatalog catalog = document.getDocumentCatalog();
>>> PDPageLabels labels = catalog.getPageLabels();
>>> PDPageLabelRange pageLabelRange = labels.getPageLabelRange(0);
>>> }
>>
>> [accidentally mailed; repost for the list]
>>
>> Do you need the range (which is a naming scheme) or do you need the
>> label?
> Thanks for your reply. I need the ranges.
>
>>
>> First one isn't available for some reason. There's a Map<Integer,
>> PDPageLabelRange> labels, but it is not available to the public.
> Should I file a request for improving this situation somehow?
> Olivier
I'm wondering whether it is really THAT important? I'm not sure if I
should expose the map; maybe return a list of pages as a set? You could
then iterate on that one to get the ranges.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Get list of PDPageLabelRange?
Posted by Olivier Cailloux <ol...@gmail.com>.
Le 24/04/2017 � 20:10, Tilman Hausherr a �crit :
> Am 24.04.2017 um 14:53 schrieb Olivier Cailloux:
>> Dear list,
>> How can I obtain, from a given PDDocument, a list of the page label
>> ranges that it contains?
>>
>> Here is an example where I obtain the first PDPageLabelRange. How to
>> retrieve the other ones? I realize I can iterate over all pages of
>> the document and query for the possible existence of a pageLabelRange
>> at each page, but I suspect there must be a more efficient (and
>> simpler) way.
>>
>> try (PDDocument document = PDDocument.load(\u2026)) {
>> assert !document.isEncrypted();
>> PDDocumentCatalog catalog = document.getDocumentCatalog();
>> PDPageLabels labels = catalog.getPageLabels();
>> PDPageLabelRange pageLabelRange = labels.getPageLabelRange(0);
>> }
>
> [accidentally mailed; repost for the list]
>
> Do you need the range (which is a naming scheme) or do you need the label?
Thanks for your reply. I need the ranges.
>
> First one isn't available for some reason. There's a Map<Integer,
> PDPageLabelRange> labels, but it is not available to the public.
Should I file a request for improving this situation somehow?
Olivier
>
> Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Get list of PDPageLabelRange?
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 24.04.2017 um 14:53 schrieb Olivier Cailloux:
> Dear list,
> How can I obtain, from a given PDDocument, a list of the page label
> ranges that it contains?
>
> Here is an example where I obtain the first PDPageLabelRange. How to
> retrieve the other ones? I realize I can iterate over all pages of the
> document and query for the possible existence of a pageLabelRange at
> each page, but I suspect there must be a more efficient (and simpler)
> way.
>
> try (PDDocument document = PDDocument.load(\u2026)) {
> assert !document.isEncrypted();
> PDDocumentCatalog catalog = document.getDocumentCatalog();
> PDPageLabels labels = catalog.getPageLabels();
> PDPageLabelRange pageLabelRange = labels.getPageLabelRange(0);
> }
[accidentally mailed; repost for the list]
Do you need the range (which is a naming scheme) or do you need the
label? Last one would be easy.
labels.getLabelsByPageIndices();
First one isn't available for some reason. There's a Map<Integer,
PDPageLabelRange> labels, but it is not available to the public.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org