You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by karthick g <ik...@gmail.com> on 2017/05/18 03:47:05 UTC

Linearized dictionary

Hi team,

I am a long time user of PDFBox. We starts to migrate pdfbox from 1.8.2 to
2.0.5.
During migration I found that Linearized dictionary moved to preflight jar.
I created the PDDocument based on preflight context which is returning null.
Since the PDDocument is null I can't proceed further. What is the right way
to
get Lineraized dictionary in the current version of PDFBox . Please guide
me.
Please let me know if you need more details.

Regards,
Karthick G

Re: Linearized dictionary

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 18.05.2017 um 05:47 schrieb karthick g:
> Hi team,
> 
> I am a long time user of PDFBox. We starts to migrate pdfbox from 1.8.2 to
> 2.0.5.
> During migration I found that Linearized dictionary moved to preflight jar.
No, that's not correct. It was always part of preflight and located in the 
preflight jar.

> I created the PDDocument based on preflight context which is returning null.
> Since the PDDocument is null I can't proceed further. What is the right way
> to
> get Lineraized dictionary in the current version of PDFBox . Please guide
> me.
It isn't needed to parse a linearized pdf. Why do you need it?

> Please let me know if you need more details.
> 
> Regards,
> Karthick G

Andreas



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Linearized dictionary

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 19.05.2017 um 05:57 schrieb karthick g:
> Now Linearized
> keyword is not in
> the List of COSName. How can I get the Linearized dictionary in PDFBox.



COSName.getPDFName("Linearized")




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Linearized dictionary

Posted by karthick g <ik...@gmail.com>.
Hi team,

The code given by based on loading preflight document works fine.

Thank You

Regards,
karthick G

On Tue, May 30, 2017 at 9:23 AM, karthick g <ik...@gmail.com> wrote:

> Hi,
>
> Thanks for your reply.
>
> "@karthick: as a "dumb" workaround, just read 1024 bytes (or whatever is
> best) and search for "Linerarized"."
>
> By this we can check the Linearized key word. But if their is Incremental
> update, It will show linearized in the text ....
>
> but technically it is not linearized.  For that I will get the value of /L
> from the dictionary which is the size of the file.
>
> cOSDictionary.keySet().contains(COSName.getPDFName("Linearized"))
>                             && pdf_file_length == cOSDictionary.getInt("L")
>
> This code will work fine in 1.8.2
> but in latest since I can't get \L from the dictionary I can't get the
> size of the file from the \L option. How can I Implement this,
> when there is an incremental update in the file in Pdfbox latest version.
>
> Regards,
> Karthick G
>
>
>
> On Mon, May 22, 2017 at 9:47 AM, karthick g <ik...@gmail.com>
> wrote:
>
>> Hi team,
>>
>> Here is the code, I am using COSName.getPDFName("Linearized). The
>> problem is
>>
>> PDDocument pdDoc = PDDocument.load(new File(""));
>> COSDocument cosDoc = pdDoc.getDocument();
>> List<?> lObj = cosDoc.getObjects();
>>             for (Object object : lObj) {
>>
>>                 COSBase curObj = ((COSObject) object).getObject();
>>                 if (curObj instanceof COSDictionary) {
>>
>>                     COSDictionary cOSDictionary = (COSDictionary) curObj;
>>
>>                     if (cOSDictionary.keySet().contai
>> ns(COSName.getPDFName("Linearized"))) {
>>                         //System.out.println("Linearized");
>>                     }
>>                 }
>>             }
>>
>> While using 1.8.2 Linearized is working properly. But in 2.0.5 I can not
>> get the linearized and I can't check the linearized as it is not in the
>> dictionary keyset. Please let me know if you need more details.
>>
>>
>>
>>
>> Regards,
>> Karthick G
>>
>> On Fri, May 19, 2017 at 9:27 AM, karthick g <ik...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> * I need to Check whether my PDF file is Linearized or not, for fast
>>> view web. *
>>> In the previous version (1.8.2) of PDFBox Linearized is in the COSName.
>>> I will get the COSDictionary and check whether Linearized is available in
>>> the COSName and conclude the PDF is suited for fast web view. Now
>>> Linearized keyword is not in
>>> the List of COSName. How can I get the Linearized dictionary in PDFBox.
>>> Please let me know if you need more details.
>>>
>>> Regards,
>>> Karthick G
>>>
>>>
>>>
>>> On Thu, May 18, 2017 at 9:17 AM, karthick g <ik...@gmail.com>
>>> wrote:
>>>
>>>> Hi team,
>>>>
>>>> I am a long time user of PDFBox. We starts to migrate pdfbox from 1.8.2
>>>> to 2.0.5.
>>>> During migration I found that Linearized dictionary moved to preflight
>>>> jar.
>>>> I created the PDDocument based on preflight context which is returning
>>>> null.
>>>> Since the PDDocument is null I can't proceed further. What is the right
>>>> way to
>>>> get Lineraized dictionary in the current version of PDFBox . Please
>>>> guide me.
>>>> Please let me know if you need more details.
>>>>
>>>> Regards,
>>>> Karthick G
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Re: Linearized dictionary

Posted by Tilman Hausherr <TH...@t-online.de>.
That one you could find with a regular expression... obviously, it will be

something << something /L Digits something  >> something


I suspect that it's a difference between the normal parser and the 
preflight parser.

Another idea:
open the document with preflight.

         PreflightParser parser = new PreflightParser(new 
File("XXXXX.pdf"));
         parser.parse();
         PreflightDocument preflightDocument = 
parser.getPreflightDocument();

         List<?> lObj = preflightDocument.getDocument().getObjects();
         for (Object object : lObj)
         {
             COSBase curObj = ((COSObject) object).getObject();
             if (curObj instanceof COSDictionary
                     && ((COSDictionary) 
curObj).keySet().contains(COSName.getPDFName(DICTIONARY_KEY_LINEARIZED)))
             {
                 System.out.println("Linearized: " + curObj);
             }
         }
         preflightDocument.close();


output I get:

Linearized: 
COSDictionary{COSName{E}:COSInt{5442};COSName{H}:COSArray{[COSInt{811}, 
COSInt{211}]};COSName{L}:COSInt{54625};COSName{Linearized}:COSInt{1};COSName{N}:COSInt{18};COSName{O}:COSInt{92};COSName{T}:COSInt{54090};}

Tilman


Am 30.05.2017 um 05:53 schrieb karthick g:
> Hi,
>
> Thanks for your reply.
>
> "@karthick: as a "dumb" workaround, just read 1024 bytes (or whatever is
> best) and search for "Linerarized"."
>
> By this we can check the Linearized key word. But if their is Incremental
> update, It will show linearized in the text ....
>
> but technically it is not linearized.  For that I will get the value of /L
> from the dictionary which is the size of the file.
>
> cOSDictionary.keySet().contains(COSName.getPDFName("Linearized"))
>                              && pdf_file_length == cOSDictionary.getInt("L")
>
> This code will work fine in 1.8.2
> but in latest since I can't get \L from the dictionary I can't get the size
> of the file from the \L option. How can I Implement this,
> when there is an incremental update in the file in Pdfbox latest version.
>
> Regards,
> Karthick G
>
>
>
> On Mon, May 22, 2017 at 9:47 AM, karthick g <ik...@gmail.com> wrote:
>
>> Hi team,
>>
>> Here is the code, I am using COSName.getPDFName("Linearized). The problem
>> is
>>
>> PDDocument pdDoc = PDDocument.load(new File(""));
>> COSDocument cosDoc = pdDoc.getDocument();
>> List<?> lObj = cosDoc.getObjects();
>>              for (Object object : lObj) {
>>
>>                  COSBase curObj = ((COSObject) object).getObject();
>>                  if (curObj instanceof COSDictionary) {
>>
>>                      COSDictionary cOSDictionary = (COSDictionary) curObj;
>>
>>                      if (cOSDictionary.keySet().
>> contains(COSName.getPDFName("Linearized"))) {
>>                          //System.out.println("Linearized");
>>                      }
>>                  }
>>              }
>>
>> While using 1.8.2 Linearized is working properly. But in 2.0.5 I can not
>> get the linearized and I can't check the linearized as it is not in the
>> dictionary keyset. Please let me know if you need more details.
>>
>>
>>
>>
>> Regards,
>> Karthick G
>>
>> On Fri, May 19, 2017 at 9:27 AM, karthick g <ik...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> * I need to Check whether my PDF file is Linearized or not, for fast view
>>> web. *
>>> In the previous version (1.8.2) of PDFBox Linearized is in the COSName. I
>>> will get the COSDictionary and check whether Linearized is available in the
>>> COSName and conclude the PDF is suited for fast web view. Now Linearized
>>> keyword is not in
>>> the List of COSName. How can I get the Linearized dictionary in PDFBox.
>>> Please let me know if you need more details.
>>>
>>> Regards,
>>> Karthick G
>>>
>>>
>>>
>>> On Thu, May 18, 2017 at 9:17 AM, karthick g <ik...@gmail.com>
>>> wrote:
>>>
>>>> Hi team,
>>>>
>>>> I am a long time user of PDFBox. We starts to migrate pdfbox from 1.8.2
>>>> to 2.0.5.
>>>> During migration I found that Linearized dictionary moved to preflight
>>>> jar.
>>>> I created the PDDocument based on preflight context which is returning
>>>> null.
>>>> Since the PDDocument is null I can't proceed further. What is the right
>>>> way to
>>>> get Lineraized dictionary in the current version of PDFBox . Please
>>>> guide me.
>>>> Please let me know if you need more details.
>>>>
>>>> Regards,
>>>> Karthick G
>>>>
>>>>
>>>>
>>>>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Linearized dictionary

Posted by karthick g <ik...@gmail.com>.
Hi,

Thanks for your reply.

"@karthick: as a "dumb" workaround, just read 1024 bytes (or whatever is
best) and search for "Linerarized"."

By this we can check the Linearized key word. But if their is Incremental
update, It will show linearized in the text ....

but technically it is not linearized.  For that I will get the value of /L
from the dictionary which is the size of the file.

cOSDictionary.keySet().contains(COSName.getPDFName("Linearized"))
                            && pdf_file_length == cOSDictionary.getInt("L")

This code will work fine in 1.8.2
but in latest since I can't get \L from the dictionary I can't get the size
of the file from the \L option. How can I Implement this,
when there is an incremental update in the file in Pdfbox latest version.

Regards,
Karthick G



On Mon, May 22, 2017 at 9:47 AM, karthick g <ik...@gmail.com> wrote:

> Hi team,
>
> Here is the code, I am using COSName.getPDFName("Linearized). The problem
> is
>
> PDDocument pdDoc = PDDocument.load(new File(""));
> COSDocument cosDoc = pdDoc.getDocument();
> List<?> lObj = cosDoc.getObjects();
>             for (Object object : lObj) {
>
>                 COSBase curObj = ((COSObject) object).getObject();
>                 if (curObj instanceof COSDictionary) {
>
>                     COSDictionary cOSDictionary = (COSDictionary) curObj;
>
>                     if (cOSDictionary.keySet().
> contains(COSName.getPDFName("Linearized"))) {
>                         //System.out.println("Linearized");
>                     }
>                 }
>             }
>
> While using 1.8.2 Linearized is working properly. But in 2.0.5 I can not
> get the linearized and I can't check the linearized as it is not in the
> dictionary keyset. Please let me know if you need more details.
>
>
>
>
> Regards,
> Karthick G
>
> On Fri, May 19, 2017 at 9:27 AM, karthick g <ik...@gmail.com>
> wrote:
>
>> Hi,
>> * I need to Check whether my PDF file is Linearized or not, for fast view
>> web. *
>> In the previous version (1.8.2) of PDFBox Linearized is in the COSName. I
>> will get the COSDictionary and check whether Linearized is available in the
>> COSName and conclude the PDF is suited for fast web view. Now Linearized
>> keyword is not in
>> the List of COSName. How can I get the Linearized dictionary in PDFBox.
>> Please let me know if you need more details.
>>
>> Regards,
>> Karthick G
>>
>>
>>
>> On Thu, May 18, 2017 at 9:17 AM, karthick g <ik...@gmail.com>
>> wrote:
>>
>>> Hi team,
>>>
>>> I am a long time user of PDFBox. We starts to migrate pdfbox from 1.8.2
>>> to 2.0.5.
>>> During migration I found that Linearized dictionary moved to preflight
>>> jar.
>>> I created the PDDocument based on preflight context which is returning
>>> null.
>>> Since the PDDocument is null I can't proceed further. What is the right
>>> way to
>>> get Lineraized dictionary in the current version of PDFBox . Please
>>> guide me.
>>> Please let me know if you need more details.
>>>
>>> Regards,
>>> Karthick G
>>>
>>>
>>>
>>>
>>
>

Re: Linearized dictionary

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 22.05.2017 um 12:30 schrieb Andreas Lehmkühler:
>> While using 1.8.2 Linearized is working properly. But in 2.0.5 I can not
>> get the linearized and I can't check the linearized as it is not in the
>> dictionary keyset. Please let me know if you need more details.
> I can confirm the behaviour. The object is read but not dereferenced as it isn't needed. Consequently that dictionary isn't part of the object pool.
> I have no solution yet ....


@karthick: as a "dumb" workaround, just read 1024 bytes (or whatever is 
best) and search for "Linerarized".

Tilman



Re: Linearized dictionary

Posted by Andreas Lehmkühler <an...@lehmi.de>.
> karthick g <ik...@gmail.com> hat am 22. Mai 2017 um 06:17 geschrieben:
> 
> 
> Hi team,
> 
> Here is the code, I am using COSName.getPDFName("Linearized). The problem
> is
> 
> PDDocument pdDoc = PDDocument.load(new File(""));
> COSDocument cosDoc = pdDoc.getDocument();
> List<?> lObj = cosDoc.getObjects();
>             for (Object object : lObj) {
> 
>                 COSBase curObj = ((COSObject) object).getObject();
>                 if (curObj instanceof COSDictionary) {
> 
>                     COSDictionary cOSDictionary = (COSDictionary) curObj;
> 
>                     if
> (cOSDictionary.keySet().contains(COSName.getPDFName("Linearized"))) {
>                         //System.out.println("Linearized");
>                     }
>                 }
>             }
> 
> While using 1.8.2 Linearized is working properly. But in 2.0.5 I can not
> get the linearized and I can't check the linearized as it is not in the
> dictionary keyset. Please let me know if you need more details.
I can confirm the behaviour. The object is read but not dereferenced as it isn't needed. Consequently that dictionary isn't part of the object pool.
I have no solution yet ....

Andreas
> 
> 
> 
> 
> Regards,
> Karthick G
> 
> On Fri, May 19, 2017 at 9:27 AM, karthick g <ik...@gmail.com> wrote:
> 
> > Hi,
> > * I need to Check whether my PDF file is Linearized or not, for fast view
> > web. *
> > In the previous version (1.8.2) of PDFBox Linearized is in the COSName. I
> > will get the COSDictionary and check whether Linearized is available in the
> > COSName and conclude the PDF is suited for fast web view. Now Linearized
> > keyword is not in
> > the List of COSName. How can I get the Linearized dictionary in PDFBox.
> > Please let me know if you need more details.
> >
> > Regards,
> > Karthick G
> >
> >
> >
> > On Thu, May 18, 2017 at 9:17 AM, karthick g <ik...@gmail.com>
> > wrote:
> >
> >> Hi team,
> >>
> >> I am a long time user of PDFBox. We starts to migrate pdfbox from 1.8.2
> >> to 2.0.5.
> >> During migration I found that Linearized dictionary moved to preflight
> >> jar.
> >> I created the PDDocument based on preflight context which is returning
> >> null.
> >> Since the PDDocument is null I can't proceed further. What is the right
> >> way to
> >> get Lineraized dictionary in the current version of PDFBox . Please guide
> >> me.
> >> Please let me know if you need more details.
> >>
> >> Regards,
> >> Karthick G
> >>
> >>
> >>
> >>
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Linearized dictionary

Posted by karthick g <ik...@gmail.com>.
Hi team,

Here is the code, I am using COSName.getPDFName("Linearized). The problem
is

PDDocument pdDoc = PDDocument.load(new File(""));
COSDocument cosDoc = pdDoc.getDocument();
List<?> lObj = cosDoc.getObjects();
            for (Object object : lObj) {

                COSBase curObj = ((COSObject) object).getObject();
                if (curObj instanceof COSDictionary) {

                    COSDictionary cOSDictionary = (COSDictionary) curObj;

                    if
(cOSDictionary.keySet().contains(COSName.getPDFName("Linearized"))) {
                        //System.out.println("Linearized");
                    }
                }
            }

While using 1.8.2 Linearized is working properly. But in 2.0.5 I can not
get the linearized and I can't check the linearized as it is not in the
dictionary keyset. Please let me know if you need more details.




Regards,
Karthick G

On Fri, May 19, 2017 at 9:27 AM, karthick g <ik...@gmail.com> wrote:

> Hi,
> * I need to Check whether my PDF file is Linearized or not, for fast view
> web. *
> In the previous version (1.8.2) of PDFBox Linearized is in the COSName. I
> will get the COSDictionary and check whether Linearized is available in the
> COSName and conclude the PDF is suited for fast web view. Now Linearized
> keyword is not in
> the List of COSName. How can I get the Linearized dictionary in PDFBox.
> Please let me know if you need more details.
>
> Regards,
> Karthick G
>
>
>
> On Thu, May 18, 2017 at 9:17 AM, karthick g <ik...@gmail.com>
> wrote:
>
>> Hi team,
>>
>> I am a long time user of PDFBox. We starts to migrate pdfbox from 1.8.2
>> to 2.0.5.
>> During migration I found that Linearized dictionary moved to preflight
>> jar.
>> I created the PDDocument based on preflight context which is returning
>> null.
>> Since the PDDocument is null I can't proceed further. What is the right
>> way to
>> get Lineraized dictionary in the current version of PDFBox . Please guide
>> me.
>> Please let me know if you need more details.
>>
>> Regards,
>> Karthick G
>>
>>
>>
>>
>

Re: Linearized dictionary

Posted by karthick g <ik...@gmail.com>.
Hi,
* I need to Check whether my PDF file is Linearized or not, for fast view
web. *
In the previous version (1.8.2) of PDFBox Linearized is in the COSName. I
will get the COSDictionary and check whether Linearized is available in the
COSName and conclude the PDF is suited for fast web view. Now Linearized
keyword is not in
the List of COSName. How can I get the Linearized dictionary in PDFBox.
Please let me know if you need more details.

Regards,
Karthick G



On Thu, May 18, 2017 at 9:17 AM, karthick g <ik...@gmail.com> wrote:

> Hi team,
>
> I am a long time user of PDFBox. We starts to migrate pdfbox from 1.8.2 to
> 2.0.5.
> During migration I found that Linearized dictionary moved to preflight
> jar.
> I created the PDDocument based on preflight context which is returning
> null.
> Since the PDDocument is null I can't proceed further. What is the right
> way to
> get Lineraized dictionary in the current version of PDFBox . Please guide
> me.
> Please let me know if you need more details.
>
> Regards,
> Karthick G
>
>
>
>