You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Kevin Day <ke...@trumpetinc.com> on 2019/09/20 20:23:47 UTC

Question about JBIG2ImageReader usage

I am trying to use JBIG2ImageReader to parse JBIG2 data from a PDF (the
image stream and globals are being provided - we are not using PdfBox to
parse the PDF itself).  Please let me know if I should be using a different
communication avenue for JBIG2 specific questions.


Here's what I'm trying to do:

               JBIG2ImageReader jbig2Reader = new JBIG2ImageReader(new
JBIG2ImageReaderSpi());

                        byte[] globalBytes = //raw bytes from PDF
DECODEPARAMS, JBIG2GLOBALS

                        ImageInputStream globalsInputStream = new
DefaultInputStreamFactory().getInputStream(new
ByteArrayInputStream(globalBytes));

                        JBIG2Globals globals =
jbig2Reader.processGlobals(globalsInputStream);
                        jbig2Reader.setGlobals(globals);

                 byte[] imageBytes = // raw JBIG2 image stream bytes from
PDF
                ImageInputStream imageInputStream = new
DefaultInputStreamFactory().getInputStream(new
ByteArrayInputStream(image.getImageAsBytes()));
                jbig2Reader.setInput(imageInputStream);

                return jbig2Reader.read(0);


When I do this, I get a null pointer exception:

Exception in thread "main" java.lang.RuntimeException: Can't instantiate
segment classException in thread "main" java.lang.RuntimeException: Can't
instantiate segment class at
org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:420)
at org.apache.pdfbox.jbig2.JBIG2Page.createNormalPage(JBIG2Page.java:202)
at org.apache.pdfbox.jbig2.JBIG2Page.createPage(JBIG2Page.java:168) at
org.apache.pdfbox.jbig2.JBIG2Page.composePageBitmap(JBIG2Page.java:157) at
org.apache.pdfbox.jbig2.JBIG2Page.getBitmap(JBIG2Page.java:133) at
org.apache.pdfbox.jbig2.JBIG2ImageReader.read(JBIG2ImageReader.java:249) at
javax.imageio.ImageReader.read(ImageReader.java:939)

....

Caused by: java.lang.NullPointerException at
org.apache.pdfbox.jbig2.segments.TextRegion.initSymbols(TextRegion.java:1010)
at
org.apache.pdfbox.jbig2.segments.TextRegion.getSymbols(TextRegion.java:273)
at
org.apache.pdfbox.jbig2.segments.TextRegion.parseHeader(TextRegion.java:154)
at org.apache.pdfbox.jbig2.segments.TextRegion.init(TextRegion.java:1128)
at
org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:413)
... 19 more







The SegmentHeader array in TextRegion looks like this:

 (org.apache.pdfbox.jbig2.SegmentHeader[]) [null,

#SegmentNr: 377
SegmentType: 0
PageAssociation: 1
Referred-to segments: none
]



Note that the first element is null.  I'm not sure why this is (maybe it's
not a valid JBIG2 data stream??).  This file opens and displays fine in PDF
viewers, so I'm assuming it must be something that I'm doing wrong.


Any pointers?

- K

Kevin Day

*trumpet**p| *480.961.6003 x1002
*e| *kevin@trumpetinc.com
*www.trumpetinc.com <http://trumpetinc.com/>*

LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
<http://trumpetinc.com/blog/>| Twitter  <https://twitter.com/trumpetinc>

Re: Question about JBIG2ImageReader usage

Posted by Kevin Day <ke...@trumpetinc.com>.
Thank you - that works.

Kevin Day

*trumpet**p| *480.961.6003 x1002
*e| *kevin@trumpetinc.com
*www.trumpetinc.com <http://trumpetinc.com/>*

LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
<http://trumpetinc.com/blog/>| Twitter  <https://twitter.com/trumpetinc>


On Mon, Sep 23, 2019 at 8:32 PM Tilman Hausherr <TH...@t-online.de>
wrote:

> Am 23.09.2019 um 23:40 schrieb Kevin Day:
> > PdfDebugger is working fine - so the issue must be with how I'm using the
> > library, or how I'm extracting the globals stream...
> >
> > I checked the globals stream contents that I'm extracting and compared to
> > the globals in PDFDebugger, and they are identical bytes.
> >
> > I also checked the image content stream, and it has identical bytes as
> well.
> >
> >
> > I even changed my code to be identical to yours:
> >
> >                  JBIG2ImageReader reader = (JBIG2ImageReader)
> > ImageIO.getImageReadersByFormatName("JBIG2").next();
> >                  JBIG2Globals globals =
> > reader.processGlobals(ImageIO.createImageInputStream(new
> > ByteArrayInputStream(globalBytes)));
> >                  reader.setGlobals(globals);
> >                  reader.setInput(ImageIO.createImageInputStream(new
> > ByteArrayInputStream(imageBytes)));
> >                  return reader.read(0, reader.getDefaultReadParam());
> >
> > and it still fails.
> >
> > But PDFDebugger works fine.
> >
> >
> > So it would seem like the way that PDFBox invokes JBIG2ImageReader is not
> > the above?  Could that be right??
>
>
> That is true, we're using the reader in a plugin independent way, which
> is shown in the source of JBIG2Filter.java:
>
>
> InputStream encoded = the input stream of the main image (without the
> globals)
>
> InputStream source = encoded;
>
> InputStream source = new SequenceInputStream(((COSStream)
> globals).createInputStream(), encoded);
>
> ...
>
> ImageInputStream iis = ImageIO.createImageInputStream(source);
>
> reader.setInput(iis);
>
> image = reader.read(0, irp);
>
>
>
> Tilman
>
>
> >
> > - K
> >
> >
> > Kevin Day
> >
> > *trumpet**p| *480.961.6003 x1002
> > *e| *kevin@trumpetinc.com
> > *www.trumpetinc.com <http://trumpetinc.com/>*
> >
> > LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
> > <http://trumpetinc.com/blog/>| Twitter  <https://twitter.com/trumpetinc>
> >
> >
> > On Fri, Sep 20, 2019 at 9:28 PM Tilman Hausherr <TH...@t-online.de>
> > wrote:
> >
> >> I wonder if the PDF can be displayed with PDFDebugger. If no => bug. If
> >> yes, then you should debug this to see what calls are done, and whether
> >> you have the same data input. Your calls seem to be OK, they look
> >> similar to those I did when I debugged something in the jbig2 reader
> >> (link is before it went to Apache, don't open issues on github):
> >> https://github.com/levigo/jbig2-imageio/issues/21
> >>
> >> Tilman
> >>
> >> Am 20.09.2019 um 22:23 schrieb Kevin Day:
> >>> I am trying to use JBIG2ImageReader to parse JBIG2 data from a PDF (the
> >>> image stream and globals are being provided - we are not using PdfBox
> to
> >>> parse the PDF itself).  Please let me know if I should be using a
> >> different
> >>> communication avenue for JBIG2 specific questions.
> >>>
> >>>
> >>> Here's what I'm trying to do:
> >>>
> >>>                  JBIG2ImageReader jbig2Reader = new
> JBIG2ImageReader(new
> >>> JBIG2ImageReaderSpi());
> >>>
> >>>                           byte[] globalBytes = //raw bytes from PDF
> >>> DECODEPARAMS, JBIG2GLOBALS
> >>>
> >>>                           ImageInputStream globalsInputStream = new
> >>> DefaultInputStreamFactory().getInputStream(new
> >>> ByteArrayInputStream(globalBytes));
> >>>
> >>>                           JBIG2Globals globals =
> >>> jbig2Reader.processGlobals(globalsInputStream);
> >>>                           jbig2Reader.setGlobals(globals);
> >>>
> >>>                    byte[] imageBytes = // raw JBIG2 image stream bytes
> >> from
> >>> PDF
> >>>                   ImageInputStream imageInputStream = new
> >>> DefaultInputStreamFactory().getInputStream(new
> >>> ByteArrayInputStream(image.getImageAsBytes()));
> >>>                   jbig2Reader.setInput(imageInputStream);
> >>>
> >>>                   return jbig2Reader.read(0);
> >>>
> >>>
> >>> When I do this, I get a null pointer exception:
> >>>
> >>> Exception in thread "main" java.lang.RuntimeException: Can't
> instantiate
> >>> segment classException in thread "main" java.lang.RuntimeException:
> Can't
> >>> instantiate segment class at
> >>>
> >>
> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:420)
> >>> at
> org.apache.pdfbox.jbig2.JBIG2Page.createNormalPage(JBIG2Page.java:202)
> >>> at org.apache.pdfbox.jbig2.JBIG2Page.createPage(JBIG2Page.java:168) at
> >>> org.apache.pdfbox.jbig2.JBIG2Page.composePageBitmap(JBIG2Page.java:157)
> >> at
> >>> org.apache.pdfbox.jbig2.JBIG2Page.getBitmap(JBIG2Page.java:133) at
> >>>
> org.apache.pdfbox.jbig2.JBIG2ImageReader.read(JBIG2ImageReader.java:249)
> >> at
> >>> javax.imageio.ImageReader.read(ImageReader.java:939)
> >>>
> >>> ....
> >>>
> >>> Caused by: java.lang.NullPointerException at
> >>>
> >>
> org.apache.pdfbox.jbig2.segments.TextRegion.initSymbols(TextRegion.java:1010)
> >>> at
> >>>
> >>
> org.apache.pdfbox.jbig2.segments.TextRegion.getSymbols(TextRegion.java:273)
> >>> at
> >>>
> >>
> org.apache.pdfbox.jbig2.segments.TextRegion.parseHeader(TextRegion.java:154)
> >>> at
> org.apache.pdfbox.jbig2.segments.TextRegion.init(TextRegion.java:1128)
> >>> at
> >>>
> >>
> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:413)
> >>> ... 19 more
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> The SegmentHeader array in TextRegion looks like this:
> >>>
> >>>    (org.apache.pdfbox.jbig2.SegmentHeader[]) [null,
> >>>
> >>> #SegmentNr: 377
> >>> SegmentType: 0
> >>> PageAssociation: 1
> >>> Referred-to segments: none
> >>> ]
> >>>
> >>>
> >>>
> >>> Note that the first element is null.  I'm not sure why this is (maybe
> >> it's
> >>> not a valid JBIG2 data stream??).  This file opens and displays fine in
> >> PDF
> >>> viewers, so I'm assuming it must be something that I'm doing wrong.
> >>>
> >>>
> >>> Any pointers?
> >>>
> >>> - K
> >>>
> >>> Kevin Day
> >>>
> >>> *trumpet**p| *480.961.6003 x1002
> >>> *e| *kevin@trumpetinc.com
> >>> *www.trumpetinc.com <http://trumpetinc.com/>*
> >>>
> >>> LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
> >>> <http://trumpetinc.com/blog/>| Twitter  <
> https://twitter.com/trumpetinc>
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>

Re: Question about JBIG2ImageReader usage

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 23.09.2019 um 23:40 schrieb Kevin Day:
> PdfDebugger is working fine - so the issue must be with how I'm using the
> library, or how I'm extracting the globals stream...
>
> I checked the globals stream contents that I'm extracting and compared to
> the globals in PDFDebugger, and they are identical bytes.
>
> I also checked the image content stream, and it has identical bytes as well.
>
>
> I even changed my code to be identical to yours:
>
>                  JBIG2ImageReader reader = (JBIG2ImageReader)
> ImageIO.getImageReadersByFormatName("JBIG2").next();
>                  JBIG2Globals globals =
> reader.processGlobals(ImageIO.createImageInputStream(new
> ByteArrayInputStream(globalBytes)));
>                  reader.setGlobals(globals);
>                  reader.setInput(ImageIO.createImageInputStream(new
> ByteArrayInputStream(imageBytes)));
>                  return reader.read(0, reader.getDefaultReadParam());
>
> and it still fails.
>
> But PDFDebugger works fine.
>
>
> So it would seem like the way that PDFBox invokes JBIG2ImageReader is not
> the above?  Could that be right??


That is true, we're using the reader in a plugin independent way, which 
is shown in the source of JBIG2Filter.java:


InputStream encoded = the input stream of the main image (without the 
globals)

InputStream source = encoded;

InputStream source = new SequenceInputStream(((COSStream) 
globals).createInputStream(), encoded);

...

ImageInputStream iis = ImageIO.createImageInputStream(source);

reader.setInput(iis);

image = reader.read(0, irp);



Tilman


>
> - K
>
>
> Kevin Day
>
> *trumpet**p| *480.961.6003 x1002
> *e| *kevin@trumpetinc.com
> *www.trumpetinc.com <http://trumpetinc.com/>*
>
> LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
> <http://trumpetinc.com/blog/>| Twitter  <https://twitter.com/trumpetinc>
>
>
> On Fri, Sep 20, 2019 at 9:28 PM Tilman Hausherr <TH...@t-online.de>
> wrote:
>
>> I wonder if the PDF can be displayed with PDFDebugger. If no => bug. If
>> yes, then you should debug this to see what calls are done, and whether
>> you have the same data input. Your calls seem to be OK, they look
>> similar to those I did when I debugged something in the jbig2 reader
>> (link is before it went to Apache, don't open issues on github):
>> https://github.com/levigo/jbig2-imageio/issues/21
>>
>> Tilman
>>
>> Am 20.09.2019 um 22:23 schrieb Kevin Day:
>>> I am trying to use JBIG2ImageReader to parse JBIG2 data from a PDF (the
>>> image stream and globals are being provided - we are not using PdfBox to
>>> parse the PDF itself).  Please let me know if I should be using a
>> different
>>> communication avenue for JBIG2 specific questions.
>>>
>>>
>>> Here's what I'm trying to do:
>>>
>>>                  JBIG2ImageReader jbig2Reader = new JBIG2ImageReader(new
>>> JBIG2ImageReaderSpi());
>>>
>>>                           byte[] globalBytes = //raw bytes from PDF
>>> DECODEPARAMS, JBIG2GLOBALS
>>>
>>>                           ImageInputStream globalsInputStream = new
>>> DefaultInputStreamFactory().getInputStream(new
>>> ByteArrayInputStream(globalBytes));
>>>
>>>                           JBIG2Globals globals =
>>> jbig2Reader.processGlobals(globalsInputStream);
>>>                           jbig2Reader.setGlobals(globals);
>>>
>>>                    byte[] imageBytes = // raw JBIG2 image stream bytes
>> from
>>> PDF
>>>                   ImageInputStream imageInputStream = new
>>> DefaultInputStreamFactory().getInputStream(new
>>> ByteArrayInputStream(image.getImageAsBytes()));
>>>                   jbig2Reader.setInput(imageInputStream);
>>>
>>>                   return jbig2Reader.read(0);
>>>
>>>
>>> When I do this, I get a null pointer exception:
>>>
>>> Exception in thread "main" java.lang.RuntimeException: Can't instantiate
>>> segment classException in thread "main" java.lang.RuntimeException: Can't
>>> instantiate segment class at
>>>
>> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:420)
>>> at org.apache.pdfbox.jbig2.JBIG2Page.createNormalPage(JBIG2Page.java:202)
>>> at org.apache.pdfbox.jbig2.JBIG2Page.createPage(JBIG2Page.java:168) at
>>> org.apache.pdfbox.jbig2.JBIG2Page.composePageBitmap(JBIG2Page.java:157)
>> at
>>> org.apache.pdfbox.jbig2.JBIG2Page.getBitmap(JBIG2Page.java:133) at
>>> org.apache.pdfbox.jbig2.JBIG2ImageReader.read(JBIG2ImageReader.java:249)
>> at
>>> javax.imageio.ImageReader.read(ImageReader.java:939)
>>>
>>> ....
>>>
>>> Caused by: java.lang.NullPointerException at
>>>
>> org.apache.pdfbox.jbig2.segments.TextRegion.initSymbols(TextRegion.java:1010)
>>> at
>>>
>> org.apache.pdfbox.jbig2.segments.TextRegion.getSymbols(TextRegion.java:273)
>>> at
>>>
>> org.apache.pdfbox.jbig2.segments.TextRegion.parseHeader(TextRegion.java:154)
>>> at org.apache.pdfbox.jbig2.segments.TextRegion.init(TextRegion.java:1128)
>>> at
>>>
>> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:413)
>>> ... 19 more
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> The SegmentHeader array in TextRegion looks like this:
>>>
>>>    (org.apache.pdfbox.jbig2.SegmentHeader[]) [null,
>>>
>>> #SegmentNr: 377
>>> SegmentType: 0
>>> PageAssociation: 1
>>> Referred-to segments: none
>>> ]
>>>
>>>
>>>
>>> Note that the first element is null.  I'm not sure why this is (maybe
>> it's
>>> not a valid JBIG2 data stream??).  This file opens and displays fine in
>> PDF
>>> viewers, so I'm assuming it must be something that I'm doing wrong.
>>>
>>>
>>> Any pointers?
>>>
>>> - K
>>>
>>> Kevin Day
>>>
>>> *trumpet**p| *480.961.6003 x1002
>>> *e| *kevin@trumpetinc.com
>>> *www.trumpetinc.com <http://trumpetinc.com/>*
>>>
>>> LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
>>> <http://trumpetinc.com/blog/>| Twitter  <https://twitter.com/trumpetinc>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Question about JBIG2ImageReader usage

Posted by Kevin Day <ke...@trumpetinc.com>.
PdfDebugger is working fine - so the issue must be with how I'm using the
library, or how I'm extracting the globals stream...

I checked the globals stream contents that I'm extracting and compared to
the globals in PDFDebugger, and they are identical bytes.

I also checked the image content stream, and it has identical bytes as well.


I even changed my code to be identical to yours:

                JBIG2ImageReader reader = (JBIG2ImageReader)
ImageIO.getImageReadersByFormatName("JBIG2").next();
                JBIG2Globals globals =
reader.processGlobals(ImageIO.createImageInputStream(new
ByteArrayInputStream(globalBytes)));
                reader.setGlobals(globals);
                reader.setInput(ImageIO.createImageInputStream(new
ByteArrayInputStream(imageBytes)));
                return reader.read(0, reader.getDefaultReadParam());

and it still fails.

But PDFDebugger works fine.


So it would seem like the way that PDFBox invokes JBIG2ImageReader is not
the above?  Could that be right??

- K


Kevin Day

*trumpet**p| *480.961.6003 x1002
*e| *kevin@trumpetinc.com
*www.trumpetinc.com <http://trumpetinc.com/>*

LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
<http://trumpetinc.com/blog/>| Twitter  <https://twitter.com/trumpetinc>


On Fri, Sep 20, 2019 at 9:28 PM Tilman Hausherr <TH...@t-online.de>
wrote:

> I wonder if the PDF can be displayed with PDFDebugger. If no => bug. If
> yes, then you should debug this to see what calls are done, and whether
> you have the same data input. Your calls seem to be OK, they look
> similar to those I did when I debugged something in the jbig2 reader
> (link is before it went to Apache, don't open issues on github):
> https://github.com/levigo/jbig2-imageio/issues/21
>
> Tilman
>
> Am 20.09.2019 um 22:23 schrieb Kevin Day:
> > I am trying to use JBIG2ImageReader to parse JBIG2 data from a PDF (the
> > image stream and globals are being provided - we are not using PdfBox to
> > parse the PDF itself).  Please let me know if I should be using a
> different
> > communication avenue for JBIG2 specific questions.
> >
> >
> > Here's what I'm trying to do:
> >
> >                 JBIG2ImageReader jbig2Reader = new JBIG2ImageReader(new
> > JBIG2ImageReaderSpi());
> >
> >                          byte[] globalBytes = //raw bytes from PDF
> > DECODEPARAMS, JBIG2GLOBALS
> >
> >                          ImageInputStream globalsInputStream = new
> > DefaultInputStreamFactory().getInputStream(new
> > ByteArrayInputStream(globalBytes));
> >
> >                          JBIG2Globals globals =
> > jbig2Reader.processGlobals(globalsInputStream);
> >                          jbig2Reader.setGlobals(globals);
> >
> >                   byte[] imageBytes = // raw JBIG2 image stream bytes
> from
> > PDF
> >                  ImageInputStream imageInputStream = new
> > DefaultInputStreamFactory().getInputStream(new
> > ByteArrayInputStream(image.getImageAsBytes()));
> >                  jbig2Reader.setInput(imageInputStream);
> >
> >                  return jbig2Reader.read(0);
> >
> >
> > When I do this, I get a null pointer exception:
> >
> > Exception in thread "main" java.lang.RuntimeException: Can't instantiate
> > segment classException in thread "main" java.lang.RuntimeException: Can't
> > instantiate segment class at
> >
> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:420)
> > at org.apache.pdfbox.jbig2.JBIG2Page.createNormalPage(JBIG2Page.java:202)
> > at org.apache.pdfbox.jbig2.JBIG2Page.createPage(JBIG2Page.java:168) at
> > org.apache.pdfbox.jbig2.JBIG2Page.composePageBitmap(JBIG2Page.java:157)
> at
> > org.apache.pdfbox.jbig2.JBIG2Page.getBitmap(JBIG2Page.java:133) at
> > org.apache.pdfbox.jbig2.JBIG2ImageReader.read(JBIG2ImageReader.java:249)
> at
> > javax.imageio.ImageReader.read(ImageReader.java:939)
> >
> > ....
> >
> > Caused by: java.lang.NullPointerException at
> >
> org.apache.pdfbox.jbig2.segments.TextRegion.initSymbols(TextRegion.java:1010)
> > at
> >
> org.apache.pdfbox.jbig2.segments.TextRegion.getSymbols(TextRegion.java:273)
> > at
> >
> org.apache.pdfbox.jbig2.segments.TextRegion.parseHeader(TextRegion.java:154)
> > at org.apache.pdfbox.jbig2.segments.TextRegion.init(TextRegion.java:1128)
> > at
> >
> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:413)
> > ... 19 more
> >
> >
> >
> >
> >
> >
> >
> > The SegmentHeader array in TextRegion looks like this:
> >
> >   (org.apache.pdfbox.jbig2.SegmentHeader[]) [null,
> >
> > #SegmentNr: 377
> > SegmentType: 0
> > PageAssociation: 1
> > Referred-to segments: none
> > ]
> >
> >
> >
> > Note that the first element is null.  I'm not sure why this is (maybe
> it's
> > not a valid JBIG2 data stream??).  This file opens and displays fine in
> PDF
> > viewers, so I'm assuming it must be something that I'm doing wrong.
> >
> >
> > Any pointers?
> >
> > - K
> >
> > Kevin Day
> >
> > *trumpet**p| *480.961.6003 x1002
> > *e| *kevin@trumpetinc.com
> > *www.trumpetinc.com <http://trumpetinc.com/>*
> >
> > LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
> > <http://trumpetinc.com/blog/>| Twitter  <https://twitter.com/trumpetinc>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>

Re: Question about JBIG2ImageReader usage

Posted by Tilman Hausherr <TH...@t-online.de>.
I wonder if the PDF can be displayed with PDFDebugger. If no => bug. If 
yes, then you should debug this to see what calls are done, and whether 
you have the same data input. Your calls seem to be OK, they look 
similar to those I did when I debugged something in the jbig2 reader 
(link is before it went to Apache, don't open issues on github):
https://github.com/levigo/jbig2-imageio/issues/21

Tilman

Am 20.09.2019 um 22:23 schrieb Kevin Day:
> I am trying to use JBIG2ImageReader to parse JBIG2 data from a PDF (the
> image stream and globals are being provided - we are not using PdfBox to
> parse the PDF itself).  Please let me know if I should be using a different
> communication avenue for JBIG2 specific questions.
>
>
> Here's what I'm trying to do:
>
>                 JBIG2ImageReader jbig2Reader = new JBIG2ImageReader(new
> JBIG2ImageReaderSpi());
>
>                          byte[] globalBytes = //raw bytes from PDF
> DECODEPARAMS, JBIG2GLOBALS
>
>                          ImageInputStream globalsInputStream = new
> DefaultInputStreamFactory().getInputStream(new
> ByteArrayInputStream(globalBytes));
>
>                          JBIG2Globals globals =
> jbig2Reader.processGlobals(globalsInputStream);
>                          jbig2Reader.setGlobals(globals);
>
>                   byte[] imageBytes = // raw JBIG2 image stream bytes from
> PDF
>                  ImageInputStream imageInputStream = new
> DefaultInputStreamFactory().getInputStream(new
> ByteArrayInputStream(image.getImageAsBytes()));
>                  jbig2Reader.setInput(imageInputStream);
>
>                  return jbig2Reader.read(0);
>
>
> When I do this, I get a null pointer exception:
>
> Exception in thread "main" java.lang.RuntimeException: Can't instantiate
> segment classException in thread "main" java.lang.RuntimeException: Can't
> instantiate segment class at
> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:420)
> at org.apache.pdfbox.jbig2.JBIG2Page.createNormalPage(JBIG2Page.java:202)
> at org.apache.pdfbox.jbig2.JBIG2Page.createPage(JBIG2Page.java:168) at
> org.apache.pdfbox.jbig2.JBIG2Page.composePageBitmap(JBIG2Page.java:157) at
> org.apache.pdfbox.jbig2.JBIG2Page.getBitmap(JBIG2Page.java:133) at
> org.apache.pdfbox.jbig2.JBIG2ImageReader.read(JBIG2ImageReader.java:249) at
> javax.imageio.ImageReader.read(ImageReader.java:939)
>
> ....
>
> Caused by: java.lang.NullPointerException at
> org.apache.pdfbox.jbig2.segments.TextRegion.initSymbols(TextRegion.java:1010)
> at
> org.apache.pdfbox.jbig2.segments.TextRegion.getSymbols(TextRegion.java:273)
> at
> org.apache.pdfbox.jbig2.segments.TextRegion.parseHeader(TextRegion.java:154)
> at org.apache.pdfbox.jbig2.segments.TextRegion.init(TextRegion.java:1128)
> at
> org.apache.pdfbox.jbig2.SegmentHeader.getSegmentData(SegmentHeader.java:413)
> ... 19 more
>
>
>
>
>
>
>
> The SegmentHeader array in TextRegion looks like this:
>
>   (org.apache.pdfbox.jbig2.SegmentHeader[]) [null,
>
> #SegmentNr: 377
> SegmentType: 0
> PageAssociation: 1
> Referred-to segments: none
> ]
>
>
>
> Note that the first element is null.  I'm not sure why this is (maybe it's
> not a valid JBIG2 data stream??).  This file opens and displays fine in PDF
> viewers, so I'm assuming it must be something that I'm doing wrong.
>
>
> Any pointers?
>
> - K
>
> Kevin Day
>
> *trumpet**p| *480.961.6003 x1002
> *e| *kevin@trumpetinc.com
> *www.trumpetinc.com <http://trumpetinc.com/>*
>
> LinkedIn <https://www.linkedin.com/company/trumpet-inc.>| Trumpet Blog
> <http://trumpetinc.com/blog/>| Twitter  <https://twitter.com/trumpetinc>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org