You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "Timo Boehme (JIRA)" <ji...@apache.org> on 2009/02/04 15:38:02 UTC

[jira] Created: (PDFBOX-418) PDFStreamParser reads incorrect number (patch provided)

PDFStreamParser reads incorrect number (patch provided)
-------------------------------------------------------

                 Key: PDFBOX-418
                 URL: https://issues.apache.org/jira/browse/PDFBOX-418
             Project: PDFBox
          Issue Type: Bug
          Components: Parsing
    Affects Versions: 0.8.0-incubator
            Reporter: Timo Boehme
             Fix For: 0.8.0-incubator


With one of our documents PDFBox (compiled Incubator version 2009-02-04) throws an floating point number exception.
The reason: PDFStreamParser in method parseNextToken() is not strict enough reading a number. In our case the string read
was '97.-96' which clearly could not be parsed as a number. Thus when parsing a number (starting at line 238) '+' and '-' should
only be allowed at fist position (maybe one should make sure that '.' can only be read once).

                  StringBuffer buf = new StringBuffer();
                  
                  buf.append( c );
                  pdfSource.read();
                  
                  boolean dotNotRead = (c != '.');
                  
                  while( Character.isDigit(( c = (char)pdfSource.peek()) ) || (dotNotRead && (c == '.')) )
                  {
                      buf.append( c );
                      pdfSource.read();
                      
                      if (dotNotRead && (c == '.'))
                      	dotNotRead = false;
                  }
                  retval = COSNumber.get( buf.toString() );
                break;


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (PDFBOX-418) PDFStreamParser reads incorrect number (patch provided)

Posted by "Andreas Lehmkühler (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/PDFBOX-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670343#action_12670343 ] 

lehmi edited comment on PDFBOX-418 at 2/4/09 7:17 AM:
-------------------------------------------------------------------

Seems to be the same issue than PDFBOX-228

      was (Author: lehmi):
    Seems to be the same issue
  
> PDFStreamParser reads incorrect number (patch provided)
> -------------------------------------------------------
>
>                 Key: PDFBOX-418
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-418
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>            Reporter: Timo Boehme
>             Fix For: 0.8.0-incubator
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> With one of our documents PDFBox (compiled Incubator version 2009-02-04) throws an floating point number exception.
> The reason: PDFStreamParser in method parseNextToken() is not strict enough reading a number. In our case the string read
> was '97.-96' which clearly could not be parsed as a number. Thus when parsing a number (starting at line 238) '+' and '-' should
> only be allowed at fist position (maybe one should make sure that '.' can only be read once).
> The following patch completely replaces the code after "case '.':" at line 236. The first condition in replaced code is not
> necessary since the test is already be done by the 'case' statements - so we don't have to throw an exception either. 
>                   StringBuffer buf = new StringBuffer();
>                   
>                   buf.append( c );
>                   pdfSource.read();
>                   
>                   boolean dotNotRead = (c != '.');
>                   
>                   while( Character.isDigit(( c = (char)pdfSource.peek()) ) || (dotNotRead && (c == '.')) )
>                   {
>                       buf.append( c );
>                       pdfSource.read();
>                       
>                       if (dotNotRead && (c == '.'))
>                       	dotNotRead = false;
>                   }
>                   retval = COSNumber.get( buf.toString() );
>                 break;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PDFBOX-418) PDFStreamParser reads incorrect number (patch provided)

Posted by "Timo Boehme (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PDFBOX-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Timo Boehme updated PDFBOX-418:
-------------------------------

    Description: 
With one of our documents PDFBox (compiled Incubator version 2009-02-04) throws an floating point number exception.
The reason: PDFStreamParser in method parseNextToken() is not strict enough reading a number. In our case the string read
was '97.-96' which clearly could not be parsed as a number. Thus when parsing a number (starting at line 238) '+' and '-' should
only be allowed at fist position (maybe one should make sure that '.' can only be read once).

The following patch completely replaces the code after "case '.':" at line 236. The first condition in replaced code is not
necessary since the test is already be done by the 'case' statements - so we don't have to throw an exception either. 

                  StringBuffer buf = new StringBuffer();
                  
                  buf.append( c );
                  pdfSource.read();
                  
                  boolean dotNotRead = (c != '.');
                  
                  while( Character.isDigit(( c = (char)pdfSource.peek()) ) || (dotNotRead && (c == '.')) )
                  {
                      buf.append( c );
                      pdfSource.read();
                      
                      if (dotNotRead && (c == '.'))
                      	dotNotRead = false;
                  }
                  retval = COSNumber.get( buf.toString() );
                break;


  was:
With one of our documents PDFBox (compiled Incubator version 2009-02-04) throws an floating point number exception.
The reason: PDFStreamParser in method parseNextToken() is not strict enough reading a number. In our case the string read
was '97.-96' which clearly could not be parsed as a number. Thus when parsing a number (starting at line 238) '+' and '-' should
only be allowed at fist position (maybe one should make sure that '.' can only be read once).

                  StringBuffer buf = new StringBuffer();
                  
                  buf.append( c );
                  pdfSource.read();
                  
                  boolean dotNotRead = (c != '.');
                  
                  while( Character.isDigit(( c = (char)pdfSource.peek()) ) || (dotNotRead && (c == '.')) )
                  {
                      buf.append( c );
                      pdfSource.read();
                      
                      if (dotNotRead && (c == '.'))
                      	dotNotRead = false;
                  }
                  retval = COSNumber.get( buf.toString() );
                break;



> PDFStreamParser reads incorrect number (patch provided)
> -------------------------------------------------------
>
>                 Key: PDFBOX-418
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-418
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>            Reporter: Timo Boehme
>             Fix For: 0.8.0-incubator
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> With one of our documents PDFBox (compiled Incubator version 2009-02-04) throws an floating point number exception.
> The reason: PDFStreamParser in method parseNextToken() is not strict enough reading a number. In our case the string read
> was '97.-96' which clearly could not be parsed as a number. Thus when parsing a number (starting at line 238) '+' and '-' should
> only be allowed at fist position (maybe one should make sure that '.' can only be read once).
> The following patch completely replaces the code after "case '.':" at line 236. The first condition in replaced code is not
> necessary since the test is already be done by the 'case' statements - so we don't have to throw an exception either. 
>                   StringBuffer buf = new StringBuffer();
>                   
>                   buf.append( c );
>                   pdfSource.read();
>                   
>                   boolean dotNotRead = (c != '.');
>                   
>                   while( Character.isDigit(( c = (char)pdfSource.peek()) ) || (dotNotRead && (c == '.')) )
>                   {
>                       buf.append( c );
>                       pdfSource.read();
>                       
>                       if (dotNotRead && (c == '.'))
>                       	dotNotRead = false;
>                   }
>                   retval = COSNumber.get( buf.toString() );
>                 break;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (PDFBOX-418) PDFStreamParser reads incorrect number (patch provided)

Posted by "Brian Carrier (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PDFBOX-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Carrier resolved PDFBOX-418.
----------------------------------

    Resolution: Fixed

Checked into trunk.

> PDFStreamParser reads incorrect number (patch provided)
> -------------------------------------------------------
>
>                 Key: PDFBOX-418
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-418
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>            Reporter: Timo Boehme
>             Fix For: 0.8.0-incubator
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> With one of our documents PDFBox (compiled Incubator version 2009-02-04) throws an floating point number exception.
> The reason: PDFStreamParser in method parseNextToken() is not strict enough reading a number. In our case the string read
> was '97.-96' which clearly could not be parsed as a number. Thus when parsing a number (starting at line 238) '+' and '-' should
> only be allowed at fist position (maybe one should make sure that '.' can only be read once).
> The following patch completely replaces the code after "case '.':" at line 236. The first condition in replaced code is not
> necessary since the test is already be done by the 'case' statements - so we don't have to throw an exception either. 
>                   StringBuffer buf = new StringBuffer();
>                   
>                   buf.append( c );
>                   pdfSource.read();
>                   
>                   boolean dotNotRead = (c != '.');
>                   
>                   while( Character.isDigit(( c = (char)pdfSource.peek()) ) || (dotNotRead && (c == '.')) )
>                   {
>                       buf.append( c );
>                       pdfSource.read();
>                       
>                       if (dotNotRead && (c == '.'))
>                       	dotNotRead = false;
>                   }
>                   retval = COSNumber.get( buf.toString() );
>                 break;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.