You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Michael Cooper (JIRA)" <ji...@apache.org> on 2011/09/21 06:37:08 UTC

[jira] [Created] (AVRO-892) Python snappy error: "integer out of range for 'I' format code"

Python snappy error: "integer out of range for 'I' format code"
---------------------------------------------------------------

                 Key: AVRO-892
                 URL: https://issues.apache.org/jira/browse/AVRO-892
             Project: Avro
          Issue Type: Bug
          Components: python
    Affects Versions: 1.5.4
         Environment: Linux michaelc 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
Ubuntu 11.04
Python 2.7.1+ (ubuntu stock version)

avro-1.5.4-py2.7.egg
snappy-1.0.4 (c library)
python-snappy-0.3.2
            Reporter: Michael Cooper


The Python library for avro fails to write some blocks when used with snappy compression.

The error is:

Traceback (most recent call last):
  File "tools/json_to_avro.py", line 74, in <module>
    writer.append(line)
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 185, in append
    self._write_block()
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 169, in _write_block
    self.encoder.write_crc32(uncompressed_data)
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/io.py", line 364, in write_crc32
    self.write(STRUCT_CRC32.pack(crc32(bytes)));
struct.error: integer out of range for 'I' format code


>From my investigation, str(crc32(bytes)) is showing negative integers, so the issue seems to be fixed by masking the output.

This fix appears to work from my limited testing:

--- io.old.py	2011-09-21 14:32:38.992544680 +1000
+++ io.py	2011-09-21 14:33:11.492544686 +1000
@@ -360,7 +360,7 @@
     """
     A 4-byte, big-endian CRC32 checksum
     """
-    self.write(STRUCT_CRC32.pack(crc32(bytes)));
+    self.write(STRUCT_CRC32.pack(crc32(bytes) & 0xffffffff));
 
 #
 # DatumReader/Writer


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (AVRO-892) Python snappy error: "integer out of range for 'I' format code"

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting resolved AVRO-892.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 1.6.0
         Assignee: Michael Cooper

It looks like this is something that's different in Python 2.7 and/or when running 64-bit.  In Python 2.6 on 32-bit Linux this just emits a warning, which I missed when committing AVRO-866.

I committed this fix.  Thanks, Michael!

> Python snappy error: "integer out of range for 'I' format code"
> ---------------------------------------------------------------
>
>                 Key: AVRO-892
>                 URL: https://issues.apache.org/jira/browse/AVRO-892
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.5.4
>         Environment: Linux michaelc 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
> Ubuntu 11.04
> Python 2.7.1+ (ubuntu stock version)
> avro-1.5.4-py2.7.egg
> snappy-1.0.4 (c library)
> python-snappy-0.3.2
>            Reporter: Michael Cooper
>            Assignee: Michael Cooper
>             Fix For: 1.6.0
>
>
> The Python library for avro fails to write some blocks when used with snappy compression.
> The error is:
> {code}
> Traceback (most recent call last):
>   File "tools/json_to_avro.py", line 74, in <module>
>     writer.append(line)
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 185, in append
>     self._write_block()
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 169, in _write_block
>     self.encoder.write_crc32(uncompressed_data)
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/io.py", line 364, in write_crc32
>     self.write(STRUCT_CRC32.pack(crc32(bytes)));
> struct.error: integer out of range for 'I' format code
> {code}
> From my investigation, str(crc32(bytes)) is showing negative integers, so the issue seems to be fixed by masking the output.
> This fix appears to work from my limited testing:
> {code}
> --- io.old.py	2011-09-21 14:32:38.992544680 +1000
> +++ io.py	2011-09-21 14:33:11.492544686 +1000
> @@ -360,7 +360,7 @@
>      """
>      A 4-byte, big-endian CRC32 checksum
>      """
> -    self.write(STRUCT_CRC32.pack(crc32(bytes)));
> +    self.write(STRUCT_CRC32.pack(crc32(bytes) & 0xffffffff));
>  
>  #
>  # DatumReader/Writer
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (AVRO-892) Python snappy error: "integer out of range for 'I' format code"

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112200#comment-13112200 ] 

Tom White commented on AVRO-892:
--------------------------------

This looks like the right fix to me. Thanks for reporting it! 

The mask was applied when comparing checksums, but was missed for the write path. See also the note at http://docs.python.org/library/binascii.html#binascii.crc32

It would be good to have an interoperability test for this that uses larger volumes of data than the testing I did in AVRO-866.



> Python snappy error: "integer out of range for 'I' format code"
> ---------------------------------------------------------------
>
>                 Key: AVRO-892
>                 URL: https://issues.apache.org/jira/browse/AVRO-892
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.5.4
>         Environment: Linux michaelc 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
> Ubuntu 11.04
> Python 2.7.1+ (ubuntu stock version)
> avro-1.5.4-py2.7.egg
> snappy-1.0.4 (c library)
> python-snappy-0.3.2
>            Reporter: Michael Cooper
>
> The Python library for avro fails to write some blocks when used with snappy compression.
> The error is:
> {code}
> Traceback (most recent call last):
>   File "tools/json_to_avro.py", line 74, in <module>
>     writer.append(line)
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 185, in append
>     self._write_block()
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 169, in _write_block
>     self.encoder.write_crc32(uncompressed_data)
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/io.py", line 364, in write_crc32
>     self.write(STRUCT_CRC32.pack(crc32(bytes)));
> struct.error: integer out of range for 'I' format code
> {code}
> From my investigation, str(crc32(bytes)) is showing negative integers, so the issue seems to be fixed by masking the output.
> This fix appears to work from my limited testing:
> {code}
> --- io.old.py	2011-09-21 14:32:38.992544680 +1000
> +++ io.py	2011-09-21 14:33:11.492544686 +1000
> @@ -360,7 +360,7 @@
>      """
>      A 4-byte, big-endian CRC32 checksum
>      """
> -    self.write(STRUCT_CRC32.pack(crc32(bytes)));
> +    self.write(STRUCT_CRC32.pack(crc32(bytes) & 0xffffffff));
>  
>  #
>  # DatumReader/Writer
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-892) Python snappy error: "integer out of range for 'I' format code"

Posted by "Michael Cooper (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Cooper updated AVRO-892:
--------------------------------

    Description: 
The Python library for avro fails to write some blocks when used with snappy compression.

The error is:
{code}
Traceback (most recent call last):
  File "tools/json_to_avro.py", line 74, in <module>
    writer.append(line)
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 185, in append
    self._write_block()
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 169, in _write_block
    self.encoder.write_crc32(uncompressed_data)
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/io.py", line 364, in write_crc32
    self.write(STRUCT_CRC32.pack(crc32(bytes)));
struct.error: integer out of range for 'I' format code
{code}

>From my investigation, str(crc32(bytes)) is showing negative integers, so the issue seems to be fixed by masking the output.

This fix appears to work from my limited testing:
{code}
--- io.old.py	2011-09-21 14:32:38.992544680 +1000
+++ io.py	2011-09-21 14:33:11.492544686 +1000
@@ -360,7 +360,7 @@
     """
     A 4-byte, big-endian CRC32 checksum
     """
-    self.write(STRUCT_CRC32.pack(crc32(bytes)));
+    self.write(STRUCT_CRC32.pack(crc32(bytes) & 0xffffffff));
 
 #
 # DatumReader/Writer
{code}

  was:
The Python library for avro fails to write some blocks when used with snappy compression.

The error is:

Traceback (most recent call last):
  File "tools/json_to_avro.py", line 74, in <module>
    writer.append(line)
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 185, in append
    self._write_block()
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 169, in _write_block
    self.encoder.write_crc32(uncompressed_data)
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/io.py", line 364, in write_crc32
    self.write(STRUCT_CRC32.pack(crc32(bytes)));
struct.error: integer out of range for 'I' format code


>From my investigation, str(crc32(bytes)) is showing negative integers, so the issue seems to be fixed by masking the output.

This fix appears to work from my limited testing:

--- io.old.py	2011-09-21 14:32:38.992544680 +1000
+++ io.py	2011-09-21 14:33:11.492544686 +1000
@@ -360,7 +360,7 @@
     """
     A 4-byte, big-endian CRC32 checksum
     """
-    self.write(STRUCT_CRC32.pack(crc32(bytes)));
+    self.write(STRUCT_CRC32.pack(crc32(bytes) & 0xffffffff));
 
 #
 # DatumReader/Writer



> Python snappy error: "integer out of range for 'I' format code"
> ---------------------------------------------------------------
>
>                 Key: AVRO-892
>                 URL: https://issues.apache.org/jira/browse/AVRO-892
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.5.4
>         Environment: Linux michaelc 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
> Ubuntu 11.04
> Python 2.7.1+ (ubuntu stock version)
> avro-1.5.4-py2.7.egg
> snappy-1.0.4 (c library)
> python-snappy-0.3.2
>            Reporter: Michael Cooper
>
> The Python library for avro fails to write some blocks when used with snappy compression.
> The error is:
> {code}
> Traceback (most recent call last):
>   File "tools/json_to_avro.py", line 74, in <module>
>     writer.append(line)
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 185, in append
>     self._write_block()
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 169, in _write_block
>     self.encoder.write_crc32(uncompressed_data)
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/io.py", line 364, in write_crc32
>     self.write(STRUCT_CRC32.pack(crc32(bytes)));
> struct.error: integer out of range for 'I' format code
> {code}
> From my investigation, str(crc32(bytes)) is showing negative integers, so the issue seems to be fixed by masking the output.
> This fix appears to work from my limited testing:
> {code}
> --- io.old.py	2011-09-21 14:32:38.992544680 +1000
> +++ io.py	2011-09-21 14:33:11.492544686 +1000
> @@ -360,7 +360,7 @@
>      """
>      A 4-byte, big-endian CRC32 checksum
>      """
> -    self.write(STRUCT_CRC32.pack(crc32(bytes)));
> +    self.write(STRUCT_CRC32.pack(crc32(bytes) & 0xffffffff));
>  
>  #
>  # DatumReader/Writer
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira