You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Michael Cooper (JIRA)" <ji...@apache.org> on 2011/09/21 06:37:08 UTC

[jira] [Updated] (AVRO-892) Python snappy error: "integer out of range for 'I' format code"

     [ https://issues.apache.org/jira/browse/AVRO-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Cooper updated AVRO-892:
--------------------------------

    Description: 
The Python library for avro fails to write some blocks when used with snappy compression.

The error is:
{code}
Traceback (most recent call last):
  File "tools/json_to_avro.py", line 74, in <module>
    writer.append(line)
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 185, in append
    self._write_block()
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 169, in _write_block
    self.encoder.write_crc32(uncompressed_data)
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/io.py", line 364, in write_crc32
    self.write(STRUCT_CRC32.pack(crc32(bytes)));
struct.error: integer out of range for 'I' format code
{code}

>From my investigation, str(crc32(bytes)) is showing negative integers, so the issue seems to be fixed by masking the output.

This fix appears to work from my limited testing:
{code}
--- io.old.py	2011-09-21 14:32:38.992544680 +1000
+++ io.py	2011-09-21 14:33:11.492544686 +1000
@@ -360,7 +360,7 @@
     """
     A 4-byte, big-endian CRC32 checksum
     """
-    self.write(STRUCT_CRC32.pack(crc32(bytes)));
+    self.write(STRUCT_CRC32.pack(crc32(bytes) & 0xffffffff));
 
 #
 # DatumReader/Writer
{code}

  was:
The Python library for avro fails to write some blocks when used with snappy compression.

The error is:

Traceback (most recent call last):
  File "tools/json_to_avro.py", line 74, in <module>
    writer.append(line)
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 185, in append
    self._write_block()
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 169, in _write_block
    self.encoder.write_crc32(uncompressed_data)
  File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/io.py", line 364, in write_crc32
    self.write(STRUCT_CRC32.pack(crc32(bytes)));
struct.error: integer out of range for 'I' format code


>From my investigation, str(crc32(bytes)) is showing negative integers, so the issue seems to be fixed by masking the output.

This fix appears to work from my limited testing:

--- io.old.py	2011-09-21 14:32:38.992544680 +1000
+++ io.py	2011-09-21 14:33:11.492544686 +1000
@@ -360,7 +360,7 @@
     """
     A 4-byte, big-endian CRC32 checksum
     """
-    self.write(STRUCT_CRC32.pack(crc32(bytes)));
+    self.write(STRUCT_CRC32.pack(crc32(bytes) & 0xffffffff));
 
 #
 # DatumReader/Writer



> Python snappy error: "integer out of range for 'I' format code"
> ---------------------------------------------------------------
>
>                 Key: AVRO-892
>                 URL: https://issues.apache.org/jira/browse/AVRO-892
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.5.4
>         Environment: Linux michaelc 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
> Ubuntu 11.04
> Python 2.7.1+ (ubuntu stock version)
> avro-1.5.4-py2.7.egg
> snappy-1.0.4 (c library)
> python-snappy-0.3.2
>            Reporter: Michael Cooper
>
> The Python library for avro fails to write some blocks when used with snappy compression.
> The error is:
> {code}
> Traceback (most recent call last):
>   File "tools/json_to_avro.py", line 74, in <module>
>     writer.append(line)
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 185, in append
>     self._write_block()
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/datafile.py", line 169, in _write_block
>     self.encoder.write_crc32(uncompressed_data)
>   File "/home/michaelc/.python/2.7/avro-1.5.4-py2.7.egg/avro/io.py", line 364, in write_crc32
>     self.write(STRUCT_CRC32.pack(crc32(bytes)));
> struct.error: integer out of range for 'I' format code
> {code}
> From my investigation, str(crc32(bytes)) is showing negative integers, so the issue seems to be fixed by masking the output.
> This fix appears to work from my limited testing:
> {code}
> --- io.old.py	2011-09-21 14:32:38.992544680 +1000
> +++ io.py	2011-09-21 14:33:11.492544686 +1000
> @@ -360,7 +360,7 @@
>      """
>      A 4-byte, big-endian CRC32 checksum
>      """
> -    self.write(STRUCT_CRC32.pack(crc32(bytes)));
> +    self.write(STRUCT_CRC32.pack(crc32(bytes) & 0xffffffff));
>  
>  #
>  # DatumReader/Writer
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira