You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2017/09/27 01:57:00 UTC

[jira] [Resolved] (ARROW-1611) Crash in BitmapReader when length is zero

     [ https://issues.apache.org/jira/browse/ARROW-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wes McKinney resolved ARROW-1611.
---------------------------------
    Resolution: Fixed

Issue resolved by pull request 1137
[https://github.com/apache/arrow/pull/1137]

> Crash in BitmapReader when length is zero
> -----------------------------------------
>
>                 Key: ARROW-1611
>                 URL: https://issues.apache.org/jira/browse/ARROW-1611
>             Project: Apache Arrow
>          Issue Type: Bug
>    Affects Versions: 0.7.1
>         Environment: Mac OS X 10.11.6
>            Reporter: Rene Sugar
>            Assignee: Rene Sugar
>              Labels: pull-request-available
>             Fix For: 0.7.1
>
>
> This was found when applying the fix for ARROW-1601 to parquet-cpp.
> BitmapReader can be called when the length is zero resulting in EXC_BAD_ACCESS when trying to access the first byte of bitmap.
> Call stack says BitmapWriter because I added a BitmapWriter class to fix the same pattern as the INIT_BITSET/READ_NEXT_BITSET code for writing bitmaps in DefinitionLevelsToBitmap (parquet-cpp/src/parquet/column_reader.h). The constructors are the same so the compiler merged them.
> Old pull request (close):
> https://github.com/apache/arrow/pull/1131
> New pull request with suggested changes:
> https://github.com/apache/arrow/pull/1133
> Process 17313 launched: './bin/FileConvert' (x86_64)
> Input files are: 
> ../../parquet-data/State_Drug_Utilization_Data_2016.csv
> Processing input file: ../../parquet-data/State_Drug_Utilization_Data_2016.csv
> Process 17313 stopped
> * thread #1: tid = 0x4be842, 0x0000000101840fe9 libparquet.1.dylib`arrow::internal::BitmapWriter::BitmapWriter(this=0x00007fff5fbf2908, bitmap="}", start_offset=1048576, length=0) + 89 at bit-util.h:99, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x106ba0000)
>     frame #0: 0x0000000101840fe9 libparquet.1.dylib`arrow::internal::BitmapWriter::BitmapWriter(this=0x00007fff5fbf2908, bitmap="}", start_offset=1048576, length=0) + 89 at bit-util.h:99
>    96  	  : bitmap_(bitmap), position_(0), length_(length) {
>    97  	    byte_offset_ = start_offset / 8;
>    98  	    bit_offset_ = start_offset % 8;
> -> 99  	    current_byte_ = bitmap[byte_offset_];
>    100 	  }
>    101 	
>    102 	  void Set() { current_byte_ |= (1 << bit_offset_); }
> (lldb) thread backtrace
> * thread #1: tid = 0x4be842, 0x0000000101840fe9 libparquet.1.dylib`arrow::internal::BitmapWriter::BitmapWriter(this=0x00007fff5fbf2908, bitmap="}", start_offset=1048576, length=0) + 89 at bit-util.h:99, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x106ba0000)
>   * frame #0: 0x0000000101840fe9 libparquet.1.dylib`arrow::internal::BitmapWriter::BitmapWriter(this=0x00007fff5fbf2908, bitmap="}", start_offset=1048576, length=0) + 89 at bit-util.h:99
>     frame #1: 0x0000000101840ded libparquet.1.dylib`arrow::internal::BitmapWriter::BitmapWriter(this=0x00007fff5fbf2908, bitmap="}", start_offset=1048576, length=0) + 45 at bit-util.h:96
>     frame #2: 0x0000000101964bf3 libparquet.1.dylib`parquet::Encoder<parquet::DataType<(parquet::Type::type)4> >::PutSpaced(this=0x0000000109b08bb0, src=0x000000012b86b000, num_values=0, valid_bits="}", valid_bits_offset=1048576) + 1747 at encoding.h:62
>     frame #3: 0x0000000101931913 libparquet.1.dylib`parquet::TypedColumnWriter<parquet::DataType<(parquet::Type::type)4> >::WriteValuesSpaced(this=0x0000000109b08cb8, num_values=0, valid_bits="}", valid_bits_offset=1048576, values=0x000000012b86b000) + 115 at column_writer.cc:612
> To reproduce this problem:
> 1) Download the CSV file.
> Source: https://catalog.data.gov/dataset?res_format=CSV
> State Drug Utilization Data 2016
> https://data.medicaid.gov/api/views/3v6v-qk5s/rows.csv?accessType=DOWNLOAD
> 2) Run FileConvert (see https://github.com/renesugar/FileConvert)
> ./bin/FileConvert -i ./State_Drug_Utilization_Data_2016.csv -o ./State_Drug_Utilization_Data_2016.parquet
> (FileConvert is built using the same process as MapD.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)