You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Hatem Helal (JIRA)" <ji...@apache.org> on 2019/06/14 12:51:00 UTC

[jira] [Comment Edited] (ARROW-5608) [C++][parquet] Invalid memory access when using parquet::arrow::ColumnReader

    [ https://issues.apache.org/jira/browse/ARROW-5608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16864026#comment-16864026 ] 

Hatem Helal edited comment on ARROW-5608 at 6/14/19 12:50 PM:
--------------------------------------------------------------

I'm completely baffled by how to copy and paste code into jira...so here is a [gist|https://gist.github.com/hatemhelal/892e76a48e5b372f0e28a34403893ddd#file-reader-writer-cc-L130], see {{read_column_iteratively}}


was (Author: hatem):
I'm completely baffled by how to copy and paste code into jira...so here is a [gist|[https://gist.github.com/hatemhelal/892e76a48e5b372f0e28a34403893ddd#file-reader-writer-cc-L130]]

> [C++][parquet] Invalid memory access when using parquet::arrow::ColumnReader
> ----------------------------------------------------------------------------
>
>                 Key: ARROW-5608
>                 URL: https://issues.apache.org/jira/browse/ARROW-5608
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Hatem Helal
>            Assignee: Hatem Helal
>            Priority: Major
>              Labels: Parquet
>
> I've observed occasional crashes when using the {{parquet::arrow::ColumnReader}} to iteratively read a fixed number of records.  This has been quite tricky to isolate but compiling the attached version of parquet-arrow-example with ASAN has pointed me to an out-of-bounds access at [cpp/src/parquet/arrow/record_reader.cc#L356|https://github.com/apache/arrow/blob/master/cpp/src/parquet/arrow/record_reader.cc#L356]
> ASAN stack trace
> {code:java}
> ==18666==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00010c1b3038 at pc 0x000108330bdd bp 0x7ffee8d16450 sp 0x7ffee8d15c00
> READ of size 198 at 0x00010c1b3038 thread T0
> #0 0x108330bdc in __asan_memmove (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x54bdc)
> #1 0x107205e96 in parquet::internal::RecordReader::RecordReaderImpl::Reset() algorithm:1828
> #2 0x107205813 in parquet::internal::RecordReader::Reset() record_reader.cc:932
> #3 0x106faea47 in parquet::arrow::PrimitiveImpl::NextBatch(long long, std::__1::shared_ptr<arrow::ChunkedArray>*) reader.cc:1549
> #4 0x106f6e69b in parquet::arrow::ColumnReader::NextBatch(long long, std::__1::shared_ptr<arrow::ChunkedArray>*) reader.cc:1665
> #5 0x106f06afe in read_column_iterative() reader-writer.cc:162
> #6 0x106f09e9a in main reader-writer.cc:174
> #7 0x7fff79472ed8 in start (libdyld.dylib:x86_64+0x16ed8){code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)