You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jay Baywatch (Jira)" <ji...@apache.org> on 2021/04/27 02:09:00 UTC

[jira] [Comment Edited] (ARROW-12547) Sigbus when using mmap in multiprocessing env over netapp

    [ https://issues.apache.org/jira/browse/ARROW-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17332830#comment-17332830 ] 

Jay Baywatch edited comment on ARROW-12547 at 4/27/21, 2:08 AM:
----------------------------------------------------------------

I wrote a test and was unexpectedly able to replicate this using Pyarrow 3.0 and pandas 0.25.2.



EDIT: upon further review, it was a bus error, but not exactly the same. Could be something environmental to us, although it's happened on 3 separate hosts now.

I'll keep digging.


was (Author: baywatch):
I wrote a test and was unexpectedly able to replicate this using Pyarrow 3.0 and pandas 0.25.2.

> Sigbus when using mmap in multiprocessing env over netapp
> ---------------------------------------------------------
>
>                 Key: ARROW-12547
>                 URL: https://issues.apache.org/jira/browse/ARROW-12547
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 3.0.0
>            Reporter: Jay Baywatch
>            Priority: Minor
>
> We have noticed a condition where using arrow to read parquet files that reside on our netapp from slurm (over python) raise an occasional signal 7.
> We haven’t yet tried disabling memory mapping yet, although we do expect that turning memory mapping off in read_table will resolve the issue.
> This seems to occur when we read a file that has just been written, even though we do write parquet files to a transient location and then swap the file in using os.rename
>  
> All that said, we were not sure if this was known issue or if team pyarrow is interested in the stack trace.
>  
>  
> Thread 1 (Thread 0x7fafa7dff700 (LWP 44408)):
> #0  __memcpy_avx_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:238
> #1  0x00007fafb9c40aba in snappy::RawUncompress(snappy::Source*, char*) () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/*libarrow.so*.300
> #2  0x00007fafb9c41131 in snappy::RawUncompress(char const*, unsigned long, char*) () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/libarrow.so.300
> #3  0x00007fafb942abbe in arrow::util::internal::(anonymous namespace)::SnappyCodec::Decompress(long, unsigned char const*, long, unsigned char*) () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/libarrow.so.300
> #4  0x00007fafb4d0965e in parquet::(anonymous namespace)::SerializedPageReader::DecompressIfNeeded(std::shared_ptr<arrow::Buffer>, int, int, int) () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/*libparquet.so*.300
> #5  0x00007fafb4d2bc2d in parquet::(anonymous namespace)::SerializedPageReader::NextPage() () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/libparquet.so.300
> #6  0x00007fafb4d330c3 in parquet::(anonymous namespace)::ColumnReaderImplBase<parquet::PhysicalType<(parquet::Type::type)5> >::HasNextInternal() [clone .part.0] () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/libparquet.so.300
> #7  0x00007fafb4d33eb8 in parquet::internal::(anonymous namespace)::TypedRecordReader<parquet::PhysicalType<(parquet::Type::type)5> >::ReadRecords(long) () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/libparquet.so.300
> #8  0x00007fafb4d21bb8 in parquet::arrow::(anonymous namespace)::LeafReader::LoadBatch(long) () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/libparquet.so.300
> #9  0x00007fafb4d489c8 in parquet::arrow::ColumnReaderImpl::NextBatch(long, std::shared_ptr<arrow::ChunkedArray>*) () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/libparquet.so.300
> #10 0x00007fafb4d32db9 in arrow::internal::FnOnce<void ()>::FnImpl<std::_Bind<arrow::detail::ContinueFuture (arrow::Future<arrow::detail::Empty>, parquet::arrow::(anonymous namespace)::FileReaderImpl::GetRecordBatchReader(std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, std::unique_ptr<arrow::RecordBatchReader, std::default_delete<arrow::RecordBatchReader> >*)::\{lambda()#1}::operator()()::\{lambda(int)#1}, int)> >::invoke() () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/libparquet.so.300
> #11 0x00007fafb9444ddd in std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::\{lambda()#1}> > >::_M_run() () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/libarrow.so.300
> #12 0x00007fafb9dd3580 in execute_native_thread_routine () from /home/svc_backtest/portfolio_analytics/prod/pyenv/lib/python3.7/site-packages/pyarrow/libarrow.so.300
> #13 0x00007fafefcdc6ba in start_thread (arg=0x7fafa7dff700) at pthread_create.c:333
> #14 0x00007fafefa1241d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109



--
This message was sent by Atlassian Jira
(v8.3.4#803005)