You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/07/14 18:00:52 UTC

[GitHub] [arrow-datafusion] thomas-k-cameron opened a new issue, #2910: index out of range error from datafusion_row::write::write_field

thomas-k-cameron opened a new issue, #2910:
URL: https://github.com/apache/arrow-datafusion/issues/2910

   **Describe the bug**
   index out of range error coming from datafusion_row::write::writ_field 
   
   **To Reproduce**
   It happened when I used a proprietary data set which I bought from the vendor.
   I haven't been able to reproduce it without that data set.
   
   **Expected behavior**
   It does not panic
   
   **Additional context**
   cargo 1.62.0 (a748cf5a3 2022-06-08)
   
   ```
   thread 'tokio-runtime-worker' panicked at 'range end index 153 out of range for slice of length 152', library/core/src/slice/index.rs:73:5
   stack backtrace:
      0: rust_begin_unwind
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panicking.rs:584:5
      1: core::panicking::panic_fmt
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/panicking.rs:142:14
      2: core::slice::index::slice_end_index_len_fail_rt
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/slice/index.rs:73:5
      3: core::ops::function::FnOnce::call_once
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/ops/function.rs:248:5
      4: core::intrinsics::const_eval_select
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/intrinsics.rs:2372:5
      5: core::slice::index::slice_end_index_len_fail
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/slice/index.rs:67:9
      6: datafusion_row::writer::write_field
      7: datafusion_row::writer::write_row
      8: <datafusion::physical_plan::aggregates::row_hash::GroupedHashAggregateStreamV2 as futures_core::stream::Stream>::poll_next
      9: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
     10: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
     11: tokio::runtime::task::harness::Harness<T,S>::poll
     12: std::thread::local::LocalKey<T>::with
     13: tokio::runtime::thread_pool::worker::Context::run_task
     14: tokio::runtime::thread_pool::worker::Context::run
     15: tokio::macros::scoped_tls::ScopedKey<T>::set
     16: tokio::runtime::thread_pool::worker::run
     17: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
     18: tokio::runtime::task::harness::Harness<T,S>::poll
     19: tokio::runtime::blocking::pool::Inner::run
   note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
   thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ArrowError(ExternalError(Execution("Join Error: task 15 panicked")))', src/main.rs:47:46
   stack backtrace:
      0: rust_begin_unwind
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panicking.rs:584:5
      1: core::panicking::panic_fmt
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/panicking.rs:142:14
      2: core::result::unwrap_failed
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/result.rs:1785:5
      3: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
      4: std::thread::local::LocalKey<T>::with
      5: tokio::park::thread::CachedParkThread::block_on
      6: tokio::runtime::Runtime::block_on
      7: my_app::main
   note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
   thread 'tokio-runtime-worker' panicked at 'range end index 128 out of range for slice of length 120', library/core/src/slice/index.rs:73:5
   stack backtrace:
      0: rust_begin_unwind
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panicking.rs:584:5
      1: core::panicking::panic_fmt
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/panicking.rs:142:14
      2: core::slice::index::slice_end_index_len_fail_rt
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/slice/index.rs:73:5
      3: core::ops::function::FnOnce::call_once
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/ops/function.rs:248:5
      4: core::intrinsics::const_eval_select
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/intrinsics.rs:2372:5
      5: core::slice::index::slice_end_index_len_fail
                at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/slice/index.rs:67:9
      6: datafusion_row::writer::write_field
      7: datafusion_row::writer::write_row
      8: <datafusion::physical_plan::aggregates::row_hash::GroupedHashAggregateStreamV2 as futures_core::stream::Stream>::poll_next
      9: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
     10: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
     11: tokio::runtime::task::harness::Harness<T,S>::poll
     12: std::thread::local::LocalKey<T>::with
     13: tokio::runtime::thread_pool::worker::Context::run_task
     14: tokio::runtime::thread_pool::worker::Context::run
     15: tokio::macros::scoped_tls::ScopedKey<T>::set
     16: tokio::runtime::thread_pool::worker::run
     17: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
     18: tokio::runtime::task::harness::Harness<T,S>::poll
     19: tokio::runtime::blocking::pool::Inner::run
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on issue #2910: index out of range error from datafusion_row::write::write_field

Posted by GitBox <gi...@apache.org>.
comphead commented on issue #2910:
URL: https://github.com/apache/arrow-datafusion/issues/2910#issuecomment-1187674159

   > Glad to hear that. Is there anything else that I can do?
   
   Nothing more needed for now, I'll try to investigate why its happening. Thanks for reporting such weird thing.
   My vision the problem somewhere in hasher, which used both in group by and hash joins.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on issue #2910: index out of range error from datafusion_row::write::write_field

Posted by GitBox <gi...@apache.org>.
comphead commented on issue #2910:
URL: https://github.com/apache/arrow-datafusion/issues/2910#issuecomment-1186673960

   Thanks for providing data. I was able to reproduce it locally. Looks like #2877 related, but without even having join. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #2910: index out of range error from datafusion_row::write::write_field

Posted by GitBox <gi...@apache.org>.
alamb closed issue #2910: index out of range error from datafusion_row::write::write_field 
URL: https://github.com/apache/arrow-datafusion/issues/2910


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] thomas-k-cameron commented on issue #2910: index out of range error from datafusion_row::write::write_field

Posted by GitBox <gi...@apache.org>.
thomas-k-cameron commented on issue #2910:
URL: https://github.com/apache/arrow-datafusion/issues/2910#issuecomment-1193182672

   Lovely. Thanks a lot!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] thomas-k-cameron commented on issue #2910: index out of range error from datafusion_row::write::write_field

Posted by GitBox <gi...@apache.org>.
thomas-k-cameron commented on issue #2910:
URL: https://github.com/apache/arrow-datafusion/issues/2910#issuecomment-1186696077

   Glad to hear that.
   Is there anything else that I can do?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on issue #2910: index out of range error from datafusion_row::write::write_field

Posted by GitBox <gi...@apache.org>.
comphead commented on issue #2910:
URL: https://github.com/apache/arrow-datafusion/issues/2910#issuecomment-1194875373

   Well, it happens that the same data from CSV and constructed manually works differently. Manual works, CSV failed. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on issue #2910: index out of range error from datafusion_row::write::write_field

Posted by GitBox <gi...@apache.org>.
comphead commented on issue #2910:
URL: https://github.com/apache/arrow-datafusion/issues/2910#issuecomment-1193172370

   Looks like there are couple of issues in row writer. The interesting thing they occur in very specific scenarios. Still investigating.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on issue #2910: index out of range error from datafusion_row::write::write_field

Posted by GitBox <gi...@apache.org>.
comphead commented on issue #2910:
URL: https://github.com/apache/arrow-datafusion/issues/2910#issuecomment-1184845870

   Hi @thomas-k-cameron what is tag function?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] thomas-k-cameron commented on issue #2910: index out of range error from datafusion_row::write::write_field

Posted by GitBox <gi...@apache.org>.
thomas-k-cameron commented on issue #2910:
URL: https://github.com/apache/arrow-datafusion/issues/2910#issuecomment-1184947051

   @comphead 
   It's my own UDF. 
   It still generates the same error even without it.
   
   I was able to reproduce the error without the data set I was talking about.
   I uploaded the code here.
   
   https://github.com/thomas-k-cameron/df_rs_error_find
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org