You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/18 20:19:32 UTC
[GitHub] [arrow-datafusion] wangxiaoying opened a new issue, #2268: LEFT JOIN panicked at 'assertion failed: i < self.len()' and derive incorrect result
wangxiaoying opened a new issue, #2268:
URL: https://github.com/apache/arrow-datafusion/issues/2268
**Describe the bug**
In some cases, the left join will cause panic with following error:
```
thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
```
**To Reproduce**
```rust
let mut ctx = SessionContext::new();
let id64_array = Int64Array::from(vec![
Some(5247),
Some(3821),
Some(6321),
Some(8821),
Some(7748),
]);
let str_array = StringArray::from(vec!["1", "2", "3", "4", "5"]);
let schema = Schema::new(vec![
Field::new("id64", DataType::Int64, true),
Field::new("STR", DataType::Utf8, false),
]);
let batch = RecordBatch::try_new(
Arc::new(schema),
vec![
Arc::new(id64_array),
Arc::new(str_array),
],
)
.unwrap();
let db1 = MemTable::try_new(batch.schema(), vec![vec![batch]]).unwrap();
ctx.register_table("t1", Arc::new(db1)).unwrap();
let id64_2array = Int64Array::from(vec![Some(358), Some(2820), Some(3804), Some(7748)]);
let bool_2array = BooleanArray::from(vec![Some(true), Some(false), None, None]);
let schema = Schema::new(vec![
Field::new("id64", DataType::Int64, true),
Field::new("col_b", DataType::Boolean, true),
]);
let batch2 = RecordBatch::try_new(
Arc::new(schema),
vec![
Arc::new(id64_2array),
Arc::new(bool_2array),
],
)
.unwrap();
let db2 = MemTable::try_new(batch2.schema(), vec![vec![batch2]]).unwrap();
ctx.register_table("t2", Arc::new(db2)).unwrap();
let sql = "select * from t1 left join t2 on t1.id64 = t2.id64";
let rt = Arc::new(tokio::runtime::Runtime::new().expect("Failed to create runtime"));
let df = rt.block_on(ctx.sql(sql)).unwrap();
rt.block_on(df.limit(5).unwrap().show()).unwrap();
let num_rows = rt
.block_on(df.collect())
.unwrap()
.into_iter()
.map(|rb| rb.num_rows())
.sum::<usize>();
println!("Final # rows: {}", num_rows);
```
Output is
```
thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
+------+-----+------+-------+
| id64 | STR | id64 | col_b |
+------+-----+------+-------+
| 7748 | 5 | 7748 | |
| 6321 | 3 | | |
+------+-----+------+-------+
thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
Final # rows: 2
```
**Expected behavior**
Should not panic, the result should contain all rows in `t1`.
**Additional context**
None
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] yjshen closed issue #2268: LEFT JOIN panicked at 'assertion failed: i < self.len()' and derive incorrect result
Posted by GitBox <gi...@apache.org>.
yjshen closed issue #2268: LEFT JOIN panicked at 'assertion failed: i < self.len()' and derive incorrect result
URL: https://github.com/apache/arrow-datafusion/issues/2268
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #2268: LEFT JOIN panicked at 'assertion failed: i < self.len()' and derive incorrect result
Posted by GitBox <gi...@apache.org>.
alamb commented on issue #2268:
URL: https://github.com/apache/arrow-datafusion/issues/2268#issuecomment-1102597127
Thank you for the report and reproducer @wangxiaoying ❤️
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org