You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/18 20:19:32 UTC

[GitHub] [arrow-datafusion] wangxiaoying opened a new issue, #2268: LEFT JOIN panicked at 'assertion failed: i < self.len()' and derive incorrect result

wangxiaoying opened a new issue, #2268:
URL: https://github.com/apache/arrow-datafusion/issues/2268

   **Describe the bug**
   In some cases, the left join will cause panic with following error:
   ```
   thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
   ```
   
   **To Reproduce**
   ```rust
       let mut ctx = SessionContext::new();
   
       let id64_array = Int64Array::from(vec![
           Some(5247),
           Some(3821),
           Some(6321),
           Some(8821),
           Some(7748),
       ]);
       let str_array = StringArray::from(vec!["1", "2", "3", "4", "5"]);
       let schema = Schema::new(vec![
           Field::new("id64", DataType::Int64, true),
           Field::new("STR", DataType::Utf8, false),
       ]);
   
       let batch = RecordBatch::try_new(
           Arc::new(schema),
           vec![
               Arc::new(id64_array),
               Arc::new(str_array),
           ],
       )
       .unwrap();
   
       let db1 = MemTable::try_new(batch.schema(), vec![vec![batch]]).unwrap();
       ctx.register_table("t1", Arc::new(db1)).unwrap();
   
       let id64_2array = Int64Array::from(vec![Some(358), Some(2820), Some(3804), Some(7748)]);
       let bool_2array = BooleanArray::from(vec![Some(true), Some(false), None, None]);
       let schema = Schema::new(vec![
           Field::new("id64", DataType::Int64, true),
           Field::new("col_b", DataType::Boolean, true),
       ]);
   
       let batch2 = RecordBatch::try_new(
           Arc::new(schema),
           vec![
               Arc::new(id64_2array),
               Arc::new(bool_2array),
           ],
       )
       .unwrap();
       let db2 = MemTable::try_new(batch2.schema(), vec![vec![batch2]]).unwrap();
       ctx.register_table("t2", Arc::new(db2)).unwrap();
   
       let sql = "select * from t1 left join t2 on t1.id64 = t2.id64";
   
       let rt = Arc::new(tokio::runtime::Runtime::new().expect("Failed to create runtime"));
       let df = rt.block_on(ctx.sql(sql)).unwrap();
       rt.block_on(df.limit(5).unwrap().show()).unwrap();
       let num_rows = rt
           .block_on(df.collect())
           .unwrap()
           .into_iter()
           .map(|rb| rb.num_rows())
           .sum::<usize>();
       println!("Final # rows: {}", num_rows);
   ```
   
   Output is
   ```
   thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
   note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
   thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
   thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
   +------+-----+------+-------+
   | id64 | STR | id64 | col_b |
   +------+-----+------+-------+
   | 7748 | 5   | 7748 |       |
   | 6321 | 3   |      |       |
   +------+-----+------+-------+
   thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
   thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
   thread 'tokio-runtime-worker' panicked at 'assertion failed: i < self.len()', /Users/momo/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-11.1.0/src/array/array_boolean.rs:120:9
   Final # rows: 2
   ```
   
   **Expected behavior**
   Should not panic, the result should contain all rows in `t1`.
   
   **Additional context**
   None
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] yjshen closed issue #2268: LEFT JOIN panicked at 'assertion failed: i < self.len()' and derive incorrect result

Posted by GitBox <gi...@apache.org>.
yjshen closed issue #2268: LEFT JOIN panicked at 'assertion failed: i < self.len()' and derive incorrect result
URL: https://github.com/apache/arrow-datafusion/issues/2268


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #2268: LEFT JOIN panicked at 'assertion failed: i < self.len()' and derive incorrect result

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #2268:
URL: https://github.com/apache/arrow-datafusion/issues/2268#issuecomment-1102597127

   Thank you for the report and reproducer @wangxiaoying ❤️ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org