You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Jorge (Jira)" <ji...@apache.org> on 2020/08/20 19:47:00 UTC

[jira] [Created] (ARROW-9815) [Rust] [DataFusion] Deadlock in creation of physical plan with two udfs

Jorge created ARROW-9815:
----------------------------

             Summary: [Rust] [DataFusion] Deadlock in creation of physical plan with two udfs
                 Key: ARROW-9815
                 URL: https://issues.apache.org/jira/browse/ARROW-9815
             Project: Apache Arrow
          Issue Type: Bug
          Components: Rust, Rust - DataFusion
            Reporter: Jorge
            Assignee: Jorge


This one took me some time to understand, but I finally have a reproducible example: when two udfs are called, one after the other, we cause a deadlock when creating the physical plan.

Example test

{code}
#[test]
fn csv_query_sqrt_sqrt() -> Result<()> {
    let mut ctx = create_ctx()?;
    register_aggregate_csv(&mut ctx)?;
    let sql = "SELECT sqrt(sqrt(c12)) FROM aggregate_test_100 LIMIT 1";
    let actual = execute(&mut ctx, sql);
    // sqrt(sqrt(c12=0.9294097332465232)) = 0.9818650561397431
    let expected = "0.9818650561397431".to_string();
    assert_eq!(actual.join("\n"), expected);
    Ok(())
}
{code}

I believe that this is due to the recursive nature of the physical planner, that locks scalar_functions within a match, which blocks the whole thing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)