You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "wolffcm (via GitHub)" <gi...@apache.org> on 2023/04/13 17:50:51 UTC

[GitHub] [arrow-datafusion] wolffcm opened a new issue, #5996: SimplifyExpressions errors when simplifying `power` fn

wolffcm opened a new issue, #5996:
URL: https://github.com/apache/arrow-datafusion/issues/5996

   ### Describe the bug
   
   When calling the `power` function with a constant in the second argument, I get an error:
   ```
   Internal error: Optimizer rule 'simplify_expressions' failed due to unexpected error: Internal error: The expr has more than one schema, could not determine data type. This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker.
   ```
   
   ### To Reproduce
   
   with `datafusion-cli`:
   ```sql
   select power(f, 100) from (values(10.0)) as t(f);
   ```
   I was also able to reproduce this with a unit test in `datafusion/optimizer/src/simplify_expressions/simlify_exprs.rs`:
   ```rust
       #[test]
       fn simplify_power() -> Result<()> {
           let schema = Schema::new(vec![
               Field::new("f", DataType::Float64, false),
           ]);
           let table_scan = table_scan(Some("test"), &schema, None)
               .expect("creating scan")
               .build()
               .expect("building plan");
   
           let plan = LogicalPlanBuilder::from(table_scan)
               .project(vec![call_fn("power", vec![col("f"), lit(1)])?])?
               .build()?;
           let rule = SimplifyExpressions::new();
           let _optimized_plan = rule
               .try_optimize(&plan, &OptimizerContext::new())
               .unwrap()
               .expect("failed to optimize plan");
           Ok(())
       }
   ```
   
   ### Expected behavior
   
    Calling `power` with a constant is a reasonable thing to want to do, so I would expect this to work.
   
   ### Additional context
   
   Unless the optimizer is configured with `skip_failed_rules` set to `false`, this could manifest in other, or the query might succeed just fine.
   
   Related issue: https://github.com/apache/arrow-datafusion/issues/4685
   
   I found this specifically when invoking `power` but this same issue probably occurs when simplifying other expressions.
   
   In IOx we want to run with `skip_failed_rules` set to `false` specifically so we can return reasonable error messages to the user: https://github.com/influxdata/influxdb_iox/issues/7330
   
   This test failure seems to happen in `simpl_pow` when trying to get the data type of the exponent argument:
   https://github.com/apache/arrow-datafusion/blob/fcd8b899e2a62f798413c536f47078289ece9d05/datafusion/optimizer/src/simplify_expressions/utils.rs#L403
   And then the `SimplifyContext` can't get the type because there is "more than one schema":
   https://github.com/apache/arrow-datafusion/blob/fcd8b899e2a62f798413c536f47078289ece9d05/datafusion/optimizer/src/simplify_expressions/context.rs#L135


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] wolffcm commented on issue #5996: SimplifyExpressions errors when simplifying `power` fn

Posted by "wolffcm (via GitHub)" <gi...@apache.org>.
wolffcm commented on issue #5996:
URL: https://github.com/apache/arrow-datafusion/issues/5996#issuecomment-1515581349

   For the simple example in the unit test, the plan is:
   ```
   Projection: power(test.f, Int32(1))
     TableScan: test
   ```
   
   And the two schemas in the error message are:
   ```
   [
       DFSchema { fields: [
           DFField { qualifier: None, field: Field { name: "power(test.f,Int32(1))", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} } }
       ], metadata: {} }, 
       DFSchema { fields: [
           DFField { qualifier: Some(Bare { table: "test" }), field: Field { name: "f", data_type: Float64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} } }
       ], metadata: {} }
   ]
   ```
   The first one is from the Project and the other is from the Scan.
   
   After spending some time looking at this today I think I might have a potential fix, will draft a PR tomorrow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #5996: SimplifyExpressions errors when simplifying `power` fn

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #5996:
URL: https://github.com/apache/arrow-datafusion/issues/5996#issuecomment-1509742988

   I am not sure why this is happening -- I think some debugging is in order. 
   
   Maybe one way to start is to figure out what the two schema are:
   
   https://github.com/apache/arrow-datafusion/blob/03851ce6892c4a682058bbc8689dd7d6d7990ad0/datafusion/optimizer/src/simplify_expressions/context.rs#L130-L138


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #5996: SimplifyExpressions errors when simplifying `power` fn

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #5996: SimplifyExpressions errors when simplifying `power` fn
URL: https://github.com/apache/arrow-datafusion/issues/5996


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #5996: SimplifyExpressions errors when simplifying `power` fn

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #5996:
URL: https://github.com/apache/arrow-datafusion/issues/5996#issuecomment-1519898793

   I believe the upstream fix is now available in IOx after https://github.com/influxdata/influxdb_iox/pull/7630


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] wolffcm commented on issue #5996: SimplifyExpressions errors when simplifying `power` fn

Posted by "wolffcm (via GitHub)" <gi...@apache.org>.
wolffcm commented on issue #5996:
URL: https://github.com/apache/arrow-datafusion/issues/5996#issuecomment-1515140356

   I am looking into this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org