You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/10 07:37:10 UTC

[GitHub] [arrow-datafusion] Jimexist opened a new pull request #303: add random SQL function

Jimexist opened a new pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303


   # Which issue does this PR close?
   
   add random SQL function
   
   Closes #.
   
    # Rationale for this change
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   
   # What changes are included in this PR?
   
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   
   # Are there any user-facing changes?
   
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   
   If there are any breaking changes to public APIs, please add the `breaking change` label.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Jimexist commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r630623061



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       @alamb @Dandandan and @jorgecarleitao thanks for the discussion.
   
   I agree that both ways are similar and equally “hacky” but for lack of a better solution they are okay. I'd slightly prefer null array because there's no ScalarValue::USize and having to convert from/to UInt32 is a bit cumbersome.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r630080278



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       See also https://github.com/apache/arrow-datafusion/pull/307




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] nevi-me commented on a change in pull request #303: WIP add random SQL function

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629129273



##########
File path: datafusion/src/physical_plan/math_expressions.rs
##########
@@ -116,3 +116,12 @@ math_unary_function!("exp", exp);
 math_unary_function!("ln", ln);
 math_unary_function!("log2", log2);
 math_unary_function!("log10", log10);
+
+/// random SQL function
+pub fn random(_: &[ColumnarValue]) -> Result<ColumnarValue> {

Review comment:
       Would it be more efficient if this is implemented as an `arrow::compute` kernel, so that instead of creating individual scalar values, you could fill a buffer of `n` elements with random data at once?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Jimexist commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629927595



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       the alternative is to drastically change the signature of scalar functions. it confess compared with that this is cleaner (but a bit hacky).
   
   will change to use a scalar.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629348533



##########
File path: datafusion/src/physical_plan/math_expressions.rs
##########
@@ -116,3 +116,21 @@ math_unary_function!("exp", exp);
 math_unary_function!("ln", ln);
 math_unary_function!("log2", log2);
 math_unary_function!("log10", log10);
+
+/// random SQL function
+pub fn random(args: &[ColumnarValue]) -> Result<ColumnarValue> {
+    let len = match &args[0] {
+        ColumnarValue::Array(array) => array.len(),
+        _ => {
+            return Err(DataFusionError::Internal(
+                "Expect random function to take no param".to_string(),
+            ))
+        }
+    };
+    let mut rng = thread_rng();
+    let mut array = Vec::with_capacity(len);
+    for _ in 0..len {
+        array.push(Some(rng.gen_range(0.0..1.0)))

Review comment:
       Yes, makes sense, thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r632737073



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       So I think the plan should be merge  https://github.com/apache/arrow-datafusion/pull/328 and then rebase this PR to pick up that change




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r632728287



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       Yeah, let's do that for now




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Jimexist commented on pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Jimexist commented on pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#issuecomment-836311947


   ```
   > select random() r, random() r2 union all select random() r, random() r2;
   +--------------------+--------------------+
   | r                  | r2                 |
   +--------------------+--------------------+
   | 0.9560537521872872 | 0.6483468545317812 |
   | 0.9101355192715241 | 0.6704899577404533 |
   +--------------------+--------------------+
   2 row in set. Query took 0 seconds.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jorgecarleitao commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r630285020



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       Note that `NullArray` is composed by zero buffers, zero childs, no validity and one datatype, so the cost to instantiate it is really small. The advantage over a `ScalarValue` is that the semantics of getting a length are preserved: use `array.len()` as any other array.
   
   I am not married with any; was just trying to think about this from a documentations' perspective:
   
   > We support zero-argument UDFs. They MUST be declared as accepting zero arguments and the function signature MUST be a single argument. DataFusion will pass an `Array` to it, from which you can retrieve its length via `Array::len()`. The function MUST return an array whose number of rows equals the length of the array.
   
   If we pass a scalar of any type, if the evaluation is distributed, I believe that we will have to serialize `Scalar -> Array` in Ballista.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#issuecomment-841758453


   # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#303](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (77ab4ba) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/1702d6c85ebfdbc968b1dc427a9799e74b64ff96?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1702d6c) will **increase** coverage by `0.02%`.
   > The diff coverage is `90.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/303/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #303      +/-   ##
   ==========================================
   + Coverage   75.72%   75.74%   +0.02%     
   ==========================================
     Files         143      143              
     Lines       23832    23869      +37     
   ==========================================
   + Hits        18046    18080      +34     
   - Misses       5786     5789       +3     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [datafusion/src/physical\_plan/type\_coercion.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi90eXBlX2NvZXJjaW9uLnJz) | `99.32% <ø> (ø)` | |
   | [datafusion/src/physical\_plan/udf.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi91ZGYucnM=) | `78.26% <ø> (ø)` | |
   | [datafusion/src/physical\_plan/math\_expressions.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9tYXRoX2V4cHJlc3Npb25zLnJz) | `90.00% <86.66%> (-10.00%)` | :arrow_down: |
   | [datafusion/src/physical\_plan/functions.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9mdW5jdGlvbnMucnM=) | `89.58% <88.23%> (+0.03%)` | :arrow_up: |
   | [datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi90ZXN0cy9zcWwucnM=) | `99.88% <100.00%> (+<0.01%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [1702d6c...77ab4ba](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Jimexist commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629941928



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       I guess then another option is to add a new type of node which models no arg functions




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629298188



##########
File path: datafusion/src/physical_plan/math_expressions.rs
##########
@@ -116,3 +116,21 @@ math_unary_function!("exp", exp);
 math_unary_function!("ln", ln);
 math_unary_function!("log2", log2);
 math_unary_function!("log10", log10);
+
+/// random SQL function
+pub fn random(args: &[ColumnarValue]) -> Result<ColumnarValue> {
+    let len = match &args[0] {
+        ColumnarValue::Array(array) => array.len(),
+        _ => {
+            return Err(DataFusionError::Internal(
+                "Expect random function to take no param".to_string(),
+            ))
+        }
+    };
+    let mut rng = thread_rng();
+    let mut array = Vec::with_capacity(len);
+    for _ in 0..len {
+        array.push(Some(rng.gen_range(0.0..1.0)))

Review comment:
       `arrow::compute::kernels::arity::unary` could be probably used here.
   
   see also:
   https://github.com/apache/arrow-datafusion/pull/309




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629859725



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       What about passing the length as a `ColumnarValue::Scalar` value for now?
   I am not sure if I'm totally happy either, but that wil avoid generating a temporary array only for accessing the length.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629859725



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       What about passing the length as a `ColumnarValue::Scalar` value for now?
   I am not sure if I'm totally happy with that either, but that wil avoid generating a temporary array only for accessing the length.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jorgecarleitao commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629356732



##########
File path: datafusion/src/physical_plan/math_expressions.rs
##########
@@ -116,3 +116,21 @@ math_unary_function!("exp", exp);
 math_unary_function!("ln", ln);
 math_unary_function!("log2", log2);
 math_unary_function!("log10", log10);
+
+/// random SQL function
+pub fn random(args: &[ColumnarValue]) -> Result<ColumnarValue> {
+    let len = match &args[0] {
+        ColumnarValue::Array(array) => array.len(),
+        _ => {
+            return Err(DataFusionError::Internal(
+                "Expect random function to take no param".to_string(),
+            ))
+        }
+    };
+    let mut rng = thread_rng();
+    let mut array = Vec::with_capacity(len);
+    for _ in 0..len {
+        array.push(Some(rng.gen_range(0.0..1.0)))

Review comment:
       `PrimitiveArray::<f64>::from_iter_values` would work here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] jorgecarleitao commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r632735574



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       👍 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#issuecomment-841758453


   # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#303](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (77ab4ba) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/1702d6c85ebfdbc968b1dc427a9799e74b64ff96?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1702d6c) will **increase** coverage by `0.02%`.
   > The diff coverage is `90.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/303/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #303      +/-   ##
   ==========================================
   + Coverage   75.72%   75.74%   +0.02%     
   ==========================================
     Files         143      143              
     Lines       23832    23869      +37     
   ==========================================
   + Hits        18046    18080      +34     
   - Misses       5786     5789       +3     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [datafusion/src/physical\_plan/type\_coercion.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi90eXBlX2NvZXJjaW9uLnJz) | `99.32% <ø> (ø)` | |
   | [datafusion/src/physical\_plan/udf.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi91ZGYucnM=) | `78.26% <ø> (ø)` | |
   | [datafusion/src/physical\_plan/math\_expressions.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9tYXRoX2V4cHJlc3Npb25zLnJz) | `90.00% <86.66%> (-10.00%)` | :arrow_down: |
   | [datafusion/src/physical\_plan/functions.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9mdW5jdGlvbnMucnM=) | `89.58% <88.23%> (+0.03%)` | :arrow_up: |
   | [datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi90ZXN0cy9zcWwucnM=) | `99.88% <100.00%> (+<0.01%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [1702d6c...77ab4ba](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Jimexist commented on pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Jimexist commented on pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#issuecomment-836511210


   ```
   > select c1, random() r from test;
   +----+----------------------+
   | c1 | r                    |
   +----+----------------------+
   | c  | 0.5655965570298749   |
   | d  | 0.5421519281027325   |
   | b  | 0.13654323005560398  |
   | a  | 0.8308433815463256   |
   | b  | 0.9186802759034394   |
   | b  | 0.7445636998695224   |
   | e  | 0.07519801371257007  |
   | a  | 0.25136111798546534  |
   | d  | 0.5295747101676771   |
   | a  | 0.40990916212038053  |
   | d  | 0.6953850383573246   |
   | a  | 0.7315940259479854   |
   | e  | 0.8114995840750128   |
   | d  | 0.5784867126269881   |
   | b  | 0.5043455209680903   |
   | c  | 0.1485474143927108   |
   | e  | 0.03117840454799814  |
   | d  | 0.6236479179320964   |
   | d  | 0.7008741488202888   |
   | e  | 0.18371494725350446  |
   | e  | 0.5753780913309461   |
   | d  | 0.8313881968944341   |
   | a  | 0.9891383411533885   |
   | e  | 0.39443274475046164  |
   | c  | 0.7883888787267002   |
   | a  | 0.11551212043748382  |
   | c  | 0.2280335008009724   |
   | a  | 0.05986784536832124  |
   | a  | 0.17788633521196284  |
   | b  | 0.9116042616992985   |
   | e  | 0.7562969962961654   |
   | c  | 0.7036201290901811   |
   | e  | 0.4835128269497777   |
   | b  | 0.24683894699947628  |
   | a  | 0.4500284040416962   |
   | c  | 0.7808859382454547   |
   | d  | 0.4153951159657392   |
   | c  | 0.9322736705730224   |
   | c  | 0.18616032322350895  |
   | c  | 0.013816038991163904 |
   | b  | 0.24775819719833092  |
   | d  | 0.27429688634180494  |
   | d  | 0.7935576595699978   |
   | a  | 0.23005272798412935  |
   | e  | 0.6416103347500266   |
   | b  | 0.7578072279050923   |
   | b  | 0.8045671610701597   |
   | c  | 0.22788070846321595  |
   | a  | 0.7923072791591466   |
   | d  | 0.5123487785123308   |
   | b  | 0.49300577431592507  |
   | c  | 0.3327370429940055   |
   | d  | 0.6985993575806515   |
   | d  | 0.7679736992227877   |
   | b  | 0.3933313593453034   |
   | d  | 0.893368503058128    |
   | e  | 0.8216179334475555   |
   | b  | 0.41132389621208576  |
   | a  | 0.2190799810695121   |
     1 better err msg
   | b  | 0.17446572139705596  |
   | c  | 0.42130093097362353  |
   | b  | 0.6844927700858034   |
   | c  | 0.43682537610987593  |
   | e  | 0.07365743593350871  |
     1 pick b9ea16e47 fix 305
   | e  | 0.4784257958869569   |
   | d  | 0.16726586201233884  |
   | e  | 0.909180084107635    |
   | c  | 0.6397341730827932   |
   | d  | 0.06891523396188948  |
   | e  | 0.9332425864685405   |
   | e  | 0.5388023964378135   |
   | a  | 0.525252618536995    |
   | a  | 0.7750163345281538   |
   | e  | 0.5870405166590817   |
   | a  | 0.7229541396586825   |
   | b  | 0.19106271603993163  |
   | e  | 0.08862026816319024  |
   | c  | 0.8906844121568276   |
   | e  | 0.7893706764764692   |
   | c  | 0.293145187453989    |
   | a  | 0.22659651366231603  |
   | c  | 0.27982658851095454  |
   | b  | 0.05309405992658389  |
   | a  | 0.9985010613648426   |
   | a  | 0.8408072146747827   |
   | c  | 0.9361082666514462   |
   | a  | 0.7701435871742566   |
   | c  | 0.9257993780317568   |
   | c  | 0.6235457571369003   |
   | c  | 0.9273997152575428   |
   | b  | 0.6094183904162762   |
   | a  | 0.6744046616867845   |
   | a  | 0.750243015599535    |
   | b  | 0.6774154871357085   |
   | d  | 0.950910969179537    |
   | e  | 0.3040496852019885   |
   | e  | 0.7172651559939669   |
   | d  | 0.9587276644264635   |
   | b  | 0.17079263745303108  |
   | e  | 0.2878489806731648   |
   +----+----------------------+
   100 row in set. Query took 0 seconds.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Jimexist commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629343884



##########
File path: datafusion/src/physical_plan/math_expressions.rs
##########
@@ -116,3 +116,21 @@ math_unary_function!("exp", exp);
 math_unary_function!("ln", ln);
 math_unary_function!("log2", log2);
 math_unary_function!("log10", log10);
+
+/// random SQL function
+pub fn random(args: &[ColumnarValue]) -> Result<ColumnarValue> {
+    let len = match &args[0] {
+        ColumnarValue::Array(array) => array.len(),
+        _ => {
+            return Err(DataFusionError::Internal(
+                "Expect random function to take no param".to_string(),
+            ))
+        }
+    };
+    let mut rng = thread_rng();
+    let mut array = Vec::with_capacity(len);
+    for _ in 0..len {
+        array.push(Some(rng.gen_range(0.0..1.0)))

Review comment:
       not yet, i guess https://github.com/apache/arrow-rs/pull/275 is needed first?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#issuecomment-841758453


   # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#303](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fb74a59) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/ed92673e19f1b20e0ea35397f73da49bdb304be4?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ed92673) will **increase** coverage by `0.01%`.
   > The diff coverage is `90.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/303/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #303      +/-   ##
   ==========================================
   + Coverage   75.71%   75.72%   +0.01%     
   ==========================================
     Files         143      143              
     Lines       23881    23910      +29     
   ==========================================
   + Hits        18081    18107      +26     
   - Misses       5800     5803       +3     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [datafusion/src/physical\_plan/type\_coercion.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi90eXBlX2NvZXJjaW9uLnJz) | `99.32% <ø> (ø)` | |
   | [datafusion/src/physical\_plan/functions.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9mdW5jdGlvbnMucnM=) | `89.58% <85.71%> (-0.03%)` | :arrow_down: |
   | [datafusion/src/physical\_plan/math\_expressions.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9tYXRoX2V4cHJlc3Npb25zLnJz) | `90.00% <86.66%> (-10.00%)` | :arrow_down: |
   | [datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi90ZXN0cy9zcWwucnM=) | `99.88% <100.00%> (+<0.01%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ed92673...fb74a59](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Jimexist edited a comment on pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Jimexist edited a comment on pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#issuecomment-836311947


   ## singular case
   
   ```
   > select random() r, random() r2 union all select random() r, random() r2;
   +--------------------+--------------------+
   | r                  | r2                 |
   +--------------------+--------------------+
   | 0.570782163304485  | 0.7122336302240444 |
   | 0.4344519217750207 | 0.256188733362102  |
   +--------------------+--------------------+
   2 row in set. Query took 0 seconds.
   ```
   
   ## tabular align case
   
   ```
   > select c1, random() r1, random() r2 from test;
   +----+-----------------------+-----------------------+
   | c1 | r1                    | r2                    |
   +----+-----------------------+-----------------------+
   | c  | 0.8858248695161692    | 0.6143187156567651    |
   | d  | 0.6852840904479431    | 0.2709251169769842    |
   | b  | 0.630098981690546     | 0.6077598267065687    |
   | a  | 0.16307864728987242   | 0.8227547597140914    |
   | b  | 0.008702150304241929  | 0.38766493311897166   |
   | b  | 0.9450226561353341    | 0.16156125141103117   |
   | e  | 0.6926494011573385    | 0.653725968146263     |
   | a  | 0.4099823846109887    | 0.07170807057149098   |
   | d  | 0.15579483001397176   | 0.3098832344415323    |
   | a  | 0.8059293634597469    | 0.7994994516372975    |
   | d  | 0.6593555317339335    | 0.19312486980857102   |
   | a  | 0.9086155399677391    | 0.7665270701938516    |
   | e  | 0.4044800758190923    | 0.05197052666680535   |
   | d  | 0.0025795018740371045 | 0.46326292155346094   |
   | b  | 0.2587391359273177    | 0.9162104619942888    |
   | c  | 0.13645516508526523   | 0.15752850692883258   |
   | e  | 0.3877465007257497    | 0.5699901097930642    |
   | d  | 0.47780316828309766   | 0.7885763689480612    |
   | d  | 0.22794474273876175   | 0.6095793437463859    |
   | e  | 0.5455296223081574    | 0.4289355617327004    |
   | e  | 0.2133157137081647    | 0.4184730286437153    |
   | d  | 0.8158975348529787    | 0.394255938540677     |
   | a  | 0.656112907355159     | 0.4385685299440685    |
   | e  | 0.7187571688285945    | 0.4932180835725457    |
   | c  | 0.4460089214506695    | 0.3288857238339975    |
   | a  | 0.7659867073691242    | 0.7338590718453064    |
   | c  | 0.8312317998888263    | 0.713249790823179     |
   | a  | 0.5034689184624486    | 0.3564531411596683    |
   | a  | 0.49555953754158066   | 0.5288964005078611    |
   | b  | 0.488574157540709     | 0.34840668143906095   |
   | e  | 0.962322293398729     | 0.0017991571576252419 |
   | c  | 0.405323557632822     | 0.13801508069882895   |
   | e  | 0.44810564754006976   | 0.32609793459010716   |
   | b  | 0.0481996264470097    | 0.5624886587309861    |
   | a  | 0.6856284932837569    | 0.30189559597954085   |
   | c  | 0.4392762780370185    | 0.22778780763979856   |
   | d  | 0.01862005761513874   | 0.9484007330304798    |
   | c  | 0.5493861573252898    | 0.6787226076159059    |
   | c  | 0.6804506222983264    | 0.40012929715726475   |
   | c  | 0.17031205277616834   | 0.28969160031465124   |
   | b  | 0.5516607121077097    | 0.32765665335953154   |
   | d  | 0.7902397504137304    | 0.8047820664058987    |
   | d  | 0.6235135313619369    | 0.8611740902370557    |
   | a  | 0.4089158775121935    | 0.7632865452823412    |
   | e  | 0.6479818623367095    | 0.3303923993061251    |
   | b  | 0.8405383629477621    | 0.5761157012684217    |
   | b  | 0.16013306898301072   | 0.18377799688319274   |
   | c  | 0.5237107246024528    | 0.18702870828721596   |
   | a  | 0.4267698184654345    | 0.6080320114682305    |
   | d  | 0.6752001786973243    | 0.18744579948119555   |
   | b  | 0.06394198453121214   | 0.8697468928632974    |
   | c  | 0.5533880608032804    | 0.410087636982861     |
   | d  | 0.17195857936051007   | 0.9642347754732317    |
   | d  | 0.24714036094951686   | 0.2087533372695889    |
   | b  | 0.38223418402701226   | 0.44797491855182825   |
   | d  | 0.354147713947109     | 0.1583774861902576    |
   | e  | 0.8978738349376183    | 0.6870679270888751    |
   | b  | 0.0962990269141899    | 0.9251103720761726    |
   | a  | 0.08754479049780262   | 0.3061691178397379    |
   | b  | 0.8347877947374489    | 0.10492831402932445   |
   | c  | 0.6772625649184507    | 0.5267610906406157    |
   | b  | 0.6956531376927251    | 0.9243742506850876    |
   | c  | 0.6096066968750522    | 0.15500300961880753   |
   | e  | 0.8991012695527614    | 0.014652679069998786  |
   | e  | 0.4048500168573612    | 0.7288386405759564    |
   | d  | 0.8738661862341139    | 0.5736561149057426    |
   | e  | 0.79628104001088      | 0.10359057613551692   |
   | c  | 0.6015511737143195    | 0.4246275983489023    |
   | d  | 0.5976422825371586    | 0.7110161517521416    |
   | e  | 0.13607511799429672   | 0.5692416938012763    |
   | e  | 0.7050104248345099    | 0.48394357812924893   |
   | a  | 0.3794121594380737    | 0.2762570292624906    |
   | a  | 0.3756937524394046    | 0.8349592211879893    |
   | e  | 0.6780514311786121    | 0.06202299343125328   |
   | a  | 0.9916898758023276    | 0.25480434620940917   |
   | b  | 0.11861101237985605   | 0.36793787570512615   |
   | e  | 0.9779069828477198    | 0.706631326575605     |
   | c  | 0.05169106965696968   | 0.757319676986232     |
   | e  | 0.5008578861141886    | 0.5542545101873164    |
   | c  | 0.3080413824916599    | 0.6461181444579944    |
   | a  | 0.3499918991104509    | 0.3738674842979186    |
   | c  | 0.49266139042465706   | 0.44786989508250286   |
   | b  | 0.8294106157763355    | 0.8250357289976049    |
   | a  | 0.8145941535102705    | 0.010227803378715983  |
   | a  | 0.19432122069500224   | 0.500087727457039     |
   | c  | 0.5228722683904334    | 0.19655375516418694   |
   | a  | 0.7264844525200564    | 0.7118351314074298    |
   | c  | 0.1526188214350377    | 0.2543946368251362    |
   | c  | 0.33950854157702826   | 0.960977006313132     |
   | c  | 0.6350990913317207    | 0.7731276647898677    |
   | b  | 0.24010294541852772   | 0.8652139521697786    |
   | a  | 0.0883054439477311    | 0.4627145656673548    |
   | a  | 0.540637100589954     | 0.8545095562126641    |
   | b  | 0.8328479394485582    | 0.5200373050323923    |
   | d  | 0.527288611622466     | 0.8929364305158876    |
   | e  | 0.20030569328547343   | 0.5677935767953404    |
   | e  | 0.7779991255911018    | 0.9255994347346632    |
   | d  | 0.8406514575091932    | 0.4562466008463426    |
   | b  | 0.7545561897792099    | 0.47341049730312923   |
   | e  | 0.2561046134849507    | 0.28863618306585237   |
   +----+-----------------------+-----------------------+
   100 row in set. Query took 0 seconds.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#issuecomment-841758453


   # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#303](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fb74a59) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/ed92673e19f1b20e0ea35397f73da49bdb304be4?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ed92673) will **increase** coverage by `0.01%`.
   > The diff coverage is `90.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/303/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #303      +/-   ##
   ==========================================
   + Coverage   75.71%   75.72%   +0.01%     
   ==========================================
     Files         143      143              
     Lines       23881    23910      +29     
   ==========================================
   + Hits        18081    18107      +26     
   - Misses       5800     5803       +3     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [datafusion/src/physical\_plan/type\_coercion.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi90eXBlX2NvZXJjaW9uLnJz) | `99.32% <ø> (ø)` | |
   | [datafusion/src/physical\_plan/functions.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9mdW5jdGlvbnMucnM=) | `89.58% <85.71%> (-0.03%)` | :arrow_down: |
   | [datafusion/src/physical\_plan/math\_expressions.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9tYXRoX2V4cHJlc3Npb25zLnJz) | `90.00% <86.66%> (-10.00%)` | :arrow_down: |
   | [datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/303/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi90ZXN0cy9zcWwucnM=) | `99.88% <100.00%> (+<0.01%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ed92673...fb74a59](https://codecov.io/gh/apache/arrow-datafusion/pull/303?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r630297583



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       Good call @jorgecarleitao - I feel both approaches are a bit "workarounds" to the issue of not having something in design for zero-arguments functions.
   
   I feel what would be "cleaner":
   * passing a `usize` argument for the batch size (`so you don't have to do `args[0].num_rows()` or something like that but use the length anywhere. This might break some stuff(?)
   * having another node for zero-argument functions. This might duplicate some stuff(?)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r630407771



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       I like the idea of using `NullArray` too, but generally agree with @Dandandan  that both are somewhat hacky ways to handle zero argument functions by passing a "fake" argument




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb merged pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
alamb merged pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Jimexist removed a comment on pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Jimexist removed a comment on pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#issuecomment-836511210


   ```
   > select c1, random() r from test;
   +----+----------------------+
   | c1 | r                    |
   +----+----------------------+
   | c  | 0.5655965570298749   |
   | d  | 0.5421519281027325   |
   | b  | 0.13654323005560398  |
   | a  | 0.8308433815463256   |
   | b  | 0.9186802759034394   |
   | b  | 0.7445636998695224   |
   | e  | 0.07519801371257007  |
   | a  | 0.25136111798546534  |
   | d  | 0.5295747101676771   |
   | a  | 0.40990916212038053  |
   | d  | 0.6953850383573246   |
   | a  | 0.7315940259479854   |
   | e  | 0.8114995840750128   |
   | d  | 0.5784867126269881   |
   | b  | 0.5043455209680903   |
   | c  | 0.1485474143927108   |
   | e  | 0.03117840454799814  |
   | d  | 0.6236479179320964   |
   | d  | 0.7008741488202888   |
   | e  | 0.18371494725350446  |
   | e  | 0.5753780913309461   |
   | d  | 0.8313881968944341   |
   | a  | 0.9891383411533885   |
   | e  | 0.39443274475046164  |
   | c  | 0.7883888787267002   |
   | a  | 0.11551212043748382  |
   | c  | 0.2280335008009724   |
   | a  | 0.05986784536832124  |
   | a  | 0.17788633521196284  |
   | b  | 0.9116042616992985   |
   | e  | 0.7562969962961654   |
   | c  | 0.7036201290901811   |
   | e  | 0.4835128269497777   |
   | b  | 0.24683894699947628  |
   | a  | 0.4500284040416962   |
   | c  | 0.7808859382454547   |
   | d  | 0.4153951159657392   |
   | c  | 0.9322736705730224   |
   | c  | 0.18616032322350895  |
   | c  | 0.013816038991163904 |
   | b  | 0.24775819719833092  |
   | d  | 0.27429688634180494  |
   | d  | 0.7935576595699978   |
   | a  | 0.23005272798412935  |
   | e  | 0.6416103347500266   |
   | b  | 0.7578072279050923   |
   | b  | 0.8045671610701597   |
   | c  | 0.22788070846321595  |
   | a  | 0.7923072791591466   |
   | d  | 0.5123487785123308   |
   | b  | 0.49300577431592507  |
   | c  | 0.3327370429940055   |
   | d  | 0.6985993575806515   |
   | d  | 0.7679736992227877   |
   | b  | 0.3933313593453034   |
   | d  | 0.893368503058128    |
   | e  | 0.8216179334475555   |
   | b  | 0.41132389621208576  |
   | a  | 0.2190799810695121   |
     1 better err msg
   | b  | 0.17446572139705596  |
   | c  | 0.42130093097362353  |
   | b  | 0.6844927700858034   |
   | c  | 0.43682537610987593  |
   | e  | 0.07365743593350871  |
     1 pick b9ea16e47 fix 305
   | e  | 0.4784257958869569   |
   | d  | 0.16726586201233884  |
   | e  | 0.909180084107635    |
   | c  | 0.6397341730827932   |
   | d  | 0.06891523396188948  |
   | e  | 0.9332425864685405   |
   | e  | 0.5388023964378135   |
   | a  | 0.525252618536995    |
   | a  | 0.7750163345281538   |
   | e  | 0.5870405166590817   |
   | a  | 0.7229541396586825   |
   | b  | 0.19106271603993163  |
   | e  | 0.08862026816319024  |
   | c  | 0.8906844121568276   |
   | e  | 0.7893706764764692   |
   | c  | 0.293145187453989    |
   | a  | 0.22659651366231603  |
   | c  | 0.27982658851095454  |
   | b  | 0.05309405992658389  |
   | a  | 0.9985010613648426   |
   | a  | 0.8408072146747827   |
   | c  | 0.9361082666514462   |
   | a  | 0.7701435871742566   |
   | c  | 0.9257993780317568   |
   | c  | 0.6235457571369003   |
   | c  | 0.9273997152575428   |
   | b  | 0.6094183904162762   |
   | a  | 0.6744046616867845   |
   | a  | 0.750243015599535    |
   | b  | 0.6774154871357085   |
   | d  | 0.950910969179537    |
   | e  | 0.3040496852019885   |
   | e  | 0.7172651559939669   |
   | d  | 0.9587276644264635   |
   | b  | 0.17079263745303108  |
   | e  | 0.2878489806731648   |
   +----+----------------------+
   100 row in set. Query took 0 seconds.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629441606



##########
File path: datafusion/src/physical_plan/math_expressions.rs
##########
@@ -116,3 +116,21 @@ math_unary_function!("exp", exp);
 math_unary_function!("ln", ln);
 math_unary_function!("log2", log2);
 math_unary_function!("log10", log10);
+
+/// random SQL function
+pub fn random(args: &[ColumnarValue]) -> Result<ColumnarValue> {
+    let len = match &args[0] {
+        ColumnarValue::Array(array) => array.len(),
+        _ => {
+            return Err(DataFusionError::Internal(
+                "Expect random function to take no param".to_string(),
+            ))
+        }
+    };
+    let mut rng = thread_rng();
+    let mut array = Vec::with_capacity(len);
+    for _ in 0..len {
+        array.push(Some(rng.gen_range(0.0..1.0)))

Review comment:
       even better :100: 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Jimexist commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
Jimexist commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r629500577



##########
File path: datafusion/src/physical_plan/math_expressions.rs
##########
@@ -116,3 +116,21 @@ math_unary_function!("exp", exp);
 math_unary_function!("ln", ln);
 math_unary_function!("log2", log2);
 math_unary_function!("log10", log10);
+
+/// random SQL function
+pub fn random(args: &[ColumnarValue]) -> Result<ColumnarValue> {
+    let len = match &args[0] {
+        ColumnarValue::Array(array) => array.len(),
+        _ => {
+            return Err(DataFusionError::Internal(
+                "Expect random function to take no param".to_string(),
+            ))
+        }
+    };
+    let mut rng = thread_rng();
+    let mut array = Vec::with_capacity(len);
+    for _ in 0..len {
+        array.push(Some(rng.gen_range(0.0..1.0)))

Review comment:
       Updated per suggestions 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on a change in pull request #303: add random SQL function

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r632722624



##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
     }
 
     fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
-        // evaluate the arguments
-        let inputs = self
-            .args
-            .iter()
-            .map(|e| e.evaluate(batch))
-            .collect::<Result<Vec<_>>>()?;
+        // evaluate the arguments, if there are no arguments we'll instead pass in a null array of
+        // batch size (as a convention)
+        let inputs = match self.args.len() {
+            0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(

Review comment:
       @jorgecarleitao  / @Dandandan  / @Jimexist  are we all cool with using a `NullArray` to pass "how many rows are passed to this function that has no input arguments"? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org