You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "izveigor (via GitHub)" <gi...@apache.org> on 2023/04/12 18:04:13 UTC

[GitHub] [arrow-datafusion] izveigor opened a new pull request, #5978: docs: improve expressions.md

izveigor opened a new pull request, #5978:
URL: https://github.com/apache/arrow-datafusion/pull/5978

   # Which issue does this PR close?
   
   Closes #5977
   
   # Rationale for this change
   
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   # What changes are included in this PR?
   
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   # Are these changes tested?
   
   <!--
   We typically require tests for all PRs in order to:
   1. Prevent the code from being accidentally broken by subsequent changes
   2. Serve as another way to document the expected behavior of the code
   
   If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?
   -->
   
   # Are there any user-facing changes?
   
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #5978: docs: improve expressions.md

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #5978:
URL: https://github.com/apache/arrow-datafusion/pull/5978#issuecomment-1507405618

   Thanks @izveigor !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5978: docs: improve expressions.md

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5978:
URL: https://github.com/apache/arrow-datafusion/pull/5978#discussion_r1164565331


##########
docs/source/user-guide/expressions.md:
##########
@@ -125,105 +135,105 @@ Unlike to some databases the math functions in Datafusion works the same way as
 
 ## String Expressions
 
-| Function         | Notes |
-| ---------------- | ----- |
-| ascii            |       |
-| bit_length       |       |
-| btrim            |       |
-| char_length      |       |
-| character_length |       |
-| concat           |       |
-| concat_ws        |       |
-| chr              |       |
-| initcap          |       |
-| left             |       |
-| length           |       |
-| lower            |       |
-| lpad             |       |
-| ltrim            |       |
-| md5              |       |
-| octet_length     |       |
-| repeat           |       |
-| replace          |       |
-| reverse          |       |
-| right            |       |
-| rpad             |       |
-| rtrim            |       |
-| digest           |       |
-| split_part       |       |
-| starts_with      |       |
-| strpos           |       |
-| substr           |       |
-| translate        |       |
-| trim             |       |
-| upper            |       |
+| Function                                       | Notes                                                                                                                                                                                                                                    |
+| ---------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| ascii(character)                               | Returns a numeric representation of the character (`character`). Example: `ascii('a') -> 97`                                                                                                                                             |
+| bit_length(text)                               | Returns the length of the string (`text`) in bits. Example: `bit_length('spider') -> 48`                                                                                                                                                 |
+| btrim(text, characters)                        | Removes all specified characters (`characters`) from both the beginning and the end of the string (`text`). Example: `btrim('aabchelloccb', 'abc') -> hello`                                                                             |
+| char_length(text)                              | Returns number of characters in the string (`text`). The same as `character_length` and `length`. Example: `character_length('lion') -> 4`                                                                                               |
+| character_length(text)                         | Returns number of characters in the string (`text`). The same as `char_length` and `length`. Example: `char_length('lion') -> 4`                                                                                                         |
+| concat(value1, [value2 [, ...]])               | Concatenates the text representations (`value1, [value2 [, ...]]`) of all the arguments. NULL arguments are ignored. Example: `concat('aaa', 'bbc', NULL, 321) -> aaabbc321`                                                             |
+| concat_ws(separator, value1, [value2 [, ...]]) | Concatenates the text representations (`value1, [value2 [, ...]]`) of all the arguments with the separator (`separator`). NULL arguments are ignored. `concat_ws('/', 'path', 'to', NULL, 'my', 'folder', 123) -> path/to/my/folder/123` |
+| chr(integer)                                   | Returns a character by its numeric representation (`integer`). Example: `chr(90) -> 8`                                                                                                                                                   |
+| initcap                                        | Converts the first letter of each word to upper case and the rest to lower case. Example: `initcap('hi TOM') -> Hi Tom`                                                                                                                  |
+| left(text, number)                             | Returns a certain number (`number`) of first characters (`text`). Example: `left('like', 2) -> li`                                                                                                                                       |
+| length(text)                                   | Returns number of characters in the string (`text`). The same as `character_length` and `char_length`. Example: `length('lion') -> 4`                                                                                                    |
+| lower(text)                                    | Converts all characters in the string (`text`) into lower case. Example: `lower('HELLO') -> hello`                                                                                                                                       |
+| lpad(text, length, [, fill])                   | Extends the string to length (`lenght`) by prepending the characters (`fill`) (a space by default). Example: `lpad('bb', 5, 'a') → aaabb`                                                                                                |

Review Comment:
   ```suggestion
   | lpad(text, length, [, fill])                   | Extends the string to length (`length`) by prepending the characters (`fill`) (a space by default). Example: `lpad('bb', 5, 'a') → aaabb`                                                                                                |
   ```



##########
docs/source/user-guide/expressions.md:
##########
@@ -125,105 +135,105 @@ Unlike to some databases the math functions in Datafusion works the same way as
 
 ## String Expressions
 
-| Function         | Notes |
-| ---------------- | ----- |
-| ascii            |       |
-| bit_length       |       |
-| btrim            |       |
-| char_length      |       |
-| character_length |       |
-| concat           |       |
-| concat_ws        |       |
-| chr              |       |
-| initcap          |       |
-| left             |       |
-| length           |       |
-| lower            |       |
-| lpad             |       |
-| ltrim            |       |
-| md5              |       |
-| octet_length     |       |
-| repeat           |       |
-| replace          |       |
-| reverse          |       |
-| right            |       |
-| rpad             |       |
-| rtrim            |       |
-| digest           |       |
-| split_part       |       |
-| starts_with      |       |
-| strpos           |       |
-| substr           |       |
-| translate        |       |
-| trim             |       |
-| upper            |       |
+| Function                                       | Notes                                                                                                                                                                                                                                    |
+| ---------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| ascii(character)                               | Returns a numeric representation of the character (`character`). Example: `ascii('a') -> 97`                                                                                                                                             |
+| bit_length(text)                               | Returns the length of the string (`text`) in bits. Example: `bit_length('spider') -> 48`                                                                                                                                                 |
+| btrim(text, characters)                        | Removes all specified characters (`characters`) from both the beginning and the end of the string (`text`). Example: `btrim('aabchelloccb', 'abc') -> hello`                                                                             |
+| char_length(text)                              | Returns number of characters in the string (`text`). The same as `character_length` and `length`. Example: `character_length('lion') -> 4`                                                                                               |
+| character_length(text)                         | Returns number of characters in the string (`text`). The same as `char_length` and `length`. Example: `char_length('lion') -> 4`                                                                                                         |
+| concat(value1, [value2 [, ...]])               | Concatenates the text representations (`value1, [value2 [, ...]]`) of all the arguments. NULL arguments are ignored. Example: `concat('aaa', 'bbc', NULL, 321) -> aaabbc321`                                                             |
+| concat_ws(separator, value1, [value2 [, ...]]) | Concatenates the text representations (`value1, [value2 [, ...]]`) of all the arguments with the separator (`separator`). NULL arguments are ignored. `concat_ws('/', 'path', 'to', NULL, 'my', 'folder', 123) -> path/to/my/folder/123` |
+| chr(integer)                                   | Returns a character by its numeric representation (`integer`). Example: `chr(90) -> 8`                                                                                                                                                   |
+| initcap                                        | Converts the first letter of each word to upper case and the rest to lower case. Example: `initcap('hi TOM') -> Hi Tom`                                                                                                                  |
+| left(text, number)                             | Returns a certain number (`number`) of first characters (`text`). Example: `left('like', 2) -> li`                                                                                                                                       |
+| length(text)                                   | Returns number of characters in the string (`text`). The same as `character_length` and `char_length`. Example: `length('lion') -> 4`                                                                                                    |
+| lower(text)                                    | Converts all characters in the string (`text`) into lower case. Example: `lower('HELLO') -> hello`                                                                                                                                       |
+| lpad(text, length, [, fill])                   | Extends the string to length (`lenght`) by prepending the characters (`fill`) (a space by default). Example: `lpad('bb', 5, 'a') → aaabb`                                                                                                |
+| ltrim(text, text)                              | Removes all specified characters (`characters`) from the beginning of the string (`text`). Example: `ltrim('aabchelloccb', 'abc') -> helloccb`                                                                                           |
+| md5(text)                                      | Computes the MD5 hash of the argument (`text`).                                                                                                                                                                                          |
+| octet_length(text)                             | Returns number of bytes in the string (`text`).                                                                                                                                                                                          |
+| repeat(text, number)                           | Repeats the string the specified number of times. Example: `repeat('1', 4) -> 1111`                                                                                                                                                      |
+| replace(string, from, to)                      | Replaces a specified string (`from`) with another specified string (`to`) in the string (`string`). Example: `replace('Hello', 'replace', 'el') -> Hola`                                                                                 |
+| reverse(text)                                  | Reverses the order of the characters in the string (`text`). Example: `reverse('hello') -> olleh`                                                                                                                                        |
+| right(text, number)                            | Returns a certain number (`number`) of last characters (`text`). Example: `right('like', 2) -> ke`                                                                                                                                       |
+| rpad(text, length, [, fill])                   | Extends the string to length (`lenght`) by prepending the characters (`fill`) (a space by default). Example: `rpad('bb', 5, 'a') → bbaaa`                                                                                                |
+| rtrim                                          | Removes all specified characters (`characters`) from the end of the string (`text`). Example: `rtrim('aabchelloccb', 'abc') -> aabchello`                                                                                                |
+| digest(input, algorithm)                       | Computes the binary hash of `input`, using the `algorithm`.                                                                                                                                                                              |
+| split_part(string, delimiter, index)           | Splits the string (`string`) based on a delimiter (`delimiter`) and picks out the desired field based on the index (`index`).                                                                                                            |
+| starts_with(string, prefix)                    | Returns `true` if the string (`string`) starts with the specified prefix (`prefix`). If not, it returns `false`. Example: `starts_with('Hi Tom', 'Hi') -> true`                                                                          |
+| strpos                                         | Finds the position from where the `substring` matches the `string`                                                                                                                                                                       |
+| substr(string, position, [, length])           | Returns substring from the position (`position`) with length (`length`) characters in the string (`string`).                                                                                                                             |
+| translate(string, from, to)                    | Replaces the characters in `from` with the counterpart in `to`. Example: `translate('abcde', 'acd', '15') -> 1b5e`                                                                                                                       |
+| trim(string)                                   | Removes all characters, space by default from the string (`string`)                                                                                                                                                                      |
+| upper                                          | Converts all characters in the string into upper case. Example: `upper('hello') -> HELLO`                                                                                                                                                |
 
 ## Regular Expressions
 
-| Function       | Notes |
-| -------------- | ----- |
-| regexp_match   |       |
-| regexp_replace |       |
+| Function       | Notes                                                                         |
+| -------------- | ----------------------------------------------------------------------------- |
+| regexp_match   | Matches a regular expression against a string and returns matched substrings. |
+| regexp_replace | Replaces strings that match a regular expression                              |
 
 ## Temporal Expressions
 
-| Function             | Notes        |
-| -------------------- | ------------ |
-| date_part            |              |
-| date_trunc           |              |
-| from_unixtime        |              |
-| to_timestamp         |              |
-| to_timestamp_millis  |              |
-| to_timestamp_micros  |              |
-| to_timestamp_seconds |              |
-| now()                | current time |
+| Function             | Notes                                                  |
+| -------------------- | ------------------------------------------------------ |
+| date_part            | Extracts a subfield from the date.                     |
+| date_trunc           | Truncates the date to a specified level of precision.  |
+| from_unixtime        | Returns the unix time in format.                       |
+| to_timestamp         | Converts a string to a `Timestamp(_, _)`               |
+| to_timestamp_millis  | Converts a string to a `Timestamp(Milliseconds, None)` |
+| to_timestamp_micros  | Converts a string to a `Timestamp(Microseconds, None)` |
+| to_timestamp_seconds | Converts a string to a `Timestamp(Seconds, None)`      |
+| now()                | Returns current time.                                  |
 
 ## Other Expressions
 
-| Function | Notes |
-| -------- | ----- |
-| array    |       |
-| in_list  |       |
-| random   |       |
-| sha224   |       |
-| sha256   |       |
-| sha384   |       |
-| sha512   |       |
-| struct   |       |
-| to_hex   |       |
+| Function                     | Notes                                                                                                      |
+| ---------------------------- | ---------------------------------------------------------------------------------------------------------- |
+| array([value1, ...])         | Returns an array of fixed size with each argument (`[value1, ...]`) on it.                                 |
+| in_list(expr, list, negated) | Returns `true` if (`expr`) belongs or not belongs (`negated`) to a list (`list`), otherwise returns false. |
+| random()                     | Returns a random value from 0 (inclusive) to 1 (exclusive).                                                |
+| sha224(text)                 | Computes the SHA224 hash of the argument (`text`).                                                         |
+| sha256(text)                 | Computes the SHA256 hash of the argument (`text`).                                                         |
+| sha384(text)                 | Computes the SHA384 hash of the argument (`text`).                                                         |
+| sha512(text)                 | Computes the SHA512 hash of the argument (`text`).                                                         |
+| struct                       |                                                                                                            |

Review Comment:
   ```sql
   ❯ select struct('soo');
   +---------------------+
   | struct(Utf8("soo")) |
   +---------------------+
   | {c0: soo}           |
   +---------------------+
   ```
   
   Wow that is a wild funtion 🤔 



##########
docs/source/user-guide/expressions.md:
##########
@@ -125,105 +135,105 @@ Unlike to some databases the math functions in Datafusion works the same way as
 
 ## String Expressions
 
-| Function         | Notes |
-| ---------------- | ----- |
-| ascii            |       |
-| bit_length       |       |
-| btrim            |       |
-| char_length      |       |
-| character_length |       |
-| concat           |       |
-| concat_ws        |       |
-| chr              |       |
-| initcap          |       |
-| left             |       |
-| length           |       |
-| lower            |       |
-| lpad             |       |
-| ltrim            |       |
-| md5              |       |
-| octet_length     |       |
-| repeat           |       |
-| replace          |       |
-| reverse          |       |
-| right            |       |
-| rpad             |       |
-| rtrim            |       |
-| digest           |       |
-| split_part       |       |
-| starts_with      |       |
-| strpos           |       |
-| substr           |       |
-| translate        |       |
-| trim             |       |
-| upper            |       |
+| Function                                       | Notes                                                                                                                                                                                                                                    |
+| ---------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| ascii(character)                               | Returns a numeric representation of the character (`character`). Example: `ascii('a') -> 97`                                                                                                                                             |

Review Comment:
   At least some of these functions are already covered in https://arrow.apache.org/datafusion/user-guide/sql/scalar_functions.html#string-functions
   
   However, I see this is for the expression syntax. 🤔 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #5978: docs: improve expressions.md

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #5978:
URL: https://github.com/apache/arrow-datafusion/pull/5978


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org