You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/04/12 19:43:15 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5978: docs: improve expressions.md

alamb commented on code in PR #5978:
URL: https://github.com/apache/arrow-datafusion/pull/5978#discussion_r1164565331


##########
docs/source/user-guide/expressions.md:
##########
@@ -125,105 +135,105 @@ Unlike to some databases the math functions in Datafusion works the same way as
 
 ## String Expressions
 
-| Function         | Notes |
-| ---------------- | ----- |
-| ascii            |       |
-| bit_length       |       |
-| btrim            |       |
-| char_length      |       |
-| character_length |       |
-| concat           |       |
-| concat_ws        |       |
-| chr              |       |
-| initcap          |       |
-| left             |       |
-| length           |       |
-| lower            |       |
-| lpad             |       |
-| ltrim            |       |
-| md5              |       |
-| octet_length     |       |
-| repeat           |       |
-| replace          |       |
-| reverse          |       |
-| right            |       |
-| rpad             |       |
-| rtrim            |       |
-| digest           |       |
-| split_part       |       |
-| starts_with      |       |
-| strpos           |       |
-| substr           |       |
-| translate        |       |
-| trim             |       |
-| upper            |       |
+| Function                                       | Notes                                                                                                                                                                                                                                    |
+| ---------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| ascii(character)                               | Returns a numeric representation of the character (`character`). Example: `ascii('a') -> 97`                                                                                                                                             |
+| bit_length(text)                               | Returns the length of the string (`text`) in bits. Example: `bit_length('spider') -> 48`                                                                                                                                                 |
+| btrim(text, characters)                        | Removes all specified characters (`characters`) from both the beginning and the end of the string (`text`). Example: `btrim('aabchelloccb', 'abc') -> hello`                                                                             |
+| char_length(text)                              | Returns number of characters in the string (`text`). The same as `character_length` and `length`. Example: `character_length('lion') -> 4`                                                                                               |
+| character_length(text)                         | Returns number of characters in the string (`text`). The same as `char_length` and `length`. Example: `char_length('lion') -> 4`                                                                                                         |
+| concat(value1, [value2 [, ...]])               | Concatenates the text representations (`value1, [value2 [, ...]]`) of all the arguments. NULL arguments are ignored. Example: `concat('aaa', 'bbc', NULL, 321) -> aaabbc321`                                                             |
+| concat_ws(separator, value1, [value2 [, ...]]) | Concatenates the text representations (`value1, [value2 [, ...]]`) of all the arguments with the separator (`separator`). NULL arguments are ignored. `concat_ws('/', 'path', 'to', NULL, 'my', 'folder', 123) -> path/to/my/folder/123` |
+| chr(integer)                                   | Returns a character by its numeric representation (`integer`). Example: `chr(90) -> 8`                                                                                                                                                   |
+| initcap                                        | Converts the first letter of each word to upper case and the rest to lower case. Example: `initcap('hi TOM') -> Hi Tom`                                                                                                                  |
+| left(text, number)                             | Returns a certain number (`number`) of first characters (`text`). Example: `left('like', 2) -> li`                                                                                                                                       |
+| length(text)                                   | Returns number of characters in the string (`text`). The same as `character_length` and `char_length`. Example: `length('lion') -> 4`                                                                                                    |
+| lower(text)                                    | Converts all characters in the string (`text`) into lower case. Example: `lower('HELLO') -> hello`                                                                                                                                       |
+| lpad(text, length, [, fill])                   | Extends the string to length (`lenght`) by prepending the characters (`fill`) (a space by default). Example: `lpad('bb', 5, 'a') → aaabb`                                                                                                |

Review Comment:
   ```suggestion
   | lpad(text, length, [, fill])                   | Extends the string to length (`length`) by prepending the characters (`fill`) (a space by default). Example: `lpad('bb', 5, 'a') → aaabb`                                                                                                |
   ```



##########
docs/source/user-guide/expressions.md:
##########
@@ -125,105 +135,105 @@ Unlike to some databases the math functions in Datafusion works the same way as
 
 ## String Expressions
 
-| Function         | Notes |
-| ---------------- | ----- |
-| ascii            |       |
-| bit_length       |       |
-| btrim            |       |
-| char_length      |       |
-| character_length |       |
-| concat           |       |
-| concat_ws        |       |
-| chr              |       |
-| initcap          |       |
-| left             |       |
-| length           |       |
-| lower            |       |
-| lpad             |       |
-| ltrim            |       |
-| md5              |       |
-| octet_length     |       |
-| repeat           |       |
-| replace          |       |
-| reverse          |       |
-| right            |       |
-| rpad             |       |
-| rtrim            |       |
-| digest           |       |
-| split_part       |       |
-| starts_with      |       |
-| strpos           |       |
-| substr           |       |
-| translate        |       |
-| trim             |       |
-| upper            |       |
+| Function                                       | Notes                                                                                                                                                                                                                                    |
+| ---------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| ascii(character)                               | Returns a numeric representation of the character (`character`). Example: `ascii('a') -> 97`                                                                                                                                             |
+| bit_length(text)                               | Returns the length of the string (`text`) in bits. Example: `bit_length('spider') -> 48`                                                                                                                                                 |
+| btrim(text, characters)                        | Removes all specified characters (`characters`) from both the beginning and the end of the string (`text`). Example: `btrim('aabchelloccb', 'abc') -> hello`                                                                             |
+| char_length(text)                              | Returns number of characters in the string (`text`). The same as `character_length` and `length`. Example: `character_length('lion') -> 4`                                                                                               |
+| character_length(text)                         | Returns number of characters in the string (`text`). The same as `char_length` and `length`. Example: `char_length('lion') -> 4`                                                                                                         |
+| concat(value1, [value2 [, ...]])               | Concatenates the text representations (`value1, [value2 [, ...]]`) of all the arguments. NULL arguments are ignored. Example: `concat('aaa', 'bbc', NULL, 321) -> aaabbc321`                                                             |
+| concat_ws(separator, value1, [value2 [, ...]]) | Concatenates the text representations (`value1, [value2 [, ...]]`) of all the arguments with the separator (`separator`). NULL arguments are ignored. `concat_ws('/', 'path', 'to', NULL, 'my', 'folder', 123) -> path/to/my/folder/123` |
+| chr(integer)                                   | Returns a character by its numeric representation (`integer`). Example: `chr(90) -> 8`                                                                                                                                                   |
+| initcap                                        | Converts the first letter of each word to upper case and the rest to lower case. Example: `initcap('hi TOM') -> Hi Tom`                                                                                                                  |
+| left(text, number)                             | Returns a certain number (`number`) of first characters (`text`). Example: `left('like', 2) -> li`                                                                                                                                       |
+| length(text)                                   | Returns number of characters in the string (`text`). The same as `character_length` and `char_length`. Example: `length('lion') -> 4`                                                                                                    |
+| lower(text)                                    | Converts all characters in the string (`text`) into lower case. Example: `lower('HELLO') -> hello`                                                                                                                                       |
+| lpad(text, length, [, fill])                   | Extends the string to length (`lenght`) by prepending the characters (`fill`) (a space by default). Example: `lpad('bb', 5, 'a') → aaabb`                                                                                                |
+| ltrim(text, text)                              | Removes all specified characters (`characters`) from the beginning of the string (`text`). Example: `ltrim('aabchelloccb', 'abc') -> helloccb`                                                                                           |
+| md5(text)                                      | Computes the MD5 hash of the argument (`text`).                                                                                                                                                                                          |
+| octet_length(text)                             | Returns number of bytes in the string (`text`).                                                                                                                                                                                          |
+| repeat(text, number)                           | Repeats the string the specified number of times. Example: `repeat('1', 4) -> 1111`                                                                                                                                                      |
+| replace(string, from, to)                      | Replaces a specified string (`from`) with another specified string (`to`) in the string (`string`). Example: `replace('Hello', 'replace', 'el') -> Hola`                                                                                 |
+| reverse(text)                                  | Reverses the order of the characters in the string (`text`). Example: `reverse('hello') -> olleh`                                                                                                                                        |
+| right(text, number)                            | Returns a certain number (`number`) of last characters (`text`). Example: `right('like', 2) -> ke`                                                                                                                                       |
+| rpad(text, length, [, fill])                   | Extends the string to length (`lenght`) by prepending the characters (`fill`) (a space by default). Example: `rpad('bb', 5, 'a') → bbaaa`                                                                                                |
+| rtrim                                          | Removes all specified characters (`characters`) from the end of the string (`text`). Example: `rtrim('aabchelloccb', 'abc') -> aabchello`                                                                                                |
+| digest(input, algorithm)                       | Computes the binary hash of `input`, using the `algorithm`.                                                                                                                                                                              |
+| split_part(string, delimiter, index)           | Splits the string (`string`) based on a delimiter (`delimiter`) and picks out the desired field based on the index (`index`).                                                                                                            |
+| starts_with(string, prefix)                    | Returns `true` if the string (`string`) starts with the specified prefix (`prefix`). If not, it returns `false`. Example: `starts_with('Hi Tom', 'Hi') -> true`                                                                          |
+| strpos                                         | Finds the position from where the `substring` matches the `string`                                                                                                                                                                       |
+| substr(string, position, [, length])           | Returns substring from the position (`position`) with length (`length`) characters in the string (`string`).                                                                                                                             |
+| translate(string, from, to)                    | Replaces the characters in `from` with the counterpart in `to`. Example: `translate('abcde', 'acd', '15') -> 1b5e`                                                                                                                       |
+| trim(string)                                   | Removes all characters, space by default from the string (`string`)                                                                                                                                                                      |
+| upper                                          | Converts all characters in the string into upper case. Example: `upper('hello') -> HELLO`                                                                                                                                                |
 
 ## Regular Expressions
 
-| Function       | Notes |
-| -------------- | ----- |
-| regexp_match   |       |
-| regexp_replace |       |
+| Function       | Notes                                                                         |
+| -------------- | ----------------------------------------------------------------------------- |
+| regexp_match   | Matches a regular expression against a string and returns matched substrings. |
+| regexp_replace | Replaces strings that match a regular expression                              |
 
 ## Temporal Expressions
 
-| Function             | Notes        |
-| -------------------- | ------------ |
-| date_part            |              |
-| date_trunc           |              |
-| from_unixtime        |              |
-| to_timestamp         |              |
-| to_timestamp_millis  |              |
-| to_timestamp_micros  |              |
-| to_timestamp_seconds |              |
-| now()                | current time |
+| Function             | Notes                                                  |
+| -------------------- | ------------------------------------------------------ |
+| date_part            | Extracts a subfield from the date.                     |
+| date_trunc           | Truncates the date to a specified level of precision.  |
+| from_unixtime        | Returns the unix time in format.                       |
+| to_timestamp         | Converts a string to a `Timestamp(_, _)`               |
+| to_timestamp_millis  | Converts a string to a `Timestamp(Milliseconds, None)` |
+| to_timestamp_micros  | Converts a string to a `Timestamp(Microseconds, None)` |
+| to_timestamp_seconds | Converts a string to a `Timestamp(Seconds, None)`      |
+| now()                | Returns current time.                                  |
 
 ## Other Expressions
 
-| Function | Notes |
-| -------- | ----- |
-| array    |       |
-| in_list  |       |
-| random   |       |
-| sha224   |       |
-| sha256   |       |
-| sha384   |       |
-| sha512   |       |
-| struct   |       |
-| to_hex   |       |
+| Function                     | Notes                                                                                                      |
+| ---------------------------- | ---------------------------------------------------------------------------------------------------------- |
+| array([value1, ...])         | Returns an array of fixed size with each argument (`[value1, ...]`) on it.                                 |
+| in_list(expr, list, negated) | Returns `true` if (`expr`) belongs or not belongs (`negated`) to a list (`list`), otherwise returns false. |
+| random()                     | Returns a random value from 0 (inclusive) to 1 (exclusive).                                                |
+| sha224(text)                 | Computes the SHA224 hash of the argument (`text`).                                                         |
+| sha256(text)                 | Computes the SHA256 hash of the argument (`text`).                                                         |
+| sha384(text)                 | Computes the SHA384 hash of the argument (`text`).                                                         |
+| sha512(text)                 | Computes the SHA512 hash of the argument (`text`).                                                         |
+| struct                       |                                                                                                            |

Review Comment:
   ```sql
   ❯ select struct('soo');
   +---------------------+
   | struct(Utf8("soo")) |
   +---------------------+
   | {c0: soo}           |
   +---------------------+
   ```
   
   Wow that is a wild funtion 🤔 



##########
docs/source/user-guide/expressions.md:
##########
@@ -125,105 +135,105 @@ Unlike to some databases the math functions in Datafusion works the same way as
 
 ## String Expressions
 
-| Function         | Notes |
-| ---------------- | ----- |
-| ascii            |       |
-| bit_length       |       |
-| btrim            |       |
-| char_length      |       |
-| character_length |       |
-| concat           |       |
-| concat_ws        |       |
-| chr              |       |
-| initcap          |       |
-| left             |       |
-| length           |       |
-| lower            |       |
-| lpad             |       |
-| ltrim            |       |
-| md5              |       |
-| octet_length     |       |
-| repeat           |       |
-| replace          |       |
-| reverse          |       |
-| right            |       |
-| rpad             |       |
-| rtrim            |       |
-| digest           |       |
-| split_part       |       |
-| starts_with      |       |
-| strpos           |       |
-| substr           |       |
-| translate        |       |
-| trim             |       |
-| upper            |       |
+| Function                                       | Notes                                                                                                                                                                                                                                    |
+| ---------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| ascii(character)                               | Returns a numeric representation of the character (`character`). Example: `ascii('a') -> 97`                                                                                                                                             |

Review Comment:
   At least some of these functions are already covered in https://arrow.apache.org/datafusion/user-guide/sql/scalar_functions.html#string-functions
   
   However, I see this is for the expression syntax. 🤔 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org