You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Tanner Clary (Jira)" <ji...@apache.org> on 2023/04/12 17:26:00 UTC

[jira] [Assigned] (CALCITE-5644) Implement BigQuery CONTAINS_SUBSTR

     [ https://issues.apache.org/jira/browse/CALCITE-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tanner Clary reassigned CALCITE-5644:
-------------------------------------

    Assignee: Tanner Clary

> Implement BigQuery CONTAINS_SUBSTR
> ----------------------------------
>
>                 Key: CALCITE-5644
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5644
>             Project: Calcite
>          Issue Type: Task
>            Reporter: Tanner Clary
>            Assignee: Tanner Clary
>            Priority: Major
>
> BigQuery offers the [CONTAINS_SUBSTR|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#contains_substr] function that returns {{TRUE}} if a substring is present in an expression and {{FALSE}} if it is not. A basic example of this may be seen in [1]. 
> The expression can take many forms (more info in the linked doc) which makes its implementation more complex than other string functions that only accept arguments of type {{STRING}}. For instance, the expression to be searched can be a column or table reference. 
> The function also has an optional third argument called {{json_scope}} where the user can indicate the scope of JSON data (keys, values, or both) to be searched for the substring.
>  I am curious if anyone has thoughts on how the search of rows or tables could be implemented. I have a basic implementation (that supports expressions of type {{STRING}} and nothing else) that I will open a draft PR for as a starting point. To me, the challenge is implementing the additional features like the {{JSON_SCOPE}} argument (seen in [2]) and performing a cross field search as seen in [3]. 
> [1] {{SELECT CONTAINS_SUBSTR("hello", "he");}} would return {{TRUE}}. 
> [2] SELECT CONTAINS_SUBSTR(JSON '("lunch":"soup")', "lunch",
>          json_scope=>"JSON_VALUES") AS result;
> would return {{FALSE}}.
> [3] 
> {{SELECT *
> FROM Recipes
> WHERE CONTAINS_SUBSTR(
>   (SELECT AS STRUCT Recipes.* EXCEPT (Lunch, Dinner)),
>   'potato'
> );}} would return: 
> ||Breakfast||Lunch||Dinner||
> |Potato pancakes|Toasted cheese sandwich|Beef stroganoff|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)