You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "paul-rogers (via GitHub)" <gi...@apache.org> on 2023/03/21 23:13:19 UTC

[GitHub] [druid] paul-rogers opened a new pull request, #13958: Added VARIANT data type for external tables

paul-rogers opened a new pull request, #13958:
URL: https://github.com/apache/druid/pull/13958

   Druid recently added support for nested JSON types. The JSON-style of `extern()` can label input columns as a JSON object or array using the Druid `complex<json>` type. This PR adds an equivalent of the type to SQL using the `VARIANT` keyword. Example:
   
   ```sql
   INSERT INTO dst SELECT *
   FROM TABLE(http(uris => ARRAY['http://foo.com/bar.json'],
                   format => 'csv'))
        (x VARCHAR, y VARCHAR, z VARIANT)
   PARTITIONED BY ALL TIME
   ```
   
   The semantics are that the `z` column above is _some_ complex JSON type, but we don't care which: Druid's indexer will figure it out. Since we don't know the exact type (other than that it isn't simple), we let Druid do the work by declaring that the input column type varies: it is of type `VARIANT`. At run time, Druid will pick the actual type, which may be `complex<json>` or may be something else.
   
   #### Release note
   
   See the comments above and the modified `reference.md` file.
   
   Druid has added a new keyword, `VARIANT` in this release. In the unlikely event that your tables or columns use that name, you must now quote the name in SQL statements.
   
   <hr>
   
   This PR has:
   
   - [X] been self-reviewed.
   - [X] added documentation for new or modified features or behaviors.
   - [X] a release note entry in the PR description.
   - [X] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [X] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] paul-rogers merged pull request #13958: Added TYPE(native) data type for external tables

Posted by "paul-rogers (via GitHub)" <gi...@apache.org>.
paul-rogers merged PR #13958:
URL: https://github.com/apache/druid/pull/13958


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] paul-rogers commented on a diff in pull request #13958: Added TYPE(native) data type for external tables

Posted by "paul-rogers (via GitHub)" <gi...@apache.org>.
paul-rogers commented on code in PR #13958:
URL: https://github.com/apache/druid/pull/13958#discussion_r1145550463


##########
sql/src/main/java/org/apache/druid/sql/calcite/external/Externals.java:
##########
@@ -242,7 +242,11 @@ private static String convertType(String name, SqlDataTypeSpec dataType)
     if (typeName == null || !typeName.isSimple()) {
       throw unsupportedType(name, dataType);
     }
-    SqlTypeName type = SqlTypeName.get(typeName.getSimple());
+    String simpleName = typeName.getSimple();
+    if (StringUtils.toLowerCase(simpleName).startsWith(("complex<"))) {
+      return simpleName;
+    }
+    SqlTypeName type = SqlTypeName.get(simpleName);

Review Comment:
   Changed all the names. I hope the new names are a bit more clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a diff in pull request #13958: Added TYPE(native) data type for external tables

Posted by "clintropolis (via GitHub)" <gi...@apache.org>.
clintropolis commented on code in PR #13958:
URL: https://github.com/apache/druid/pull/13958#discussion_r1145496668


##########
sql/src/main/codegen/config.fmpp:
##########
@@ -68,6 +69,7 @@ data: {
       "CLUSTERED"
       "OVERWRITE"
       "PARTITIONED"
+      "VARIANT"

Review Comment:
   this seems stale



##########
sql/src/main/codegen/includes/common.ftl:
##########
@@ -92,3 +92,18 @@ SqlNodeList ClusterItems() :
     return new SqlNodeList(list, s.addAll(list).pos());
   }
 }
+
+SqlTypeNameSpec VariantType() :

Review Comment:
   nit: maybe rename something like `DruidTypePassthrough`?



##########
sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidTypeSystem.java:
##########
@@ -35,6 +35,8 @@ public class DruidTypeSystem implements RelDataTypeSystem
    */
   public static final int DEFAULT_TIMESTAMP_PRECISION = 3;
 
+  public static final String VARIANT_TYPE_NAME = "VARIANT";

Review Comment:
   nit: stale



##########
sql/src/main/java/org/apache/druid/sql/calcite/external/Externals.java:
##########
@@ -242,7 +242,11 @@ private static String convertType(String name, SqlDataTypeSpec dataType)
     if (typeName == null || !typeName.isSimple()) {
       throw unsupportedType(name, dataType);
     }
-    SqlTypeName type = SqlTypeName.get(typeName.getSimple());
+    String simpleName = typeName.getSimple();
+    if (StringUtils.toLowerCase(simpleName).startsWith(("complex<"))) {
+      return simpleName;
+    }
+    SqlTypeName type = SqlTypeName.get(simpleName);

Review Comment:
   nit: i know this isn't new, but these variable names are kinda confusing which initially had me wondering why we were looking up the type name of the type name 😅 but see now that `typeName` is an `SqlIdentifier` and `type` is an `SqlTypeName`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org