You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/04/15 19:09:00 UTC

[GitHub] [incubator-iceberg] rdblue opened a new pull request #924: Add void transform that always produces null

rdblue opened a new pull request #924: Add void transform that always produces null
URL: https://github.com/apache/incubator-iceberg/pull/924
 
 
   This adds a new transform function, `void`, that always produces a null value. Because `void` and `null` are Java keywords, the `PartitionSpecBuilder` is configured using `alwaysNull`.
   
   The purpose of this transform is to be a stand-in for partition transforms that are removed from a spec. In the v1 table format, IDs for partition fields are not tracked by `PartitionSpec`. Instead, they are assigned starting at 1000 for each spec. Because tables may have more than one spec, manifest files could have incompatible partition field structs. This is not a problem for job planning because each manifest is read independently, but it can break metadata tables that show a union of all manifest data files or entries.
   
   The `void` transform can be used to avoid a problem with ID assignment. If a table has two partition fields, `1000: categorical string, 1001: ts_day int`, then removing the `categorical` partition will create a new partition spec with `1000: ts_day int`. That would create a problem in the metadata tables. Instead of deleting the categorical partition, it should be replaced with a `void` partition to keep the IDs aligned: `1000: always_null string, 1001: ts_day int`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [incubator-iceberg] danielcweeks commented on issue #924: Add void transform that always produces null

Posted by GitBox <gi...@apache.org>.
danielcweeks commented on issue #924: Add void transform that always produces null
URL: https://github.com/apache/incubator-iceberg/pull/924#issuecomment-614238463
 
 
   One minor comment, but +1 (pending checks)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [incubator-iceberg] danielcweeks commented on a change in pull request #924: Add void transform that always produces null

Posted by GitBox <gi...@apache.org>.
danielcweeks commented on a change in pull request #924: Add void transform that always produces null
URL: https://github.com/apache/incubator-iceberg/pull/924#discussion_r409084413
 
 

 ##########
 File path: api/src/main/java/org/apache/iceberg/transforms/VoidTransform.java
 ##########
 @@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.transforms;
+
+import java.io.ObjectStreamException;
+import org.apache.iceberg.expressions.BoundPredicate;
+import org.apache.iceberg.expressions.UnboundPredicate;
+import org.apache.iceberg.types.Type;
+
+class VoidTransform<S> implements Transform<S, Void> {
+  private static final VoidTransform<Object> INSTANCE = new VoidTransform<>();
+
+  @SuppressWarnings("unchecked")
+  static <T> VoidTransform<T> get() {
+    return (VoidTransform<T>) INSTANCE;
+  }
+
+  private VoidTransform() {
+  }
+
+  @Override
+  public Void apply(Object value) {
+    return null;
+  }
+
+  @Override
+  public boolean canTransform(Type type) {
+    return true;
+  }
+
+  @Override
+  public Type getResultType(Type sourceType) {
+    return sourceType;
+  }
+
+  @Override
+  public UnboundPredicate<Void> projectStrict(String name, BoundPredicate<S> predicate) {
+    return null;
+  }
+
+  @Override
+  public UnboundPredicate<Void> project(String name, BoundPredicate<S> predicate) {
+    return null;
+  }
+
+  @Override
+  public String toHumanString(Void value) {
+    return "null";
+  }
+
+  @Override
+  public String toString() {
+    return "void";
+  }
 
 Review comment:
   Minor nit here to think about . . . we seem to be using both `void` and `null`.  Would it make more sense to just consistently use `void` as it seems to better indicate there there is no expected value?  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [incubator-iceberg] rdblue commented on issue #924: Add void transform that always produces null

Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #924: Add void transform that always produces null
URL: https://github.com/apache/incubator-iceberg/pull/924#issuecomment-614228381
 
 
   @jun-he, FYI. This is related to #922. Instead of deleting partitions in v1, we can replace them with the `void` transform.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [incubator-iceberg] rdblue commented on a change in pull request #924: Add void transform that always produces null

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #924: Add void transform that always produces null
URL: https://github.com/apache/incubator-iceberg/pull/924#discussion_r409093584
 
 

 ##########
 File path: api/src/main/java/org/apache/iceberg/transforms/VoidTransform.java
 ##########
 @@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.transforms;
+
+import java.io.ObjectStreamException;
+import org.apache.iceberg.expressions.BoundPredicate;
+import org.apache.iceberg.expressions.UnboundPredicate;
+import org.apache.iceberg.types.Type;
+
+class VoidTransform<S> implements Transform<S, Void> {
+  private static final VoidTransform<Object> INSTANCE = new VoidTransform<>();
+
+  @SuppressWarnings("unchecked")
+  static <T> VoidTransform<T> get() {
+    return (VoidTransform<T>) INSTANCE;
+  }
+
+  private VoidTransform() {
+  }
+
+  @Override
+  public Void apply(Object value) {
+    return null;
+  }
+
+  @Override
+  public boolean canTransform(Type type) {
+    return true;
+  }
+
+  @Override
+  public Type getResultType(Type sourceType) {
+    return sourceType;
+  }
+
+  @Override
+  public UnboundPredicate<Void> projectStrict(String name, BoundPredicate<S> predicate) {
+    return null;
+  }
+
+  @Override
+  public UnboundPredicate<Void> project(String name, BoundPredicate<S> predicate) {
+    return null;
+  }
+
+  @Override
+  public String toHumanString(Void value) {
+    return "null";
+  }
+
+  @Override
+  public String toString() {
+    return "void";
+  }
 
 Review comment:
   Null is the human-readable string for the value produced by the transform. Void is the name of the transform. I considered naming it something like `always_null` but `void` seemed shorter and less error prone (was that alwaysNull or always-null?)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [incubator-iceberg] rdblue merged pull request #924: Add void transform that always produces null

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #924: Add void transform that always produces null
URL: https://github.com/apache/incubator-iceberg/pull/924
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org