You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "ClownXC (via GitHub)" <gi...@apache.org> on 2023/03/13 15:39:00 UTC

[GitHub] [spark] ClownXC opened a new pull request, #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

ClownXC opened a new pull request, #40400:
URL: https://github.com/apache/spark/pull/40400

   What changes were proposed in this pull request?
   The main change of this pr is refactor UnsafeRow#isMutable  and UnsafeRow#isFixedLength  method to use PhysicalDataType instead of DataType.
   
   Why are the changes needed?
   Simplify type match.
   
   Does this PR introduce any user-facing change?
   No
   
   How was this patch tested?
   Existing UnsafeRowSuite


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] clownxc commented on pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "clownxc (via GitHub)" <gi...@apache.org>.
clownxc commented on PR #40400:
URL: https://github.com/apache/spark/pull/40400#issuecomment-1479542197

   @cloud-fan


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1148743917


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala:
##########
@@ -19,48 +19,83 @@ package org.apache.spark.sql.catalyst.types
 
 import org.apache.spark.sql.types._
 
-sealed abstract class PhysicalDataType
-
-case class PhysicalArrayType(elementType: DataType, containsNull: Boolean) extends PhysicalDataType
-
-class PhysicalBinaryType() extends PhysicalDataType
+sealed abstract class PhysicalDataType{
+  private[sql] def isPrimitive: Boolean
+}
+
+case class PhysicalArrayType(elementType: DataType, containsNull: Boolean)
+  extends PhysicalDataType {
+  override def isPrimitive: Boolean = false
+}
+
+class PhysicalBinaryType() extends PhysicalDataType {
+  override def isPrimitive: Boolean = false
+}
 case object PhysicalBinaryType extends PhysicalBinaryType
 
-class PhysicalBooleanType() extends PhysicalDataType
+class PhysicalBooleanType() extends PhysicalDataType {
+  override def isPrimitive: Boolean = true

Review Comment:
   looking at the code changes here, seems it's simpler to add a new trait `PhysicalPrimitiveType` instead of a `def isPrimitive`. Then the caller side can just do `pdt instanceof PhysicalPrimitiveType`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] clownxc commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "clownxc (via GitHub)" <gi...@apache.org>.
clownxc commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1148677615


##########
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java:
##########
@@ -73,48 +74,44 @@ public static int calculateBitSetWidthInBytes(int numFields) {
   /**
    * Field types that can be updated in place in UnsafeRows (e.g. we support set() for these types)
    */
-  public static final Set<DataType> mutableFieldTypes;
+  public static final Set<PhysicalDataType> mutableFieldTypes;
 
   // DecimalType, DayTimeIntervalType and YearMonthIntervalType are also mutable
   static {
     mutableFieldTypes = Collections.unmodifiableSet(
       new HashSet<>(
         Arrays.asList(
-          NullType,
-          BooleanType,
-          ByteType,
-          ShortType,
-          IntegerType,
-          LongType,
-          FloatType,
-          DoubleType,
-          DateType,
-          TimestampType,
-          TimestampNTZType
+          PhysicalNullType$.MODULE$,

Review Comment:
   > this looks ugly. Instead of having a Set, shall we just add a `def isPrimitive` in `PhysicalDataType`?
   done
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] clownxc commented on pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "clownxc (via GitHub)" <gi...@apache.org>.
clownxc commented on PR #40400:
URL: https://github.com/apache/spark/pull/40400#issuecomment-1479542802

   @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #40400:
URL: https://github.com/apache/spark/pull/40400#issuecomment-1488035365

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] clownxc commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "clownxc (via GitHub)" <gi...@apache.org>.
clownxc commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1149320479


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala:
##########
@@ -19,48 +19,83 @@ package org.apache.spark.sql.catalyst.types
 
 import org.apache.spark.sql.types._
 
-sealed abstract class PhysicalDataType
-
-case class PhysicalArrayType(elementType: DataType, containsNull: Boolean) extends PhysicalDataType
-
-class PhysicalBinaryType() extends PhysicalDataType
+sealed abstract class PhysicalDataType{
+  private[sql] def isPrimitive: Boolean
+}
+
+case class PhysicalArrayType(elementType: DataType, containsNull: Boolean)
+  extends PhysicalDataType {
+  override def isPrimitive: Boolean = false
+}
+
+class PhysicalBinaryType() extends PhysicalDataType {
+  override def isPrimitive: Boolean = false
+}
 case object PhysicalBinaryType extends PhysicalBinaryType
 
-class PhysicalBooleanType() extends PhysicalDataType
+class PhysicalBooleanType() extends PhysicalDataType {
+  override def isPrimitive: Boolean = true

Review Comment:
   Thank you for the code review comments,  I will try to modify the code as you say.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] clownxc commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "clownxc (via GitHub)" <gi...@apache.org>.
clownxc commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1151292679


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala:
##########
@@ -19,48 +19,83 @@ package org.apache.spark.sql.catalyst.types
 
 import org.apache.spark.sql.types._
 
-sealed abstract class PhysicalDataType
-
-case class PhysicalArrayType(elementType: DataType, containsNull: Boolean) extends PhysicalDataType
-
-class PhysicalBinaryType() extends PhysicalDataType
+sealed abstract class PhysicalDataType{
+  private[sql] def isPrimitive: Boolean
+}
+
+case class PhysicalArrayType(elementType: DataType, containsNull: Boolean)
+  extends PhysicalDataType {
+  override def isPrimitive: Boolean = false
+}
+
+class PhysicalBinaryType() extends PhysicalDataType {
+  override def isPrimitive: Boolean = false
+}
 case object PhysicalBinaryType extends PhysicalBinaryType
 
-class PhysicalBooleanType() extends PhysicalDataType
+class PhysicalBooleanType() extends PhysicalDataType {
+  override def isPrimitive: Boolean = true

Review Comment:
   > looking at the code changes here, seems it's simpler to add a new trait `PhysicalPrimitiveType` instead of a `def isPrimitive`. Then the caller side can just do `pdt instanceof PhysicalPrimitiveType`
   
   I applied the suggested code changes. Can you please do another review?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] clownxc commented on pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "clownxc (via GitHub)" <gi...@apache.org>.
clownxc commented on PR #40400:
URL: https://github.com/apache/spark/pull/40400#issuecomment-1479553553

   can you spare some precious time to review? Thanks very much @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1151465013


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala:
##########
@@ -19,43 +19,46 @@ package org.apache.spark.sql.catalyst.types
 
 import org.apache.spark.sql.types._
 
+
 sealed abstract class PhysicalDataType
 
+sealed abstract class PhysicalPrimitiveType extends PhysicalDataType

Review Comment:
   I think ordering needs to understand the semantic and should belong to the logical type.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] clownxc commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "clownxc (via GitHub)" <gi...@apache.org>.
clownxc commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1149359986


##########
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java:
##########
@@ -70,51 +66,25 @@ public static int calculateBitSetWidthInBytes(int numFields) {
     return ((numFields + 63)/ 64) * 8;
   }
 
-  /**
-   * Field types that can be updated in place in UnsafeRows (e.g. we support set() for these types)
-   */
-  public static final Set<DataType> mutableFieldTypes;
-
-  // DecimalType, DayTimeIntervalType and YearMonthIntervalType are also mutable
-  static {
-    mutableFieldTypes = Collections.unmodifiableSet(
-      new HashSet<>(
-        Arrays.asList(
-          NullType,
-          BooleanType,
-          ByteType,
-          ShortType,
-          IntegerType,
-          LongType,
-          FloatType,
-          DoubleType,
-          DateType,
-          TimestampType,
-          TimestampNTZType
-        )));
-  }
-
   public static boolean isFixedLength(DataType dt) {

Review Comment:
   Ok, Let me do it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.
amaliujia commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1151093946


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala:
##########
@@ -19,43 +19,46 @@ package org.apache.spark.sql.catalyst.types
 
 import org.apache.spark.sql.types._
 
+
 sealed abstract class PhysicalDataType
 
+sealed abstract class PhysicalPrimitiveType extends PhysicalDataType

Review Comment:
   Just for my own education: 
   
   what is the pros/cons of adding `class` versus `trait` to act as base and shared interface here? 
   
   I am thinking to move some DataType stuff to here and one example is the Ordering implementation which I thought to define as a trait but not a `PhysicalDataTypeWithOrdering` class
   
   ```
   trait Ordering {
     private[sql] type InternalType
     private[sql] val tag: TypeTag[InternalType]
     private[sql] val ordering: Ordering[InternalType]
   }
   ```



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala:
##########
@@ -19,43 +19,46 @@ package org.apache.spark.sql.catalyst.types
 
 import org.apache.spark.sql.types._
 
+
 sealed abstract class PhysicalDataType
 
+sealed abstract class PhysicalPrimitiveType extends PhysicalDataType

Review Comment:
   cc @cloud-fan 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan closed pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow
URL: https://github.com/apache/spark/pull/40400


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1147177710


##########
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java:
##########
@@ -73,48 +74,44 @@ public static int calculateBitSetWidthInBytes(int numFields) {
   /**
    * Field types that can be updated in place in UnsafeRows (e.g. we support set() for these types)
    */
-  public static final Set<DataType> mutableFieldTypes;
+  public static final Set<PhysicalDataType> mutableFieldTypes;
 
   // DecimalType, DayTimeIntervalType and YearMonthIntervalType are also mutable
   static {
     mutableFieldTypes = Collections.unmodifiableSet(
       new HashSet<>(
         Arrays.asList(
-          NullType,
-          BooleanType,
-          ByteType,
-          ShortType,
-          IntegerType,
-          LongType,
-          FloatType,
-          DoubleType,
-          DateType,
-          TimestampType,
-          TimestampNTZType
+          PhysicalNullType$.MODULE$,

Review Comment:
   this looks ugly. Instead of having a Set, shall we just add a `def isPrimitive` in `PhysicalDataType`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1148743201


##########
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java:
##########
@@ -70,51 +66,25 @@ public static int calculateBitSetWidthInBytes(int numFields) {
     return ((numFields + 63)/ 64) * 8;
   }
 
-  /**
-   * Field types that can be updated in place in UnsafeRows (e.g. we support set() for these types)
-   */
-  public static final Set<DataType> mutableFieldTypes;
-
-  // DecimalType, DayTimeIntervalType and YearMonthIntervalType are also mutable
-  static {
-    mutableFieldTypes = Collections.unmodifiableSet(
-      new HashSet<>(
-        Arrays.asList(
-          NullType,
-          BooleanType,
-          ByteType,
-          ShortType,
-          IntegerType,
-          LongType,
-          FloatType,
-          DoubleType,
-          DateType,
-          TimestampType,
-          TimestampNTZType
-        )));
-  }
-
   public static boolean isFixedLength(DataType dt) {

Review Comment:
   since we are touching the code here, can we add doc for these 2 methods? It's not easy to tell what the difference between fixed length and mutable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] clownxc commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "clownxc (via GitHub)" <gi...@apache.org>.
clownxc commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1151292571


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/PhysicalDataType.scala:
##########
@@ -19,48 +19,83 @@ package org.apache.spark.sql.catalyst.types
 
 import org.apache.spark.sql.types._
 
-sealed abstract class PhysicalDataType
-
-case class PhysicalArrayType(elementType: DataType, containsNull: Boolean) extends PhysicalDataType
-
-class PhysicalBinaryType() extends PhysicalDataType
+sealed abstract class PhysicalDataType{
+  private[sql] def isPrimitive: Boolean
+}
+
+case class PhysicalArrayType(elementType: DataType, containsNull: Boolean)
+  extends PhysicalDataType {
+  override def isPrimitive: Boolean = false
+}
+
+class PhysicalBinaryType() extends PhysicalDataType {
+  override def isPrimitive: Boolean = false
+}
 case object PhysicalBinaryType extends PhysicalBinaryType
 
-class PhysicalBooleanType() extends PhysicalDataType
+class PhysicalBooleanType() extends PhysicalDataType {
+  override def isPrimitive: Boolean = true

Review Comment:
   > looking at the code changes here, seems it's simpler to add a new trait `PhysicalPrimitiveType` instead of a `def isPrimitive`. Then the caller side can just do `pdt instanceof PhysicalPrimitiveType`
   
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] clownxc commented on a diff in pull request #40400: [SPARK-41359][SQL] Use `PhysicalDataType` instead of DataType in UnsafeRow

Posted by "clownxc (via GitHub)" <gi...@apache.org>.
clownxc commented on code in PR #40400:
URL: https://github.com/apache/spark/pull/40400#discussion_r1148469116


##########
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java:
##########
@@ -73,48 +74,44 @@ public static int calculateBitSetWidthInBytes(int numFields) {
   /**
    * Field types that can be updated in place in UnsafeRows (e.g. we support set() for these types)
    */
-  public static final Set<DataType> mutableFieldTypes;
+  public static final Set<PhysicalDataType> mutableFieldTypes;
 
   // DecimalType, DayTimeIntervalType and YearMonthIntervalType are also mutable
   static {
     mutableFieldTypes = Collections.unmodifiableSet(
       new HashSet<>(
         Arrays.asList(
-          NullType,
-          BooleanType,
-          ByteType,
-          ShortType,
-          IntegerType,
-          LongType,
-          FloatType,
-          DoubleType,
-          DateType,
-          TimestampType,
-          TimestampNTZType
+          PhysicalNullType$.MODULE$,

Review Comment:
   Thank you, I will try to modify the code as you say
   
   
   
   > this looks ugly. Instead of having a Set, shall we just add a `def isPrimitive` in `PhysicalDataType`?
   Thank you, I will try to modify the code as you say
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org