You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by GitBox <gi...@apache.org> on 2021/05/31 11:00:03 UTC
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4129: [CARBONDATA-4179] Support renaming of complex columns (array/struct)

Indhumathi27 commented on a change in pull request #4129:
URL: https://github.com/apache/carbondata/pull/4129#discussion_r640293211



##########
File path: core/src/main/java/org/apache/carbondata/core/scan/executor/util/RestructureUtil.java
##########
@@ -186,26 +208,37 @@ public static boolean isColumnMatches(boolean isTransactionalTable,
   }
 
   /**
-   * In case of Multilevel Complex column - Struct/StructOfStruct, traverse all the child dimension
-   * to check column Id
+   * In case of Multilevel Complex column - Struct/StructOfStruct, traverse all the child dimensions
+   * of tableColumn to check if any of its column Id has matched with that of queryColumn .
    *
-   * @param tableColumn
-   * @param queryColumn
+   * @param tableColumn - column entity that is present in the table block or in the segment
+   *                      properties.
+   * @param queryColumn - column entity that is present in the fired query or in the query model.
+   * tableColumn name and queryColumn name may or may not be the same in case schema has evolved.
+   * Hence matching happens based on the column ID
    * @return
    */
   private static boolean isColumnMatchesStruct(CarbonColumn tableColumn, CarbonColumn queryColumn) {
     if (tableColumn instanceof CarbonDimension) {
-      List<CarbonDimension> parentDimension =
+      List<CarbonDimension> childrenDimensions =
           ((CarbonDimension) tableColumn).getListOfChildDimensions();
-      CarbonDimension carbonDimension = null;
+      CarbonDimension carbonDimension;
       String[] colSplits = queryColumn.getColName().split("\\.");
       StringBuffer tempColName = new StringBuffer(colSplits[0]);
       for (String colSplit : colSplits) {
         if (!tempColName.toString().equalsIgnoreCase(colSplit)) {
-          tempColName = tempColName.append(".").append(colSplit);
+          tempColName = tempColName.append(CarbonCommonConstants.POINT).append(colSplit);
         }
-        carbonDimension = CarbonTable.getCarbonDimension(tempColName.toString(), parentDimension);
-        if (carbonDimension != null) {
+        carbonDimension =
+            CarbonTable.getCarbonDimension(tempColName.toString(), childrenDimensions);
+        if (carbonDimension == null) {
+          // Avoid returning true in case of SDK as the column name contains the id.
+          if (existingTableColumnIDMap.containsKey(queryColumn.getColumnId())
+              && !existingTableColumnIDMap.get(queryColumn.getColumnId())
+              .contains(queryColumn.getColumnId())) {

Review comment:
       should the validation should be with tablecolumn here?

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/catalyst/CarbonParserUtil.scala
##########
@@ -1132,10 +1135,60 @@ object CarbonParserUtil {
         } else if (scale < 0 || scale > 38) {
           throw new MalformedCarbonCommandException("Invalid value for scale")
         }
-        DataTypeInfo("decimal", precision, scale)
+        DataTypeInfo(columnName, "decimal", precision, scale)
+      case _ =>
+        DataTypeInfo(columnName,
+          DataTypeConverterUtil.convertToCarbonType(dataType).getName.toLowerCase)
+    }
+  }
+
+  /**
+   * This method will return the instantiated DataTypeInfo by parsing the column
+   */
+  def parseColumn(columnName: String, dataType: DataType,
+      values: Option[List[(Int, Int)]]): DataTypeInfo = {
+    // creates parent dataTypeInfo first
+    val dataTypeInfo = CarbonParserUtil.parseDataType(
+      columnName,
+      DataTypeConverterUtil
+        .convertToCarbonType(dataType.typeName)
+        .getName
+        .toLowerCase,
+      values)
+    // check which child type is present and create children dataTypeInfo accordingly
+    dataType match {
+      case arrayType: ArrayType =>
+        val childType: DataType = arrayType.elementType
+        val childName = columnName + ".val"
+        val childValues = childType match {
+          case d: DecimalType => Some(List((d.precision, d.scale)))
+          case _ => None
+        }
+        var childTypeInfoList: List[DataTypeInfo] = null
+        val childDatatypeInfo = parseColumn(childName, childType, childValues)
+        childTypeInfoList = List(childDatatypeInfo)
+        dataTypeInfo.setChildren(childTypeInfoList)
+      case childrenTypeList: StructType =>

Review comment:
       change to structType

##########
File path: docs/ddl-of-carbondata.md
##########
@@ -841,11 +843,23 @@ Users can specify which columns to include and exclude for local dictionary gene
      ```
      ALTER TABLE test_db.carbon CHANGE a3 a4 STRING
      ```
-     Example3:Change column a3's comment to "col_comment".
+     Example4:Change column a3's comment to "col_comment".
      
      ```
      ALTER TABLE test_db.carbon CHANGE a3 a3 STRING COMMENT 'col_comment'
      ```
+     
+     Example5:Change child column name in column: structField struct\<age:int> from age to id.

Review comment:
       ```suggestion
        Example5: Change child column name in column: structField struct<age:int> from age to id.
   ```
   same for below

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -143,27 +141,49 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
       val newColumnPrecision = alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.precision
       val newColumnScale = alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.scale
       // set isDataTypeChange flag
-      if (oldCarbonColumn.head.getDataType.getName
-        .equalsIgnoreCase(alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.dataType)) {
+      val oldDatatype = oldCarbonColumn.head.getDataType
+      val newDatatype = alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.dataType
+      if (isColumnRename && (DataTypes.isMapType(oldDatatype) ||
+                             newDatatype.equalsIgnoreCase(CarbonCommonConstants.MAP))) {
+        throw new UnsupportedOperationException(
+          "Alter rename is unsupported for Map datatype column")
+      }
+      if (oldDatatype.getName.equalsIgnoreCase(newDatatype)) {
         val newColumnPrecision =
           alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.precision
         val newColumnScale = alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.scale
         // if the source datatype is decimal and there is change in precision and scale, then
         // along with rename, datatype change is also required for the command, so set the
         // isDataTypeChange flag to true in this case
-        if (oldCarbonColumn.head.getDataType.getName.equalsIgnoreCase("decimal") &&
-            (oldCarbonColumn.head.getDataType.asInstanceOf[DecimalType].getPrecision !=
+        if (DataTypes.isDecimal(oldDatatype) &&
+            (oldDatatype.asInstanceOf[DecimalType].getPrecision !=
              newColumnPrecision ||
-             oldCarbonColumn.head.getDataType.asInstanceOf[DecimalType].getScale !=
+             oldDatatype.asInstanceOf[DecimalType].getScale !=
              newColumnScale)) {
           isDataTypeChange = true
         }
+        if (DataTypes.isArrayType(oldDatatype) || DataTypes.isStructType(oldDatatype)) {
+          val oldParent = oldCarbonColumn.head
+          val oldChildren = oldParent.asInstanceOf[CarbonDimension].getListOfChildDimensions.asScala
+            .toList
+          AlterTableUtil.validateComplexStructure(oldChildren,
+            alterTableColRenameAndDataTypeChangeModel.dataTypeInfo.getChildren(),
+            alteredColumnNamesMap)
+        }
       } else {
+        if (oldDatatype.isComplexType ||
+            newDatatype.equalsIgnoreCase(CarbonCommonConstants.ARRAY) ||
+            newDatatype.equalsIgnoreCase(CarbonCommonConstants.STRUCT)) {
+          throw new UnsupportedOperationException(
+            "Old and new complex columns are not compatible in structure")
+        }
         isDataTypeChange = true
       }
+

Review comment:
       can revert this change

##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
##########
@@ -83,6 +75,9 @@ private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
     childTableColumnRename: Boolean = false)
   extends CarbonAlterTableColumnRenameCommand(alterTableColRenameAndDataTypeChangeModel.columnName,
     alterTableColRenameAndDataTypeChangeModel.newColumnName) {
+  // stores mapping of altered column names: old-column-name -> new-column-name.
+  // Including both parent/table and children columns

Review comment:
       here, you mean parent column?

##########
File path: core/src/main/java/org/apache/carbondata/core/scan/executor/util/RestructureUtil.java
##########
@@ -50,6 +52,10 @@
  * Utility class for restructuring
  */
 public class RestructureUtil {
+  // if table column is of complex type- this look up stores the column id of the parent and their
+  // children. This helps to determine the existence of incoming query column by matching based
+  // on id.
+  private static Map<String, String> existingTableColumnIDMap = new HashMap<>();

Review comment:
       then, can remove initialization from line:58




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org