You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2020/10/12 06:27:05 UTC

[GitHub] [carbondata] Karan980 opened a new pull request #3979: [WIP] Fix insertion from ORC table into carbon table when sort scope is global sort

Karan980 opened a new pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979


   ### Why is this PR needed?
   Loading records from ORC table having null values into carbon table having sort scope as global sort gives NPE.
    
    ### What changes were proposed in this PR?
   Added null check for arrayType and mapType before writing the data into byteArrays
   
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-714704138


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4641/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Karan980 commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
Karan980 commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r503140543



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -1011,38 +1011,46 @@ object CommonUtil {
       objectDataType: DataType): AnyRef = {
     objectDataType match {
       case _: ArrayType =>
-        val arrayDataType = objectDataType.asInstanceOf[ArrayType]
-        val arrayData = data.asInstanceOf[UnsafeArrayData]
-        val size = arrayData.numElements()
-        val childDataType = arrayDataType.elementType
-        val arrayChildObjects = new Array[AnyRef](size)
-        var i = 0
-        while (i < size) {
-          arrayChildObjects(i) = convertSparkComplexTypeToCarbonObject(arrayData.get(i,
-            childDataType), childDataType)
-          i = i + 1
+        if (data == null) {

Review comment:
       For dateType if data is null that is handled by some other way which is not returning null. That's why i didn't put it after line 1011




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-708398610


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4436/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713438976


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4569/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r509038981



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -1009,6 +1009,10 @@ object CommonUtil {
 
   private def convertSparkComplexTypeToCarbonObject(data: AnyRef,
       objectDataType: DataType): AnyRef = {
+    if (data == null && (objectDataType.isInstanceOf[ArrayType]

Review comment:
       ok




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713450581


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4580/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-708407338


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2683/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] asfgit closed pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-715183340


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2901/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Karan980 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
Karan980 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713353614


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Karan980 commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
Karan980 commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r509000503



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/InsertIntoCarbonTableTestCase.scala
##########
@@ -67,6 +67,21 @@ class InsertIntoCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
 
   }
 
+  test("insert from orc-select columns with columns having null values and sort scope as global sort") {
+    sql("drop table if exists TORCSource")
+    sql("drop table if exists TCarbon")
+    sql("create table TORCSource(name string,col array<String>,fee int) STORED AS orc")
+    sql("insert into TORCSource values('karan',null,2)")
+    sql("create table TCarbon(name string, col array<String>,fee int) STORED AS carbondata TBLPROPERTIES ('SORT_COLUMNS'='name','TABLE_BLOCKSIZE'='128','TABLE_BLOCKLET_SIZE'='128','SORT_SCOPE'='global_SORT')")
+    sql("insert overwrite table TCarbon select name,col,fee from TORCSource")
+    val result = sql("show segments for table TCarbon").collect()(0).get(1).toString()
+    if(!"Success".equalsIgnoreCase(result)) {

Review comment:
       Done

##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -1011,38 +1011,46 @@ object CommonUtil {
       objectDataType: DataType): AnyRef = {
     objectDataType match {
       case _: ArrayType =>
-        val arrayDataType = objectDataType.asInstanceOf[ArrayType]
-        val arrayData = data.asInstanceOf[UnsafeArrayData]
-        val size = arrayData.numElements()
-        val childDataType = arrayDataType.elementType
-        val arrayChildObjects = new Array[AnyRef](size)
-        var i = 0
-        while (i < size) {
-          arrayChildObjects(i) = convertSparkComplexTypeToCarbonObject(arrayData.get(i,
-            childDataType), childDataType)
-          i = i + 1
+        if (data == null) {

Review comment:
       Done

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/InsertIntoCarbonTableTestCase.scala
##########
@@ -67,6 +67,21 @@ class InsertIntoCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
 
   }
 
+  test("insert from orc-select columns with columns having null values and sort scope as global sort") {
+    sql("drop table if exists TORCSource")
+    sql("drop table if exists TCarbon")
+    sql("create table TORCSource(name string,col array<String>,fee int) STORED AS orc")
+    sql("insert into TORCSource values('karan',null,2)")
+    sql("create table TCarbon(name string, col array<String>,fee int) STORED AS carbondata TBLPROPERTIES ('SORT_COLUMNS'='name','TABLE_BLOCKSIZE'='128','TABLE_BLOCKLET_SIZE'='128','SORT_SCOPE'='global_SORT')")
+    sql("insert overwrite table TCarbon select name,col,fee from TORCSource")
+    val result = sql("show segments for table TCarbon").collect()(0).get(1).toString()
+    if(!"Success".equalsIgnoreCase(result)) {
+      assert(false)
+    }
+    sql("drop table if exists TORCSource")
+    sql("drop table if exists TCarbon")
+  }
+

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713375299


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2808/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ydvpankaj99 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
ydvpankaj99 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-714376464


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ydvpankaj99 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
ydvpankaj99 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713229757


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-706972701


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2622/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-706985772


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4372/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r503097655



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -1011,38 +1011,46 @@ object CommonUtil {
       objectDataType: DataType): AnyRef = {
     objectDataType match {
       case _: ArrayType =>
-        val arrayDataType = objectDataType.asInstanceOf[ArrayType]
-        val arrayData = data.asInstanceOf[UnsafeArrayData]
-        val size = arrayData.numElements()
-        val childDataType = arrayDataType.elementType
-        val arrayChildObjects = new Array[AnyRef](size)
-        var i = 0
-        while (i < size) {
-          arrayChildObjects(i) = convertSparkComplexTypeToCarbonObject(arrayData.get(i,
-            childDataType), childDataType)
-          i = i + 1
+        if (data == null) {

Review comment:
       after line 1011, add a check that if data type is array or struct or map and the data is null, return null to avoid changing many lines.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-712748032


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2778/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r503096192



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/InsertIntoCarbonTableTestCase.scala
##########
@@ -67,6 +67,21 @@ class InsertIntoCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
 
   }
 
+  test("insert from orc-select columns with columns having null values and sort scope as global sort") {
+    sql("drop table if exists TORCSource")
+    sql("drop table if exists TCarbon")
+    sql("create table TORCSource(name string,col array<String>,fee int) STORED AS orc")
+    sql("insert into TORCSource values('karan',null,2)")
+    sql("create table TCarbon(name string, col array<String>,fee int) STORED AS carbondata TBLPROPERTIES ('SORT_COLUMNS'='name','TABLE_BLOCKSIZE'='128','TABLE_BLOCKLET_SIZE'='128','SORT_SCOPE'='global_SORT')")
+    sql("insert overwrite table TCarbon select name,col,fee from TORCSource")
+    val result = sql("show segments for table TCarbon").collect()(0).get(1).toString()
+    if(!"Success".equalsIgnoreCase(result)) {

Review comment:
       please do a select query and compare orc and carbon table results 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Karan980 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
Karan980 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713415461


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713451630


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4583/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713451826


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2832/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-712853433


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2782/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r509026576



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -1009,6 +1009,10 @@ object CommonUtil {
 
   private def convertSparkComplexTypeToCarbonObject(data: AnyRef,
       objectDataType: DataType): AnyRef = {
+    if (data == null && (objectDataType.isInstanceOf[ArrayType]

Review comment:
       check with  objectDataType.isComplexType




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713448948


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4574/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-714443055


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4625/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Karan980 commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
Karan980 commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r509032391



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -1009,6 +1009,10 @@ object CommonUtil {
 
   private def convertSparkComplexTypeToCarbonObject(data: AnyRef,
       objectDataType: DataType): AnyRef = {
+    if (data == null && (objectDataType.isInstanceOf[ArrayType]

Review comment:
       DataType is a spark class here. It doesn't have isComplexType method.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-714443695


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2870/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Karan980 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
Karan980 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713449806


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-714709707


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2885/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] Karan980 commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
Karan980 commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r503139533



##########
File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##########
@@ -1011,38 +1011,46 @@ object CommonUtil {
       objectDataType: DataType): AnyRef = {
     objectDataType match {
       case _: ArrayType =>
-        val arrayDataType = objectDataType.asInstanceOf[ArrayType]
-        val arrayData = data.asInstanceOf[UnsafeArrayData]
-        val size = arrayData.numElements()
-        val childDataType = arrayDataType.elementType
-        val arrayChildObjects = new Array[AnyRef](size)
-        var i = 0
-        while (i < size) {
-          arrayChildObjects(i) = convertSparkComplexTypeToCarbonObject(arrayData.get(i,
-            childDataType), childDataType)
-          i = i + 1
+        if (data == null) {

Review comment:
       For dateType if data is null that is handled by some other way which is not returning null. That's why i didn't put it after line 1011




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-712854016


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4536/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-712395039






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713437666


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2817/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-715211651


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
QiangCai commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-714860572


   please fix scala style issue


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713303716


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4558/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-713449259


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2825/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-715139071


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4657/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on a change in pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#discussion_r503099172



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/InsertIntoCarbonTableTestCase.scala
##########
@@ -67,6 +67,21 @@ class InsertIntoCarbonTableTestCase extends QueryTest with BeforeAndAfterAll {
 
   }
 
+  test("insert from orc-select columns with columns having null values and sort scope as global sort") {
+    sql("drop table if exists TORCSource")
+    sql("drop table if exists TCarbon")
+    sql("create table TORCSource(name string,col array<String>,fee int) STORED AS orc")
+    sql("insert into TORCSource values('karan',null,2)")
+    sql("create table TCarbon(name string, col array<String>,fee int) STORED AS carbondata TBLPROPERTIES ('SORT_COLUMNS'='name','TABLE_BLOCKSIZE'='128','TABLE_BLOCKLET_SIZE'='128','SORT_SCOPE'='global_SORT')")
+    sql("insert overwrite table TCarbon select name,col,fee from TORCSource")
+    val result = sql("show segments for table TCarbon").collect()(0).get(1).toString()
+    if(!"Success".equalsIgnoreCase(result)) {
+      assert(false)
+    }
+    sql("drop table if exists TORCSource")
+    sql("drop table if exists TCarbon")
+  }
+

Review comment:
       please handle and verify 4 scenarios by comparing with ORC.
   a) local sort insert with complex type null value data
   b) global sort insert with complex type null value data
   c) same point a) with `carbon.enable.bad.record.handling.for.insert `as `true`
   d) same point b) with `carbon.enable.bad.record.handling.for.insert` as `true`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
QiangCai commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-712799969


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-712748210


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4532/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org