You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2020/08/27 08:32:00 UTC

[GitHub] [carbondata] nihal0107 opened a new pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

nihal0107 opened a new pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905


    ### Why is this PR needed?
    In case of 1 million record and 500 segments select query without filter is thowing null pointer exception.
    
    ### What changes were proposed in this PR?
   Select query without filter should execute pruneWithoutFilter method rather than pruneWithMultiThread. Added null check for filter.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-685923202


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3965/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688655958


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2256/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-681868472


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3888/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed, null pointer excption for select query and time zone dependent test failures.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688820071


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2261/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#discussion_r484445743



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##########
@@ -318,48 +318,47 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
   test("test load, update data with setlenient carbon property for daylight " +
        "saving time from different timezone") {
     CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
     sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
     sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")

Review comment:
       just add comment in both the test cases, like what timezone its using and about dst for any future reference for other developers




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

ShreelekhyaG commented on a change in pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#discussion_r484802588



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##########
@@ -318,48 +318,53 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
   test("test load, update data with setlenient carbon property for daylight " +
        "saving time from different timezone") {
     CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
     sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
     sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")
+    sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'")
+    // Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests)
+    // Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and
+    // clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST.
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
     sql("DROP TABLE test_time")
     CarbonProperties.getInstance().removeProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE)
   }
 
   test("test load, update data with setlenient session level property for daylight " +
        "saving time from different timezone") {
     sql("set carbon.load.dateformat.setlenient.enable = true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
     sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
     sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")
+    sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'")
+    // Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests)
+    // Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and
+    // clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST.
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
     sql("DROP TABLE test_time")
     defaultConfig()
   }
 
   def generateCSVFile(): Unit = {
     val rows = new ListBuffer[Array[String]]
     rows += Array("ID", "date", "time")
-    rows += Array("1", "1941-3-15", "1941-3-15 00:00:00")
+    rows += Array("1", "1941-3-15", "2019-3-10 02:00:00")
     rows += Array("2", "2016-7-24", "2016-7-24 01:02:30")
     BadRecordUtil.createCSV(rows, csvPath)
   }
 
   override def afterAll {
     sql("DROP TABLE IF EXISTS t3")
     FileUtils.forceDelete(new File(csvPath))
-    TimeZone.setDefault(defaultTimeZone)
+    CarbonProperties.getInstance().removeProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE)
+    defaultConfig()

Review comment:
       Done

##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##########
@@ -318,48 +318,65 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
   test("test load, update data with setlenient carbon property for daylight " +
        "saving time from different timezone") {
     CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
+    sql("DROP TABLE IF EXISTS testhivetable")
+    // Create test_time and hive table
     sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
-    sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql("CREATE TABLE testhivetable (ID Int, date Date, time TIMESTAMP) row format delimited fields terminated by ',' ")
+    // load data into test_time and hive table and validate query result
+    sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time options('fileheader'='ID,date,time')")
+    sql(s"LOAD DATA local inpath '$resourcesPath/differentZoneTimeStamp.csv' overwrite INTO table testhivetable")
+    checkAnswer(sql("select * from test_time"), sql("select * from testhivetable"))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")
+    sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'")
+    // Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests)
+    // Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and
+    // clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST.
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
     sql("DROP TABLE test_time")
     CarbonProperties.getInstance().removeProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE)
   }
 
   test("test load, update data with setlenient session level property for daylight " +
        "saving time from different timezone") {
     sql("set carbon.load.dateformat.setlenient.enable = true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
-    sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
+    sql("DROP TABLE IF EXISTS testhivetable")
+    // Create test_time and hive table
+    sql("CREATE TABLE test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
-    sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql("CREATE TABLE testhivetable (ID Int, date Date, time TIMESTAMP) row format delimited fields terminated by ',' ")
+    // load data into test_time and hive table and validate query result
+    sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time options('fileheader'='ID,date,time')")
+    sql(s"LOAD DATA local inpath '$resourcesPath/differentZoneTimeStamp.csv' overwrite INTO table testhivetable")
+    checkAnswer(sql("select * from test_time"), sql("select * from testhivetable"))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")
+    sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'")
+    // Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests)
+    // Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and
+    // clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST.
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    sql("DROP TABLE testhivetable")
     sql("DROP TABLE test_time")
-    defaultConfig()
+    sql("set carbon.load.dateformat.setlenient.enable = false")
   }
 
   def generateCSVFile(): Unit = {
     val rows = new ListBuffer[Array[String]]
-    rows += Array("ID", "date", "time")
-    rows += Array("1", "1941-3-15", "1941-3-15 00:00:00")
+    rows += Array("1", "1941-3-15", "2019-3-10 02:00:00")
     rows += Array("2", "2016-7-24", "2016-7-24 01:02:30")
     BadRecordUtil.createCSV(rows, csvPath)
   }
 
   override def afterAll {
     sql("DROP TABLE IF EXISTS t3")
     FileUtils.forceDelete(new File(csvPath))
-    TimeZone.setDefault(defaultTimeZone)
+    CarbonProperties.getInstance().removeProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE)
+    sql("set carbon.load.dateformat.setlenient.enable = false")
   }
-}
+}

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688747092


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2259/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#discussion_r484752262



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##########
@@ -318,48 +318,65 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
   test("test load, update data with setlenient carbon property for daylight " +
        "saving time from different timezone") {
     CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
+    sql("DROP TABLE IF EXISTS testhivetable")
+    // Create test_time and hive table
     sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
-    sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql("CREATE TABLE testhivetable (ID Int, date Date, time TIMESTAMP) row format delimited fields terminated by ',' ")
+    // load data into test_time and hive table and validate query result
+    sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time options('fileheader'='ID,date,time')")
+    sql(s"LOAD DATA local inpath '$resourcesPath/differentZoneTimeStamp.csv' overwrite INTO table testhivetable")
+    checkAnswer(sql("select * from test_time"), sql("select * from testhivetable"))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")
+    sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'")
+    // Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests)
+    // Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and
+    // clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST.
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
     sql("DROP TABLE test_time")
     CarbonProperties.getInstance().removeProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE)
   }
 
   test("test load, update data with setlenient session level property for daylight " +
        "saving time from different timezone") {
     sql("set carbon.load.dateformat.setlenient.enable = true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
-    sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
+    sql("DROP TABLE IF EXISTS testhivetable")
+    // Create test_time and hive table
+    sql("CREATE TABLE test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
-    sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql("CREATE TABLE testhivetable (ID Int, date Date, time TIMESTAMP) row format delimited fields terminated by ',' ")
+    // load data into test_time and hive table and validate query result
+    sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time options('fileheader'='ID,date,time')")
+    sql(s"LOAD DATA local inpath '$resourcesPath/differentZoneTimeStamp.csv' overwrite INTO table testhivetable")
+    checkAnswer(sql("select * from test_time"), sql("select * from testhivetable"))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")
+    sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'")
+    // Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests)
+    // Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and
+    // clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST.
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    sql("DROP TABLE testhivetable")
     sql("DROP TABLE test_time")
-    defaultConfig()
+    sql("set carbon.load.dateformat.setlenient.enable = false")
   }
 
   def generateCSVFile(): Unit = {
     val rows = new ListBuffer[Array[String]]
-    rows += Array("ID", "date", "time")
-    rows += Array("1", "1941-3-15", "1941-3-15 00:00:00")
+    rows += Array("1", "1941-3-15", "2019-3-10 02:00:00")
     rows += Array("2", "2016-7-24", "2016-7-24 01:02:30")
     BadRecordUtil.createCSV(rows, csvPath)
   }
 
   override def afterAll {
     sql("DROP TABLE IF EXISTS t3")
     FileUtils.forceDelete(new File(csvPath))
-    TimeZone.setDefault(defaultTimeZone)
+    CarbonProperties.getInstance().removeProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE)
+    sql("set carbon.load.dateformat.setlenient.enable = false")
   }
-}
+}

Review comment:
       add new line at end of the file




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] asfgit closed pull request #3905: [CARBONDATA-3964] Fixed, null pointer excption for select query and time zone dependent test failures.

Posted by GitBox <gi...@apache.org>.

asfgit closed pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] nihal0107 removed a comment on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

nihal0107 removed a comment on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-684215300


   We are getting this exception in case of more than 0.1 million record. We can't load that amount of data for the test case. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-685923980


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2225/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

nihal0107 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-684215300


   We are getting this exception in case of more than 0.1 million record. We can't load that amount of data for the test case. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688748633


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3999/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688652826


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3996/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#discussion_r484445218



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -153,7 +153,7 @@ public CarbonTable getTable() {
     int carbonDriverPruningMultiThreadEnableFilesCount =
         CarbonProperties.getDriverPruningMultiThreadEnableFilesCount();
     if (numOfThreadsForPruning == 1 || indexesCount < numOfThreadsForPruning || totalFiles
-            < carbonDriverPruningMultiThreadEnableFilesCount) {
+            < carbonDriverPruningMultiThreadEnableFilesCount || !isFilterPresent) {

Review comment:
       here add a comment saying, when the query is without filter, as we need to return all the blocklets, no need to prune multithread




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

ShreelekhyaG commented on a change in pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#discussion_r484709973



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##########
@@ -318,48 +318,47 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
   test("test load, update data with setlenient carbon property for daylight " +
        "saving time from different timezone") {
     CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
     sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
     sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")

Review comment:
       ok done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

nihal0107 commented on a change in pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#discussion_r484645647



##########
File path: core/src/main/java/org/apache/carbondata/core/index/TableIndex.java
##########
@@ -153,7 +153,7 @@ public CarbonTable getTable() {
     int carbonDriverPruningMultiThreadEnableFilesCount =
         CarbonProperties.getDriverPruningMultiThreadEnableFilesCount();
     if (numOfThreadsForPruning == 1 || indexesCount < numOfThreadsForPruning || totalFiles
-            < carbonDriverPruningMultiThreadEnableFilesCount) {
+            < carbonDriverPruningMultiThreadEnableFilesCount || !isFilterPresent) {

Review comment:
       added.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed, null pointer excption for select query and time zone dependent test failures.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688819393


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4001/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

nihal0107 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-682327952


   retest this please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on pull request #3905: [CARBONDATA-3964] Fixed, null pointer excption for select query and time zone dependent test failures.

Posted by GitBox <gi...@apache.org>.

nihal0107 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688802174


   > @nihal0107 this PR contains some test case fix too, please add the changes in PR description and title, you can brief the title, no need to keep so long.
   
   Updated the PR description and title.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-682366329


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3900/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on pull request #3905: [CARBONDATA-3964] Fixed, null pointer excption for select query and time zone dependent test failures.

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688866461


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688769415


   @nihal0107 this PR contains some test case fix too, please add the changes in PR description and title, you can brief the title, no need to keep so long.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#discussion_r484649115



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##########
@@ -318,48 +318,53 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
   test("test load, update data with setlenient carbon property for daylight " +
        "saving time from different timezone") {
     CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
     sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
     sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")
+    sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'")
+    // Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests)
+    // Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and
+    // clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST.
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))

Review comment:
       i feel in this test case you can create a hive table and compare the results with that also to be in sync




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

nihal0107 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-684216370


   > @nihal0107 , can a test case be added for your fix?
   
   We are getting this exception in case of more than 0.1 million record. We can't load that amount of data for the test case.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-682366182


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2159/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] vikramahuja1001 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

vikramahuja1001 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-683869471


   @nihal0107 , can a test case be added for your fix?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-681874973


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2147/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

ShreelekhyaG commented on a change in pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#discussion_r484710316



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##########
@@ -318,48 +318,53 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
   test("test load, update data with setlenient carbon property for daylight " +
        "saving time from different timezone") {
     CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
     sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
     sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")
+    sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'")
+    // Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests)
+    // Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and
+    // clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST.
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))

Review comment:
       added




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

akashrn5 commented on a change in pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#discussion_r484521356



##########
File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##########
@@ -318,48 +318,53 @@ class TestLoadDataWithDiffTimestampFormat extends QueryTest with BeforeAndAfterA
   test("test load, update data with setlenient carbon property for daylight " +
        "saving time from different timezone") {
     CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE, "true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
     sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
     sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")
+    sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'")
+    // Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests)
+    // Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and
+    // clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST.
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
     sql("DROP TABLE test_time")
     CarbonProperties.getInstance().removeProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE)
   }
 
   test("test load, update data with setlenient session level property for daylight " +
        "saving time from different timezone") {
     sql("set carbon.load.dateformat.setlenient.enable = true")
-    TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
     sql("DROP TABLE IF EXISTS test_time")
     sql("CREATE TABLE IF NOT EXISTS test_time (ID Int, date Date, time Timestamp) STORED AS carbondata " +
         "TBLPROPERTIES('dateformat'='yyyy-MM-dd', 'timestampformat'='yyyy-MM-dd HH:mm:ss') ")
     sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/differentZoneTimeStamp.csv' into table test_time")
-    sql(s"insert into test_time select 11, '2016-7-24', '1941-3-15 00:00:00' ")
-    sql("update test_time set (time) = ('1941-3-15 00:00:00') where ID='2'")
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
-    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00"))))
+    sql(s"insert into test_time select 11, '2016-7-24', '2019-3-10 02:00:00' ")
+    sql("update test_time set (time) = ('2019-3-10 02:00:00') where ID='2'")
+    // Using America/Los_Angeles timezone (timezone is fixed to America/Los_Angeles for all tests)
+    // Here, 2019-3-10 02:00:00 is invalid data in America/Los_Angeles zone, as DST is observed and
+    // clocks were turned forward 1 hour to 2019-3-10 03:00:00. With lenience property enabled, can parse the time according to DST.
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 1"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 11"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
+    checkAnswer(sql("SELECT time FROM test_time WHERE ID = 2"), Seq(Row(Timestamp.valueOf("2019-3-10 03:00:00"))))
     sql("DROP TABLE test_time")
     defaultConfig()
   }
 
   def generateCSVFile(): Unit = {
     val rows = new ListBuffer[Array[String]]
     rows += Array("ID", "date", "time")
-    rows += Array("1", "1941-3-15", "1941-3-15 00:00:00")
+    rows += Array("1", "1941-3-15", "2019-3-10 02:00:00")
     rows += Array("2", "2016-7-24", "2016-7-24 01:02:30")
     BadRecordUtil.createCSV(rows, csvPath)
   }
 
   override def afterAll {
     sql("DROP TABLE IF EXISTS t3")
     FileUtils.forceDelete(new File(csvPath))
-    TimeZone.setDefault(defaultTimeZone)
+    CarbonProperties.getInstance().removeProperty(CarbonCommonConstants.CARBON_LOAD_DATEFORMAT_SETLENIENT_ENABLE)
+    defaultConfig()

Review comment:
       why do we need to call this, defaultConfig(), here you have just set lenient to true, only reverting that would be enough right




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3905: [CARBONDATA-3964] Fixed null pointer excption for select * and select count(*) without filter.

Posted by GitBox <gi...@apache.org>.

CarbonDataQA1 commented on pull request #3905:
URL: https://github.com/apache/carbondata/pull/3905#issuecomment-688442809






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org