You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2020/05/30 02:57:52 UTC

[GitHub] [carbondata] kevinjmh opened a new pull request #3780: [HOTFIX] Fix carbon store path

kevinjmh opened a new pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780


    ### Why is this PR needed?
    In the case of using carbon only setting `carbon.storelocation`, carbon will use local spark warehouse path instead of the configured value.
    
    ### What changes were proposed in this PR?
   `spark.sql.warehouse.dir` has its own default value in Spark, do not use `carbonStorePath` as default value, which will make `hiveStorePath.equals(carbonStorePath)` TRUE when user not set `spark.sql.warehouse.dir`
       
    ### Does this PR introduce any user interface change?
    - No
    - Yes. (please explain the change and update document)
   
    ### Is any new testcase added?
    - No
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kevinjmh commented on a change in pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
kevinjmh commented on a change in pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#discussion_r432938670



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
##########
@@ -328,7 +328,7 @@ object CarbonEnv {
     if ((!EnvHelper.isLegacy(sparkSession)) &&
         (dbName.equals("default") || databaseLocation.endsWith(".db"))) {
       val carbonStorePath = CarbonProperties.getStorePath()
-      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir", carbonStorePath)
+      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir")
       // if carbon.store does not point to spark.sql.warehouse.dir then follow the old table path
       // format
       if (carbonStorePath != null && !hiveStorePath.equals(carbonStorePath)) {

Review comment:
       no need. Spark conf has default value




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kevinjmh commented on a change in pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
kevinjmh commented on a change in pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#discussion_r432951870



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
##########
@@ -328,7 +328,7 @@ object CarbonEnv {
     if ((!EnvHelper.isLegacy(sparkSession)) &&
         (dbName.equals("default") || databaseLocation.endsWith(".db"))) {
       val carbonStorePath = CarbonProperties.getStorePath()
-      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir", carbonStorePath)
+      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir")
       // if carbon.store does not point to spark.sql.warehouse.dir then follow the old table path
       // format
       if (carbonStorePath != null && !hiveStorePath.equals(carbonStorePath)) {

Review comment:
       I think  he can try adding an empty carbon style OPTIONS in the create statement :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3780: [HOTFIX] Fix carbon store path

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#issuecomment-636278570


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1372/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kevinjmh commented on a change in pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
kevinjmh commented on a change in pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#discussion_r432950129



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
##########
@@ -328,7 +328,7 @@ object CarbonEnv {
     if ((!EnvHelper.isLegacy(sparkSession)) &&
         (dbName.equals("default") || databaseLocation.endsWith(".db"))) {
       val carbonStorePath = CarbonProperties.getStorePath()
-      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir", carbonStorePath)
+      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir")
       // if carbon.store does not point to spark.sql.warehouse.dir then follow the old table path
       // format
       if (carbonStorePath != null && !hiveStorePath.equals(carbonStorePath)) {

Review comment:
       In my case, a vanilla apache version of spark on yarn, the default warehouse location is under the local working directory where the thrift server launch. 
   
   If you check spark code, the default value of `spark.sql.warehouse.dir` barely comes from Java URI implementation.
   
   https://github.com/apache/spark/blob/6c792a79c10e7b01bd040ef14c848a2a2378e28c/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/StaticSQLConf.scala#L33-L37
   
   https://github.com/apache/spark/blob/47dc332258bec20c460f666de50d9a8c5c0fbc0a/core/src/main/scala/org/apache/spark/util/Utils.scala#L1976
   
   ![image](https://user-images.githubusercontent.com/3809732/83354246-fab16c00-a389-11ea-96ca-48e28ebe799c.png)
   
   
   the problem of PR #3675 is : 
   1. carbon store path is not taken effect, default database warehouse URI (local fs, file://...) is used. <---  this pr fix this
   2. the scheme of default database warehouse URI is removed by `FileFactiry.getUpdatedFilePath` in `CarbonEnv.getDatabaseLocation`. And this is not an absolute URI string, so spark will make 
    qualified path in `SessionCatalog#createTable`. <--  #3675 fix this




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai commented on a change in pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
QiangCai commented on a change in pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#discussion_r432938586



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
##########
@@ -328,7 +328,7 @@ object CarbonEnv {
     if ((!EnvHelper.isLegacy(sparkSession)) &&
         (dbName.equals("default") || databaseLocation.endsWith(".db"))) {
       val carbonStorePath = CarbonProperties.getStorePath()
-      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir", carbonStorePath)
+      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir")
       // if carbon.store does not point to spark.sql.warehouse.dir then follow the old table path
       // format
       if (carbonStorePath != null && !hiveStorePath.equals(carbonStorePath)) {

Review comment:
       need check hiveStorePath  is null or not




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kevinjmh commented on a change in pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
kevinjmh commented on a change in pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#discussion_r432950129



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
##########
@@ -328,7 +328,7 @@ object CarbonEnv {
     if ((!EnvHelper.isLegacy(sparkSession)) &&
         (dbName.equals("default") || databaseLocation.endsWith(".db"))) {
       val carbonStorePath = CarbonProperties.getStorePath()
-      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir", carbonStorePath)
+      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir")
       // if carbon.store does not point to spark.sql.warehouse.dir then follow the old table path
       // format
       if (carbonStorePath != null && !hiveStorePath.equals(carbonStorePath)) {

Review comment:
       the problem of that PR need checking of the cluster setting. I haven't met the case local path is added hdfs prefix unconsciously.
   In my case, a vanilla apache version of spark on yarn, the default warehouse location is under the local working directory where the thrift server launch. 
   
   If you check spark code, the default value of `spark.sql.warehouse.dir` barely comes from Java URI implementation.
   https://github.com/apache/spark/blob/6c792a79c10e7b01bd040ef14c848a2a2378e28c/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/StaticSQLConf.scala#L33-L37
   
   https://github.com/apache/spark/blob/47dc332258bec20c460f666de50d9a8c5c0fbc0a/core/src/main/scala/org/apache/spark/util/Utils.scala#L1976
   
   ![image](https://user-images.githubusercontent.com/3809732/83354246-fab16c00-a389-11ea-96ca-48e28ebe799c.png)
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai commented on a change in pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
QiangCai commented on a change in pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#discussion_r432938586



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
##########
@@ -328,7 +328,7 @@ object CarbonEnv {
     if ((!EnvHelper.isLegacy(sparkSession)) &&
         (dbName.equals("default") || databaseLocation.endsWith(".db"))) {
       val carbonStorePath = CarbonProperties.getStorePath()
-      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir", carbonStorePath)
+      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir")
       // if carbon.store does not point to spark.sql.warehouse.dir then follow the old table path
       // format
       if (carbonStorePath != null && !hiveStorePath.equals(carbonStorePath)) {

Review comment:
       need check hivestorePath is null or not




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai commented on a change in pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
QiangCai commented on a change in pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#discussion_r432944937



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
##########
@@ -328,7 +328,7 @@ object CarbonEnv {
     if ((!EnvHelper.isLegacy(sparkSession)) &&
         (dbName.equals("default") || databaseLocation.endsWith(".db"))) {
       val carbonStorePath = CarbonProperties.getStorePath()
-      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir", carbonStorePath)
+      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir")
       // if carbon.store does not point to spark.sql.warehouse.dir then follow the old table path
       // format
       if (carbonStorePath != null && !hiveStorePath.equals(carbonStorePath)) {

Review comment:
       ok, can you check this pr: https://github.com/apache/carbondata/pull/3675




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kevinjmh commented on a change in pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
kevinjmh commented on a change in pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#discussion_r432951870



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
##########
@@ -328,7 +328,7 @@ object CarbonEnv {
     if ((!EnvHelper.isLegacy(sparkSession)) &&
         (dbName.equals("default") || databaseLocation.endsWith(".db"))) {
       val carbonStorePath = CarbonProperties.getStorePath()
-      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir", carbonStorePath)
+      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir")
       // if carbon.store does not point to spark.sql.warehouse.dir then follow the old table path
       // format
       if (carbonStorePath != null && !hiveStorePath.equals(carbonStorePath)) {

Review comment:
       I think  he can try adding an empty carbon style option TBLPROPERTIES in the create statement :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai commented on pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
QiangCai commented on pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#issuecomment-636582665


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kevinjmh commented on a change in pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
kevinjmh commented on a change in pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#discussion_r432951870



##########
File path: integration/spark/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
##########
@@ -328,7 +328,7 @@ object CarbonEnv {
     if ((!EnvHelper.isLegacy(sparkSession)) &&
         (dbName.equals("default") || databaseLocation.endsWith(".db"))) {
       val carbonStorePath = CarbonProperties.getStorePath()
-      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir", carbonStorePath)
+      val hiveStorePath = sparkSession.conf.get("spark.sql.warehouse.dir")
       // if carbon.store does not point to spark.sql.warehouse.dir then follow the old table path
       // format
       if (carbonStorePath != null && !hiveStorePath.equals(carbonStorePath)) {

Review comment:
       I think  he can try adding an empty carbon style option TBLPROPERTIES in the create statement :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3780: [HOTFIX] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#issuecomment-636312777


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3097/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3780: [HOTFIX] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#issuecomment-636312576


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1375/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kevinjmh commented on pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
kevinjmh commented on pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#issuecomment-636454529


   > Please create a JIRA for this
   
   OK


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] asfgit closed pull request #3780: [CARBONDATA-3836] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3780: [HOTFIX] Fix carbon store path

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#issuecomment-636278462


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3092/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] kunal642 commented on pull request #3780: [HOTFIX] Fix carbon store path & avoid exception when creating new carbon table

Posted by GitBox <gi...@apache.org>.
kunal642 commented on pull request #3780:
URL: https://github.com/apache/carbondata/pull/3780#issuecomment-636435451


   Please create a JIRA for this 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org