You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2020/03/20 06:43:31 UTC

[GitHub] [carbondata] ajantha-bhat opened a new pull request #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

ajantha-bhat opened a new pull request #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675
 
 
    ### Why is this PR needed?
   select query fails when warehouse directory is default (not configured) with below callstak.
   
   ```
   0: jdbc:hive2://localhost:10000> create table ab(age int) stored as carbondata;
   ---------+
   Result
   ---------+
   ---------+
   No rows selected (0.093 seconds)
   0: jdbc:hive2://localhost:10000> select count from ab;
   Error: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or view 'ab' not found in database 'tpch'; (state=,code=0)
   
   caused by
   java.io.FileNotFoundException: File hdfs://localhost:54311/home/root1/tools/spark-2.3.4-bin-hadoop2.7/spark-warehouse/tpch.db/ab/Metadata does not exist.
   ```
   
    ### What changes were proposed in this PR?
   When the spark.sql.warehouse.dir is not configured, default local file system SPARK_HOME is used. But the describe table shows with HDFS prefix in cluster. 
   
   Reason is we are removing the local filesystem scheme , so when table path is read we add HDFS prefix in cluster. instead if we keep the scheme issue will not come.    
   
   
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No. Happens only in cluster with HDFS or OBS.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] asfgit closed pull request #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-607099805
 
 
   Yes, If we set "spark.sql.warehouse.dir" to /home/root1/temp;
   table is created with hdfs scheme (because cluster is hdfs)
   
   ![Screenshot from 2020-04-01 13-34-08](https://user-images.githubusercontent.com/5889404/78113600-9b80bb00-741d-11ea-8390-e3e3b9cb1cf8.png)
   
   ![Screenshot from 2020-04-01 13-38-13](https://user-images.githubusercontent.com/5889404/78113903-13e77c00-741e-11ea-9a87-79ded069563e.png)
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-607350487
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2610/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-602508634
 
 
   ![Screenshot from 2020-03-23 15-55-06](https://user-images.githubusercontent.com/5889404/77307132-ba48c880-6d1e-11ea-97be-7a8c452a2fc6.png)
   
   FileMetaStore uses location from catalog table instead of tablepath. so hdfs scheme is added .

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-607348598
 
 
   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/901/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601610898
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601748528
 
 
   @QiangCai  please check

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601640428
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/816/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601725797
 
 
   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2524/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601644675
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2523/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-602488252
 
 
   @QiangCai : I checked.
   
   When I do `create table t1(age int) stored as carbondata;` table is stored as hive carbontable.
   ![Screenshot from 2020-03-23 15-10-56](https://user-images.githubusercontent.com/5889404/77303393-e9f4d200-6d18-11ea-9da1-6435d718d3a7.png)
   
   Here `Location` is **given by spark itself**, so spark is adding hdfs prefix, because carbon didn't had any scheme.
   If you observe `storage properties` in above image, there it is carbon's table location without prefix
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601724117
 
 
   Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/817/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
jackylk commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-609395273
 
 
   LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601579151
 
 
   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2518/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-602510902
 
 
   parquet adds a scheme while storing. Hence no issue in parquet. After my changes carbon also similar ot parquet
   ![Screenshot from 2020-03-23 16-00-40](https://user-images.githubusercontent.com/5889404/77307608-96d24d80-6d1f-11ea-8c2f-120bc35f134d.png)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] ajantha-bhat edited a comment on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
ajantha-bhat edited a comment on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-607099805
 
 
   Yes, If we set "spark.sql.warehouse.dir" to /home/root1/temp;
   table is created with hdfs scheme (because cluster is hdfs)
   
   ![Screenshot from 2020-04-01 13-34-08](https://user-images.githubusercontent.com/5889404/78113600-9b80bb00-741d-11ea-8390-e3e3b9cb1cf8.png)
   
   ![Screenshot from 2020-04-01 13-38-55](https://user-images.githubusercontent.com/5889404/78113987-2f528700-741e-11ea-9411-43feaab9631e.png)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601580895
 
 
   Build Failed  with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/812/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] QiangCai commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
QiangCai commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-603616337
 
 
   but if we set "spark.sql.warehouse.dir" to /user/hive/warehouse;
   in cluster env, it should auto to use "defaultFS" as the prefix of the path, right?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] QiangCai commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster

Posted by GitBox <gi...@apache.org>.
QiangCai commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-602019750
 
 
   better  to find the root cause:
   where we append the "defaultFS" prefix to store location or database location or table path?
   
   at some places, carbon will append the "defaultFS" prefix 
   we need to check whether spark does it also.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services