You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2020/03/20 06:43:31 UTC
[GitHub] [carbondata] ajantha-bhat opened a new pull request #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
ajantha-bhat opened a new pull request #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675
### Why is this PR needed?
select query fails when warehouse directory is default (not configured) with below callstak.
```
0: jdbc:hive2://localhost:10000> create table ab(age int) stored as carbondata;
---------+
Result
---------+
---------+
No rows selected (0.093 seconds)
0: jdbc:hive2://localhost:10000> select count from ab;
Error: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or view 'ab' not found in database 'tpch'; (state=,code=0)
caused by
java.io.FileNotFoundException: File hdfs://localhost:54311/home/root1/tools/spark-2.3.4-bin-hadoop2.7/spark-warehouse/tpch.db/ab/Metadata does not exist.
```
### What changes were proposed in this PR?
When the spark.sql.warehouse.dir is not configured, default local file system SPARK_HOME is used. But the describe table shows with HDFS prefix in cluster.
Reason is we are removing the local filesystem scheme , so when table path is read we add HDFS prefix in cluster. instead if we keep the scheme issue will not come.
### Does this PR introduce any user interface change?
- No
### Is any new testcase added?
- No. Happens only in cluster with HDFS or OBS.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] asfgit closed pull request #3675: [CARBONDATA-3744]
Fix select query failure issue when warehouse directory is default (not
configured) in cluster
Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-607099805
Yes, If we set "spark.sql.warehouse.dir" to /home/root1/temp;
table is created with hdfs scheme (because cluster is hdfs)
![Screenshot from 2020-04-01 13-34-08](https://user-images.githubusercontent.com/5889404/78113600-9b80bb00-741d-11ea-8390-e3e3b9cb1cf8.png)
![Screenshot from 2020-04-01 13-38-13](https://user-images.githubusercontent.com/5889404/78113903-13e77c00-741e-11ea-9a87-79ded069563e.png)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-607350487
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2610/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-602508634
![Screenshot from 2020-03-23 15-55-06](https://user-images.githubusercontent.com/5889404/77307132-ba48c880-6d1e-11ea-97be-7a8c452a2fc6.png)
FileMetaStore uses location from catalog table instead of tablepath. so hdfs scheme is added .
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-607348598
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/901/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601610898
retest this please
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601748528
@QiangCai please check
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601640428
Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/816/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601725797
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2524/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601644675
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2523/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-602488252
@QiangCai : I checked.
When I do `create table t1(age int) stored as carbondata;` table is stored as hive carbontable.
![Screenshot from 2020-03-23 15-10-56](https://user-images.githubusercontent.com/5889404/77303393-e9f4d200-6d18-11ea-9da1-6435d718d3a7.png)
Here `Location` is **given by spark itself**, so spark is adding hdfs prefix, because carbon didn't had any scheme.
If you observe `storage properties` in above image, there it is carbon's table location without prefix
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601724117
Build Success with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/817/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] jackylk commented on issue #3675: [CARBONDATA-3744]
Fix select query failure issue when warehouse directory is default (not
configured) in cluster
Posted by GitBox <gi...@apache.org>.
jackylk commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-609395273
LGTM
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601579151
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/2518/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] ajantha-bhat commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-602510902
parquet adds a scheme while storing. Hence no issue in parquet. After my changes carbon also similar ot parquet
![Screenshot from 2020-03-23 16-00-40](https://user-images.githubusercontent.com/5889404/77307608-96d24d80-6d1f-11ea-8c2f-120bc35f134d.png)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] ajantha-bhat edited a comment on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
ajantha-bhat edited a comment on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-607099805
Yes, If we set "spark.sql.warehouse.dir" to /home/root1/temp;
table is created with hdfs scheme (because cluster is hdfs)
![Screenshot from 2020-04-01 13-34-08](https://user-images.githubusercontent.com/5889404/78113600-9b80bb00-741d-11ea-8390-e3e3b9cb1cf8.png)
![Screenshot from 2020-04-01 13-38-55](https://user-images.githubusercontent.com/5889404/78113987-2f528700-741e-11ea-9411-43feaab9631e.png)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] CarbonDataQA1 commented on issue #3675:
[CARBONDATA-3744] Fix select query failure issue when warehouse directory
is default (not configured) in cluster
Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-601580895
Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/812/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] QiangCai commented on issue #3675: [CARBONDATA-3744]
Fix select query failure issue when warehouse directory is default (not
configured) in cluster
Posted by GitBox <gi...@apache.org>.
QiangCai commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-603616337
but if we set "spark.sql.warehouse.dir" to /user/hive/warehouse;
in cluster env, it should auto to use "defaultFS" as the prefix of the path, right?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [carbondata] QiangCai commented on issue #3675: [CARBONDATA-3744]
Fix select query failure issue when warehouse directory is default (not
configured) in cluster
Posted by GitBox <gi...@apache.org>.
QiangCai commented on issue #3675: [CARBONDATA-3744] Fix select query failure issue when warehouse directory is default (not configured) in cluster
URL: https://github.com/apache/carbondata/pull/3675#issuecomment-602019750
better to find the root cause:
where we append the "defaultFS" prefix to store location or database location or table path?
at some places, carbon will append the "defaultFS" prefix
we need to check whether spark does it also.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services