You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "Ajantha Bhat (JIRA)" <ji...@apache.org> on 2019/07/23 09:56:00 UTC

[jira] [Comment Edited] (CARBONDATA-3472) Carbondata Integration with Presto

    [ https://issues.apache.org/jira/browse/CARBONDATA-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890853#comment-16890853 ] 

Ajantha Bhat edited comment on CARBONDATA-3472 at 7/23/19 9:55 AM:
-------------------------------------------------------------------

Hi [~dibyad]: I have tried above steps with presto 0.217 + carbondata, I found steps is incorrect.

a) In create statement you have used same columns in inverted index as well as no inverted index, so create table failed. For this I removed no inverted index columns

b) I didn't see you mention, load data, so I loaded one row data  from spark and queried from presto

query success for me, didn't face any issue.

 

presto:db1> select * from inventory;
 inv_date_sk | inv_item_sk | inv_warehouse_sk | inv_quantity_on_hand 
-------------+-------------+------------------+----------------------
 1 | 2 | 3 | 30 
(1 row)

Query 20190723_094532_00009_u9get, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:01 [1 rows, 24B] [1 rows/s, 24B/s]

presto:db1> 
presto:db1> 
presto:db1> 
presto:db1> 
presto:db1> 
presto:db1> 
presto:db1> describe inventory;
 Column | Type | Extra | Comment 
----------------------+---------+-------+---------
 inv_date_sk | integer | | 
 inv_item_sk | integer | | 
 inv_warehouse_sk | integer | | 
 inv_quantity_on_hand | bigint | | 
(4 rows)

 

 

 

from spark:

 

0: jdbc:hive2://localhost:10000> create table inventory ( inv_date_sk int, inv_item_sk int, inv_warehouse_sk int, inv_quantity_on_hand bigint) STORED AS carbondata TBLPROPERTIES ('INVERTED_INDEX'='inv_date_sk , inv_item_sk , inv_warehouse_sk , inv_quantity_on_hand','SORT_COLUMNS'='inv_date_sk , inv_item_sk , inv_warehouse_sk , inv_quantity_on_hand','DICTIONARY_INCLUDE'='inv_date_sk , inv_item_sk , inv_warehouse_sk , inv_quantity_on_hand', 'TABLE_BLOCKSIZE'='128');
+---------+--+
| Result |
+---------+--+
+---------+--+
No rows selected (0.691 seconds)
0: jdbc:hive2://localhost:10000> 
0: jdbc:hive2://localhost:10000> 
0: jdbc:hive2://localhost:10000> 
0: jdbc:hive2://localhost:10000> insert into inventory select 1, 2 , 3, 30;
+---------+--+
| Result |
+---------+--+
+---------+--+
No rows selected (7.969 seconds)
0: jdbc:hive2://localhost:10000> select * from in
in indicator initially inner input insensitive insert int integer intersect interval 
into 
0: jdbc:hive2://localhost:10000> select * from inventory;
+--------------+--------------+-------------------+-----------------------+--+
| inv_date_sk | inv_item_sk | inv_warehouse_sk | inv_quantity_on_hand |
+--------------+--------------+-------------------+-----------------------+--+
| 1 | 2 | 3 | 30 |
+--------------+--------------+-------------------+-----------------------+--+
1 row selected (1.111 seconds)

 


was (Author: ajantha_bhat):
Hi [~dibyad]: I have tried above steps with presto 0.217 + carbondata, I found steps is incorrect.

a) In create statement you have used same columns in inverted index as well as no inverted index, so create table failed. For this I removed no inverted index columns

b) I didn't see you mention, load data, so I loaded one row data  from spark and queried from presto

query success for me, didn't face any issue.

 

 

 

> Carbondata Integration with Presto
> ----------------------------------
>
>                 Key: CARBONDATA-3472
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-3472
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-query, presto-integration
>    Affects Versions: 1.6.0
>         Environment: centos 7
>            Reporter: Dibya
>            Priority: Major
>
> Hi,
> I came across the below issue when I was trying to query a table stored in carbondata format through presto:
> java.lang.RuntimeException: Failed to create reader 
>  at org.apache.carbondata.presto.CarbondataPageSource.createReaderForColumnar(CarbondataPageSource.java:366)
>  at org.apache.carbondata.presto.CarbondataPageSource.initializeForColumnar(CarbondataPageSource.java:136)
>  at org.apache.carbondata.presto.CarbondataPageSource.initialize(CarbondataPageSource.java:130)
>  at org.apache.carbondata.presto.CarbondataPageSource.<init>(CarbondataPageSource.java:120)
>  at org.apache.carbondata.presto.CarbondataPageSourceProvider.createPageSource(CarbondataPageSourceProvider.java:88)
>  at com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:44)
>  at com.facebook.presto.split.PageSourceManager.createPageSource(PageSourceManager.java:56)
>  at com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:221)
>  at com.facebook.presto.operator.Driver.processInternal(Driver.java:379)
>  at com.facebook.presto.operator.Driver.lambda$processFor$8(Driver.java:283)
>  at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:675)
>  at com.facebook.presto.operator.Driver.processFor(Driver.java:276)
>  at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1077)
>  at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:162)
>  at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:483)
>  at com.facebook.presto.$gen.Presto_0_217____20190711_064626_1.run(Unknown Source)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Last dictionary chunk does not exist
>  at org.apache.carbondata.core.reader.CarbonDictionaryMetadataReaderImpl.readLastEntryOfDictionaryMetaChunk(CarbonDictionaryMetadataReaderImpl.java:115)
>  at org.apache.carbondata.core.cache.dictionary.AbstractDictionaryCache.readLastChunkFromDictionaryMetadataFile(AbstractDictionaryCache.java:93)
>  at org.apache.carbondata.core.cache.dictionary.AbstractDictionaryCache.checkAndLoadDictionaryData(AbstractDictionaryCache.java:198)
>  at org.apache.carbondata.core.cache.dictionary.ForwardDictionaryCache.getDictionary(ForwardDictionaryCache.java:212)
>  at org.apache.carbondata.core.cache.dictionary.ForwardDictionaryCache.get(ForwardDictionaryCache.java:80)
>  at org.apache.carbondata.core.cache.dictionary.ForwardDictionaryCache.get(ForwardDictionaryCache.java:45)
>  at org.apache.carbondata.presto.CarbonDictionaryDecodeReadSupport$$anonfun$initialize$1.apply(CarbonDictionaryDecodeReadSupport.scala:65)
>  at org.apache.carbondata.presto.CarbonDictionaryDecodeReadSupport$$anonfun$initialize$1.apply(CarbonDictionaryDecodeReadSupport.scala:53)
>  at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
>  at org.apache.carbondata.presto.CarbonDictionaryDecodeReadSupport.initialize(CarbonDictionaryDecodeReadSupport.scala:53)
>  at org.apache.carbondata.presto.CarbondataPageSource.createReaderForColumnar(CarbondataPageSource.java:359)
>  ... 18 more
>  
> This issue is seen only while querying a table which has Dictionary created on one of its columns during table creation. The same queries run fine on tables that do not have dictionaries on any of its columns.
> Please look into it.
> Thanks
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)