You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "Prasanna Ravichandran (JIRA)" <ji...@apache.org> on 2018/08/01 10:27:00 UTC

[jira] [Commented] (CARBONDATA-2539) MV Dataset - Subqueries is not accessing the data from the MV datamap.

    [ https://issues.apache.org/jira/browse/CARBONDATA-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565103#comment-16565103 ] 

Prasanna Ravichandran commented on CARBONDATA-2539:
---------------------------------------------------

Still the sub-queries are not accessing the data from the MV datamap.

Terminal:

> create datamap dm3 using 'mv' as *select min(workgroupcategory) from origintable*;
+---------+--+
| Result |
+---------+--+
+---------+--+
No rows selected (0.392 seconds)

> select distinct workgroupcategory from originTable;
+--------------------+--+
| workgroupcategory |
+--------------------+--+
| 1 |
| 3 |
| 2 |
+--------------------+--+
3 rows selected (0.664 seconds)

> select count(*) from originTable where workgroupcategory=1;
+-----------+--+
| count(1) |
+-----------+--+
| 5 |
+-----------+--+
1 row selected (0.349 seconds)

> explain SELECT max(empno) FROM originTable WHERE workgroupcategory IN (*select min(workgroupcategory) from originTable*) group by empname;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| plan |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| == CarbonData Profiler ==
Table Scan on origintable
 - total blocklets: 1
 - filter: none
 - pruned by Main DataMap
 - skipped blocklets: 0
Table Scan on origintable
 - total blocklets: 1
 - filter: none
 - pruned by Main DataMap
 - skipped blocklets: 0
 |
| == Physical Plan ==
*HashAggregate(keys=[empname#24982], functions=[max(empno#24981)])
+- Exchange hashpartitioning(empname#24982, 200)
 +- *HashAggregate(keys=[empname#24982], functions=[partial_max(empno#24981)])
 +- *Project [empno#24981, empname#24982]
 +- *BroadcastHashJoin [workgroupcategory#24985], [*min(workgroupcategory)*#25804], LeftSemi, BuildRight
 :- *FileScan carbondata *rtyo.origintable*[empno#24981,empname#24982,designation#24983,doj#24984,workgroupcategory#24985,workgroupcategoryname#24986,deptno#24987,deptname#24988,projectcode#24989,projectjoindate#24990,projectenddate#24991,attendance#24992,utilization#24993,salary#24994]
 +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)))
 +- *HashAggregate(keys=[], functions=[min(workgroupcategory#24985)])
 +- Exchange SinglePartition
 +- *HashAggregate(keys=[], functions=[partial_min(workgroupcategory#24985)])
 +- *FileScan carbondata *rtyo.origintable*[workgroupcategory#24985] |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+

 

> MV Dataset - Subqueries is not accessing the data from the MV datamap.
> ----------------------------------------------------------------------
>
>                 Key: CARBONDATA-2539
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2539
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-query
>         Environment: 3 node opensource ANT cluster.
>            Reporter: Prasanna Ravichandran
>            Assignee: Ravindra Pesala
>            Priority: Minor
>             Fix For: 1.5.0, 1.4.1
>
>         Attachments: data.csv
>
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Inner subquery is not accessing the data from the MV datamap. It is accessing the data from the main table.
> Test queries - Spark shell:
> scala> carbon.sql("drop table if exists origintable").show()
> ++
> ||
> ++
> ++
>  scala> carbon.sql("CREATE TABLE originTable (empno int, empname String, designation String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int, utilization int,salary int) STORED BY 'org.apache.carbondata.format'").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("LOAD DATA local inpath 'hdfs://hacluster/user/prasanna/data.csv' INTO TABLE originTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"','timestampformat'='dd-MM-yyyy')").show(200,false)
> ++
> ||
> ++
> ++
>  
> scala> carbon.sql("drop datamap datamap_subqry").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("create datamap datamap_subqry using 'mv' as select min(salary) from originTable group by empno").show(200,false)
> ++
> ||
> ++
> ++
> scala> carbon.sql("explain SELECT max(empno) FROM originTable WHERE salary IN (select min(salary) from originTable group by empno ) group by empname").show(200,false)
> +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |plan |
> +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |== CarbonData Profiler ==
> Table Scan on origintable
>  - total blocklets: 2
>  - filter: none
>  - pruned by Main DataMap
>  - skipped blocklets: 0
> Table Scan on origintable
>  - total blocklets: 2
>  - filter: none
>  - pruned by Main DataMap
>  - skipped blocklets: 0
>  |
> |== Physical Plan ==
> *HashAggregate(keys=[empname#2132], functions=[max(empno#2131)])
> +- Exchange hashpartitioning(empname#2132, 200)
>  +- *HashAggregate(keys=[empname#2132], functions=[partial_max(empno#2131)])
>  +- *Project [empno#2131, empname#2132]
>  +- *BroadcastHashJoin [salary#2144], [*min(salary*)#2219], LeftSemi, BuildRight
>  :- *BatchedScan CarbonDatasourceHadoopRelation [ *Database name :default, Table name :origintable*, Schema :Some(StructType(StructField(empno,IntegerType,true), StructField(empname,StringType,true), StructField(designation,StringType,true), StructField(doj,TimestampType,true), StructField(workgroupcategory,IntegerType,true), StructField(workgroupcategoryname,StringType,true), StructField(deptno,IntegerType,true), StructField(deptname,StringType,true), StructField(projectcode,IntegerType,true), StructField(projectjoindate,TimestampType,true), StructField(projectenddate,TimestampType,true), StructField(attendance,IntegerType,true), StructField(utilization,IntegerType,true), StructField(salary,IntegerType,true))) ] default.origintable[empno#2131,empname#2132,designation#2133,doj#2134,workgroupcategory#2135,workgroupcategoryname#2136,deptno#2137,deptname#2138,projectcode#2139,projectjoindate#2140,projectenddate#2141,attendance#2142,utilization#2143,salary#2144]
>  +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)))
>  +- *HashAggregate(keys=[empno#2131], functions=[min(salary#2144)])
>  +- Exchange hashpartitioning(empno#2131, 200)
>  +- *HashAggregate(keys=[empno#2131], functions=[partial_min(salary#2144)])
>  +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :default, Table name :origintable, Schema :Some(StructType(StructField(empno,IntegerType,true), StructField(empname,StringType,true), StructField(designation,StringType,true), StructField(doj,TimestampType,true), StructField(workgroupcategory,IntegerType,true), StructField(workgroupcategoryname,StringType,true), StructField(deptno,IntegerType,true), StructField(deptname,StringType,true), StructField(projectcode,IntegerType,true), StructField(projectjoindate,TimestampType,true), StructField(projectenddate,TimestampType,true), StructField(attendance,IntegerType,true), StructField(utilization,IntegerType,true), StructField(salary,IntegerType,true))) ] default.origintable[empno#2131,salary#2144]|
> +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)