You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@carbondata.apache.org by Deepak Kulkarni <de...@gmail.com> on 2019/07/02 09:22:16 UTC

Info needed -

Hi,



We are evaluating the Carbondata for following alternative to Oracle –and
hit following issues when achieving this migration -



   1. Primary Key Indexes – how we can avoid duplicate records on Carbon?
   2. Secondary Indexes
   3. Constraints such as NULL, NOT NULL – how can we set this at column
   levels in carbon?
   4. Functions such as DECODE – what alternatives we have in Carbon to use
   these SQL functions?



Appreciate the response.



-- 
Regards,
Deepak

Re: Info needed -

Posted by Deepak Kulkarni <de...@gmail.com>.

Thanks so much, Ravindra, appreciated.

On Thu, Jul 18, 2019 at 8:03 PM Ravindra Pesala <ra...@gmail.com>
wrote:

> Hi Deepak,
>
> I have verified your query and pre-aggregate works as expected.
>
> spark.sql("DROP TABLE IF EXISTS MTRC00_NRAW_000_RECENT_1572019_2")
> spark.sql("CREATE TABLE MTRC00_NRAW_000_RECENT_1572019_2 (IDX_METRIC int, IDX_RESOURCE int, DTE_DATE int, DBL_VALUE int) stored by 'org.apache.carbondata.format'")
>
> spark.sql("CREATE DATAMAP DM_preaggregate_new on table MTRC00_NRAW_000_RECENT_1572019_2 USING 'preaggregate' AS select IDX_METRIC,IDX_RESOURCE ,sum(dbl_value),avg(dbl_value),min(dbl_value),max(dbl_value),count(dbl_value) FROM MTRC00_NRAW_000_RECENT_1572019_2 group by IDX_METRIC,IDX_RESOURCE")
>
> // show datamap
> spark.sql("show datamap on table MTRC00_NRAW_000_RECENT_1572019_2").show(false)
>
> spark.sql("insert into MTRC00_NRAW_000_RECENT_1572019_2 select 1,1,1,1")
>
> spark.sql("EXPLAIN select IDX_METRIC,IDX_RESOURCE ,sum(dbl_value),avg(dbl_value),min(dbl_value),max(dbl_value),count(dbl_value) FROM MTRC00_NRAW_000_RECENT_1572019_2 group by IDX_METRIC,IDX_RESOURCE").show(false)
>
>
> Output
>
> ===========
>
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> |== Physical Plan ==
> *(2) HashAggregate(keys=[mtrc00_nraw_000_recent_1572019_2_idx_metric#380, mtrc00_nraw_000_recent_1572019_2_idx_resource#381], functions=[sum(mtrc00_nraw_000_recent_1572019_2_dbl_value_sum#382L), sum(cast(mtrc00_nraw_000_recent_1572019_2_dbl_value_sum#382L as double)), sum(cast(mtrc00_nraw_000_recent_1572019_2_dbl_value_count#383L as double)), min(mtrc00_nraw_000_recent_1572019_2_dbl_value_min#384), max(mtrc00_nraw_000_recent_1572019_2_dbl_value_max#385), sum(mtrc00_nraw_000_recent_1572019_2_dbl_value_count#383L)])
> +- Exchange hashpartitioning(mtrc00_nraw_000_recent_1572019_2_idx_metric#380, mtrc00_nraw_000_recent_1572019_2_idx_resource#381, 200)
>    +- *(1) HashAggregate(keys=[mtrc00_nraw_000_recent_1572019_2_idx_metric#380, mtrc00_nraw_000_recent_1572019_2_idx_resource#381], functions=[partial_sum(mtrc00_nraw_000_recent_1572019_2_dbl_value_sum#382L), partial_sum(cast(mtrc00_nraw_000_recent_1572019_2_dbl_value_sum#382L as double)), partial_sum(cast(mtrc00_nraw_000_recent_1572019_2_dbl_value_count#383L as double)), partial_min(mtrc00_nraw_000_recent_1572019_2_dbl_value_min#384), partial_max(mtrc00_nraw_000_recent_1572019_2_dbl_value_max#385), partial_sum(mtrc00_nraw_000_recent_1572019_2_dbl_value_count#383L)])
>       +- *(1) FileScan carbondata default.*mtrc00_nraw_000_recent_1572019_2_dm_preaggregate_new*[mtrc00_nraw_000_recent_1572019_2_idx_metric#380,mtrc00_nraw_000_recent_1572019_2_idx_resource#381,mtrc00_nraw_000_recent_1572019_2_dbl_value_sum#382L,mtrc00_nraw_000_recent_1572019_2_dbl_value_count#383L,mtrc00_nraw_000_recent_1572019_2_dbl_value_min#384,mtrc00_nraw_000_recent_1572019_2_dbl_value_max#385]|
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
>
>
> In the above explain plan clearly shows that it has choosen pre-aggregate table. I am attaching the example scala file for your reference.
>
>
>
> Regards,
> Ravindra.
>
> On Tue, 16 Jul 2019 at 15:28, Deepak Kulkarni <de...@gmail.com>
> wrote:
>
>> Hello,
>>
>> This is what we have been doing -
>>
>> Please find below steps that we followed to create tables and datamap and
>> tried explain command to see if we can get some information about the use
>> of datamap while query execution but we did not get the expected output :-
>>
>>
>>
>>    1. *Table creation:*
>>
>> *carbon.sql("CREATE TABLE MTRC00_NRAW_000_RECENT_1572019_2 (IDX_METRIC
>> int, IDX_RESOURCE int, DTE_DATE int, DBL_VALUE int) stored by
>> 'org.apache.carbondata.format'").show()*
>>
>> 2019-07-15 23:03:40 AUDIT audit:72 - {"time":"July 15, 2019 11:03:40 PM
>> IST","username":"root","opName":"CREATE
>> TABLE","opId":"11174429655716046","opStatus":"START"}
>>
>> 2019-07-15 23:03:40 WARN  HiveExternalCatalog:66 - Couldn't find
>> corresponding Hive SerDe for data source provider
>> org.apache.spark.sql.CarbonSource. Persisting data source table
>> `default`.`mtrc00_nraw_000_recent_1572019_2` into Hive metastore in Spark
>> SQL specific format, which is NOT compatible with Hive.
>>
>> 2019-07-15 23:03:40 AUDIT audit:93 - {"time":"July 15, 2019 11:03:40 PM
>> IST","username":"root","opName":"CREATE
>> TABLE","opId":"11174429655716046","opStatus":"SUCCESS","opTime":"87
>> ms","table":"default.mtrc00_nraw_000_recent_1572019_2","extraInfo":{"bad_record_path":"","local_dictionary_enable":"true","external":"false","sort_columns":"","comment":""}}
>>
>> ++
>>
>> ||
>>
>> ++
>>
>> ++
>>
>>    1. *DataMap Creation:*
>>
>>
>>
>> *scala>* *carbon.sql("CREATE DATAMAP DM_preaggregate_new on table
>> MTRC00_NRAW_000_RECENT_1572019_2 USING 'preaggregate' AS select
>> IDX_METRIC,IDX_RESOURCE
>> ,sum(dbl_value),avg(dbl_value),min(dbl_value),max(dbl_value),count(dbl_value)
>> FROM MTRC00_NRAW_000_RECENT_1572019_2 group by IDX_METRIC,IDX_RESOURCE")*
>>
>> 2019-07-15 23:05:35 AUDIT audit:72 - {"time":"July 15, 2019 11:05:35 PM
>> IST","username":"root","opName":"CREATE
>> DATAMAP","opId":"11174544343554709","opStatus":"START"}
>>
>> 2019-07-15 23:05:35 AUDIT audit:72 - {"time":"July 15, 2019 11:05:35 PM
>> IST","username":"root","opName":"CREATE
>> TABLE","opId":"11174544390284045","opStatus":"START"}
>>
>> 2019-07-15 23:05:35 WARN  HiveExternalCatalog:66 - Couldn't find
>> corresponding Hive SerDe for data source provider
>> org.apache.spark.sql.CarbonSource. Persisting data source table
>> `default`.`mtrc00_nraw_000_recent_1572019_2_dm_preaggregate_new` into Hive
>> metastore in Spark SQL specific format, which is NOT compatible with Hive.
>>
>> 2019-07-15 23:05:35 AUDIT audit:93 - {"time":"July 15, 2019 11:05:35 PM
>> IST","username":"root","opName":"CREATE
>> TABLE","opId":"11174544390284045","opStatus":"SUCCESS","opTime":"96
>> ms","table":"default.mtrc00_nraw_000_recent_1572019_2_dm_preaggregate_new","extraInfo":{"local_dictionary_threshold":"10000","bad_record_path":"","table_blocksize":"1024","sort_scope":"LOCAL_SORT","local_dictionary_enable":"true","flat_folder":"false","external":"false","sort_columns":"","comment":"","_internal.deferred.rebuild":"false"}}
>>
>> 2019-07-15 23:05:35 AUDIT audit:93 - {"time":"July 15, 2019 11:05:35 PM
>> IST","username":"root","opName":"CREATE
>> DATAMAP","opId":"11174544343554709","opStatus":"SUCCESS","opTime":"231
>> ms","table":"default.mtrc00_nraw_000_recent_1572019_2","extraInfo":{"provider":"preaggregate","dmName":"DM_preaggregate_new"}}
>>
>> res121: org.apache.spark.sql.DataFrame = []
>>
>>
>>
>>    1. *Data Loading:*
>>
>> *scala> carbon.sql("LOAD DATA inpath
>> 'hdfs://10.46.40.149:9000/root/test/Wireline/NRAW/MTRC00_NRAW_000_RECENT.csv
>> <http://10.46.40.149:9000/root/test/Wireline/NRAW/MTRC00_NRAW_000_RECENT.csv>'
>> into table MTRC00_NRAW_000_RECENT_1572019_2
>> options('BAD_RECORDS_ACTION'='FORCE','DELIMITER'=',','FILEHEADER'='IDX_METRIC,IDX_RESOURCE,DTE_DATE,DBL_VALUE')");*
>>
>> 2019-07-15 23:08:50 AUDIT audit:72 - {"time":"July 15, 2019 11:08:50 PM
>> IST","username":"root","opName":"LOAD
>> DATA","opId":"11174739779154705","opStatus":"START"}
>>
>> 2019-07-15 23:08:51 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 0
>>
>> 2019-07-15 23:08:51 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11174740740416991
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>> |                      6 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   6
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>>    1. *Query Execution with EXPLAIN command:-*
>>
>>
>>
>> scala> *carbon.sql("EXPLAIN select groupNumber,
>> resourceNumber,SUM(VsumM1) as VsumM1, SUM(VcountM1) as VcountM1,
>> SUM(VsumM2) as VsumM2, SUM(VcountM2) as VcountM2, SUM(VsumM3) as VsumM3,
>> SUM(VcountM3) as VcountM3, SUM(VsumM4) as VsumM4, SUM(VcountM4) as
>> VcountM4, SUM(VsumM5) as VsumM5, SUM(VcountM5) as VcountM5,(SELECT
>> MIN(STR_NAME) FROM FRML_DESC_HIST FRML WHERE FRML.IDX_IND = 2210 AND
>> FRML.DTE_DATE = (SELECT MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1 WHERE
>> FRML1.IDX_IND = FRML.IDX_IND AND FRML1.DTE_DATE < 1561507200 AND
>> FRML1.STR_ACTION in ('I','U'))) as M1NAME, (SELECT MIN(STR_NAME) FROM
>> FRML_DESC_HIST FRML WHERE FRML.IDX_IND = 2211 AND FRML.DTE_DATE = (SELECT
>> MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1 WHERE FRML1.IDX_IND = FRML.IDX_IND
>> AND FRML1.DTE_DATE < 1561507200 AND FRML1.STR_ACTION in ('I','U'))) as
>> M2NAME, (SELECT MIN(STR_NAME) FROM FRML_DESC_HIST FRML WHERE FRML.IDX_IND =
>> 2214 AND FRML.DTE_DATE = (SELECT MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1
>> WHERE FRML1.IDX_IND = FRML.IDX_IND AND FRML1.DTE_DATE < 1561507200 AND
>> FRML1.STR_ACTION in ('I','U'))) as M3NAME, (SELECT MIN(STR_NAME) FROM
>> FRML_DESC_HIST FRML WHERE FRML.IDX_IND = 2215 AND FRML.DTE_DATE = (SELECT
>> MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1 WHERE FRML1.IDX_IND = FRML.IDX_IND
>> AND FRML1.DTE_DATE < 1561507200 AND FRML1.STR_ACTION in ('I','U'))) as
>> M4NAME, (SELECT MIN(STR_NAME) FROM FRML_DESC_HIST FRML WHERE FRML.IDX_IND =
>> 2234 AND FRML.DTE_DATE = (SELECT MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1
>> WHERE FRML1.IDX_IND = FRML.IDX_IND AND FRML1.DTE_DATE < 1561507200 AND
>> FRML1.STR_ACTION in ('I','U'))) as M5NAME FROM (select S1.IDX_GROUP as
>> groupNumber ,M1.idx_resource as resourceNumber ,CASE when
>> m1.idx_metric=2210 then SUM(M1.dbl_value) end as VsumM1,CASE when
>> m1.idx_metric=2210 then count(m1.dbl_value) end as VcountM1,CASE when
>> m1.idx_metric=2211 then SUM(M1.dbl_value) end as VsumM2,CASE when
>> m1.idx_metric=2211 then count(m1.dbl_value) end as VcountM2,CASE when
>> m1.idx_metric=2214 then SUM(M1.dbl_value) end as VsumM3,CASE when
>> m1.idx_metric=2214 then count(m1.dbl_value) end as VcountM3,CASE when
>> m1.idx_metric=2215 then SUM(M1.dbl_value) end as VsumM4,CASE when
>> m1.idx_metric=2215 then count(m1.dbl_value) end as VcountM4,CASE when
>> m1.idx_metric=2234 then SUM(M1.dbl_value) end as VsumM5,CASE when
>> m1.idx_metric=2234 then count(m1.dbl_value) end as VcountM5 FROM
>> MTRC00_NRAW_000_RECENT_1572019_2  M1, SE_GRP_MEMBER_HIST S1 where
>> M1.idx_resource=S1.IDX_ALIAS_INST and S1.IDX_GROUP=200075906 and
>> M1.idx_resource= 200071954 and M1.IDX_METRIC in (2210,2211,2214,2215,2234)
>> group by M1.IDX_METRIC,M1.idx_resource,S1.IDX_GROUP) GROUP BY
>> groupNumber,resourceNumber HAVING SUM(VcountM1) <> 0 OR SUM(VcountM2) <> 0
>> OR SUM(VcountM3) <> 0 OR SUM(VcountM4) <> 0 OR SUM(VcountM5) <> 0").show()*
>>
>> 2019-07-16 12:33:25 AUDIT audit:72 - {"time":"July 16, 2019 12:33:25 PM
>> IST","username":"root","opName":"EXPLAIN","opId":"11223014419086786","opStatus":"START"}
>>
>> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 2
>>
>> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 0
>>
>> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 1
>>
>> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 0
>>
>> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 1
>>
>> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 1
>>
>> 2019-07-16 12:33:25 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223014841650012
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|                     |
>> 10 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                  10
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223014952642509
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>> |                      9 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   9
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223014986644062
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>> |                      9 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   9
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223014913695441
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>> |                      9 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   9
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>>               |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223015020278657
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|                     |
>> 11 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                  11
>> |                      0 |
>>
>> |        +--------------------+
>>    +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223014881878925
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>> |                      9 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   9
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223014903200116
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>> |                      3 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   4
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>> |                      1 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223014976273697
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>> |                      4 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   4
>> |                      0 |
>>
>> |        +--------------------+
>>            +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223015007180119
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>> |                      4 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   4
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223014939414859
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>> |                      6 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   6
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>>    |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223014868807332
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>>            |                      3 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   3
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:27 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223014828073061
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|
>> |                      3 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                   3
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 0
>>
>> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 1
>>
>> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 0
>>
>> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 0
>>
>> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 0
>>
>> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 1
>>
>> 2019-07-16 12:33:27 AUDIT audit:93 - {"time":"July 16, 2019 12:33:27 PM
>> IST","username":"root","opName":"EXPLAIN","opId":"11223014419086786","opStatus":"SUCCESS","opTime":"2570
>> ms","table":"NA","extraInfo":{"query":"'Filter (((NOT ('SUM('VcountM1) = 0)
>> || NOT ('SUM('VcountM2) = 0)) || NOT ('SUM('VcountM3) = 0)) || (NOT
>> ('SUM('VcountM4) = 0) || NOT ('SUM('VcountM5) = 0)))"}}
>>
>> +--------------------+
>>
>> |                plan|
>>
>> +--------------------+
>>
>> |== CarbonData Pro...|
>>
>> |== Physical Plan ...|
>>
>> +--------------------+
>>
>>    1. *Even with simple select query we do not see any reference of
>>    datamp being getting hit while execution:-*
>>
>>
>>
>> scala> carbon.sql("EXPLAIN select IDX_METRIC,IDX_RESOURCE
>> ,sum(dbl_value),avg(dbl_value),min(dbl_value),max(dbl_value),count(dbl_value)
>> FROM MTRC00_NRAW_000_RECENT_1572019_2 group by
>> IDX_METRIC,IDX_RESOURCE").show()
>>
>> 2019-07-16 12:37:47 AUDIT audit:72 - {"time":"July 16, 2019 12:37:47 PM
>> IST","username":"root","opName":"EXPLAIN","opId":"11223276220632808","opStatus":"START"}
>>
>> 2019-07-16 12:37:47 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 0
>>
>> 2019-07-16 12:37:47 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
>> Print query statistic for query id: 11223276310826076
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Module|      Operation Step|     Total Query Cost|              Query
>> Cost|
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>> |  Driver|  Load blocks driver|                     |
>> 16 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |    Part|    Block allocation|                  16
>> |                      0 |
>>
>> |        +--------------------+
>> +------------------------+
>>
>> |        |Block identification|
>> |                      0 |
>>
>>
>> +--------+--------------------+---------------------+------------------------+
>>
>>
>>
>> 2019-07-16 12:37:47 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
>> for Carbon Optimizer to optimize: 0
>>
>> 2019-07-16 12:37:47 AUDIT audit:93 - {"time":"July 16, 2019 12:37:47 PM
>> IST","username":"root","opName":"EXPLAIN","opId":"11223276220632808","opStatus":"SUCCESS","opTime":"288
>> ms","table":"NA","extraInfo":{"query":"'Aggregate ['IDX_METRIC,
>> 'IDX_RESOURCE], ['IDX_METRIC, 'IDX_RESOURCE,
>> unresolvedalias('sum('dbl_value), None), unresolvedalias('avg('dbl_value),
>> None), unresolvedalias('min('dbl_value), None),
>> unresolvedalias('max('dbl_value), None),
>> unresolvedalias('count('dbl_value), None)]"}}
>>
>> +--------------------+
>>
>> |                plan|
>>
>> +--------------------+
>>
>> |== CarbonData Pro...|
>>
>> |== Physical Plan ...|
>>
>> +--------------------+
>>
>> The required carbon property  enable.query.statistics = true has been
>> set .
>>
>>
>> BR,
>>
>> Deepak
>>
>> On Mon, Jul 15, 2019 at 11:24 AM Ravindra Pesala <ra...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Please provide the script/test case  you are using, I can try it.
>>>
>>> Regards,
>>> Ravindra
>>>
>>> On Fri, 12 Jul 2019 at 16:55, Deepak Kulkarni <de...@gmail.com>
>>> wrote:
>>>
>>>> thx. We tried this example but we did not get the similar output for
>>>> EXPLAIN PLAN command. Can yo help?
>>>> i
>>>>
>>>> On Fri, Jul 12, 2019 at 4:08 PM Ravindra Pesala <ra...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> You want to use pre-aggregate datamaps, please try the example
>>>>> `org.apache.carbondata.examples.PreAggregateDataMapExample` to understand
>>>>> how it works. Please note that we are going to obsolete pre-aggregate
>>>>> datamaps from next version and make MV(materialized view ) datamaps going
>>>>> to replace it.
>>>>>
>>>>> Regards,
>>>>> Ravindra.
>>>>>
>>>>> On Thu, 11 Jul 2019 at 17:40, Deepak Kulkarni <de...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thx. However we want to tame sure that the queries are using
>>>>>> pre-aggregate datamaps . We tried using explain plan but the output showed
>>>>>> in this link https://carbondata.apache.org/datamap-management.html we
>>>>>> are not getting.
>>>>>>
>>>>>> Can you help?
>>>>>>
>>>>>> BR,
>>>>>> Deepak
>>>>>>
>>>>>> On Tue, Jul 9, 2019 at 8:51 AM Ravindra Pesala <ra...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> 1. Currently there is no way to impose primary key indexes on
>>>>>>> carbon. We may consider it in future.
>>>>>>>
>>>>>>> 2. We have datamaps interface opened for implementing secondary
>>>>>>> indexes. Right now we have implemented for min/max , bloom indexes. User
>>>>>>> can add there own implementations as well.
>>>>>>>
>>>>>>> 3. There is no constraints supported on carbon right now.
>>>>>>>
>>>>>>> 4. Spark/hive has many predefined functions and also user can have
>>>>>>> there own user defined function and register to spark and use them carbon.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Ravindra.
>>>>>>>
>>>>>>> On Tue, 2 Jul 2019 at 2:52 PM, Deepak Kulkarni <de...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> We are evaluating the Carbondata for following alternative to
>>>>>>>> Oracle –and hit following issues when achieving this migration -
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>    1. Primary Key Indexes – how we can avoid duplicate records on
>>>>>>>>    Carbon?
>>>>>>>>    2. Secondary Indexes
>>>>>>>>    3. Constraints such as NULL, NOT NULL – how can we set this at
>>>>>>>>    column levels in carbon?
>>>>>>>>    4. Functions such as DECODE – what alternatives we have in
>>>>>>>>    Carbon to use these SQL functions?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Appreciate the response.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Regards,
>>>>>>>> Deepak
>>>>>>>>
>>>>>>> --
>>>>>>> Thanks & Regards,
>>>>>>> Ravi
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Deepak
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks & Regards,
>>>>> Ravi
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Deepak
>>>>
>>>
>>>
>>> --
>>> Thanks & Regards,
>>> Ravi
>>>
>>
>>
>> --
>> Regards,
>> Deepak
>>
>
>
> --
> Thanks & Regards,
> Ravi
>


-- 
Regards,
Deepak

Re: Info needed -

Posted by Ravindra Pesala <ra...@gmail.com>.

Hi Deepak,

I have verified your query and pre-aggregate works as expected.

spark.sql("DROP TABLE IF EXISTS MTRC00_NRAW_000_RECENT_1572019_2")
spark.sql("CREATE TABLE MTRC00_NRAW_000_RECENT_1572019_2 (IDX_METRIC
int, IDX_RESOURCE int, DTE_DATE int, DBL_VALUE int) stored by
'org.apache.carbondata.format'")

spark.sql("CREATE DATAMAP DM_preaggregate_new on table
MTRC00_NRAW_000_RECENT_1572019_2 USING 'preaggregate' AS select
IDX_METRIC,IDX_RESOURCE
,sum(dbl_value),avg(dbl_value),min(dbl_value),max(dbl_value),count(dbl_value)
FROM MTRC00_NRAW_000_RECENT_1572019_2 group by
IDX_METRIC,IDX_RESOURCE")

// show datamap
spark.sql("show datamap on table MTRC00_NRAW_000_RECENT_1572019_2").show(false)

spark.sql("insert into MTRC00_NRAW_000_RECENT_1572019_2 select 1,1,1,1")

spark.sql("EXPLAIN select IDX_METRIC,IDX_RESOURCE
,sum(dbl_value),avg(dbl_value),min(dbl_value),max(dbl_value),count(dbl_value)
FROM MTRC00_NRAW_000_RECENT_1572019_2 group by
IDX_METRIC,IDX_RESOURCE").show(false)


Output

===========

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|plan






















                 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|== Physical Plan ==
*(2) HashAggregate(keys=[mtrc00_nraw_000_recent_1572019_2_idx_metric#380,
mtrc00_nraw_000_recent_1572019_2_idx_resource#381],
functions=[sum(mtrc00_nraw_000_recent_1572019_2_dbl_value_sum#382L),
sum(cast(mtrc00_nraw_000_recent_1572019_2_dbl_value_sum#382L as
double)), sum(cast(mtrc00_nraw_000_recent_1572019_2_dbl_value_count#383L
as double)), min(mtrc00_nraw_000_recent_1572019_2_dbl_value_min#384),
max(mtrc00_nraw_000_recent_1572019_2_dbl_value_max#385),
sum(mtrc00_nraw_000_recent_1572019_2_dbl_value_count#383L)])
+- Exchange hashpartitioning(mtrc00_nraw_000_recent_1572019_2_idx_metric#380,
mtrc00_nraw_000_recent_1572019_2_idx_resource#381, 200)
   +- *(1) HashAggregate(keys=[mtrc00_nraw_000_recent_1572019_2_idx_metric#380,
mtrc00_nraw_000_recent_1572019_2_idx_resource#381],
functions=[partial_sum(mtrc00_nraw_000_recent_1572019_2_dbl_value_sum#382L),
partial_sum(cast(mtrc00_nraw_000_recent_1572019_2_dbl_value_sum#382L
as double)), partial_sum(cast(mtrc00_nraw_000_recent_1572019_2_dbl_value_count#383L
as double)), partial_min(mtrc00_nraw_000_recent_1572019_2_dbl_value_min#384),
partial_max(mtrc00_nraw_000_recent_1572019_2_dbl_value_max#385),
partial_sum(mtrc00_nraw_000_recent_1572019_2_dbl_value_count#383L)])
      +- *(1) FileScan carbondata
default.*mtrc00_nraw_000_recent_1572019_2_dm_preaggregate_new*[mtrc00_nraw_000_recent_1572019_2_idx_metric#380,mtrc00_nraw_000_recent_1572019_2_idx_resource#381,mtrc00_nraw_000_recent_1572019_2_dbl_value_sum#382L,mtrc00_nraw_000_recent_1572019_2_dbl_value_count#383L,mtrc00_nraw_000_recent_1572019_2_dbl_value_min#384,mtrc00_nraw_000_recent_1572019_2_dbl_value_max#385]|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


In the above explain plan clearly shows that it has choosen
pre-aggregate table. I am attaching the example scala file for your
reference.



Regards,
Ravindra.

On Tue, 16 Jul 2019 at 15:28, Deepak Kulkarni <de...@gmail.com> wrote:

> Hello,
>
> This is what we have been doing -
>
> Please find below steps that we followed to create tables and datamap and
> tried explain command to see if we can get some information about the use
> of datamap while query execution but we did not get the expected output :-
>
>
>
>    1. *Table creation:*
>
> *carbon.sql("CREATE TABLE MTRC00_NRAW_000_RECENT_1572019_2 (IDX_METRIC
> int, IDX_RESOURCE int, DTE_DATE int, DBL_VALUE int) stored by
> 'org.apache.carbondata.format'").show()*
>
> 2019-07-15 23:03:40 AUDIT audit:72 - {"time":"July 15, 2019 11:03:40 PM
> IST","username":"root","opName":"CREATE
> TABLE","opId":"11174429655716046","opStatus":"START"}
>
> 2019-07-15 23:03:40 WARN  HiveExternalCatalog:66 - Couldn't find
> corresponding Hive SerDe for data source provider
> org.apache.spark.sql.CarbonSource. Persisting data source table
> `default`.`mtrc00_nraw_000_recent_1572019_2` into Hive metastore in Spark
> SQL specific format, which is NOT compatible with Hive.
>
> 2019-07-15 23:03:40 AUDIT audit:93 - {"time":"July 15, 2019 11:03:40 PM
> IST","username":"root","opName":"CREATE
> TABLE","opId":"11174429655716046","opStatus":"SUCCESS","opTime":"87
> ms","table":"default.mtrc00_nraw_000_recent_1572019_2","extraInfo":{"bad_record_path":"","local_dictionary_enable":"true","external":"false","sort_columns":"","comment":""}}
>
> ++
>
> ||
>
> ++
>
> ++
>
>    1. *DataMap Creation:*
>
>
>
> *scala>* *carbon.sql("CREATE DATAMAP DM_preaggregate_new on table
> MTRC00_NRAW_000_RECENT_1572019_2 USING 'preaggregate' AS select
> IDX_METRIC,IDX_RESOURCE
> ,sum(dbl_value),avg(dbl_value),min(dbl_value),max(dbl_value),count(dbl_value)
> FROM MTRC00_NRAW_000_RECENT_1572019_2 group by IDX_METRIC,IDX_RESOURCE")*
>
> 2019-07-15 23:05:35 AUDIT audit:72 - {"time":"July 15, 2019 11:05:35 PM
> IST","username":"root","opName":"CREATE
> DATAMAP","opId":"11174544343554709","opStatus":"START"}
>
> 2019-07-15 23:05:35 AUDIT audit:72 - {"time":"July 15, 2019 11:05:35 PM
> IST","username":"root","opName":"CREATE
> TABLE","opId":"11174544390284045","opStatus":"START"}
>
> 2019-07-15 23:05:35 WARN  HiveExternalCatalog:66 - Couldn't find
> corresponding Hive SerDe for data source provider
> org.apache.spark.sql.CarbonSource. Persisting data source table
> `default`.`mtrc00_nraw_000_recent_1572019_2_dm_preaggregate_new` into Hive
> metastore in Spark SQL specific format, which is NOT compatible with Hive.
>
> 2019-07-15 23:05:35 AUDIT audit:93 - {"time":"July 15, 2019 11:05:35 PM
> IST","username":"root","opName":"CREATE
> TABLE","opId":"11174544390284045","opStatus":"SUCCESS","opTime":"96
> ms","table":"default.mtrc00_nraw_000_recent_1572019_2_dm_preaggregate_new","extraInfo":{"local_dictionary_threshold":"10000","bad_record_path":"","table_blocksize":"1024","sort_scope":"LOCAL_SORT","local_dictionary_enable":"true","flat_folder":"false","external":"false","sort_columns":"","comment":"","_internal.deferred.rebuild":"false"}}
>
> 2019-07-15 23:05:35 AUDIT audit:93 - {"time":"July 15, 2019 11:05:35 PM
> IST","username":"root","opName":"CREATE
> DATAMAP","opId":"11174544343554709","opStatus":"SUCCESS","opTime":"231
> ms","table":"default.mtrc00_nraw_000_recent_1572019_2","extraInfo":{"provider":"preaggregate","dmName":"DM_preaggregate_new"}}
>
> res121: org.apache.spark.sql.DataFrame = []
>
>
>
>    1. *Data Loading:*
>
> *scala> carbon.sql("LOAD DATA inpath
> 'hdfs://10.46.40.149:9000/root/test/Wireline/NRAW/MTRC00_NRAW_000_RECENT.csv
> <http://10.46.40.149:9000/root/test/Wireline/NRAW/MTRC00_NRAW_000_RECENT.csv>'
> into table MTRC00_NRAW_000_RECENT_1572019_2
> options('BAD_RECORDS_ACTION'='FORCE','DELIMITER'=',','FILEHEADER'='IDX_METRIC,IDX_RESOURCE,DTE_DATE,DBL_VALUE')");*
>
> 2019-07-15 23:08:50 AUDIT audit:72 - {"time":"July 15, 2019 11:08:50 PM
> IST","username":"root","opName":"LOAD
> DATA","opId":"11174739779154705","opStatus":"START"}
>
> 2019-07-15 23:08:51 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 0
>
> 2019-07-15 23:08:51 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11174740740416991
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 6 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   6 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
>    1. *Query Execution with EXPLAIN command:-*
>
>
>
> scala> *carbon.sql("EXPLAIN select groupNumber,
> resourceNumber,SUM(VsumM1) as VsumM1, SUM(VcountM1) as VcountM1,
> SUM(VsumM2) as VsumM2, SUM(VcountM2) as VcountM2, SUM(VsumM3) as VsumM3,
> SUM(VcountM3) as VcountM3, SUM(VsumM4) as VsumM4, SUM(VcountM4) as
> VcountM4, SUM(VsumM5) as VsumM5, SUM(VcountM5) as VcountM5,(SELECT
> MIN(STR_NAME) FROM FRML_DESC_HIST FRML WHERE FRML.IDX_IND = 2210 AND
> FRML.DTE_DATE = (SELECT MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1 WHERE
> FRML1.IDX_IND = FRML.IDX_IND AND FRML1.DTE_DATE < 1561507200 AND
> FRML1.STR_ACTION in ('I','U'))) as M1NAME, (SELECT MIN(STR_NAME) FROM
> FRML_DESC_HIST FRML WHERE FRML.IDX_IND = 2211 AND FRML.DTE_DATE = (SELECT
> MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1 WHERE FRML1.IDX_IND = FRML.IDX_IND
> AND FRML1.DTE_DATE < 1561507200 AND FRML1.STR_ACTION in ('I','U'))) as
> M2NAME, (SELECT MIN(STR_NAME) FROM FRML_DESC_HIST FRML WHERE FRML.IDX_IND =
> 2214 AND FRML.DTE_DATE = (SELECT MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1
> WHERE FRML1.IDX_IND = FRML.IDX_IND AND FRML1.DTE_DATE < 1561507200 AND
> FRML1.STR_ACTION in ('I','U'))) as M3NAME, (SELECT MIN(STR_NAME) FROM
> FRML_DESC_HIST FRML WHERE FRML.IDX_IND = 2215 AND FRML.DTE_DATE = (SELECT
> MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1 WHERE FRML1.IDX_IND = FRML.IDX_IND
> AND FRML1.DTE_DATE < 1561507200 AND FRML1.STR_ACTION in ('I','U'))) as
> M4NAME, (SELECT MIN(STR_NAME) FROM FRML_DESC_HIST FRML WHERE FRML.IDX_IND =
> 2234 AND FRML.DTE_DATE = (SELECT MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1
> WHERE FRML1.IDX_IND = FRML.IDX_IND AND FRML1.DTE_DATE < 1561507200 AND
> FRML1.STR_ACTION in ('I','U'))) as M5NAME FROM (select S1.IDX_GROUP as
> groupNumber ,M1.idx_resource as resourceNumber ,CASE when
> m1.idx_metric=2210 then SUM(M1.dbl_value) end as VsumM1,CASE when
> m1.idx_metric=2210 then count(m1.dbl_value) end as VcountM1,CASE when
> m1.idx_metric=2211 then SUM(M1.dbl_value) end as VsumM2,CASE when
> m1.idx_metric=2211 then count(m1.dbl_value) end as VcountM2,CASE when
> m1.idx_metric=2214 then SUM(M1.dbl_value) end as VsumM3,CASE when
> m1.idx_metric=2214 then count(m1.dbl_value) end as VcountM3,CASE when
> m1.idx_metric=2215 then SUM(M1.dbl_value) end as VsumM4,CASE when
> m1.idx_metric=2215 then count(m1.dbl_value) end as VcountM4,CASE when
> m1.idx_metric=2234 then SUM(M1.dbl_value) end as VsumM5,CASE when
> m1.idx_metric=2234 then count(m1.dbl_value) end as VcountM5 FROM
> MTRC00_NRAW_000_RECENT_1572019_2  M1, SE_GRP_MEMBER_HIST S1 where
> M1.idx_resource=S1.IDX_ALIAS_INST and S1.IDX_GROUP=200075906 and
> M1.idx_resource= 200071954 and M1.IDX_METRIC in (2210,2211,2214,2215,2234)
> group by M1.IDX_METRIC,M1.idx_resource,S1.IDX_GROUP) GROUP BY
> groupNumber,resourceNumber HAVING SUM(VcountM1) <> 0 OR SUM(VcountM2) <> 0
> OR SUM(VcountM3) <> 0 OR SUM(VcountM4) <> 0 OR SUM(VcountM5) <> 0").show()*
>
> 2019-07-16 12:33:25 AUDIT audit:72 - {"time":"July 16, 2019 12:33:25 PM
> IST","username":"root","opName":"EXPLAIN","opId":"11223014419086786","opStatus":"START"}
>
> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 2
>
> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 0
>
> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 1
>
> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 0
>
> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 1
>
> 2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 1
>
> 2019-07-16 12:33:25 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223014841650012
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 10 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                  10 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223014952642509
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 9 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   9 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223014986644062
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 9 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   9 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223014913695441
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 9 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   9 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223015020278657
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 11 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                  11 |
> 0 |
>
> |        +--------------------+
>    +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223014881878925
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 9 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   9 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223014903200116
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 3 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   4 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 1 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223014976273697
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 4 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   4 |
> 0 |
>
> |        +--------------------+
>            +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223015007180119
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 4 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   4 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223014939414859
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 6 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   6 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223014868807332
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 3 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   3 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:27 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223014828073061
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 3 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                   3 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 0
>
> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 1
>
> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 0
>
> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 0
>
> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 0
>
> 2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 1
>
> 2019-07-16 12:33:27 AUDIT audit:93 - {"time":"July 16, 2019 12:33:27 PM
> IST","username":"root","opName":"EXPLAIN","opId":"11223014419086786","opStatus":"SUCCESS","opTime":"2570
> ms","table":"NA","extraInfo":{"query":"'Filter (((NOT ('SUM('VcountM1) = 0)
> || NOT ('SUM('VcountM2) = 0)) || NOT ('SUM('VcountM3) = 0)) || (NOT
> ('SUM('VcountM4) = 0) || NOT ('SUM('VcountM5) = 0)))"}}
>
> +--------------------+
>
> |                plan|
>
> +--------------------+
>
> |== CarbonData Pro...|
>
> |== Physical Plan ...|
>
> +--------------------+
>
>    1. *Even with simple select query we do not see any reference of
>    datamp being getting hit while execution:-*
>
>
>
> scala> carbon.sql("EXPLAIN select IDX_METRIC,IDX_RESOURCE
> ,sum(dbl_value),avg(dbl_value),min(dbl_value),max(dbl_value),count(dbl_value)
> FROM MTRC00_NRAW_000_RECENT_1572019_2 group by
> IDX_METRIC,IDX_RESOURCE").show()
>
> 2019-07-16 12:37:47 AUDIT audit:72 - {"time":"July 16, 2019 12:37:47 PM
> IST","username":"root","opName":"EXPLAIN","opId":"11223276220632808","opStatus":"START"}
>
> 2019-07-16 12:37:47 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 0
>
> 2019-07-16 12:37:47 STATISTIC DriverQueryStatisticsRecorderImpl:123 -
> Print query statistic for query id: 11223276310826076
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Module|      Operation Step|     Total Query Cost|              Query
> Cost|
>
>
> +--------+--------------------+---------------------+------------------------+
>
> |  Driver|  Load blocks driver|                     |
> 16 |
>
> |        +--------------------+
> +------------------------+
>
> |    Part|    Block allocation|                  16 |
> 0 |
>
> |        +--------------------+
> +------------------------+
>
> |        |Block identification|                     |
> 0 |
>
>
> +--------+--------------------+---------------------+------------------------+
>
>
>
> 2019-07-16 12:37:47 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
> for Carbon Optimizer to optimize: 0
>
> 2019-07-16 12:37:47 AUDIT audit:93 - {"time":"July 16, 2019 12:37:47 PM
> IST","username":"root","opName":"EXPLAIN","opId":"11223276220632808","opStatus":"SUCCESS","opTime":"288
> ms","table":"NA","extraInfo":{"query":"'Aggregate ['IDX_METRIC,
> 'IDX_RESOURCE], ['IDX_METRIC, 'IDX_RESOURCE,
> unresolvedalias('sum('dbl_value), None), unresolvedalias('avg('dbl_value),
> None), unresolvedalias('min('dbl_value), None),
> unresolvedalias('max('dbl_value), None),
> unresolvedalias('count('dbl_value), None)]"}}
>
> +--------------------+
>
> |                plan|
>
> +--------------------+
>
> |== CarbonData Pro...|
>
> |== Physical Plan ...|
>
> +--------------------+
>
> The required carbon property  enable.query.statistics = true has been set
> .
>
>
> BR,
>
> Deepak
>
> On Mon, Jul 15, 2019 at 11:24 AM Ravindra Pesala <ra...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Please provide the script/test case  you are using, I can try it.
>>
>> Regards,
>> Ravindra
>>
>> On Fri, 12 Jul 2019 at 16:55, Deepak Kulkarni <de...@gmail.com>
>> wrote:
>>
>>> thx. We tried this example but we did not get the similar output for
>>> EXPLAIN PLAN command. Can yo help?
>>> i
>>>
>>> On Fri, Jul 12, 2019 at 4:08 PM Ravindra Pesala <ra...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> You want to use pre-aggregate datamaps, please try the example
>>>> `org.apache.carbondata.examples.PreAggregateDataMapExample` to understand
>>>> how it works. Please note that we are going to obsolete pre-aggregate
>>>> datamaps from next version and make MV(materialized view ) datamaps going
>>>> to replace it.
>>>>
>>>> Regards,
>>>> Ravindra.
>>>>
>>>> On Thu, 11 Jul 2019 at 17:40, Deepak Kulkarni <de...@gmail.com>
>>>> wrote:
>>>>
>>>>> Thx. However we want to tame sure that the queries are using
>>>>> pre-aggregate datamaps . We tried using explain plan but the output showed
>>>>> in this link https://carbondata.apache.org/datamap-management.html we
>>>>> are not getting.
>>>>>
>>>>> Can you help?
>>>>>
>>>>> BR,
>>>>> Deepak
>>>>>
>>>>> On Tue, Jul 9, 2019 at 8:51 AM Ravindra Pesala <ra...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> 1. Currently there is no way to impose primary key indexes on carbon.
>>>>>> We may consider it in future.
>>>>>>
>>>>>> 2. We have datamaps interface opened for implementing secondary
>>>>>> indexes. Right now we have implemented for min/max , bloom indexes. User
>>>>>> can add there own implementations as well.
>>>>>>
>>>>>> 3. There is no constraints supported on carbon right now.
>>>>>>
>>>>>> 4. Spark/hive has many predefined functions and also user can have
>>>>>> there own user defined function and register to spark and use them carbon.
>>>>>>
>>>>>> Regards,
>>>>>> Ravindra.
>>>>>>
>>>>>> On Tue, 2 Jul 2019 at 2:52 PM, Deepak Kulkarni <de...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> We are evaluating the Carbondata for following alternative to Oracle
>>>>>>> –and hit following issues when achieving this migration -
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>    1. Primary Key Indexes – how we can avoid duplicate records on
>>>>>>>    Carbon?
>>>>>>>    2. Secondary Indexes
>>>>>>>    3. Constraints such as NULL, NOT NULL – how can we set this at
>>>>>>>    column levels in carbon?
>>>>>>>    4. Functions such as DECODE – what alternatives we have in
>>>>>>>    Carbon to use these SQL functions?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Appreciate the response.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Deepak
>>>>>>>
>>>>>> --
>>>>>> Thanks & Regards,
>>>>>> Ravi
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Deepak
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks & Regards,
>>>> Ravi
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Deepak
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Ravi
>>
>
>
> --
> Regards,
> Deepak
>


-- 
Thanks & Regards,
Ravi

Re: Info needed -

Posted by Deepak Kulkarni <de...@gmail.com>.

Hello,

This is what we have been doing -

Please find below steps that we followed to create tables and datamap and
tried explain command to see if we can get some information about the use
of datamap while query execution but we did not get the expected output :-



   1. *Table creation:*

*carbon.sql("CREATE TABLE MTRC00_NRAW_000_RECENT_1572019_2 (IDX_METRIC int,
IDX_RESOURCE int, DTE_DATE int, DBL_VALUE int) stored by
'org.apache.carbondata.format'").show()*

2019-07-15 23:03:40 AUDIT audit:72 - {"time":"July 15, 2019 11:03:40 PM
IST","username":"root","opName":"CREATE
TABLE","opId":"11174429655716046","opStatus":"START"}

2019-07-15 23:03:40 WARN  HiveExternalCatalog:66 - Couldn't find
corresponding Hive SerDe for data source provider
org.apache.spark.sql.CarbonSource. Persisting data source table
`default`.`mtrc00_nraw_000_recent_1572019_2` into Hive metastore in Spark
SQL specific format, which is NOT compatible with Hive.

2019-07-15 23:03:40 AUDIT audit:93 - {"time":"July 15, 2019 11:03:40 PM
IST","username":"root","opName":"CREATE
TABLE","opId":"11174429655716046","opStatus":"SUCCESS","opTime":"87
ms","table":"default.mtrc00_nraw_000_recent_1572019_2","extraInfo":{"bad_record_path":"","local_dictionary_enable":"true","external":"false","sort_columns":"","comment":""}}

++

||

++

++

   1. *DataMap Creation:*



*scala>* *carbon.sql("CREATE DATAMAP DM_preaggregate_new on table
MTRC00_NRAW_000_RECENT_1572019_2 USING 'preaggregate' AS select
IDX_METRIC,IDX_RESOURCE
,sum(dbl_value),avg(dbl_value),min(dbl_value),max(dbl_value),count(dbl_value)
FROM MTRC00_NRAW_000_RECENT_1572019_2 group by IDX_METRIC,IDX_RESOURCE")*

2019-07-15 23:05:35 AUDIT audit:72 - {"time":"July 15, 2019 11:05:35 PM
IST","username":"root","opName":"CREATE
DATAMAP","opId":"11174544343554709","opStatus":"START"}

2019-07-15 23:05:35 AUDIT audit:72 - {"time":"July 15, 2019 11:05:35 PM
IST","username":"root","opName":"CREATE
TABLE","opId":"11174544390284045","opStatus":"START"}

2019-07-15 23:05:35 WARN  HiveExternalCatalog:66 - Couldn't find
corresponding Hive SerDe for data source provider
org.apache.spark.sql.CarbonSource. Persisting data source table
`default`.`mtrc00_nraw_000_recent_1572019_2_dm_preaggregate_new` into Hive
metastore in Spark SQL specific format, which is NOT compatible with Hive.

2019-07-15 23:05:35 AUDIT audit:93 - {"time":"July 15, 2019 11:05:35 PM
IST","username":"root","opName":"CREATE
TABLE","opId":"11174544390284045","opStatus":"SUCCESS","opTime":"96
ms","table":"default.mtrc00_nraw_000_recent_1572019_2_dm_preaggregate_new","extraInfo":{"local_dictionary_threshold":"10000","bad_record_path":"","table_blocksize":"1024","sort_scope":"LOCAL_SORT","local_dictionary_enable":"true","flat_folder":"false","external":"false","sort_columns":"","comment":"","_internal.deferred.rebuild":"false"}}

2019-07-15 23:05:35 AUDIT audit:93 - {"time":"July 15, 2019 11:05:35 PM
IST","username":"root","opName":"CREATE
DATAMAP","opId":"11174544343554709","opStatus":"SUCCESS","opTime":"231
ms","table":"default.mtrc00_nraw_000_recent_1572019_2","extraInfo":{"provider":"preaggregate","dmName":"DM_preaggregate_new"}}

res121: org.apache.spark.sql.DataFrame = []



   1. *Data Loading:*

*scala> carbon.sql("LOAD DATA inpath
'hdfs://10.46.40.149:9000/root/test/Wireline/NRAW/MTRC00_NRAW_000_RECENT.csv
<http://10.46.40.149:9000/root/test/Wireline/NRAW/MTRC00_NRAW_000_RECENT.csv>'
into table MTRC00_NRAW_000_RECENT_1572019_2
options('BAD_RECORDS_ACTION'='FORCE','DELIMITER'=',','FILEHEADER'='IDX_METRIC,IDX_RESOURCE,DTE_DATE,DBL_VALUE')");*

2019-07-15 23:08:50 AUDIT audit:72 - {"time":"July 15, 2019 11:08:50 PM
IST","username":"root","opName":"LOAD
DATA","opId":"11174739779154705","opStatus":"START"}

2019-07-15 23:08:51 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 0

2019-07-15 23:08:51 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11174740740416991

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
6 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   6 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



   1. *Query Execution with EXPLAIN command:-*



scala> *carbon.sql("EXPLAIN select groupNumber, resourceNumber,SUM(VsumM1)
as VsumM1, SUM(VcountM1) as VcountM1, SUM(VsumM2) as VsumM2, SUM(VcountM2)
as VcountM2, SUM(VsumM3) as VsumM3, SUM(VcountM3) as VcountM3, SUM(VsumM4)
as VsumM4, SUM(VcountM4) as VcountM4, SUM(VsumM5) as VsumM5, SUM(VcountM5)
as VcountM5,(SELECT MIN(STR_NAME) FROM FRML_DESC_HIST FRML WHERE
FRML.IDX_IND = 2210 AND FRML.DTE_DATE = (SELECT MAX(DTE_DATE) FROM
FRML_DESC_HIST FRML1 WHERE FRML1.IDX_IND = FRML.IDX_IND AND FRML1.DTE_DATE
< 1561507200 AND FRML1.STR_ACTION in ('I','U'))) as M1NAME, (SELECT
MIN(STR_NAME) FROM FRML_DESC_HIST FRML WHERE FRML.IDX_IND = 2211 AND
FRML.DTE_DATE = (SELECT MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1 WHERE
FRML1.IDX_IND = FRML.IDX_IND AND FRML1.DTE_DATE < 1561507200 AND
FRML1.STR_ACTION in ('I','U'))) as M2NAME, (SELECT MIN(STR_NAME) FROM
FRML_DESC_HIST FRML WHERE FRML.IDX_IND = 2214 AND FRML.DTE_DATE = (SELECT
MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1 WHERE FRML1.IDX_IND = FRML.IDX_IND
AND FRML1.DTE_DATE < 1561507200 AND FRML1.STR_ACTION in ('I','U'))) as
M3NAME, (SELECT MIN(STR_NAME) FROM FRML_DESC_HIST FRML WHERE FRML.IDX_IND =
2215 AND FRML.DTE_DATE = (SELECT MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1
WHERE FRML1.IDX_IND = FRML.IDX_IND AND FRML1.DTE_DATE < 1561507200 AND
FRML1.STR_ACTION in ('I','U'))) as M4NAME, (SELECT MIN(STR_NAME) FROM
FRML_DESC_HIST FRML WHERE FRML.IDX_IND = 2234 AND FRML.DTE_DATE = (SELECT
MAX(DTE_DATE) FROM FRML_DESC_HIST FRML1 WHERE FRML1.IDX_IND = FRML.IDX_IND
AND FRML1.DTE_DATE < 1561507200 AND FRML1.STR_ACTION in ('I','U'))) as
M5NAME FROM (select S1.IDX_GROUP as groupNumber ,M1.idx_resource as
resourceNumber ,CASE when m1.idx_metric=2210 then SUM(M1.dbl_value) end as
VsumM1,CASE when m1.idx_metric=2210 then count(m1.dbl_value) end as
VcountM1,CASE when m1.idx_metric=2211 then SUM(M1.dbl_value) end as
VsumM2,CASE when m1.idx_metric=2211 then count(m1.dbl_value) end as
VcountM2,CASE when m1.idx_metric=2214 then SUM(M1.dbl_value) end as
VsumM3,CASE when m1.idx_metric=2214 then count(m1.dbl_value) end as
VcountM3,CASE when m1.idx_metric=2215 then SUM(M1.dbl_value) end as
VsumM4,CASE when m1.idx_metric=2215 then count(m1.dbl_value) end as
VcountM4,CASE when m1.idx_metric=2234 then SUM(M1.dbl_value) end as
VsumM5,CASE when m1.idx_metric=2234 then count(m1.dbl_value) end as
VcountM5 FROM MTRC00_NRAW_000_RECENT_1572019_2  M1, SE_GRP_MEMBER_HIST S1
where M1.idx_resource=S1.IDX_ALIAS_INST and S1.IDX_GROUP=200075906 and
M1.idx_resource= 200071954 and M1.IDX_METRIC in (2210,2211,2214,2215,2234)
group by M1.IDX_METRIC,M1.idx_resource,S1.IDX_GROUP) GROUP BY
groupNumber,resourceNumber HAVING SUM(VcountM1) <> 0 OR SUM(VcountM2) <> 0
OR SUM(VcountM3) <> 0 OR SUM(VcountM4) <> 0 OR SUM(VcountM5) <> 0").show()*

2019-07-16 12:33:25 AUDIT audit:72 - {"time":"July 16, 2019 12:33:25 PM
IST","username":"root","opName":"EXPLAIN","opId":"11223014419086786","opStatus":"START"}

2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 2

2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 0

2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 1

2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 0

2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 1

2019-07-16 12:33:25 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 1

2019-07-16 12:33:25 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223014841650012

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
10 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                  10 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223014952642509

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
9 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   9 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223014986644062

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
9 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   9 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223014913695441

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
9 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   9 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223015020278657

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
11 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                  11 |
0 |

|        +--------------------+
   +------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223014881878925

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
9 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   9 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223014903200116

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
3 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   4 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
1 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223014976273697

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
4 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   4 |
0 |

|        +--------------------+
           +------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223015007180119

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
4 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   4 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223014939414859

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
6 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   6 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:26 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223014868807332

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
3 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   3 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:27 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223014828073061

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
3 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                   3 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 0

2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 1

2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 0

2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 0

2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 0

2019-07-16 12:33:27 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 1

2019-07-16 12:33:27 AUDIT audit:93 - {"time":"July 16, 2019 12:33:27 PM
IST","username":"root","opName":"EXPLAIN","opId":"11223014419086786","opStatus":"SUCCESS","opTime":"2570
ms","table":"NA","extraInfo":{"query":"'Filter (((NOT ('SUM('VcountM1) = 0)
|| NOT ('SUM('VcountM2) = 0)) || NOT ('SUM('VcountM3) = 0)) || (NOT
('SUM('VcountM4) = 0) || NOT ('SUM('VcountM5) = 0)))"}}

+--------------------+

|                plan|

+--------------------+

|== CarbonData Pro...|

|== Physical Plan ...|

+--------------------+

   1. *Even with simple select query we do not see any reference of datamp
   being getting hit while execution:-*



scala> carbon.sql("EXPLAIN select IDX_METRIC,IDX_RESOURCE
,sum(dbl_value),avg(dbl_value),min(dbl_value),max(dbl_value),count(dbl_value)
FROM MTRC00_NRAW_000_RECENT_1572019_2 group by
IDX_METRIC,IDX_RESOURCE").show()

2019-07-16 12:37:47 AUDIT audit:72 - {"time":"July 16, 2019 12:37:47 PM
IST","username":"root","opName":"EXPLAIN","opId":"11223276220632808","opStatus":"START"}

2019-07-16 12:37:47 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 0

2019-07-16 12:37:47 STATISTIC DriverQueryStatisticsRecorderImpl:123 - Print
query statistic for query id: 11223276310826076

+--------+--------------------+---------------------+------------------------+

|  Module|      Operation Step|     Total Query Cost|              Query
Cost|

+--------+--------------------+---------------------+------------------------+

|  Driver|  Load blocks driver|                     |
16 |

|        +--------------------+
+------------------------+

|    Part|    Block allocation|                  16 |
0 |

|        +--------------------+
+------------------------+

|        |Block identification|                     |
0 |

+--------+--------------------+---------------------+------------------------+



2019-07-16 12:37:47 STATISTIC QueryStatisticsRecorderImpl:72 - Time taken
for Carbon Optimizer to optimize: 0

2019-07-16 12:37:47 AUDIT audit:93 - {"time":"July 16, 2019 12:37:47 PM
IST","username":"root","opName":"EXPLAIN","opId":"11223276220632808","opStatus":"SUCCESS","opTime":"288
ms","table":"NA","extraInfo":{"query":"'Aggregate ['IDX_METRIC,
'IDX_RESOURCE], ['IDX_METRIC, 'IDX_RESOURCE,
unresolvedalias('sum('dbl_value), None), unresolvedalias('avg('dbl_value),
None), unresolvedalias('min('dbl_value), None),
unresolvedalias('max('dbl_value), None),
unresolvedalias('count('dbl_value), None)]"}}

+--------------------+

|                plan|

+--------------------+

|== CarbonData Pro...|

|== Physical Plan ...|

+--------------------+

The required carbon property  enable.query.statistics = true has been set .


BR,

Deepak

On Mon, Jul 15, 2019 at 11:24 AM Ravindra Pesala <ra...@gmail.com>
wrote:

> Hi,
>
> Please provide the script/test case  you are using, I can try it.
>
> Regards,
> Ravindra
>
> On Fri, 12 Jul 2019 at 16:55, Deepak Kulkarni <de...@gmail.com>
> wrote:
>
>> thx. We tried this example but we did not get the similar output for
>> EXPLAIN PLAN command. Can yo help?
>> i
>>
>> On Fri, Jul 12, 2019 at 4:08 PM Ravindra Pesala <ra...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> You want to use pre-aggregate datamaps, please try the example
>>> `org.apache.carbondata.examples.PreAggregateDataMapExample` to understand
>>> how it works. Please note that we are going to obsolete pre-aggregate
>>> datamaps from next version and make MV(materialized view ) datamaps going
>>> to replace it.
>>>
>>> Regards,
>>> Ravindra.
>>>
>>> On Thu, 11 Jul 2019 at 17:40, Deepak Kulkarni <de...@gmail.com>
>>> wrote:
>>>
>>>> Thx. However we want to tame sure that the queries are using
>>>> pre-aggregate datamaps . We tried using explain plan but the output showed
>>>> in this link https://carbondata.apache.org/datamap-management.html we
>>>> are not getting.
>>>>
>>>> Can you help?
>>>>
>>>> BR,
>>>> Deepak
>>>>
>>>> On Tue, Jul 9, 2019 at 8:51 AM Ravindra Pesala <ra...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> 1. Currently there is no way to impose primary key indexes on carbon.
>>>>> We may consider it in future.
>>>>>
>>>>> 2. We have datamaps interface opened for implementing secondary
>>>>> indexes. Right now we have implemented for min/max , bloom indexes. User
>>>>> can add there own implementations as well.
>>>>>
>>>>> 3. There is no constraints supported on carbon right now.
>>>>>
>>>>> 4. Spark/hive has many predefined functions and also user can have
>>>>> there own user defined function and register to spark and use them carbon.
>>>>>
>>>>> Regards,
>>>>> Ravindra.
>>>>>
>>>>> On Tue, 2 Jul 2019 at 2:52 PM, Deepak Kulkarni <de...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>>
>>>>>> We are evaluating the Carbondata for following alternative to Oracle
>>>>>> –and hit following issues when achieving this migration -
>>>>>>
>>>>>>
>>>>>>
>>>>>>    1. Primary Key Indexes – how we can avoid duplicate records on
>>>>>>    Carbon?
>>>>>>    2. Secondary Indexes
>>>>>>    3. Constraints such as NULL, NOT NULL – how can we set this at
>>>>>>    column levels in carbon?
>>>>>>    4. Functions such as DECODE – what alternatives we have in Carbon
>>>>>>    to use these SQL functions?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Appreciate the response.
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Deepak
>>>>>>
>>>>> --
>>>>> Thanks & Regards,
>>>>> Ravi
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Deepak
>>>>
>>>
>>>
>>> --
>>> Thanks & Regards,
>>> Ravi
>>>
>>
>>
>> --
>> Regards,
>> Deepak
>>
>
>
> --
> Thanks & Regards,
> Ravi
>


-- 
Regards,
Deepak

Re: Info needed -

Posted by Ravindra Pesala <ra...@gmail.com>.

Hi,

Please provide the script/test case  you are using, I can try it.

Regards,
Ravindra

On Fri, 12 Jul 2019 at 16:55, Deepak Kulkarni <de...@gmail.com> wrote:

> thx. We tried this example but we did not get the similar output for
> EXPLAIN PLAN command. Can yo help?
> i
>
> On Fri, Jul 12, 2019 at 4:08 PM Ravindra Pesala <ra...@gmail.com>
> wrote:
>
>> Hi,
>>
>> You want to use pre-aggregate datamaps, please try the example
>> `org.apache.carbondata.examples.PreAggregateDataMapExample` to understand
>> how it works. Please note that we are going to obsolete pre-aggregate
>> datamaps from next version and make MV(materialized view ) datamaps going
>> to replace it.
>>
>> Regards,
>> Ravindra.
>>
>> On Thu, 11 Jul 2019 at 17:40, Deepak Kulkarni <de...@gmail.com>
>> wrote:
>>
>>> Thx. However we want to tame sure that the queries are using
>>> pre-aggregate datamaps . We tried using explain plan but the output showed
>>> in this link https://carbondata.apache.org/datamap-management.html we
>>> are not getting.
>>>
>>> Can you help?
>>>
>>> BR,
>>> Deepak
>>>
>>> On Tue, Jul 9, 2019 at 8:51 AM Ravindra Pesala <ra...@gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> 1. Currently there is no way to impose primary key indexes on carbon.
>>>> We may consider it in future.
>>>>
>>>> 2. We have datamaps interface opened for implementing secondary
>>>> indexes. Right now we have implemented for min/max , bloom indexes. User
>>>> can add there own implementations as well.
>>>>
>>>> 3. There is no constraints supported on carbon right now.
>>>>
>>>> 4. Spark/hive has many predefined functions and also user can have
>>>> there own user defined function and register to spark and use them carbon.
>>>>
>>>> Regards,
>>>> Ravindra.
>>>>
>>>> On Tue, 2 Jul 2019 at 2:52 PM, Deepak Kulkarni <de...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>>
>>>>> We are evaluating the Carbondata for following alternative to Oracle
>>>>> –and hit following issues when achieving this migration -
>>>>>
>>>>>
>>>>>
>>>>>    1. Primary Key Indexes – how we can avoid duplicate records on
>>>>>    Carbon?
>>>>>    2. Secondary Indexes
>>>>>    3. Constraints such as NULL, NOT NULL – how can we set this at
>>>>>    column levels in carbon?
>>>>>    4. Functions such as DECODE – what alternatives we have in Carbon
>>>>>    to use these SQL functions?
>>>>>
>>>>>
>>>>>
>>>>> Appreciate the response.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Deepak
>>>>>
>>>> --
>>>> Thanks & Regards,
>>>> Ravi
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Deepak
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Ravi
>>
>
>
> --
> Regards,
> Deepak
>


-- 
Thanks & Regards,
Ravi

Re: Info needed -

Posted by Deepak Kulkarni <de...@gmail.com>.

thx. We tried this example but we did not get the similar output for
EXPLAIN PLAN command. Can yo help?
i

On Fri, Jul 12, 2019 at 4:08 PM Ravindra Pesala <ra...@gmail.com>
wrote:

> Hi,
>
> You want to use pre-aggregate datamaps, please try the example
> `org.apache.carbondata.examples.PreAggregateDataMapExample` to understand
> how it works. Please note that we are going to obsolete pre-aggregate
> datamaps from next version and make MV(materialized view ) datamaps going
> to replace it.
>
> Regards,
> Ravindra.
>
> On Thu, 11 Jul 2019 at 17:40, Deepak Kulkarni <de...@gmail.com>
> wrote:
>
>> Thx. However we want to tame sure that the queries are using
>> pre-aggregate datamaps . We tried using explain plan but the output showed
>> in this link https://carbondata.apache.org/datamap-management.html we
>> are not getting.
>>
>> Can you help?
>>
>> BR,
>> Deepak
>>
>> On Tue, Jul 9, 2019 at 8:51 AM Ravindra Pesala <ra...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> 1. Currently there is no way to impose primary key indexes on carbon. We
>>> may consider it in future.
>>>
>>> 2. We have datamaps interface opened for implementing secondary indexes.
>>> Right now we have implemented for min/max , bloom indexes. User can add
>>> there own implementations as well.
>>>
>>> 3. There is no constraints supported on carbon right now.
>>>
>>> 4. Spark/hive has many predefined functions and also user can have there
>>> own user defined function and register to spark and use them carbon.
>>>
>>> Regards,
>>> Ravindra.
>>>
>>> On Tue, 2 Jul 2019 at 2:52 PM, Deepak Kulkarni <de...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> We are evaluating the Carbondata for following alternative to Oracle
>>>> –and hit following issues when achieving this migration -
>>>>
>>>>
>>>>
>>>>    1. Primary Key Indexes – how we can avoid duplicate records on
>>>>    Carbon?
>>>>    2. Secondary Indexes
>>>>    3. Constraints such as NULL, NOT NULL – how can we set this at
>>>>    column levels in carbon?
>>>>    4. Functions such as DECODE – what alternatives we have in Carbon
>>>>    to use these SQL functions?
>>>>
>>>>
>>>>
>>>> Appreciate the response.
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Deepak
>>>>
>>> --
>>> Thanks & Regards,
>>> Ravi
>>>
>>
>>
>> --
>> Regards,
>> Deepak
>>
>
>
> --
> Thanks & Regards,
> Ravi
>


-- 
Regards,
Deepak

Re: Info needed -

Posted by Ravindra Pesala <ra...@gmail.com>.

Hi,

You want to use pre-aggregate datamaps, please try the example
`org.apache.carbondata.examples.PreAggregateDataMapExample` to understand
how it works. Please note that we are going to obsolete pre-aggregate
datamaps from next version and make MV(materialized view ) datamaps going
to replace it.

Regards,
Ravindra.

On Thu, 11 Jul 2019 at 17:40, Deepak Kulkarni <de...@gmail.com> wrote:

> Thx. However we want to tame sure that the queries are using pre-aggregate
> datamaps . We tried using explain plan but the output showed in this link
> https://carbondata.apache.org/datamap-management.html we are not getting.
>
> Can you help?
>
> BR,
> Deepak
>
> On Tue, Jul 9, 2019 at 8:51 AM Ravindra Pesala <ra...@gmail.com>
> wrote:
>
>> Hello,
>>
>> 1. Currently there is no way to impose primary key indexes on carbon. We
>> may consider it in future.
>>
>> 2. We have datamaps interface opened for implementing secondary indexes.
>> Right now we have implemented for min/max , bloom indexes. User can add
>> there own implementations as well.
>>
>> 3. There is no constraints supported on carbon right now.
>>
>> 4. Spark/hive has many predefined functions and also user can have there
>> own user defined function and register to spark and use them carbon.
>>
>> Regards,
>> Ravindra.
>>
>> On Tue, 2 Jul 2019 at 2:52 PM, Deepak Kulkarni <de...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>>
>>>
>>> We are evaluating the Carbondata for following alternative to Oracle
>>> –and hit following issues when achieving this migration -
>>>
>>>
>>>
>>>    1. Primary Key Indexes – how we can avoid duplicate records on
>>>    Carbon?
>>>    2. Secondary Indexes
>>>    3. Constraints such as NULL, NOT NULL – how can we set this at
>>>    column levels in carbon?
>>>    4. Functions such as DECODE – what alternatives we have in Carbon to
>>>    use these SQL functions?
>>>
>>>
>>>
>>> Appreciate the response.
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Deepak
>>>
>> --
>> Thanks & Regards,
>> Ravi
>>
>
>
> --
> Regards,
> Deepak
>


-- 
Thanks & Regards,
Ravi

Re: Info needed -

Posted by Deepak Kulkarni <de...@gmail.com>.

Thx. However we want to tame sure that the queries are using pre-aggregate
datamaps . We tried using explain plan but the output showed in this link
https://carbondata.apache.org/datamap-management.html we are not getting.

Can you help?

BR,
Deepak

On Tue, Jul 9, 2019 at 8:51 AM Ravindra Pesala <ra...@gmail.com>
wrote:

> Hello,
>
> 1. Currently there is no way to impose primary key indexes on carbon. We
> may consider it in future.
>
> 2. We have datamaps interface opened for implementing secondary indexes.
> Right now we have implemented for min/max , bloom indexes. User can add
> there own implementations as well.
>
> 3. There is no constraints supported on carbon right now.
>
> 4. Spark/hive has many predefined functions and also user can have there
> own user defined function and register to spark and use them carbon.
>
> Regards,
> Ravindra.
>
> On Tue, 2 Jul 2019 at 2:52 PM, Deepak Kulkarni <de...@gmail.com>
> wrote:
>
>> Hi,
>>
>>
>>
>> We are evaluating the Carbondata for following alternative to Oracle –and
>> hit following issues when achieving this migration -
>>
>>
>>
>>    1. Primary Key Indexes – how we can avoid duplicate records on Carbon?
>>    2. Secondary Indexes
>>    3. Constraints such as NULL, NOT NULL – how can we set this at column
>>    levels in carbon?
>>    4. Functions such as DECODE – what alternatives we have in Carbon to
>>    use these SQL functions?
>>
>>
>>
>> Appreciate the response.
>>
>>
>>
>> --
>> Regards,
>> Deepak
>>
> --
> Thanks & Regards,
> Ravi
>


-- 
Regards,
Deepak

Re: Info needed -

Posted by Ravindra Pesala <ra...@gmail.com>.

Hello,

1. Currently there is no way to impose primary key indexes on carbon. We
may consider it in future.

2. We have datamaps interface opened for implementing secondary indexes.
Right now we have implemented for min/max , bloom indexes. User can add
there own implementations as well.

3. There is no constraints supported on carbon right now.

4. Spark/hive has many predefined functions and also user can have there
own user defined function and register to spark and use them carbon.

Regards,
Ravindra.

On Tue, 2 Jul 2019 at 2:52 PM, Deepak Kulkarni <de...@gmail.com> wrote:

> Hi,
>
>
>
> We are evaluating the Carbondata for following alternative to Oracle –and
> hit following issues when achieving this migration -
>
>
>
>    1. Primary Key Indexes – how we can avoid duplicate records on Carbon?
>    2. Secondary Indexes
>    3. Constraints such as NULL, NOT NULL – how can we set this at column
>    levels in carbon?
>    4. Functions such as DECODE – what alternatives we have in Carbon to
>    use these SQL functions?
>
>
>
> Appreciate the response.
>
>
>
> --
> Regards,
> Deepak
>
-- 
Thanks & Regards,
Ravi