You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "xubo245 (JIRA)" <ji...@apache.org> on 2018/02/03 07:35:00 UTC

[jira] [Commented] (CARBONDATA-2085) It's different between load twice and create datamap with load again after load data and create datamap

    [ https://issues.apache.org/jira/browse/CARBONDATA-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351298#comment-16351298 ] 

xubo245 commented on CARBONDATA-2085:
-------------------------------------

df is select from maintable, you can ignore "sql("select * from maintable_agg0_minute").show(100)"

Why it fail in test case 1? but test case 2 success? 


{code:java}
checkAnswer(df,
       Seq(Row(Timestamp.valueOf("2016-02-23 01:01:00"), 120),
         Row(Timestamp.valueOf("2016-02-23 01:02:00"), 280)))
 
{code}


> It's different between load twice and create datamap with load again after load data and create datamap
> -------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-2085
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2085
>             Project: CarbonData
>          Issue Type: Bug
>          Components: core, spark-integration
>    Affects Versions: 1.3.0
>            Reporter: xubo245
>            Priority: Major
>             Fix For: 1.3.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> It's different between two test case
> test case 1: load twice and create datamap , and then query
> test case 2:load once , create datamap and load again, and then query
> {code:java}
> +  test("load data into mainTable after create timeseries datamap on table 1") {
>  +    sql("drop table if exists mainTable")
>  +    sql(
>  +      """
>  +        | CREATE TABLE mainTable(
>  +        |   mytime timestamp,
>  +        |   name string,
>  +        |   age int)
>  +        | STORED BY 'org.apache.carbondata.format'
>  +      """.stripMargin)
>  +
>  +    sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into table mainTable")
>  +
>  +    sql(
>  +      """
>  +        | create datamap agg0 on table mainTable
>  +        | using 'preaggregate'
>  +        | DMPROPERTIES (
>  +        |   'timeseries.eventTime'='mytime',
>  +        |   'timeseries.hierarchy'='second=1,minute=1,hour=1,day=1,month=1,year=1')
>  +        | as select mytime, sum(age)
>  +        | from mainTable
>  +        | group by mytime""".stripMargin)
>  +
>  +    sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into table mainTable")
>  +    val df = sql(
>  +      """
>  +        | select
>  +        |   timeseries(mytime,'minute') as minuteLevel,
>  +        |   sum(age) as sum
>  +        | from mainTable
>  +        | where timeseries(mytime,'minute')>='2016-02-23 01:01:00'
>  +        | group by
>  +        |   timeseries(mytime,'minute')
>  +        | order by
>  +        |   timeseries(mytime,'minute')
>  +      """.stripMargin)
>  +
>  +    // only for test, it need remove before merge
>  +    df.show()
>  +    sql("select * from maintable_agg0_minute").show(100)
>  +
>  +    checkAnswer(df,
>  +      Seq(Row(Timestamp.valueOf("2016-02-23 01:01:00"), 120),
>  +        Row(Timestamp.valueOf("2016-02-23 01:02:00"), 280)))
>  +
>  +  }
>  +
>  +  test("load data into mainTable after create timeseries datamap on table 2") {
>  +    sql("drop table if exists mainTable")
>  +    sql(
>  +      """
>  +        | CREATE TABLE mainTable(
>  +        |   mytime timestamp,
>  +        |   name string,
>  +        |   age int)
>  +        | STORED BY 'org.apache.carbondata.format'
>  +      """.stripMargin)
>  +
>  +    sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into table mainTable")
>  +    sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into table mainTable")
>  +    sql(
>  +      """
>  +        | create datamap agg0 on table mainTable
>  +        | using 'preaggregate'
>  +        | DMPROPERTIES (
>  +        |   'timeseries.eventTime'='mytime',
>  +        |   'timeseries.hierarchy'='second=1,minute=1,hour=1,day=1,month=1,year=1')
>  +        | as select mytime, sum(age)
>  +        | from mainTable
>  +        | group by mytime""".stripMargin)
>  +
>  +
>  +    val df = sql(
>  +      """
>  +        | select
>  +        |   timeseries(mytime,'minute') as minuteLevel,
>  +        |   sum(age) as sum
>  +        | from mainTable
>  +        | where timeseries(mytime,'minute')>='2016-02-23 01:01:00'
>  +        | group by
>  +        |   timeseries(mytime,'minute')
>  +        | order by
>  +        |   timeseries(mytime,'minute')
>  +      """.stripMargin)
>  +
>  +    // only for test, it need remove before merge
>  +    df.show()
>  +    sql("select * from maintable_agg0_minute").show(100)
>  +
>  +
>  +    checkAnswer(df,
>  +      Seq(Row(Timestamp.valueOf("2016-02-23 01:01:00"), 120),
>  +        Row(Timestamp.valueOf("2016-02-23 01:02:00"), 280)))
>  +  }
>  +
> {code}
> test case 1 and 2 should success , but test case 1 fail 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)