You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by chenliang613 <gi...@git.apache.org> on 2018/05/04 02:39:06 UTC

[GitHub] carbondata pull request #2268: [CARBONDATA-2434] Add ExternalTableExample an...

GitHub user chenliang613 opened a pull request:

    https://github.com/apache/carbondata/pull/2268

    [CARBONDATA-2434] Add ExternalTableExample and LuceneDataMapExample

    For preparing 1.4.0 release.
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [X] Any interfaces changed?
     NA
     - [X] Any backward compatibility impacted?
     NA
     - [X] Document update required?
    NA
     - [X] Testing done
    DONE
           
     - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/chenliang613/carbondata external_example

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2268.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2268
    
----
commit 3d3b19fbb78421ab59fd2b20cbe0b7bfb693b6c4
Author: chenliang613 <ch...@...>
Date:   2018-05-04T02:35:52Z

    [CARBONDATA-2434] Add ExternalTableExample and LuceneDataMapExample

----


---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4751/



---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4514/



---

[GitHub] carbondata pull request #2268: [CARBONDATA-2434] Add ExternalTableExample an...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/carbondata/pull/2268


---

[GitHub] carbondata pull request #2268: [CARBONDATA-2434] Add ExternalTableExample an...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2268#discussion_r186705671
  
    --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/LuceneDataMapExample.scala ---
    @@ -0,0 +1,118 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.carbondata.examples
    +
    +import java.io.File
    +
    +import org.apache.spark.sql.{SaveMode, SparkSession}
    +
    +import org.apache.carbondata.examples.util.ExampleUtils
    +
    +
    +/**
    + * This example is for lucene datamap.
    + */
    +
    +object LuceneDataMapExample {
    +
    +  def main(args: Array[String]) {
    +    val spark = ExampleUtils.createCarbonSession("LuceneDataMapExample")
    +    exampleBody(spark)
    +    spark.close()
    +  }
    +
    +  def exampleBody(spark : SparkSession): Unit = {
    +
    +    // build the test data, please increase the data for more obvious comparison.
    +    // if set the data is larger than 100M, it will take 10+ mins.
    +    import scala.util.Random
    +
    +    import spark.implicits._
    +    val r = new Random()
    +    val df = spark.sparkContext.parallelize(1 to 10 * 10 * 1000)
    +      .map(x => ("which test" + r.nextInt(10000) + " good" + r.nextInt(10),
    +      "who and name" + x % 8, "city" + x % 50, x % 60))
    +      .toDF("id", "name", "city", "age")
    +
    +    spark.sql("DROP TABLE IF EXISTS personTable")
    +    df.write.format("carbondata")
    +      .option("tableName", "personTable")
    +      .option("compress", "true")
    +      .mode(SaveMode.Overwrite).save()
    +
    +    // create lucene datamap on personTable
    +    spark.sql(
    +      s"""
    +         | CREATE DATAMAP IF NOT EXISTS dm ON TABLE personTable
    +         | USING 'lucene'
    +         | DMProperties('INDEX_COLUMNS'='id , name')
    +      """.stripMargin)
    +
    +    spark.sql("refresh datamap dm ON TABLE personTable")
    +
    +    // 1. Compare the performance:
    +
    +    def time(code: => Unit): Double = {
    +      val start = System.currentTimeMillis()
    +      code
    +      // return time in second
    +      (System.currentTimeMillis() - start).toDouble / 1000
    +    }
    +
    +    val time_without_lucenedatamap = time {
    +
    +      spark.sql(
    +        s"""
    +           | SELECT count(*)
    +           | FROM personTable where id like '% test1 %'
    +      """.stripMargin).show()
    +
    +    }
    +
    +    val time_with_lucenedatamap = time {
    +
    +      spark.sql(
    +        s"""
    +           | SELECT count(*)
    +           | FROM personTable where TEXT_MATCH('id:test1')
    +      """.stripMargin).show()
    +
    +    }
    +
    +    // scalastyle:off
    +    println("time for query on table with lucene datamap table:" + time_with_lucenedatamap.toString)
    +    println("time for query on table without lucene datamap table:" + time_without_lucenedatamap.toString)
    +    // scalastyle:on
    +
    +    // 2. Search for word "test1" and not "good" in the id field
    +    spark.sql(
    +      s"""
    +         | SELECT id,name
    +         | FROM personTable where TEXT_MATCH('id:test1 -id:good1')
    +      """.stripMargin).show(100)
    +
    +    // 3. TEXT_MATCH_WITH_LIMIT usage:
    +//    spark.sql(
    +//      s"""
    +//         | SELECT id,name
    +//         | FROM personTable where TEXT_MATCH_WITH_LIMIT('id:test1,10')
    +//      """.stripMargin).show()
    --- End diff --
    
    Uncomment it and change to `TEXT_MATCH_WITH_LIMIT('id:test1',10)` to work


---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5677/



---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4800/



---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5737/



---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5674/



---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5627/



---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4754/



---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4467/



---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4579/



---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4517/



---

[GitHub] carbondata issue #2268: [CARBONDATA-2434] Add ExternalTableExample and Lucen...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2268
  
    LGTM


---