You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by CrazyJacky <gi...@git.apache.org> on 2017/10/05 00:15:33 UTC

[GitHub] spark pull request #19434: [SPARK-21785][SQL]Support create table from a par...

GitHub user CrazyJacky opened a pull request:

    https://github.com/apache/spark/pull/19434

    [SPARK-21785][SQL]Support create table from a parquet file schema

    ## Support create table from a parquet file schema
    As described in jira:
    ```sql
    CREATE EXTERNAL TABLE IF NOT EXISTS test LIKE 'PARQUET' '/user/test/abc/a.snappy.parquet' STORED AS PARQUET LOCATION
    '/user/test/def/'; 
    ```
    this is a very ugly fix and I would like someone to help to review and refine.
    and it only supports create hive table.
    
    ## Tested by test case and tested about build the runnable distribution
    
    ```scala
    test("create table like parquet") {
    
        val f = getClass.getClassLoader.
          getResource("test-data/dec-in-fixed-len.parquet").getPath
        val v1 =
          """
            |create table if not exists db1.table1 like 'parquet'
          """.stripMargin.concat("'" + f + "'").concat(
          """
            |stored as sequencefile
            |location '/tmp/table1'
          """.stripMargin
          )
    
        val (desc, allowExisting) = extractTableDesc(v1)
    
        assert(allowExisting)
        assert(desc.identifier.database == Some("db1"))
        assert(desc.identifier.table == "table1")
        assert(desc.tableType == CatalogTableType.EXTERNAL)
        assert(desc.schema == new StructType()
          .add("fixed_len_dec", "decimal(10,2)"))
        assert(desc.bucketSpec.isEmpty)
        assert(desc.viewText.isEmpty)
        assert(desc.viewDefaultDatabase.isEmpty)
        assert(desc.viewQueryColumnNames.isEmpty)
        assert(desc.storage.locationUri == Some(new URI("/tmp/table1")))
        assert(desc.storage.inputFormat == Some("org.apache.hadoop.mapred.SequenceFileInputFormat"))
        assert(desc.storage.outputFormat == Some("org.apache.hadoop.mapred.SequenceFileOutputFormat"))
        assert(desc.storage.serde == Some("org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"))
      }
    ```


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jacshen/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19434.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19434
    
----
commit 6b23cb8ff5a778f4f1b4ca4f218cbe8c4e422101
Author: Shen <ja...@lm-sea-11008031.corp.ebay.com>
Date:   2017-10-04T20:35:03Z

    Add support to create table which schema is reading from a given parquet file

commit 877a57ec439db4e688c71568ddd312bdc2a50cec
Author: jacshen <ja...@ebay.com>
Date:   2017-10-04T20:37:08Z

    Merge branch 'master' of https://github.com/apache/spark

commit a22c39e795ab4a730d0277c4162cdfadd37dbf22
Author: jacshen <ja...@ebay.com>
Date:   2017-10-04T21:21:02Z

    Add support to create table which schema is reading from a given parquet file

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19434: [SPARK-21785][SQL]Support create table from a par...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19434


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19434: [SPARK-21785][SQL]Support create table from a parquet fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19434
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19434: [SPARK-21785][SQL]Support create table from a parquet fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19434
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19434: [SPARK-21785][SQL]Support create table from a parquet fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19434
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19434: [SPARK-21785][SQL]Support create table from a parquet fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19434
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19434: [SPARK-21785][SQL]Support create table from a parquet fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19434
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19434: [SPARK-21785][SQL]Support create table from a parquet fi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19434
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19434: [SPARK-21785][SQL]Support create table from a parquet fi...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/19434
  
    @CrazyJacky Can you close this for now cuz it's not active for a long time?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org