You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2015/10/10 09:02:32 UTC
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/9060
[SPARK-11044][SQL] Parquet writer version fixed as version1
https://issues.apache.org/jira/browse/SPARK-11044
Spark only writes the parquet file with writer version1 ignoring the given writer version by user.
So, in this PR, it keeps the writer version if given and sets version1 as default.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/HyukjinKwon/spark SPARK-11044
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9060.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9060
----
commit 5e72fbc93ec0783d5a440f8f70c7653f8fc39d9a
Author: HyukjinKwon <gu...@gmail.com>
Date: 2015-10-10T06:59:52Z
[SPARK-11044][SQL] Apply the writer version if given.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/9060#discussion_r41705069
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystWriteSupport.scala ---
@@ -431,6 +431,7 @@ private[parquet] object CatalystWriteSupport {
configuration.set(SPARK_ROW_SCHEMA, schema.json)
configuration.set(
ParquetOutputFormat.WRITER_VERSION,
- ParquetProperties.WriterVersion.PARQUET_1_0.toString)
+ configuration.get(ParquetOutputFormat.WRITER_VERSION,
--- End diff --
Yeap I just updated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156077334
You may construct a Parquet file consists of a single column with dictionary encoding using:
```scala
val path = "file:///tmp/parquet/dict"
sqlContext.range(1 << 16).selectExpr("(id % 4) AS i").coalesce(1).write.mode("overwrite").parquet(path)
```
And here are instructions of building and installing the parquet-tools CLI tool. Then you can inspect Parquet metadata using:
```
$ parquet-meta /tmp/parquet/dict
file: file:/private/tmp/parquet/dict/part-r-00000-88498608-9eed-4728-b96a-b60bc5ebc2a8.gz.parquet
creator: parquet-mr version 1.6.0
extra: org.apache.spark.sql.parquet.row.metadata = {"type":"struct","fields":[{"name":"i","type":"long","nullable":true,"metadata":{}}]}
file schema: root
----------------------------------------------------------------------------------------------------------------------------------------------
i: OPTIONAL INT64 R:0 D:1
row group 1: RC:65536 TS:16615 OFFSET:4
----------------------------------------------------------------------------------------------------------------------------------------------
i: INT64 GZIP DO:0 FPO:4 SZ:198/16615/83.91 VC:65536 ENC:BIT_PACKED,RLE,PLAIN_DICTIONARY
```
The `ENC:...` part in the last line is column encoding information.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156354310
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45831/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-155718158
@HyukjinKwon Oh yeah, sorry. Finally got sometime to clean my review queue :)
I wonder is there an easy way to add a test case for this? At first I thought `WriterVersion` corresponds to the the `version` field of the Thrift struct `FileMetaData` described in [parquet-format] [1], but it's not. I only found that when `WriterVersion` is set to v2, the Thrift field `PageHeader.type` is set to `DATA_PAGE_V2`.
[1]: https://github.com/apache/parquet-format#metadata
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156072061
I think we can check for column encoding information, which is accessible from Parquet footers. For example, `PARQUET_2_0` uses `RLE_DICTIONARY` while `PARQUET_1_0` uses `PLAIN_DICTIONARY` (see [here][1]).
The [parquet-meta CLI tool][2] can be a reference for how to inspect related metadata.
[1]: https://github.com/apache/parquet-mr/blob/apache-parquet-1.7.0/parquet-column/src/main/java/org/apache/parquet/column/ParquetProperties.java#L116-L123
[2]: https://github.com/apache/parquet-mr/blob/master/parquet-tools/src/main/java/org/apache/parquet/tools/util/MetadataUtils.java#L139
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156309116
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/9060#discussion_r44765188
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala ---
@@ -513,6 +515,41 @@ class ParquetIOSuite extends QueryTest with ParquetTest with SharedSQLContext {
}
}
+ test("SPARK-11044 Parquet writer version fixed as version1 ") {
+
+ // For dictionary encoding, Parquet changes the encoding types according to its writer version
+ // So, this test checks the encoding types in order to ensure that the file is written with
+ // writer version2.
+ withTempPath { dir =>
+ val clonedConf = new Configuration(hadoopConfiguration)
+ try {
+
+ // Write a Parquet file with writer version 2
+ hadoopConfiguration.set(ParquetOutputFormat.WRITER_VERSION,
+ ParquetProperties.WriterVersion.PARQUET_2_0.toString)
+
+ // By default, dictionary encoding is enabled from Parquet 1.2.0 but
+ // it is enabled just in case.
+ hadoopConfiguration.setBoolean(ParquetOutputFormat.ENABLE_DICTIONARY, true)
+ val path = s"${dir.getCanonicalPath}/part-r-0.parquet"
+ sqlContext.range(1 << 16).selectExpr("(id % 4) AS i")
+ .coalesce(1).write.mode("overwrite").parquet(path)
+
+ val blockMetadata = readFooter(new Path(path), hadoopConfiguration).getBlocks.asScala.head
+ val columnChunkMetadata = blockMetadata.getColumns.asScala.head
+
+ // If the file is written with version 2, this should include
+ // [[Encoding.RLE_DICTIONARY]] type. For version 1, it is Encoding.PLAIN_DICTIONARY
--- End diff --
BTW, the `[[...]]` notation is only useful when writing ScalaDoc. In case of inline comment s like this, you may either omit the brackets or use backquotes to emphasize that the quoted part is a Scala/Java entity.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156322233
[Test build #45810 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45810/console) for PR 9060 at commit [`2d1d343`](https://github.com/apache/spark/commit/2d1d343ab4a0218cfcbc6aaaa21c6fccb77397e7).
* This patch **fails Spark unit tests**.
* This patch **does not merge cleanly**.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/9060#discussion_r44764961
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala ---
@@ -513,6 +515,41 @@ class ParquetIOSuite extends QueryTest with ParquetTest with SharedSQLContext {
}
}
+ test("SPARK-11044 Parquet writer version fixed as version1 ") {
+
+ // For dictionary encoding, Parquet changes the encoding types according to its writer version
+ // So, this test checks the encoding types in order to ensure that the file is written with
+ // writer version2.
+ withTempPath { dir =>
+ val clonedConf = new Configuration(hadoopConfiguration)
+ try {
+
--- End diff --
Nit: Remove this empty line.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156309719
**[Test build #45811 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45811/consoleFull)** for PR 9060 at commit [`7e80ad6`](https://github.com/apache/spark/commit/7e80ad6082a9f5b53f08800bfb519a2a80632ec8).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156325499
**[Test build #45811 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45811/consoleFull)** for PR 9060 at commit [`7e80ad6`](https://github.com/apache/spark/commit/7e80ad6082a9f5b53f08800bfb519a2a80632ec8).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-147047845
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156891628
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45964/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-155752417
**[Test build #45626 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45626/consoleFull)** for PR 9060 at commit [`2eee7e3`](https://github.com/apache/spark/commit/2eee7e37b6f366336cbe19bd9545f07abb13f7db).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156309106
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156879507
**[Test build #45964 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45964/consoleFull)** for PR 9060 at commit [`cea5034`](https://github.com/apache/spark/commit/cea50348da091e5d83c14474a76d4f49e1ff3c9b).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-155753068
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45626/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/9060#discussion_r44764956
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala ---
@@ -513,6 +515,41 @@ class ParquetIOSuite extends QueryTest with ParquetTest with SharedSQLContext {
}
}
+ test("SPARK-11044 Parquet writer version fixed as version1 ") {
+
--- End diff --
Nit: Remove this empty line.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156322309
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45810/
Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156076494
Thank toy very much. I will try in that way.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-157108064
Sure
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/9060#discussion_r41695242
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystWriteSupport.scala ---
@@ -431,6 +431,7 @@ private[parquet] object CatalystWriteSupport {
configuration.set(SPARK_ROW_SCHEMA, schema.json)
configuration.set(
ParquetOutputFormat.WRITER_VERSION,
- ParquetProperties.WriterVersion.PARQUET_1_0.toString)
+ configuration.get(ParquetOutputFormat.WRITER_VERSION,
--- End diff --
Can you just use `setIfUnset` here?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156322308
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-155719264
**[Test build #45626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45626/consoleFull)** for PR 9060 at commit [`2eee7e3`](https://github.com/apache/spark/commit/2eee7e37b6f366336cbe19bd9545f07abb13f7db).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-154597634
@liancheng I assume you missed this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156325563
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-155718954
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-155753066
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156354224
**[Test build #45831 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45831/consoleFull)** for PR 9060 at commit [`78449ec`](https://github.com/apache/spark/commit/78449ec530007bbebf729c19e74364dd0e001b81).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:\n * `class TypedColumn[-T, U](`\n * `class JavaTrackStateDStream[KeyType, ValueType, StateType, EmittedType](`\n
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156306860
[Test build #45810 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45810/consoleFull) for PR 9060 at commit [`2d1d343`](https://github.com/apache/spark/commit/2d1d343ab4a0218cfcbc6aaaa21c6fccb77397e7).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156327584
**[Test build #45831 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45831/consoleFull)** for PR 9060 at commit [`78449ec`](https://github.com/apache/spark/commit/78449ec530007bbebf729c19e74364dd0e001b81).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-155720490
I will try to find and test them first tommorow before adding a commit!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156325565
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45811/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-148994769
/cc @liancheng
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156327284
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156306712
Build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156891627
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-155718167
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-157027289
@marmbrus Is this one OK for branch-1.6?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-157027571
@HyukjinKwon Thanks! I've merged this one to master. And yes, please feel free to add the decimal test case(s).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-155973698
@liancheng I give some tries to figure out the version but.. as you said, it is pretty tricky to check the writer version as it only changes the version of data page which we could know only within the internal of Parquet.
Would this be too inappropriate if we write Parquet files with both version1 and version2 and then, check if the sizes of both are equal?
Since encoding types are different, the size should be also different.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156099372
Thanks! I will follow the way.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156306727
Fortunately, I worked around parquet tools once and looked through Parquet codes several times :).
Thank you very much for your help. This could be dome much more easily than I though because of your help.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/9060#discussion_r44764972
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala ---
@@ -513,6 +515,41 @@ class ParquetIOSuite extends QueryTest with ParquetTest with SharedSQLContext {
}
}
+ test("SPARK-11044 Parquet writer version fixed as version1 ") {
+
+ // For dictionary encoding, Parquet changes the encoding types according to its writer version
+ // So, this test checks the encoding types in order to ensure that the file is written with
+ // writer version2.
+ withTempPath { dir =>
+ val clonedConf = new Configuration(hadoopConfiguration)
+ try {
+
+ // Write a Parquet file with writer version 2
+ hadoopConfiguration.set(ParquetOutputFormat.WRITER_VERSION,
+ ParquetProperties.WriterVersion.PARQUET_2_0.toString)
+
+ // By default, dictionary encoding is enabled from Parquet 1.2.0 but
+ // it is enabled just in case.
+ hadoopConfiguration.setBoolean(ParquetOutputFormat.ENABLE_DICTIONARY, true)
+ val path = s"${dir.getCanonicalPath}/part-r-0.parquet"
+ sqlContext.range(1 << 16).selectExpr("(id % 4) AS i")
+ .coalesce(1).write.mode("overwrite").parquet(path)
+
+ val blockMetadata = readFooter(new Path(path), hadoopConfiguration).getBlocks.asScala.head
+ val columnChunkMetadata = blockMetadata.getColumns.asScala.head
+
+ // If the file is written with version 2, this should include
+ // [[Encoding.RLE_DICTIONARY]] type. For version 1, it is Encoding.PLAIN_DICTIONARY
+ assert(columnChunkMetadata.getEncodings.contains(Encoding.RLE_DICTIONARY))
+ } finally {
+
--- End diff --
Nit: Remove this empty line.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156354309
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156891545
**[Test build #45964 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45964/consoleFull)** for PR 9060 at commit [`cea5034`](https://github.com/apache/spark/commit/cea50348da091e5d83c14474a76d4f49e1ff3c9b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156306692
Build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156879272
I saw accidently `TODO Adds test case for reading dictionary encoded decimals written as 'FIXED_LEN_BYTE_ARRAY'`.
I will also add this test in the following PR for using the overloaded `writeMetaFile`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-155718924
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/9060
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156327273
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...
Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/9060#issuecomment-156379942
LGTM except for a few minor styling issue. I can merge it right after you fix them.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org