You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Fokko Driesprong (JIRA)" <ji...@apache.org> on 2019/06/12 18:46:00 UTC

[jira] [Commented] (PARQUET-1596) PARQUET-1375 broke parquet-cli's to-avro command

    [ https://issues.apache.org/jira/browse/PARQUET-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862367#comment-16862367 ] 

Fokko Driesprong commented on PARQUET-1596:
-------------------------------------------

Thanks [~sekikn] I'll come up with a test and a fix tomorrow right away. Cheers!

> PARQUET-1375 broke parquet-cli's to-avro command
> ------------------------------------------------
>
>                 Key: PARQUET-1596
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1596
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cli
>            Reporter: Kengo Seki
>            Assignee: Fokko Driesprong
>            Priority: Major
>
> Given the following JSON file:
> {code}
> $ cat /tmp/sample.json 
> { "id": 1, "name": "Alice" }
> { "id": 2, "name": "Bob" }
> { "id": 3, "name": "Carol" }
> { "id": 4, "name": "Dave" }
> {code}
> using {{to-avro}} on the master branch for converting this into avro fails with NPE:
> {code}
> $ git branch -v
> * master 47398be7 PARQUET-1375: Upgrade to Jackson 2.9.9 (#616)
> $ mvn clean install -DskipTests
> (snip)
> [INFO] --- maven-install-plugin:2.5.2:install (default-install) @ parquet-cli ---
> [INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.jar
> [INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/pom.xml to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.pom
> [INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-tests.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-tests.jar
> [INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-runtime.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-runtime.jar
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time:  14.769 s
> [INFO] Finished at: 2019-06-12T23:52:57+09:00
> [INFO] ------------------------------------------------------------------------
> $ mvn dependency:copy-dependencies
> (snip)
> $ java -cp 'target/*:target/dependency/*' org.apache.parquet.cli.Main to-avro /tmp/sample.json -o /tmp/sample.avro
> Unknown error
> java.lang.RuntimeException: Failed on record 0
> 	at org.apache.parquet.cli.commands.ToAvroCommand.run(ToAvroCommand.java:120)
> 	at org.apache.parquet.cli.Main.run(Main.java:147)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 	at org.apache.parquet.cli.Main.main(Main.java:177)
> Caused by: java.lang.NullPointerException
> 	at org.apache.avro.file.DataFileWriter.create(DataFileWriter.java:153)
> 	at org.apache.avro.file.DataFileWriter.create(DataFileWriter.java:145)
> 	at org.apache.parquet.cli.commands.ToAvroCommand.run(ToAvroCommand.java:112)
> 	... 3 more
> $ echo $?
> 1
> {code}
> But with its previous revision, it succeeds:
> {code}
> $ git checkout HEAD^
> HEAD is now at 9d6fb45e PARQUET-1576 Bump Apache Avro to 1.9.0 (#638)
> $ mvn clean install -DskipTests
> (snip)
> [INFO] --- maven-install-plugin:2.5.2:install (default-install) @ parquet-cli ---
> [INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.jar
> [INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/pom.xml to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.pom
> [INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-tests.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-tests.jar
> [INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-runtime.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-runtime.jar
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time:  15.822 s
> [INFO] Finished at: 2019-06-12T23:57:04+09:00
> [INFO] ------------------------------------------------------------------------
> $ mvn dependency:copy-dependencies
> (snip)
> $ java -cp 'target/*:target/dependency/*' org.apache.parquet.cli.Main to-avro /tmp/sample.json -o /tmp/sample.avro
> $ echo $?
> 0
> $ java -cp 'target/*:target/dependency/*' org.apache.parquet.cli.Main head /tmp/sample.avro
> {"id": 1, "name": "Alice"}
> {"id": 2, "name": "Bob"}
> {"id": 3, "name": "Carol"}
> {"id": 4, "name": "Dave"}
> {code}
> Reverting the following code
> {code:title=AvroJson.java}
>    public static Iterator<JsonNode> parser(final InputStream stream) {
>      try(JsonParser parser = FACTORY.createParser(stream)) {
> {code}
> to
> {code}
>    public static Iterator<JsonNode> parser(final InputStream stream) {
>      try {
>       JsonParser parser = FACTORY.createParser(stream);
> {code}
> seems to work.
> cc [~Fokko] :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)