You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Wout Scheepers (JIRA)" <ji...@apache.org> on 2018/07/30 14:31:00 UTC

[jira] [Comment Edited] (BEAM-4839) EOF Exception writing non-english Characters to Spanner

    [ https://issues.apache.org/jira/browse/BEAM-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553970#comment-16553970 ] 

Wout Scheepers edited comment on BEAM-4839 at 7/30/18 2:30 PM:
---------------------------------------------------------------

There are a number of duplicate tickets describing this bug: 

https://issues.apache.org/jira/browse/BEAM-4359
 https://issues.apache.org/jira/browse/BEAM-4759

This is resolved on the master branch and will probably be released for 2.6

As a workaround, I added the latest version of `MutationGroupEncoder` to my project, in the correct package (org.apache.beam.sdk.io.elasticsearch).

 


was (Author: wouts):
There are a number of duplicate tickets describing this bug: 

https://issues.apache.org/jira/browse/BEAM-4359
https://issues.apache.org/jira/browse/BEAM-4759


This is resolved on the master branch and will probably be released for 2.6

As a workaround, I added the latest version of `MutationGroupEncoder` to my project, in the correct package (quantum.base.transform.entity.spanner).

 

> EOF Exception writing non-english Characters to Spanner
> -------------------------------------------------------
>
>                 Key: BEAM-4839
>                 URL: https://issues.apache.org/jira/browse/BEAM-4839
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-core, runner-dataflow
>    Affects Versions: 2.3.0, 2.4.0, 2.5.0
>         Environment: GCP and Local (High Sierra)
>            Reporter: Tom
>            Assignee: Kenneth Knowles
>            Priority: Minor
>             Fix For: Not applicable
>
>
> I am having an issue with Apache Beam ^2.3 and Google Cloud Platform Spanner.
> In short, I'm trying to write data into Spanner. Some of this data contains non-English characters, which blows up the dataflow job when using Beam 2.3 or higher.
> Currently, I'm trying to use Apache Beam 2.5, google-api-client 1.23 and Java 1.8.
> This error occurs using Apache Beam 2.3, and not when using 2.2. When using Apache Beam 2.2, we have to drop the google-http/api-client to 1.22 from 1.23.
> {quote}<beam.version>2.5.0</beam.version>{quote}
> {quote}
> <dependency>
>  <groupId>org.apache.beam</groupId>
>  <artifactId>beam-sdks-java-core</artifactId>
>  <version>${beam.version}</version>
> </dependency>
> <dependency>
>  <groupId>org.apache.beam</groupId>
>  <artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
>  <version>${beam.version}</version>
> </dependency>
> <dependency>
>  <groupId>org.apache.beam</groupId>
>  <artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
>  <version>${beam.version}</version>
> </dependency>
> {quote}
> The google libraries
> {quote}
> <google-clients.version>1.23.0</google-clients.version>
> <google-package.version>0.26.0-alpha</google-package.version>
> {quote}
> {quote}
> <dependency>
>  <groupId>com.google.cloud</groupId>
>  <artifactId>google-cloud</artifactId>
>  <version>${google-package.version}</version>
> </dependency>
> <dependency>
>  <groupId>com.google.http-client</groupId>
>  <artifactId>google-http-client</artifactId>
>  <version>${google-clients.version}</version>
>  <exclusions>
>  <!-- Exclude an old version of guava that is being pulled
>  in by a transitive dependency of google-api-client -->
>  <exclusion>
>  <groupId>com.google.guava</groupId>
>  <artifactId>guava-jdk5</artifactId>
>  </exclusion>
>  </exclusions>
>  </dependency>
> {quote}
> Here's the stack trace. You'll see that in this quick sample runner we tried inserting 4 rows. Everything runs fine until I attempt to write mutations to the table.
> {quote}
> [INFO] Scanning for projects...
> [INFO]
> [INFO] --------------------< com.testing:dataflowtwofive >---------------------
> [INFO] Building dataflowtwofive 0.1
> [INFO] --------------------------------[ jar ]---------------------------------
> [WARNING] The POM for com.google.oauth-client:google-oauth-client:jar:1.23.0 is invalid, transitive dependencies (if any) will not be available, enable debug logging for more details
> [WARNING] The POM for com.google.http-client:google-http-client-jackson:jar:1.23.0 is invalid, transitive dependencies (if any) will not be available, enable debug logging for more details
> [WARNING] The POM for com.google.apis:google-api-services-storage:jar:v1-rev114-1.23.0 is invalid, transitive dependencies (if any) will not be available, enable debug logging for more details
> [INFO]
> [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ dataflowtwofive ---
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] Copying 0 resource
> [INFO]
> [INFO] --- maven-compiler-plugin:3.6.1:compile (default-compile) @ dataflowtwofive ---
> [INFO] Nothing to compile - all classes are up to date
> [INFO]
> [INFO] --- exec-maven-plugin:1.4.0:java (default-cli) @ dataflowtwofive ---
> Jul 19, 2018 3:22:41 PM runners.SimpleTestRunner main
> WARNING: ATTN: Coder for mutations: [class com.google.cloud.spanner.Mutation]
> Jul 19, 2018 3:22:42 PM transforms.TestMutationBuilder processElement
> WARNING: ATTN: mutation: [\{ROW_INSERT_TIMESTAMP=2018-07-19T20:22:42.765000000Z, ROW_UPDATE_TIMESTAMP=2018-07-19T20:22:42.809000000Z, GUID=f2360671-557d-4406-b693-c0d66..., LAHQ=a, LAISO=a2, LASPEZ=a3, SPRAS=a4}]
> Jul 19, 2018 3:22:42 PM transforms.TestMutationBuilder processElement
> WARNING: ATTN: mutation: [\{ROW_INSERT_TIMESTAMP=2018-07-19T20:22:42.765000000Z, ROW_UPDATE_TIMESTAMP=2018-07-19T20:22:42.809000000Z, GUID=cba60659-6986-4dc3-a523-116e2..., LAHQ=á, LAISO=c2, LASPEZ=c3, SPRAS=c4}]
> Jul 19, 2018 3:22:42 PM transforms.TestMutationBuilder processElement
> WARNING: ATTN: mutation: [\{ROW_INSERT_TIMESTAMP=2018-07-19T20:22:42.765000000Z, ROW_UPDATE_TIMESTAMP=2018-07-19T20:22:42.809000000Z, GUID=889d9d14-df57-4d72-8748-8de0f..., LAHQ=d, LAISO=d2, LASPEZ=d3, SPRAS=d4}]
> Jul 19, 2018 3:22:42 PM transforms.TestMutationBuilder processElement
> WARNING: ATTN: mutation: [\{ROW_INSERT_TIMESTAMP=2018-07-19T20:22:42.765000000Z, ROW_UPDATE_TIMESTAMP=2018-07-19T20:22:42.809000000Z, GUID=888a2a76-aff7-4f64-9e24-2c231..., LAHQ=b, LAISO=b2, LASPEZ=b3, SPRAS=b4}]
> [WARNING]
> java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke (Method.java:498)
>  at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:293)
>  at java.lang.Thread.run (Thread.java:748)
> Caused by: org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.RuntimeException: java.io.EOFException
>  at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish (DirectRunner.java:349)
>  at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish (DirectRunner.java:319)
>  at org.apache.beam.runners.direct.DirectRunner.run (DirectRunner.java:210)
>  at org.apache.beam.runners.direct.DirectRunner.run (DirectRunner.java:66)
>  at org.apache.beam.sdk.Pipeline.run (Pipeline.java:311)
>  at org.apache.beam.sdk.Pipeline.run (Pipeline.java:297)
>  at runners.SimpleTestRunner.main (SimpleTestRunner.java:53)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke (Method.java:498)
>  at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:293)
>  at java.lang.Thread.run (Thread.java:748)
> Caused by: java.lang.RuntimeException: java.io.EOFException
>  at org.apache.beam.sdk.io.gcp.spanner.MutationGroupEncoder.decode (MutationGroupEncoder.java:271)
>  at org.apache.beam.sdk.io.gcp.spanner.SpannerIO$BatchFn.processElement (SpannerIO.java:1030)
> Caused by: java.io.EOFException
>  at java.io.DataInputStream.readFully (DataInputStream.java:197)
>  at java.io.DataInputStream.readFully (DataInputStream.java:169)
>  at org.apache.beam.sdk.io.gcp.spanner.MutationGroupEncoder.readBytes (MutationGroupEncoder.java:475)
>  at org.apache.beam.sdk.io.gcp.spanner.MutationGroupEncoder.decodePrimitive (MutationGroupEncoder.java:434)
>  at org.apache.beam.sdk.io.gcp.spanner.MutationGroupEncoder.decodeModification (MutationGroupEncoder.java:326)
>  at org.apache.beam.sdk.io.gcp.spanner.MutationGroupEncoder.decodeMutation (MutationGroupEncoder.java:280)
>  at org.apache.beam.sdk.io.gcp.spanner.MutationGroupEncoder.decode (MutationGroupEncoder.java:264)
>  at org.apache.beam.sdk.io.gcp.spanner.SpannerIO$BatchFn.processElement (SpannerIO.java:1030)
>  at org.apache.beam.sdk.io.gcp.spanner.SpannerIO$BatchFn$DoFnInvoker.invokeProcessElement (Unknown Source)
>  at org.apache.beam.repackaged.beam_runners_direct_java.runners.core.SimpleDoFnRunner.invokeProcessElement (SimpleDoFnRunner.java:185)
>  at org.apache.beam.repackaged.beam_runners_direct_java.runners.core.SimpleDoFnRunner.processElement (SimpleDoFnRunner.java:146)
>  at org.apache.beam.repackaged.beam_runners_direct_java.runners.core.SimplePushbackSideInputDoFnRunner.processElementInReadyWindows (SimplePushbackSideInputDoFnRunner.java:87)
>  at org.apache.beam.runners.direct.ParDoEvaluator.processElement (ParDoEvaluator.java:189)
>  at org.apache.beam.runners.direct.DoFnLifecycleManagerRemovingTransformEvaluator.processElement (DoFnLifecycleManagerRemovingTransformEvaluator.java:55)
>  at org.apache.beam.runners.direct.DirectTransformExecutor.processElements (DirectTransformExecutor.java:161)
>  at org.apache.beam.runners.direct.DirectTransformExecutor.run (DirectTransformExecutor.java:125)
>  at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:511)
>  at java.util.concurrent.FutureTask.run (FutureTask.java:266)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run (Thread.java:748)
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD FAILURE
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 8.817 s
> [INFO] Finished at: 2018-07-19T15:22:47-05:00
> [INFO] ------------------------------------------------------------------------
> [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.4.0:java (default-cli) on project dataflowtwofive: An exception occured while executing the Java class. null: InvocationTargetException: java.lang.RuntimeException: java.io.EOFException -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please read the following articles:
> [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> {quote}
> If anything else is needed to help solve this issue, please let me know.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)