You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Shashank Prabhakara (JIRA)" <ji...@apache.org> on 2018/01/16 10:33:00 UTC

[jira] [Updated] (BEAM-3482) Java serialiazation exception when using BigQueryIO

     [ https://issues.apache.org/jira/browse/BEAM-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shashank Prabhakara updated BEAM-3482:
--------------------------------------
    Description: 
When writing data to BQ using BigQueryIO, we get the following exception when checkpointing. To reproduce, use BigQueryIO.writeTableRows() using sufficiently large dataset with latest code from master branch. Attached patch to ticket.
 The following stack trace is when using using ApexRunner:

ERROR com.datatorrent.stram.engine.StreamingContainer: Operator set [OperatorDeployInfo[id=20,name=BigQueryIO.Write/PrepareWrite/ParDo(Anonymous)/ParMultiDo(Anonymous),type=GENERIC,checkpoint=\{ffffffffffffffff, 0, 0},inputs=[OperatorDeployInfo.Inpu
 tDeployInfo[portName=input,streamId=stream9,sourceNodeId=19,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=<null>]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream56,bufferServer=<null>]]]] stopped running due to
 an exception.
 com.esotericsoftware.kryo.KryoException: Error during Java serialization.
 Serialization trace:
 doFn (org.apache.beam.runners.apex.translation.operators.ApexParDoOperator)
         at com.esotericsoftware.kryo.serializers.JavaSerializer.write(JavaSerializer.java:33)
         at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
         at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
         at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
         at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599)
         at com.datatorrent.common.util.FSStorageAgent.store(FSStorageAgent.java:190)
         at com.datatorrent.common.util.AsyncFSStorageAgent.save(AsyncFSStorageAgent.java:101)
         at com.datatorrent.stram.engine.Node.checkpoint(Node.java:521)
         at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:461)
         at com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1428)
 Caused by: java.io.NotSerializableException: org.apache.beam.sdk.io.gcp.bigquery.DynamicDestinations$SideInputAccessorViaProcessContext
         at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
         at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
         at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
         at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
         at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
         at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
         at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
         at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
         at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
         at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
         at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
         at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
         at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
         at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
         at com.esotericsoftware.kryo.serializers.JavaSerializer.write(JavaSerializer.java:30)
         ... 9 more

  was:
When writing data to BQ using BigQueryIO, we get the following exception when checkpointing. To reproduce, use BigQueryIO.writeTableRows() using sufficiently large dataset with latest code from master branch.
The following stack trace is when using using ApexRunner:

ERROR com.datatorrent.stram.engine.StreamingContainer: Operator set [OperatorDeployInfo[id=20,name=BigQueryIO.Write/PrepareWrite/ParDo(Anonymous)/ParMultiDo(Anonymous),type=GENERIC,checkpoint=\{ffffffffffffffff, 0, 0},inputs=[OperatorDeployInfo.Inpu
tDeployInfo[portName=input,streamId=stream9,sourceNodeId=19,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=<null>]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream56,bufferServer=<null>]]]] stopped running due to
an exception.
com.esotericsoftware.kryo.KryoException: Error during Java serialization.
Serialization trace:
doFn (org.apache.beam.runners.apex.translation.operators.ApexParDoOperator)
        at com.esotericsoftware.kryo.serializers.JavaSerializer.write(JavaSerializer.java:33)
        at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
        at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
        at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599)
        at com.datatorrent.common.util.FSStorageAgent.store(FSStorageAgent.java:190)
        at com.datatorrent.common.util.AsyncFSStorageAgent.save(AsyncFSStorageAgent.java:101)
        at com.datatorrent.stram.engine.Node.checkpoint(Node.java:521)
        at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:461)
        at com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1428)
Caused by: java.io.NotSerializableException: org.apache.beam.sdk.io.gcp.bigquery.DynamicDestinations$SideInputAccessorViaProcessContext
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
        at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
        at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
        at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
        at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
        at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
        at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
        at com.esotericsoftware.kryo.serializers.JavaSerializer.write(JavaSerializer.java:30)
        ... 9 more


> Java serialiazation exception when using BigQueryIO
> ---------------------------------------------------
>
>                 Key: BEAM-3482
>                 URL: https://issues.apache.org/jira/browse/BEAM-3482
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-gcp
>    Affects Versions: 2.3.0
>            Reporter: Shashank Prabhakara
>            Assignee: Chamikara Jayalath
>            Priority: Major
>              Labels: easyfix, newbie
>         Attachments: m.patch
>
>
> When writing data to BQ using BigQueryIO, we get the following exception when checkpointing. To reproduce, use BigQueryIO.writeTableRows() using sufficiently large dataset with latest code from master branch. Attached patch to ticket.
>  The following stack trace is when using using ApexRunner:
> ERROR com.datatorrent.stram.engine.StreamingContainer: Operator set [OperatorDeployInfo[id=20,name=BigQueryIO.Write/PrepareWrite/ParDo(Anonymous)/ParMultiDo(Anonymous),type=GENERIC,checkpoint=\{ffffffffffffffff, 0, 0},inputs=[OperatorDeployInfo.Inpu
>  tDeployInfo[portName=input,streamId=stream9,sourceNodeId=19,sourcePortName=output,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=<null>]],outputs=[OperatorDeployInfo.OutputDeployInfo[portName=output,streamId=stream56,bufferServer=<null>]]]] stopped running due to
>  an exception.
>  com.esotericsoftware.kryo.KryoException: Error during Java serialization.
>  Serialization trace:
>  doFn (org.apache.beam.runners.apex.translation.operators.ApexParDoOperator)
>          at com.esotericsoftware.kryo.serializers.JavaSerializer.write(JavaSerializer.java:33)
>          at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>          at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>          at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>          at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599)
>          at com.datatorrent.common.util.FSStorageAgent.store(FSStorageAgent.java:190)
>          at com.datatorrent.common.util.AsyncFSStorageAgent.save(AsyncFSStorageAgent.java:101)
>          at com.datatorrent.stram.engine.Node.checkpoint(Node.java:521)
>          at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:461)
>          at com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1428)
>  Caused by: java.io.NotSerializableException: org.apache.beam.sdk.io.gcp.bigquery.DynamicDestinations$SideInputAccessorViaProcessContext
>          at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
>          at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>          at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>          at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>          at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>          at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>          at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>          at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>          at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>          at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>          at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>          at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>          at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>          at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
>          at com.esotericsoftware.kryo.serializers.JavaSerializer.write(JavaSerializer.java:30)
>          ... 9 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)