You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2021/10/19 19:17:00 UTC

[jira] [Updated] (TEZ-4152) Upgrade to protobuf 3.x and take care of relocated protobuf classes

     [ https://issues.apache.org/jira/browse/TEZ-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

László Bodor updated TEZ-4152:
------------------------------
    Description: 
Jiras under HADOOP-13363 cover the process of protobuf upgrade and relocation in Hadoop.

Tez is on protobuf 2.5, while hadoop 3.3 is on protobuf 3.x.
Tez usually follows hadoop with dependencies, so a hadoop 3.3 upgrade means a protobuf upgrade in tez as well, and an additional relocation-ish step will be needed in tez as e.g. hadoop expects protobuf messages with org.apache.hadoop.thirdparty.protobuf package, even if it's not exposed to public APIs, for example:
{code}
java.lang.ClassCastException: org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$SubmitDAGRequestProto cannot be cast to org.apache.hadoop.thirdparty.protobuf.Message
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
	at com.sun.proxy.$Proxy11.submitDAG(Unknown Source)
	at org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:706)
	at org.apache.tez.client.TezClient.submitDAG(TezClient.java:593)
	at org.apache.tez.dag.app.TestMockDAGAppMaster.testMixedEdgeRouting(TestMockDAGAppMaster.java:392)
{code}

Here is the failing line:
https://github.com/apache/hadoop/blob/e103c83765898f756f88c27b2243c8dd3098a989/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java#L232
{code}
final Message theRequest = (Message) args[1];
{code}
relocation means full rewrite in binary, so here Message refers to org.apache.hadoop.thirdparty.protobuf.Message, but tez supplies a non-relocated one, which implements com.google.protobuf.Message (hadoop method signature contains Object, so it compiles)
so even if protobuf 2.5 is fully compatible with 3.x (not checked yet), we will need extra effort on tez side to generate protobuf messages which are compatible with hadoop relocated messages...as these classes are generated from proto files, we cannot and shouldn't hack them by manual java source code manipulation

I'm thinking of a maven profile based approach which can take care of both protobuf 3 and relocation compatible protobuf objects from tez side

  was:
Jiras under HADOOP-13363 cover the process of protobuf upgrade and relocation in Hadoop.

Tez is on protobuf 2.5, while hadoop 3.3 is on protobuf 3.x.
Tez usually follows hadoop with dependencies, so a hadoop 3.3 upgrade means a protobuf upgrade in tez as well, and an additional relocation-ish step will be needed in tez as e.g. hadoop expects protobuf messages with org.apache.hadoop.thirdparty.protobuf package, even if it's not exposed to public APIs, for example:
{code}
java.lang.ClassCastException: org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$SubmitDAGRequestProto cannot be cast to org.apache.hadoop.thirdparty.protobuf.Message
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
	at com.sun.proxy.$Proxy11.submitDAG(Unknown Source)
	at org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:706)
	at org.apache.tez.client.TezClient.submitDAG(TezClient.java:593)
	at org.apache.tez.dag.app.TestMockDAGAppMaster.testMixedEdgeRouting(TestMockDAGAppMaster.java:392)
{code}

Here is the failing line:
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java#L237
{code}
final Message theRequest = (Message) args[1];
{code}
relocation means full rewrite in binary, so here Message refers to org.apache.hadoop.thirdparty.protobuf.Message, but tez supplies a non-relocated one, which implements com.google.protobuf.Message (hadoop method signature contains Object, so it compiles)
so even if protobuf 2.5 is fully compatible with 3.x (not checked yet), we will need extra effort on tez side to generate protobuf messages which are compatible with hadoop relocated messages...as these classes are generated from proto files, we cannot and shouldn't hack them by manual java source code manipulation

I'm thinking of a maven profile based approach which can take care of both protobuf 3 and relocation compatible protobuf objects from tez side


> Upgrade to protobuf 3.x and take care of relocated protobuf classes
> -------------------------------------------------------------------
>
>                 Key: TEZ-4152
>                 URL: https://issues.apache.org/jira/browse/TEZ-4152
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>
> Jiras under HADOOP-13363 cover the process of protobuf upgrade and relocation in Hadoop.
> Tez is on protobuf 2.5, while hadoop 3.3 is on protobuf 3.x.
> Tez usually follows hadoop with dependencies, so a hadoop 3.3 upgrade means a protobuf upgrade in tez as well, and an additional relocation-ish step will be needed in tez as e.g. hadoop expects protobuf messages with org.apache.hadoop.thirdparty.protobuf package, even if it's not exposed to public APIs, for example:
> {code}
> java.lang.ClassCastException: org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$SubmitDAGRequestProto cannot be cast to org.apache.hadoop.thirdparty.protobuf.Message
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
> 	at com.sun.proxy.$Proxy11.submitDAG(Unknown Source)
> 	at org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:706)
> 	at org.apache.tez.client.TezClient.submitDAG(TezClient.java:593)
> 	at org.apache.tez.dag.app.TestMockDAGAppMaster.testMixedEdgeRouting(TestMockDAGAppMaster.java:392)
> {code}
> Here is the failing line:
> https://github.com/apache/hadoop/blob/e103c83765898f756f88c27b2243c8dd3098a989/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java#L232
> {code}
> final Message theRequest = (Message) args[1];
> {code}
> relocation means full rewrite in binary, so here Message refers to org.apache.hadoop.thirdparty.protobuf.Message, but tez supplies a non-relocated one, which implements com.google.protobuf.Message (hadoop method signature contains Object, so it compiles)
> so even if protobuf 2.5 is fully compatible with 3.x (not checked yet), we will need extra effort on tez side to generate protobuf messages which are compatible with hadoop relocated messages...as these classes are generated from proto files, we cannot and shouldn't hack them by manual java source code manipulation
> I'm thinking of a maven profile based approach which can take care of both protobuf 3 and relocation compatible protobuf objects from tez side



--
This message was sent by Atlassian Jira
(v8.3.4#803005)