You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Adrian Nicoara <ad...@microsoft.com> on 2018/08/07 00:51:41 UTC

Null EdgeManager deadlock

Hello,

Consider the following graph:
A ---- E ----> B
With source vertex A connected to destination vertex B through edge E.

Edge E is defined through:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/EdgeProperty.java#L136-L156
with a null EdgeManagerPluginDescriptor, to have the edge manager setup at runtime.

Assume that both vertex A and vertex B have a custom VertexManager, and need to be configured at runtime.
Vertex B waits for vertex A to be configured, before attempting to configure itself. As part of its configuration, vertex B will use:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPluginContext.java#L172-L208
to create the edge manager for E amongst other things.

However, because of the following code:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2898-L2909
which adds edge E to the set of uninitialized edges, when vertex A has been reconfigured, then it's call to:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPluginContext.java#L349-L353
will end up not sending any Vertexstate.CONFIGURE updates:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L1932
because canInitVertex returns false, as vertex.uninitializedEdges is not empty:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L3172

Now, vertex A has no way of creating an edge manager for edge E, so that edge will never be initialized - only vertex B does that initialization.
This ends up in a deadlock, as vertex B never gets the notification that vertex A has been configured.

What is then the right way of constructing this graph.
Do uninitialized target edges need to be monitored in source vertices at all? i.e. is this code needed:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2898-L2909

Thank you for any insight.
Adrian

RE: Null EdgeManager deadlock

Posted by ". Anupam" <an...@microsoft.com.INVALID>.
Moving to dev@

From: Adrian Nicoara
Sent: Monday, August 6, 2018 7:13 PM
To: user@tez.apache.org; Hitesh Sharma <hi...@microsoft.com>; . Anupam <an...@microsoft.com>
Subject: RE: Null EdgeManager deadlock

After a bit trial and error, it seems that the condition for all edges to be initialized is needed, because:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/Edge.java#L337-L338<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FEdge.java%23L337-L338&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785408640&sdata=bNesi0N%2FifgtgRuT3ZwGkk5ZSN%2FCAOf2haA%2BiVwSapA%3D&reserved=0>

I fixed my scenario by having B listen for VertexState.PARALLELISM_UPDATED event, as that will always be sent by A (in my scenario):
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L1868<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FVertexImpl.java%23L1868&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785418653&sdata=QMxaOnYksm%2FGBn7hEotAla0TRpxEMkqkb6VUtZtRMfY%3D&reserved=0>
and it is equivalent, for my purposes, with A being configured.
Sorry for the noise.

One last question - if B can configure the edge managers at any time after tasks in A have run, is there code that asserts that the new EdgeManager will align the number of outputs it specifies for source tasks in A, with what the old EdgeManager used to have, and hence align with the output spec that tasks in A used for running?

From: Adrian Nicoara
Sent: Monday, August 6, 2018 5:52 PM
To: 'user@tez.apache.org' <us...@tez.apache.org>>; Hitesh Sharma <hi...@microsoft.com>>; . Anupam <an...@microsoft.com>>
Subject: Null EdgeManager deadlock

Hello,

Consider the following graph:
A ---- E ----> B
With source vertex A connected to destination vertex B through edge E.

Edge E is defined through:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/EdgeProperty.java#L136-L156<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-api%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapi%2FEdgeProperty.java%23L136-L156&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785418653&sdata=MT4empk987jWCLSPNMo0SPnvKaG4R6dwfIoqUrwblmk%3D&reserved=0>
with a null EdgeManagerPluginDescriptor, to have the edge manager setup at runtime.

Assume that both vertex A and vertex B have a custom VertexManager, and need to be configured at runtime.
Vertex B waits for vertex A to be configured, before attempting to configure itself. As part of its configuration, vertex B will use:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPluginContext.java#L172-L208<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-api%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapi%2FVertexManagerPluginContext.java%23L172-L208&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785428662&sdata=ADBysH0FvuG9ai07OofeTQtSZgIiTFqG0VfkP1sb15M%3D&reserved=0>
to create the edge manager for E amongst other things.

However, because of the following code:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2898-L2909<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FVertexImpl.java%23L2898-L2909&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785438670&sdata=3k2kHtgQ3xDN%2BM8ShVBW72PLH9wKRrQqX7N8a2FtJNQ%3D&reserved=0>
which adds edge E to the set of uninitialized edges, when vertex A has been reconfigured, then it's call to:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPluginContext.java#L349-L353<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-api%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapi%2FVertexManagerPluginContext.java%23L349-L353&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785438670&sdata=bghksLWl3uIQG5EmPUE6DLg9QvAAHmREsW2KGI8ksc8%3D&reserved=0>
will end up not sending any Vertexstate.CONFIGURE updates:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L1932<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FVertexImpl.java%23L1932&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785448683&sdata=qXw%2BioBOwhA3O99fdeWsFzwUjWDeAdlHmHFmwqzxlhM%3D&reserved=0>
because canInitVertex returns false, as vertex.uninitializedEdges is not empty:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L3172<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FVertexImpl.java%23L3172&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785458687&sdata=JxzNxqDxDty9GFRW3xNJrUo2XOhO7wndsxo%2BdLc6MnE%3D&reserved=0>

Now, vertex A has no way of creating an edge manager for edge E, so that edge will never be initialized - only vertex B does that initialization.
This ends up in a deadlock, as vertex B never gets the notification that vertex A has been configured.

What is then the right way of constructing this graph.
Do uninitialized target edges need to be monitored in source vertices at all? i.e. is this code needed:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2898-L2909<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FVertexImpl.java%23L2898-L2909&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785458687&sdata=dsu8FuIMLHWdN8d8nv4QbiSoNSXi1m72dTtRZ6oYNHk%3D&reserved=0>

Thank you for any insight.
Adrian

RE: Null EdgeManager deadlock

Posted by Adrian Nicoara <ad...@microsoft.com>.
After a bit trial and error, it seems that the condition for all edges to be initialized is needed, because:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/Edge.java#L337-L338

I fixed my scenario by having B listen for VertexState.PARALLELISM_UPDATED event, as that will always be sent by A (in my scenario):
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L1868
and it is equivalent, for my purposes, with A being configured.
Sorry for the noise.

One last question - if B can configure the edge managers at any time after tasks in A have run, is there code that asserts that the new EdgeManager will align the number of outputs it specifies for source tasks in A, with what the old EdgeManager used to have, and hence align with the output spec that tasks in A used for running?

From: Adrian Nicoara
Sent: Monday, August 6, 2018 5:52 PM
To: 'user@tez.apache.org' <us...@tez.apache.org>; Hitesh Sharma <hi...@microsoft.com>; . Anupam <an...@microsoft.com>
Subject: Null EdgeManager deadlock

Hello,

Consider the following graph:
A ---- E ----> B
With source vertex A connected to destination vertex B through edge E.

Edge E is defined through:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/EdgeProperty.java#L136-L156
with a null EdgeManagerPluginDescriptor, to have the edge manager setup at runtime.

Assume that both vertex A and vertex B have a custom VertexManager, and need to be configured at runtime.
Vertex B waits for vertex A to be configured, before attempting to configure itself. As part of its configuration, vertex B will use:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPluginContext.java#L172-L208
to create the edge manager for E amongst other things.

However, because of the following code:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2898-L2909
which adds edge E to the set of uninitialized edges, when vertex A has been reconfigured, then it's call to:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPluginContext.java#L349-L353
will end up not sending any Vertexstate.CONFIGURE updates:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L1932
because canInitVertex returns false, as vertex.uninitializedEdges is not empty:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L3172

Now, vertex A has no way of creating an edge manager for edge E, so that edge will never be initialized - only vertex B does that initialization.
This ends up in a deadlock, as vertex B never gets the notification that vertex A has been configured.

What is then the right way of constructing this graph.
Do uninitialized target edges need to be monitored in source vertices at all? i.e. is this code needed:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2898-L2909

Thank you for any insight.
Adrian