You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by "Piyush Narang (JIRA)" <ji...@apache.org> on 2016/07/21 21:58:20 UTC

[jira] [Created] (TEZ-3369) Fixing Tez's DAGClients to work with Cascading

Piyush Narang created TEZ-3369:
----------------------------------

             Summary: Fixing Tez's DAGClients to work with Cascading
                 Key: TEZ-3369
                 URL: https://issues.apache.org/jira/browse/TEZ-3369
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Piyush Narang


Hi,
We seem to be running into issues when we try to use the newest version of Tez (0.9.0-SNAPSHOT) with Cascading. The issue seems to be:
{code}
java.lang.ClassCastException: cascading.stats.tez.util.TezTimelineClient cannot be cast to org.apache.tez.dag.api.client.DAGClient
	at cascading.stats.tez.util.TezStatsUtil.createTimelineClient(TezStatsUtil.java:142)
{code}
(Full stack trace at the end)

Relevant Cascading code is:
1) [Cascading tries to create a TezTimelineClient and cast it to a DAGClient | https://github.com/Cascading/cascading/blob/3.1/cascading-hadoop2-tez-stats/src/main/java/cascading/stats/tez/util/TezStatsUtil.java#L142]
2) [TezTimelineClient extends from DAGClientTimelineImpl | https://github.com/Cascading/cascading/blob/3.1/cascading-hadoop2-tez-stats/src/main/java/cascading/stats/tez/util/TezTimelineClient.java#L53]
3) [DAGClientTimelineImpl extends from DAGClientInternal | https://github.com/apache/tez/blob/dacd0191b684208d71ea457ca849f2d01212bb7e/tez-api/src/main/java/org/apache/tez/dag/api/client/DAGClientTimelineImpl.java#L68]
4) [DAGClientInternal extends Closeable which is why things break | https://github.com/apache/tez/blob/dacd0191b684208d71ea457ca849f2d01212bb7e/tez-api/src/main/java/org/apache/tez/dag/api/client/DAGClientInternal.java#L38].

This behavior was 'broken' in this [commit | https://github.com/apache/tez/commit/2af886b509015200e1c04527275474cbc771c667] (release 0.8.3)

The TezTimelineClient in Cascading seems to do two things:
1) DAGClient functionalities - ends up delegating to the inner DAGClient object.
2) Retrieve stuff like vertexID, vertexChildren and vertexChild (from this [interface|https://github.com/Cascading/cascading/blob/3.1/cascading-hadoop2-tez-stats/src/main/java/cascading/stats/tez/util/TimelineClient.java#L31]). 

As there's no good way to get the vertexID / vertexChildren / vertexChild (correct me if I'm wrong), they end up extending from the DAGClientTimelineImpl which has the http client and json parsing code to allow [things like this | https://github.com/Cascading/cascading/blob/3.1/cascading-hadoop2-tez-stats/src/main/java/cascading/stats/tez/util/TezTimelineClient.java#L93]:

{code}
@Override
  public String getVertexID( String vertexName ) throws IOException, TezException
    {
    // the filter 'vertexName' is in the 'otherinfo' field, so it must be requested, otherwise timeline server throws
    // an NPE. to be safe, we include both fields in the result
    String format = "%s/%s?primaryFilter=%s:%s&secondaryFilter=vertexName:%s&fields=%s";
    String url = String.format( format, baseUri, TEZ_VERTEX_ID, TEZ_DAG_ID, dagId, vertexName, FILTER_BY_FIELDS );

    JSONObject jsonRoot = getJsonRootEntity( url );
    JSONArray entitiesNode = jsonRoot.optJSONArray( ENTITIES );
...
{code}

Some options I can think of:
1) Ideally these methods getVertexID / getVertexChildren / getVertexChild would be part of DAGClient? Or even part of the DAGClientTimelineImpl? That way the cascading code wouldn't need updating if the uri changed / json format changed, it would end up being updated in these clients as well. I suspect adding this to DAGClient would require more work as it'll also need to be supported by the RPCClient and I don't think there are the relevant protos and such available. 
2) A simpler fix would be to have DAGClientInternal extend DAGClient (currently it just implements Closeable). This will not require any changes on the Cascading side as DAGClientTimelineImpl will continue to be a DAGClient. 

Full stack trace:
{code}
Exception in thread "flow com.twitter.data_platform.e2e_testing.jobs.parquet.E2ETestConvertThriftToParquet" java.lang.ClassCastException: cascading.stats.tez.util.TezTimelineClient cannot be cast to org.apache.tez.dag.api.client.DAGClient
	at cascading.stats.tez.util.TezStatsUtil.createTimelineClient(TezStatsUtil.java:142)
	at cascading.flow.tez.planner.Hadoop2TezFlowStepJob$1.getJobStatusClient(Hadoop2TezFlowStepJob.java:117)
	at cascading.flow.tez.planner.Hadoop2TezFlowStepJob$1.getJobStatusClient(Hadoop2TezFlowStepJob.java:105)
	at cascading.stats.tez.TezStepStats$1.getJobStatusClient(TezStepStats.java:60)
	at cascading.stats.tez.TezStepStats$1.getJobStatusClient(TezStepStats.java:56)
	at cascading.stats.CounterCache.cachedCounters(CounterCache.java:229)
	at cascading.stats.CounterCache.cachedCounters(CounterCache.java:187)
	at cascading.stats.CounterCache.getCounterValue(CounterCache.java:167)
	at cascading.stats.BaseCachedStepStats.getCounterValue(BaseCachedStepStats.java:105)
	at cascading.stats.FlowStats.getCounterValue(FlowStats.java:170)
	at cascading.flow.tez.Hadoop2TezFlow.getTotalSliceCPUMilliSeconds(Hadoop2TezFlow.java:303)
	at cascading.flow.BaseFlow.run(BaseFlow.java:1287)
	at cascading.flow.BaseFlow.access$100(BaseFlow.java:82)
	at cascading.flow.BaseFlow$1.run(BaseFlow.java:928)
	at java.lang.Thread.run(Thread.java:745)
Exception in thread "main" java.lang.Throwable: If you know what exactly caused this error, please consider contributing to GitHub via following link.
https://github.com/twitter/scalding/wiki/Common-Exceptions-and-possible-reasons#javalangclasscastexception
	at com.twitter.scalding.Tool$.main(Tool.scala:152)
	at com.twitter.scalding.Tool.main(Tool.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassCastException: cascading.stats.tez.util.TezTimelineClient cannot be cast to org.apache.tez.dag.api.client.DAGClient
	at cascading.stats.tez.util.TezStatsUtil.createTimelineClient(TezStatsUtil.java:142)
	at cascading.flow.tez.planner.Hadoop2TezFlowStepJob$1.getJobStatusClient(Hadoop2TezFlowStepJob.java:117)
	at cascading.flow.tez.planner.Hadoop2TezFlowStepJob$1.getJobStatusClient(Hadoop2TezFlowStepJob.java:105)
	at cascading.stats.tez.TezStepStats$1.getJobStatusClient(TezStepStats.java:60)
	at cascading.stats.tez.TezStepStats$1.getJobStatusClient(TezStepStats.java:56)
	at cascading.stats.CounterCache.cachedCounters(CounterCache.java:229)
	at cascading.stats.CounterCache.cachedCounters(CounterCache.java:187)
	at cascading.stats.CounterCache.getCountersFor(CounterCache.java:155)
	at cascading.stats.BaseCachedStepStats.getCountersFor(BaseCachedStepStats.java:93)
	at cascading.stats.FlowStats.getCountersFor(FlowStats.java:159)
	at com.twitter.scalding.Stats$.getAllCustomCounters(Stats.scala:93)
	at com.twitter.scalding.Job.handleStats(Job.scala:269)
	at com.twitter.scalding.Job.run(Job.scala:298)
	at com.twitter.scalding.Tool.start$1(Tool.scala:124)
	at com.twitter.scalding.Tool.run(Tool.scala:140)
	at com.twitter.scalding.Tool.run(Tool.scala:68)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at com.twitter.scalding.Tool$.main(Tool.scala:148)
	... 7 more
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)