You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2009/11/12 05:51:39 UTC
[jira] Assigned: (PIG-113) Make Grunt's explain output more
understandable
[ https://issues.apache.org/jira/browse/PIG-113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates reassigned PIG-113:
------------------------------
Assignee: Pi Song
> Make Grunt's explain output more understandable
> -----------------------------------------------
>
> Key: PIG-113
> URL: https://issues.apache.org/jira/browse/PIG-113
> Project: Pig
> Issue Type: Improvement
> Components: grunt
> Affects Versions: 0.1.0
> Reporter: Pi Song
> Assignee: Pi Song
> Priority: Minor
> Fix For: 0.1.0
>
> Attachments: pig_printtree_1.patch, pig_printtree_2.patch
>
>
> I think it would be better if we can display the execution plan in a more understandable way. One intuitive way to do this is to show output as a tree like in SQL Server.
> Possibly we can have 'AS <format>' as optional argument for explain command
> For example
> {noformat}
> Grunt> explain bag1 AS tree ;
> Grunt> explain bag1 AS xml ;
> {noformat}
> and
> {noformat}
> Grunt> explain bag1
> {noformat}
> will display the default format
> I have included a patch that does generate tree output.
> Here is a sample of the existing output format
> {noformat}
> Logical Plan:
> Group root-Sun Feb 17 19:37:07 GMT+10:00 2008-5
> Object id: 9814147
> Inputs: 26335425
> Schema: (group, (sum, (), (), ()))
> EvalSpecs:
> Generate: has 2 children
> Project: (0)
> Star
> Split root-Sun Feb 17 19:37:07 GMT+10:00 2008-2
> Object id: 25199001
> Inputs: 29132923
> Schema: (sum, (), (), ())
> EvalSpecs:
> Eval root-Sun Feb 17 19:37:07 GMT+10:00 2008-1
> Object id: 29132923
> Inputs: 10774273
> Schema: (sum, (), (), ())
> EvalSpecs:
> Generate: has 4 children
> FuncEval: name: org.apache.pig.impl.builtin.ADD args:
> Generate: has 2 children
> Project: (0)
> Project: (1)
> Project: (0)
> Project: (1)
> Project: (2)
> Load root-Sun Feb 17 19:37:07 GMT+10:00 2008-0
> Object id: 10774273
> Inputs:
> Schema: ()
> EvalSpecs:
> -----------------------------------------------
> Physical Plan:
> MAPREDUCE
> Object id: 17671659
> Inputs: 682933706
> Map:
> Star
> Grouping Funcs:
> Generate: has 2 children
> Project: (0)
> Star
> Input Files: /tmp/temp678140026/tmp1867058340
> MAPREDUCE
> Object id: 17308974
> Inputs:
> Map:
> Composite: has 2 children
> Star
> Generate: has 4 children
> FuncEval: name: org.apache.pig.impl.builtin.ADD args:
> Generate: has 2 children
> Project: (0)
> Project: (1)
> Project: (0)
> Project: (1)
> Project: (2)
> Input Files: /tmp/data1.txt
> Output File: /tmp/temp678140026/tmp1613817084
> {noformat}
> Here is a sample of my tree output which is more compact and more understandable :-
> {noformat}
> grunt> explain c1 as tree ;
> Logical Plan:
> |---LOCogroup ( GENERATE {[PROJECT $0],[*]} )
> |---LOSplitOutput ( )
> |---LOSplit ( ([PROJECT $0] < ['5']),([PROJECT $0] >= ['5']) )
> |---LOEval ( GENERATE {[org.apache.pig.impl.builtin.ADD(GENERATE {[PROJECT $0],[PROJECT $1]})],[PROJECT $0],[PROJECT $1],[PROJECT $2]} )
> |---LOLoad ( file = /tmp/data1.txt )
> -----------------------------------------------
> Physical Plan:
> |---POMapreduce
> Map : *
> Grouping : Generate(Project(0),*)
> Input File(s) : /tmp/temp678140026/tmp1867058340
> |---POMapreduce
> Map : Composite(*,Generate(FuncEval(org.apache.pig.impl.builtin.ADD(Generate(Project(0),Project(1)))),Project(0),Project(1),Project(2)))
> Input File(s) : /tmp/data1.txt
> {noformat}
> I'm also thinking about doing output as xml as it might benefit people who are working on displaying execution plan on GUI.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.