You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (Created) (JIRA)" <ji...@apache.org> on 2012/03/14 06:05:55 UTC

[jira] [Created] (PIG-2586) A better plan/data flow visualizer

A better plan/data flow visualizer
----------------------------------

                 Key: PIG-2586
                 URL: https://issues.apache.org/jira/browse/PIG-2586
             Project: Pig
          Issue Type: Improvement
          Components: impl
            Reporter: Daniel Dai


Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
1. show operator type and alias
2. turn on/off output schema
3. dive into foreach inner plan on demand
4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481080#comment-13481080 ] 

Daniel Dai commented on PIG-2586:
---------------------------------

There is a partial patch in https://github.com/manuranga/svg-graph. It has not linked to Pig yet.
                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2586) A better plan/data flow visualizer

Posted by "Daniel Dai (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-2586:
----------------------------

    Description: 
Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
1. show operator type and alias
2. turn on/off output schema
3. dive into foreach inner plan on demand
4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
6. may rely on some java graphic library such as Swing

This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

  was:
Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
1. show operator type and alias
2. turn on/off output schema
3. dive into foreach inner plan on demand
4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
6. may rely on some java graphic library such as Swing

    
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270692#comment-13270692 ] 

Daniel Dai commented on PIG-2586:
---------------------------------

@Dmitriy
Sorry, just notice this, we get one student accepted. The proposal is http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/manuranga/5002.

@Bill
Can you post some screenshots?

Also we need to make use of the source location (PIG-2659) in the visualizer.
                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265100#comment-13265100 ] 

Bill Graham commented on PIG-2586:
----------------------------------

As part of Twitter's Hackweek we developed a first pass at a visualization tool for Pig that focused on visualizing the run-time execution of jobs in a pig script. This helps our developers when running scripts with very large DAGs. We're in the process of open sourcing it, but I'll describe it here to see if parts of it might be leveraged, built upon or learned from.

* Design
When executing a pig script from the command line, we insert a {{PigProgressNotificationListener}} per PIG-2525. The PPNL launches an embedded Jetty server that exposes a json API of dag/script/job/progress info. Also embedded is the HTML/js/css content for a single page that renders the DAG, polls for updates, and shows progress.

* Viz
We use d3.js to render a chord diagram of the script (see http://mbostock.github.com/d3/ex/chord.html), where each arc in the circle is a job and each chord is a dependancy. This requires PIG-2660. We also render a tableview of all jobs where we show alias and feature initially, but then add jobName, #reducers, #mappers and progress percents once we have that. Other related patches required are PIG-2663 and PIG-2664.

* Future work
- Better visualization. The chord diagram is ok, but we'd like to find a good JS library for DAG rendering (ala GraphViz) and include that as an option too.
- Non-embedded mode. The Jetty server should be deployable as a standalone app server. Clients can push their state to it and the server has a persistant data store. Embedded mode is still useful during development.
- Better script bindings. Being able to reference a pop-up of the script with highlighting of certain parts (see PIG-2659) would be useful.


                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Aniket Mokashi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489838#comment-13489838 ] 

Aniket Mokashi commented on PIG-2586:
-------------------------------------

Is there any work/patch for explain -script 111.pig -graphics? This is very useful feature.
                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264608#comment-13264608 ] 

Dmitriy V. Ryaboy commented on PIG-2586:
----------------------------------------

I know at least one of the students who applied to GSOC for this project was accepted, with Russ and Daniel co-mentoring. Could the proposal be posted in this ticket?


                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Daniel Dai (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229507#comment-13229507 ] 

Daniel Dai commented on PIG-2586:
---------------------------------

Great, thanks Russell!
                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Manuranga Perera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410996#comment-13410996 ] 

Manuranga Perera commented on PIG-2586:
---------------------------------------

HI, I am currently working on creating (client side) SVGs for a given plan (in json format).
source code for this is available at following github repo : https://github.com/manuranga/svg-graph .
currently it is possible to generate simple plans with some nested plans.

                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2586) A better plan/data flow visualizer

Posted by "Daniel Dai (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-2586:
----------------------------

    Description: 
Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
1. show operator type and alias
2. turn on/off output schema
3. dive into foreach inner plan on demand
4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
6. may rely on some java graphic library such as Swing

  was:
Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
1. show operator type and alias
2. turn on/off output schema
3. dive into foreach inner plan on demand
4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful

    
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489961#comment-13489961 ] 

Daniel Dai commented on PIG-2586:
---------------------------------

Unfortunately no.
                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270897#comment-13270897 ] 

Bill Graham commented on PIG-2586:
----------------------------------

Yes, I should be able to post some sanitized snapshots later this week. We also have plans to integrate PIG-2659.
                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Aniket Mokashi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476599#comment-13476599 ] 

Aniket Mokashi commented on PIG-2586:
-------------------------------------

Do we have a patch for this?
                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Dimitris Bousis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244794#comment-13244794 ] 

Dimitris Bousis commented on PIG-2586:
--------------------------------------

Hi all,

My name is Dimitris Bousis, currently doing my Master in Computer Engineering & Informatics in University of Patras, Greece. My research interests include cloud & distributed computing with related technologies such as Hadoop, HBase, Cassandra , Pig & Hive. Though i have not started any research activity with the technologies (I plan to do so after the summer), i have taken an elective course in Hadoop,HDFS, HBase & Cassandra during my undergraduate studies.

I am interested in applying for this GsoC 2012 project. Flow visualization is really useful when it comes in debugging and breaking down of any form structural query. From the mentor's comment above I assume that there should exist a web interface parsing the DOT format in order to present the plans produced by explain. Furthermore, I'd like to suggest D3.js a js lib that lets you to bind arbitrary data to a Document Object Model (DOM), and then apply data-driven transformations to the document. This library uses HTML5,CSS3 and SVG to represent data within a page.

Please comment this post for anything you consider necessary. Looking forward working with you this summer.

Dimitris Bousis


                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2586) A better plan/data flow visualizer

Posted by "Russell Jurney (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229503#comment-13229503 ] 

Russell Jurney commented on PIG-2586:
-------------------------------------

I would like to mentor this.  The way to go here is with a web interface to this data.  

For instance, using these libraries:

https://github.com/glejeune/Ruby-Graphviz
http://www.sinatrarb.com/
http://neyric.github.com/wireit/

there would be enough time for a GSoC participant to really make serious progress at this.
                
> A better plan/data flow visualizer
> ----------------------------------
>
>                 Key: PIG-2586
>                 URL: https://issues.apache.org/jira/browse/PIG-2586
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>              Labels: gsoc2012
>
> Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain with -dot option, see http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). However, dot graph takes extra step to generate the plan graph and the quality of the output is not good. It's better we can implement a better visualizer for Pig. It should:
> 1. show operator type and alias
> 2. turn on/off output schema
> 3. dive into foreach inner plan on demand
> 4. provide a way to show operator source code, eg, tooltip of an operator (plan don't currently have this information, but you can assume this is in place)
> 5. besides visualize logical/physical/mapreduce plan, visualize the script itself is also useful
> 6. may rely on some java graphic library such as Swing
> This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira