You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2008/06/16 23:45:44 UTC

[jira] Created: (PIG-271) Add tutorial files and builds to Pig SVN

Add tutorial files and builds to Pig SVN
----------------------------------------

                 Key: PIG-271
                 URL: https://issues.apache.org/jira/browse/PIG-271
             Project: Pig
          Issue Type: New Feature
            Reporter: Olga Natkovich


Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 

We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606139#action_12606139 ] 

Benjamin Reed commented on PIG-271:
-----------------------------------

Patch looks good technically. It seems pretty racy to be used as a tutorial example. My only critique is that 2 spaces are used to indent rather than the Pig standard of 4 spaces.

> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>         Attachments: PIG-271.patch
>
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606606#action_12606606 ] 

Olga Natkovich commented on PIG-271:
------------------------------------

I have uploaded a new patch. Please, review. If I don't hear any objections, I will commit it tomorrow morining PST.

> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>         Attachments: PIG-271.patch, PIG-271_v2.patch
>
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-271:
-------------------------------

    Attachment: PIG-271_v2.patch

New patch that with clean data and modified scripts

> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>         Attachments: PIG-271.patch, PIG-271_v2.patch
>
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Christopher Olston (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606200#action_12606200 ] 

Christopher Olston commented on PIG-271:
----------------------------------------

+1

about the content: whether we like it or not the Excite query log contains porn-related queries. Hey, it's real data! I'd say that by filtering out porn words we are making the real data less racy.

It's important to note that the user does not need to look at the content of the porn dictionary to use/understand the tutorial.

> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>         Attachments: PIG-271.patch
>
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606434#action_12606434 ] 

Olga Natkovich commented on PIG-271:
------------------------------------

I looked at the scripts and they already have filters and UDFs even if porn processing is removed. The only thing that would be missing is loading data from DFS into UDF which might not be that important for the first encounter with Pig.

I am planning to do the following unless I hear any complains:

- prefilter the data using pornfilter and replace the original data in the tutorial with cleaned one
- remove pornwords and porn filter from the tutorial.

Comments?

> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>         Attachments: PIG-271.patch
>
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606405#action_12606405 ] 

Alan Gates commented on PIG-271:
--------------------------------

I'm with Ben on this one.  I think we should filter the porn words out of the data before we make it part of the patch and drop the porn dictionary.  We can make an example of filtering on something else.  Yeah, it's real data, and yeah we all get spam with worse every day.  But in a tutorial we want to put our best foot (hoof?) forward.  And including a bunch of porn words (in the data and in the filter dictionary) doesn't do that.

> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>         Attachments: PIG-271.patch
>
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-271:
-------------------------------

    Status: Patch Available  (was: Open)

> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>         Attachments: PIG-271.patch
>
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606146#action_12606146 ] 

Benjamin Reed commented on PIG-271:
-----------------------------------

+1 on technical merits. Somebody else needs to approve the example content. It seems to racy to be used as a tutorial example to me.

> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>         Attachments: PIG-271.patch
>
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-271:
-------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I committed the tutorial.

> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>         Attachments: PIG-271.patch, PIG-271_v2.patch
>
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-271:
-------------------------------

    Attachment: PIG-271.patch

This patch to to store all tutorial related code and data in SVN and also to build the tutorial from SVN via ANT

Tutorial structure:

A      tutorial
A      tutorial/src
A      tutorial/src/org
A      tutorial/src/org/apache
A      tutorial/src/org/apache/pig
A      tutorial/src/org/apache/pig/tutorial
A      tutorial/src/org/apache/pig/tutorial/TutorialUtil.java
A      tutorial/src/org/apache/pig/tutorial/ScoreGenerator.java
A      tutorial/src/org/apache/pig/tutorial/NonPornDetector.java
A      tutorial/src/org/apache/pig/tutorial/TutorialTest.java
A      tutorial/src/org/apache/pig/tutorial/NonURLDetector.java
A      tutorial/src/org/apache/pig/tutorial/ExtractHour.java
A      tutorial/src/org/apache/pig/tutorial/NGramGenerator.java
A      tutorial/src/org/apache/pig/tutorial/ToLower.java
A      tutorial/scripts
A      tutorial/scripts/script1-hadoop.pig
A      tutorial/scripts/script1-local.pig
A      tutorial/scripts/script2-hadoop.pig
A      tutorial/scripts/script2-local.pig
A      tutorial/data
A      tutorial/data/excite-small.log
A      tutorial/data/pornwords
A      tutorial/data/excite.log.bz2
A      tutorial/build.xml


> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>         Attachments: PIG-271.patch
>
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-271) Add tutorial files and builds to Pig SVN

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich reassigned PIG-271:
----------------------------------

    Assignee: Olga Natkovich

> Add tutorial files and builds to Pig SVN
> ----------------------------------------
>
>                 Key: PIG-271
>                 URL: https://issues.apache.org/jira/browse/PIG-271
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Olga Natkovich
>
> Corrine build a tutorial for Pig: http://wiki.apache.org/pig/PigTutorial. 
> We should store the files in SVN and also add build targets to comstruct the tutorial

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.