You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pig.apache.org by Apache Wiki <wi...@apache.org> on 2009/03/04 00:46:52 UTC

[Pig Wiki] Trivial Update of "RunPig" by CorinneC

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

------------------------------------------------------------------------------
    * Script file: attachment:id.pig 
    * Embedded program: attachment:idlocal.java and attachment:idhadoop.java
  
- To start, we're going to parse a small text file, namely the /etc/passwd file.  (Don't worry -- for arcane reasons there are no passwords in the etc/passwd file, only user names and public info.) Copy the passwd file into your local directory: `cp /etc/passwd .`
+ To start, we're going to parse a small text file, namely the /etc/passwd file.  (Don't worry -- for arcane reasons there are no passwords in the etc/passwd file, only user names and public info.) Copy the passwd file into your local directory: 
+ {{{ 
+ `cp /etc/passwd .`
+ }}}
  
  Your file may look something like this. Fields are separated by colons (:).
  
@@ -179, +182 @@

  {{{
  $ pig -x mapreduce -verbose
  }}}
- (in newer versions run `pig -x hadoop`)
  
  You should see it first connect to the namenode:
  {{{
  1    [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine  - Connecting to hadoop file system at: hdfs://namenode.your.domain.org:9000
  }}}
  
- If you see a line like
+ If you see a line like this pig is not correctly finding your cluster.
  {{{
  2008-12-02 20:53:02,983 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
  }}}
- pig is not correctly finding your cluster.
+ 
  
  The Grunt shell is invoked and you can enter commands at the prompt. Let's, you guessed it, extract the first column from the text file.  It will be much slower (due to the overhead) but way awesomer.
  {{{