You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Shashi Vishwakarma <sh...@gmail.com> on 2015/11/21 08:11:29 UTC

Running Pg+Java on remote cluster with eclipse

I am trying to run a simple pig code in java using eclipse on my windows
machine.I am launching this code cloudera VM.

Here is code that i am trying to execute

import java.io.IOException;import java.util.Properties;import
org.apache.hadoop.conf.Configuration;import
org.apache.pig.ExecType;import org.apache.pig.PigServer;
public class PigConnect{
           public static void main(String[] args) {
           try {

            PigServer pigServer = new PigServer("mapreduce");
            runIdQuery(pigServer,
"hdfs://quickstart.cloudera:8020/user/cloudera/myFile.txt");

           }
           catch(Exception e)
           {
               System.out.println(e.getMessage());
           }
        }
        public static void runIdQuery(PigServer pigServer, String
inputFile) throws IOException {
           pigServer.registerQuery("A = load
'hdfs://quickstart.cloudera:8020/user/cloudera/myFile.txt' using
PigStorage(':');");
           pigServer.registerQuery("B = foreach A generate $0 as id;");
           pigServer.store("B", "idout");
           System.out.println("Success");

           }
        }

I added hdfs-site.xml,yarn-site.xml,core-site.xml,mapred-site.xml in
resource folder in eclipse.Added all required jar in project.

While running code i am getting below error.

Error during parsing. Unable to check name
hdfs://quickstart.cloudera:8020/user/shashi


shashi is my windows user name which it is taking by default.How should
change it use hdfs user?

Thanks
Shashi