You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by "Khanolkar,Anagha" <AK...@travelers.com> on 2014/05/14 21:35:20 UTC
Issues with embedded Pig (Java) and passing paramaters to pig script

Folks-
I am running into issues with passing parameters to a pig script within a Java program.
Details are below.
Any pointers are greatly appreciated.


Version:
Pig 0.11.0

Attempted:
Run a pig script from Java, in mapreduce mode, passing script parameter map as an argument to the PigServer RegisterScript method. One of the parameters is the HDFS source data path.

Issue:
One of the parameters is the source HDFS data path.. The parameters passed are not getting resolved. See error message pasted below. I am trying to understand if there are issues with my code or if this is a bug with PigServer...Any help is greatly appreciated.

Tested:
- Tested the pig script on CLI successfully
- Tested the embedded pig program successfully without parameters (hard-coded parameter values in pig script directly)

Alternatives explored:
Tried passing the pig script parameters as part of the first parameter to RegisterScript. Received the same error reported.

Error:
ERROR mapReduceLayer.Launcher: Backend error message during job submission
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not         exist: hdfs://xxxxx.xxxxx.net:8020/user/xxxxx/$INPUT_FILE
    at     org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
    at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1105)
    at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1122)
    at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:177)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1021)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:974)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:974)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:319)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.startReadyJobs(JobControl.java:239)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:270)
    at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:160)
    at java.lang.Thread.run(Thread.java:662)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://xxxxx.xxxxx.net:8020/user/xxxxx/$INPUT_FILE

Java program:
import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.apache.pig.ExecType;
import org.apache.pig.PigServer;
import org.apache.pig.backend.executionengine.ExecException;
import org.apache.pig.data.Tuple;


public class RunRecon {

public static void main(String args[]) {

    PigServer pigServer;

    Map<String, String> params = new HashMap<String, String>();
    params.put("INPUT_FILE","hdfs://xxxxx.xxxxx.net:8020/data/pi/opsbia/callanalytics/rawlogs/nuance/processed/*" );
    params.put("SEARCH_STRING", ".*SWIclnd.*");

    try {
        try {
            pigServer = new PigServer(ExecType.MAPREDUCE);
            pigServer.registerScript("nuanceivranalytics/scripts/pig/reconUtil.pig",params);

            Iterator<Tuple> it = pigServer.openIterator("recordCountDS");
            while (it.hasNext()) {
                System.out.println("Record count=" + it.next().get(1));
            }
        } catch (ExecException ex) {
            Logger.getLogger(RunRecon.class.getName()).log(Level.SEVERE, null, ex);
        }
    } catch (IOException ex) {
        Logger.getLogger(RunRecon.class.getName()).log(Level.SEVERE, null, ex);
    }
}
}

________________________________
This communication, including attachments, is confidential, may be subject to legal privileges, and is intended for the sole use of the addressee. Any use, duplication, disclosure or dissemination of this communication, other than by the addressee, is prohibited. If you have received this communication in error, please notify the sender immediately and delete or destroy this communication and all copies.

TRVDiscDefault::1201