You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@pig.apache.org by lulynn_2008 <lu...@163.com> on 2012/02/14 15:17:18 UTC

Questions about pig-0.9.1 e2e tests

 Hi,
I am doing e2e testing to pig-0.9.1. Here is my reference: https://cwiki.apache.org/confluence/display/PIG/HowToTest
Please give your suggestions to following questions:
1. The tests needs a cluster which includes a Name Node/Job Tracker and three for Data Node/Task Trackers. Where is the cluster information saved? Are they saved in hadoop conf files?
2. I assume the tests need a hadoop cluster environment, and just install pig(low version and pig-0.9.1) and run the test cmds in Name Node.Name Node/Job Tracker to generate test data. Please correct me if I was wrong.
3. Is there any data transfer between nodes in clusters during generating test data and testing? If yes, when happened?

Thank you.

Re: Questions about pig-0.9.1 e2e tests

Posted by Alan Gates <ga...@hortonworks.com>.

Where are they hung?  Can you run other Hadoop jobs?  Other Pig jobs?  There shouldn't be a need to regenerate data or run ant clean.

Sending the last few K of the log file the test harness creates would be helpful.

Alan.

On Feb 15, 2012, at 3:17 AM, lulynn_2008 wrote:

> Hi,
> The error is removed now. The reason is hadoop datanode is not started successfully.
> Now, I have another questions, please give your suggestions:
> The tests seems hanged now, what should I do to restart e2e test?
> --Should I regenerate the test data?
> --Should I do something to hadoop fs? For example: remove some data
> --Should I restart Hadoop?
> --Should I run "ant clean"?
> --Is there any other items I missed?
> 
> Thank you.
> 
> 
> 
> 
> 
> 
> At 2012-02-15 12:53:29,"Alan Gates" <ga...@hortonworks.com> wrote:
>> That means it isn't connecting correctly to your Hadoop cluster.  What version of Hadoop are you using?
>> 
>> Alan.
>> 
>> On Feb 14, 2012, at 5:38 PM, lulynn_2008 wrote:
>> 
>>> Thank you for your detail.
>>> Most tests failed with following errors, please give your suggestions:
>>> 
>>> ERROR 2999: Unexpected internal error. Failed to create DataStorage
>>> java.lang.RuntimeException: Failed to create DataStorage
>>>       at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
>>>       at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
>>>       at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:214)
>>>       at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:134)
>>>       at org.apache.pig.impl.PigContext.connect(PigContext.java:183)
>>>       at org.apache.pig.PigServer.<init>(PigServer.java:226)
>>>       at org.apache.pig.PigServer.<init>(PigServer.java:215)
>>>       at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55)
>>>       at org.apache.pig.Main.run(Main.java:492)
>>>       at org.apache.pig.Main.main(Main.java:107)
>>> Caused by: java.io.IOException: Call to svltest150.svl.ibm.com/9.30.225.100:9000 failed on local exception: java.io.EOFException
>>>       at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
>>>       at org.apache.hadoop.ipc.Client.call(Client.java:743)
>>>       at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>>>       at $Proxy0.getProtocolVersion(Unknown Source)
>>>       at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>>>       at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>>>       at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
>>>       at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
>>>       at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
>>>       at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
>>>       at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>       at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
>>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
>>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
>>>       at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
>>>       ... 9 more
>>> Caused by: java.io.EOFException
>>>       at java.io.DataInputStream.readInt(DataInputStream.java:386)
>>>       at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
>>>       at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> At 2012-02-14 22:53:07,"Alan Gates" <ga...@hortonworks.com> wrote:
>>>> 
>>>> On Feb 14, 2012, at 6:17 AM, lulynn_2008 wrote:
>>>> 
>>>>> Hi,
>>>>> I am doing e2e testing to pig-0.9.1. Here is my reference: https://cwiki.apache.org/confluence/display/PIG/HowToTest
>>>>> Please give your suggestions to following questions:
>>>>> 1. The tests needs a cluster which includes a Name Node/Job Tracker and three for Data Node/Task Trackers. Where is the cluster information saved? Are they saved in hadoop conf files?
>>>> Pig uses the standard Hadoop configuration files (mapred-site.xml, core-site.xml, hdfs-site.xml) to find cluster information.  If you set harness.hadoop.home to the directory where you installed Hadoop when you call ant test-e2e (it will fail if you don't) these will automatically get picked up.
>>>> 
>>>>> 2. I assume the tests need a hadoop cluster environment, and just install pig(low version and pig-0.9.1) and run the test cmds in Name Node.Name Node/Job Tracker to generate test data. Please correct me if I was wrong.
>>>> 
>>>> Any machine that has access to the cluster and has Hadoop installed on it with the same configuration files as your cluster will work.  It need not be the NN/JT specifically.  But that machine will work fine.
>>>> 
>>>>> 3. Is there any data transfer between nodes in clusters during generating test data and testing? If yes, when happened?
>>>> 
>>>> Yes, the harness generates data on the machine it's run on and then does a copyFromLocal to load it into HDFS.
>>>>> 
>>>>> Thank you.
>>>> 
>>>> Alan.
>>

Re:Re: Questions about pig-0.9.1 e2e tests

Posted by lulynn_2008 <lu...@163.com>.

Hi,
The error is removed now. The reason is hadoop datanode is not started successfully.
Now, I have another questions, please give your suggestions:
The tests seems hanged now, what should I do to restart e2e test?
--Should I regenerate the test data?
--Should I do something to hadoop fs? For example: remove some data
--Should I restart Hadoop?
--Should I run "ant clean"?
--Is there any other items I missed?

Thank you.






At 2012-02-15 12:53:29,"Alan Gates" <ga...@hortonworks.com> wrote:
>That means it isn't connecting correctly to your Hadoop cluster.  What version of Hadoop are you using?
>
>Alan.
>
>On Feb 14, 2012, at 5:38 PM, lulynn_2008 wrote:
>
>> Thank you for your detail.
>> Most tests failed with following errors, please give your suggestions:
>> 
>> ERROR 2999: Unexpected internal error. Failed to create DataStorage
>> java.lang.RuntimeException: Failed to create DataStorage
>>        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
>>        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
>>        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:214)
>>        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:134)
>>        at org.apache.pig.impl.PigContext.connect(PigContext.java:183)
>>        at org.apache.pig.PigServer.<init>(PigServer.java:226)
>>        at org.apache.pig.PigServer.<init>(PigServer.java:215)
>>        at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55)
>>        at org.apache.pig.Main.run(Main.java:492)
>>        at org.apache.pig.Main.main(Main.java:107)
>> Caused by: java.io.IOException: Call to svltest150.svl.ibm.com/9.30.225.100:9000 failed on local exception: java.io.EOFException
>>        at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
>>        at org.apache.hadoop.ipc.Client.call(Client.java:743)
>>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>>        at $Proxy0.getProtocolVersion(Unknown Source)
>>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>>        at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>>        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
>>        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
>>        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
>>        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
>>        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
>>        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
>>        ... 9 more
>> Caused by: java.io.EOFException
>>        at java.io.DataInputStream.readInt(DataInputStream.java:386)
>>        at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
>>        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
>> 
>> 
>> 
>> 
>> 
>> 
>> At 2012-02-14 22:53:07,"Alan Gates" <ga...@hortonworks.com> wrote:
>>> 
>>> On Feb 14, 2012, at 6:17 AM, lulynn_2008 wrote:
>>> 
>>>> Hi,
>>>> I am doing e2e testing to pig-0.9.1. Here is my reference: https://cwiki.apache.org/confluence/display/PIG/HowToTest
>>>> Please give your suggestions to following questions:
>>>> 1. The tests needs a cluster which includes a Name Node/Job Tracker and three for Data Node/Task Trackers. Where is the cluster information saved? Are they saved in hadoop conf files?
>>> Pig uses the standard Hadoop configuration files (mapred-site.xml, core-site.xml, hdfs-site.xml) to find cluster information.  If you set harness.hadoop.home to the directory where you installed Hadoop when you call ant test-e2e (it will fail if you don't) these will automatically get picked up.
>>> 
>>>> 2. I assume the tests need a hadoop cluster environment, and just install pig(low version and pig-0.9.1) and run the test cmds in Name Node.Name Node/Job Tracker to generate test data. Please correct me if I was wrong.
>>> 
>>> Any machine that has access to the cluster and has Hadoop installed on it with the same configuration files as your cluster will work.  It need not be the NN/JT specifically.  But that machine will work fine.
>>> 
>>>> 3. Is there any data transfer between nodes in clusters during generating test data and testing? If yes, when happened?
>>> 
>>> Yes, the harness generates data on the machine it's run on and then does a copyFromLocal to load it into HDFS.
>>>> 
>>>> Thank you.
>>> 
>>> Alan.
>

Re:Re: Questions about pig-0.9.1 e2e tests

Posted by lulynn_2008 <lu...@163.com>.

hadoop-1.0.0, the latest version hadoop.





At 2012-02-15 12:53:29,"Alan Gates" <ga...@hortonworks.com> wrote:
>That means it isn't connecting correctly to your Hadoop cluster.  What version of Hadoop are you using?
>
>Alan.
>
>On Feb 14, 2012, at 5:38 PM, lulynn_2008 wrote:
>
>> Thank you for your detail.
>> Most tests failed with following errors, please give your suggestions:
>> 
>> ERROR 2999: Unexpected internal error. Failed to create DataStorage
>> java.lang.RuntimeException: Failed to create DataStorage
>>        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
>>        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
>>        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:214)
>>        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:134)
>>        at org.apache.pig.impl.PigContext.connect(PigContext.java:183)
>>        at org.apache.pig.PigServer.<init>(PigServer.java:226)
>>        at org.apache.pig.PigServer.<init>(PigServer.java:215)
>>        at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55)
>>        at org.apache.pig.Main.run(Main.java:492)
>>        at org.apache.pig.Main.main(Main.java:107)
>> Caused by: java.io.IOException: Call to svltest150.svl.ibm.com/9.30.225.100:9000 failed on local exception: java.io.EOFException
>>        at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
>>        at org.apache.hadoop.ipc.Client.call(Client.java:743)
>>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>>        at $Proxy0.getProtocolVersion(Unknown Source)
>>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>>        at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>>        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
>>        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
>>        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
>>        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
>>        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
>>        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
>>        ... 9 more
>> Caused by: java.io.EOFException
>>        at java.io.DataInputStream.readInt(DataInputStream.java:386)
>>        at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
>>        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
>> 
>> 
>> 
>> 
>> 
>> 
>> At 2012-02-14 22:53:07,"Alan Gates" <ga...@hortonworks.com> wrote:
>>> 
>>> On Feb 14, 2012, at 6:17 AM, lulynn_2008 wrote:
>>> 
>>>> Hi,
>>>> I am doing e2e testing to pig-0.9.1. Here is my reference: https://cwiki.apache.org/confluence/display/PIG/HowToTest
>>>> Please give your suggestions to following questions:
>>>> 1. The tests needs a cluster which includes a Name Node/Job Tracker and three for Data Node/Task Trackers. Where is the cluster information saved? Are they saved in hadoop conf files?
>>> Pig uses the standard Hadoop configuration files (mapred-site.xml, core-site.xml, hdfs-site.xml) to find cluster information.  If you set harness.hadoop.home to the directory where you installed Hadoop when you call ant test-e2e (it will fail if you don't) these will automatically get picked up.
>>> 
>>>> 2. I assume the tests need a hadoop cluster environment, and just install pig(low version and pig-0.9.1) and run the test cmds in Name Node.Name Node/Job Tracker to generate test data. Please correct me if I was wrong.
>>> 
>>> Any machine that has access to the cluster and has Hadoop installed on it with the same configuration files as your cluster will work.  It need not be the NN/JT specifically.  But that machine will work fine.
>>> 
>>>> 3. Is there any data transfer between nodes in clusters during generating test data and testing? If yes, when happened?
>>> 
>>> Yes, the harness generates data on the machine it's run on and then does a copyFromLocal to load it into HDFS.
>>>> 
>>>> Thank you.
>>> 
>>> Alan.
>

Re: Questions about pig-0.9.1 e2e tests

Posted by Alan Gates <ga...@hortonworks.com>.

That means it isn't connecting correctly to your Hadoop cluster.  What version of Hadoop are you using?

Alan.

On Feb 14, 2012, at 5:38 PM, lulynn_2008 wrote:

> Thank you for your detail.
> Most tests failed with following errors, please give your suggestions:
> 
> ERROR 2999: Unexpected internal error. Failed to create DataStorage
> java.lang.RuntimeException: Failed to create DataStorage
>        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
>        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
>        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:214)
>        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:134)
>        at org.apache.pig.impl.PigContext.connect(PigContext.java:183)
>        at org.apache.pig.PigServer.<init>(PigServer.java:226)
>        at org.apache.pig.PigServer.<init>(PigServer.java:215)
>        at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55)
>        at org.apache.pig.Main.run(Main.java:492)
>        at org.apache.pig.Main.main(Main.java:107)
> Caused by: java.io.IOException: Call to svltest150.svl.ibm.com/9.30.225.100:9000 failed on local exception: java.io.EOFException
>        at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
>        at org.apache.hadoop.ipc.Client.call(Client.java:743)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>        at $Proxy0.getProtocolVersion(Unknown Source)
>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>        at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
>        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
>        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
>        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
>        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
>        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
>        ... 9 more
> Caused by: java.io.EOFException
>        at java.io.DataInputStream.readInt(DataInputStream.java:386)
>        at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
>        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
> 
> 
> 
> 
> 
> 
> At 2012-02-14 22:53:07,"Alan Gates" <ga...@hortonworks.com> wrote:
>> 
>> On Feb 14, 2012, at 6:17 AM, lulynn_2008 wrote:
>> 
>>> Hi,
>>> I am doing e2e testing to pig-0.9.1. Here is my reference: https://cwiki.apache.org/confluence/display/PIG/HowToTest
>>> Please give your suggestions to following questions:
>>> 1. The tests needs a cluster which includes a Name Node/Job Tracker and three for Data Node/Task Trackers. Where is the cluster information saved? Are they saved in hadoop conf files?
>> Pig uses the standard Hadoop configuration files (mapred-site.xml, core-site.xml, hdfs-site.xml) to find cluster information.  If you set harness.hadoop.home to the directory where you installed Hadoop when you call ant test-e2e (it will fail if you don't) these will automatically get picked up.
>> 
>>> 2. I assume the tests need a hadoop cluster environment, and just install pig(low version and pig-0.9.1) and run the test cmds in Name Node.Name Node/Job Tracker to generate test data. Please correct me if I was wrong.
>> 
>> Any machine that has access to the cluster and has Hadoop installed on it with the same configuration files as your cluster will work.  It need not be the NN/JT specifically.  But that machine will work fine.
>> 
>>> 3. Is there any data transfer between nodes in clusters during generating test data and testing? If yes, when happened?
>> 
>> Yes, the harness generates data on the machine it's run on and then does a copyFromLocal to load it into HDFS.
>>> 
>>> Thank you.
>> 
>> Alan.

Re:Re: Questions about pig-0.9.1 e2e tests

Posted by lulynn_2008 <lu...@163.com>.

Thank you for your detail.
Most tests failed with following errors, please give your suggestions:

ERROR 2999: Unexpected internal error. Failed to create DataStorage
java.lang.RuntimeException: Failed to create DataStorage
        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:214)
        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:134)
        at org.apache.pig.impl.PigContext.connect(PigContext.java:183)
        at org.apache.pig.PigServer.<init>(PigServer.java:226)
        at org.apache.pig.PigServer.<init>(PigServer.java:215)
        at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55)
        at org.apache.pig.Main.run(Main.java:492)
        at org.apache.pig.Main.main(Main.java:107)
Caused by: java.io.IOException: Call to svltest150.svl.ibm.com/9.30.225.100:9000 failed on local exception: java.io.EOFException
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
        at org.apache.hadoop.ipc.Client.call(Client.java:743)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy0.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
        at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
        at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
        ... 9 more
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:386)
        at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)






At 2012-02-14 22:53:07,"Alan Gates" <ga...@hortonworks.com> wrote:
>
>On Feb 14, 2012, at 6:17 AM, lulynn_2008 wrote:
>
>> Hi,
>> I am doing e2e testing to pig-0.9.1. Here is my reference: https://cwiki.apache.org/confluence/display/PIG/HowToTest
>> Please give your suggestions to following questions:
>> 1. The tests needs a cluster which includes a Name Node/Job Tracker and three for Data Node/Task Trackers. Where is the cluster information saved? Are they saved in hadoop conf files?
>Pig uses the standard Hadoop configuration files (mapred-site.xml, core-site.xml, hdfs-site.xml) to find cluster information.  If you set harness.hadoop.home to the directory where you installed Hadoop when you call ant test-e2e (it will fail if you don't) these will automatically get picked up.
>
>> 2. I assume the tests need a hadoop cluster environment, and just install pig(low version and pig-0.9.1) and run the test cmds in Name Node.Name Node/Job Tracker to generate test data. Please correct me if I was wrong.
>
>Any machine that has access to the cluster and has Hadoop installed on it with the same configuration files as your cluster will work.  It need not be the NN/JT specifically.  But that machine will work fine.
>
>> 3. Is there any data transfer between nodes in clusters during generating test data and testing? If yes, when happened?
>
>Yes, the harness generates data on the machine it's run on and then does a copyFromLocal to load it into HDFS.
>> 
>> Thank you.
>
>Alan.

Re: Questions about pig-0.9.1 e2e tests

Posted by Alan Gates <ga...@hortonworks.com>.

On Feb 14, 2012, at 6:17 AM, lulynn_2008 wrote:

> Hi,
> I am doing e2e testing to pig-0.9.1. Here is my reference: https://cwiki.apache.org/confluence/display/PIG/HowToTest
> Please give your suggestions to following questions:
> 1. The tests needs a cluster which includes a Name Node/Job Tracker and three for Data Node/Task Trackers. Where is the cluster information saved? Are they saved in hadoop conf files?
Pig uses the standard Hadoop configuration files (mapred-site.xml, core-site.xml, hdfs-site.xml) to find cluster information.  If you set harness.hadoop.home to the directory where you installed Hadoop when you call ant test-e2e (it will fail if you don't) these will automatically get picked up.

> 2. I assume the tests need a hadoop cluster environment, and just install pig(low version and pig-0.9.1) and run the test cmds in Name Node.Name Node/Job Tracker to generate test data. Please correct me if I was wrong.

Any machine that has access to the cluster and has Hadoop installed on it with the same configuration files as your cluster will work.  It need not be the NN/JT specifically.  But that machine will work fine.

> 3. Is there any data transfer between nodes in clusters during generating test data and testing? If yes, when happened?

Yes, the harness generates data on the machine it's run on and then does a copyFromLocal to load it into HDFS.
> 
> Thank you.

Alan.