You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by yo...@wipro.com on 2012/07/25 12:48:48 UTC
DATA not storing as comma-separted
Hi All,
I am new to PIG, trying to stroe data in HDFS as comma separated by using command
store RECORDS into 'hadoop/pig/records' using PigStorage(',');
If I do
dump RECORDS ;
it shows
(YogeshKumar 210 hello)
(Mohitkumar 211 hi)
(AAshichoudhary 212 hii)
(renuchoudhary 213 namestey)
I want it to store as
(YogeshKumar, 210, hello)
(Mohitkumar, 211,hi)
(AAshichoudhary, 212, hii)
(renuchoudhary, 213, namestey)
Please suggest and Help
Thanks & Regards
Yogesh Kumar
Please do not print this email unless it is absolutely necessary.
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
www.wipro.com
Re: DATA not storing as comma-separted
Posted by Norbert Burger <no...@gmail.com>.
Yogesh -- based your log info you provided, it seems like your input
data is not tab-delimited, which is the default delimiter when using
PigStorage. As a result, your 3 space-separated fields are being
pulled as one into name:chararray, and then can't being split out
again when your try to store results into HDFS.
Either override the default delimiter (by explicitly specifying
PigStorage(' ')) in your call to LOAD or change your input data to be
tab delimited.
Norbert
the first chararray (name). Since you're not passing a separator
toAs a result, , which is the default separator
On Wed, Jul 25, 2012 at 9:07 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
> Why are you trying 0.7, yogesh? It's ancient at this point.
>
> " Unable to create input splits for: file:///hello/demotry.txt "
> implies the file does not exist.
>
> Can you show a whole session in which you load data, store it using
> PigStorage(','), cat it, and it comes out wrong?
> So far I've been unable to reproduce your results.
>
> D
>
> On Wed, Jul 25, 2012 at 7:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
>> Hello Yogesh,
>>
>> Also add these lines, export PIG_CLASSPATH=/HADOOP_HOME/conf &
>> export HADOOP_CONF_DIR=/HADOOP_HOME/conf, and see if it works for you.
>>
>> Regards,
>> Mohammad Tariq
>>
>>
>> On Wed, Jul 25, 2012 at 6:01 PM, <yo...@wipro.com> wrote:
>>> Hi mohammad,
>>>
>>> when I try the command
>>>
>>> Pig
>>>
>>> its shows error for 0.7.0 version
>>>
>>> mediaadmin$ pig
>>> 12/07/25 17:54:15 INFO pig.Main: Logging error messages to: /users/mediaadmin/pig_1343219055229.log
>>> 2012-07-25 17:54:15,451 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
>>>
>>> and this .log file doesn't exist /users/mediaadmin/
>>>
>>> Wht is it so, I have set the thses properties in pig-0.70.0/bin/pig file.
>>>
>>> ---------------------------------------------------------------------
>>> The Pig command script
>>> #
>>> # Environment Variables
>>> #
>>> export JAVA_HOME=/Library/Java/Home
>>> #
>>> # PIG_CLASSPATH Extra Java CLASSPATH entries.
>>> #
>>> export HADOOP_HOME=/HADOOP/hadoop-0.20.2
>>>
>>> export HADOOP_CONF_DIR=/HADOOP/hadoop-0.20.2/conf
>>>
>>> # PIG_HEAPSIZE The maximum amount of heap to use, in MB.
>>> # Default is 1000.
>>> #
>>> # PIG_OPTS Extra Java runtime options.
>>> #
>>> export PIG_CONF_DIR=/HADOOP/pig-0.7.0/conf
>>> #
>>> # PIG_ROOT_LOGGER The root appender. Default is INFO,console
>>> #
>>> # PIG_HADOOP_VERSION Version of hadoop to run with. Default is 20 (0.20).
>>>
>>> ----------------------------------------------------------------
>>>
>>>
>>>
>>>
>>> ________________________________________
>>> From: Mohammad Tariq [dontariq@gmail.com]
>>> Sent: Wednesday, July 25, 2012 5:34 PM
>>> To: user@pig.apache.org
>>> Subject: Re: DATA not storing as comma-separted
>>>
>>> Also, it would be help to go to the MapReduce web UI and having a look
>>> at the details of the job corresponding to this query.
>>>
>>> Regards,
>>> Mohammad Tariq
>>>
>>>
>>> On Wed, Jul 25, 2012 at 5:31 PM, Mohammad Tariq <do...@gmail.com> wrote:
>>>> I have worked with pig-0.7.0 once and it was working fine. Try to see
>>>> if there is anything interesting in the log files. Also, if possible,
>>>> share 2-3 lines of your file..I'll give it a try on my machine.
>>>>
>>>> Regards,
>>>> Mohammad Tariq
>>>>
>>>>
>>>> On Wed, Jul 25, 2012 at 5:20 PM, <yo...@wipro.com> wrote:
>>>>> Hi Mohammad,
>>>>>
>>>>> I have switched from pig 0.10.0 to 0.7.0 and its horrible experience.
>>>>> I do perform
>>>>>
>>>>> grunt> A = load '/hello/demotry.txt'
>>>>>>> as (name:chararray, roll:int, mssg:chararray);
>>>>>
>>>>> grunt> dump A;
>>>>>
>>>>> it shows this error:
>>>>>
>>>>> grunt> dump A;
>>>>> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No column pruned for A
>>>>> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No map keys pruned for A
>>>>> 2012-07-25 17:20:34,102 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
>>>>> 2012-07-25 17:20:34,169 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: Store(file:/tmp/temp61624047/tmp1087576502:org.apache.pig.builtin.BinStorage) - 1-18 Operator Key: 1-18)
>>>>> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
>>>>> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
>>>>> 2012-07-25 17:20:34,211 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>>> 2012-07-25 17:20:34,217 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>>> 2012-07-25 17:20:34,217 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
>>>>> 2012-07-25 17:20:35,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
>>>>> 2012-07-25 17:20:35,599 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>>> 2012-07-25 17:20:35,600 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
>>>>> 2012-07-25 17:20:35,606 [Thread-7] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
>>>>> 2012-07-25 17:20:35,750 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>>> 2012-07-25 17:20:35,763 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>>> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
>>>>> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
>>>>> 2012-07-25 17:20:36,101 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map reduce job(s) failed!
>>>>> 2012-07-25 17:20:36,107 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed to produce result in: "file:/tmp/temp61624047/tmp1087576502"
>>>>> 2012-07-25 17:20:36,107 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
>>>>> 2012-07-25 17:20:36,120 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>>> 2012-07-25 17:20:36,121 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: file:///hello/demotry.txt
>>>>> Details at logfile: /users/mediaadmin/pig_1343217013235.log
>>>>>
>>>>>
>>>>> why is it happening so :-(
>>>>>
>>>>> Please help and Suggest
>>>>>
>>>>> Thanks & Regards
>>>>> yogesh Kumar
>>>>>
>>>>>
>>>>>
>>>>> ________________________________________
>>>>> From: Mohammad Tariq [dontariq@gmail.com]
>>>>> Sent: Wednesday, July 25, 2012 5:00 PM
>>>>> To: user@pig.apache.org
>>>>> Subject: Re: DATA not storing as comma-separted
>>>>>
>>>>> Hi Yogesh,
>>>>>
>>>>> Is 'load' working fine with PigStorage()?? Try to load
>>>>> something using PigStorage(',') and dump it to see if that is working.
>>>>>
>>>>> Regards,
>>>>> Mohammad Tariq
>>>>>
>>>>>
>>>>> On Wed, Jul 25, 2012 at 4:41 PM, <yo...@wipro.com> wrote:
>>>>>> Hello Dmitriy,
>>>>>>
>>>>>> I have also performed the cat command in hadoop.
>>>>>>
>>>>>> hadoop dfs -cat /hadoop/pig/records/part-m-00000
>>>>>>
>>>>>> but still it shows same output without commas.
>>>>>> Please suggest
>>>>>>
>>>>>> Thanks & regards
>>>>>> Yogesh Kumar
>>>>>> ________________________________________
>>>>>> From: Dmitriy Ryaboy [dvryaboy@gmail.com]
>>>>>> Sent: Wednesday, July 25, 2012 4:33 PM
>>>>>> To: user@pig.apache.org
>>>>>> Subject: Re: DATA not storing as comma-separted
>>>>>>
>>>>>> Using the store expression you wrote should work. Dump is its own thing and doesn't know anything about the format you store things in. To see files created on hdfs, you can use cat.
>>>>>>
>>>>>> On Jul 25, 2012, at 3:48 AM, <yo...@wipro.com> wrote:
>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I am new to PIG, trying to stroe data in HDFS as comma separated by using command
>>>>>>>
>>>>>>> store RECORDS into 'hadoop/pig/records' using PigStorage(',');
>>>>>>>
>>>>>>> If I do
>>>>>>>
>>>>>>> dump RECORDS ;
>>>>>>>
>>>>>>> it shows
>>>>>>>
>>>>>>> (YogeshKumar 210 hello)
>>>>>>> (Mohitkumar 211 hi)
>>>>>>> (AAshichoudhary 212 hii)
>>>>>>> (renuchoudhary 213 namestey)
>>>>>>>
>>>>>>> I want it to store as
>>>>>>>
>>>>>>> (YogeshKumar, 210, hello)
>>>>>>> (Mohitkumar, 211,hi)
>>>>>>> (AAshichoudhary, 212, hii)
>>>>>>> (renuchoudhary, 213, namestey)
>>>>>>>
>>>>>>>
>>>>>>> Please suggest and Help
>>>>>>>
>>>>>>> Thanks & Regards
>>>>>>> Yogesh Kumar
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Please do not print this email unless it is absolutely necessary.
>>>>>>>
>>>>>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>>>>>
>>>>>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>>>>>
>>>>>>> www.wipro.com
>>>>>>
>>>>>> Please do not print this email unless it is absolutely necessary.
>>>>>>
>>>>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>>>>
>>>>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>>>>
>>>>>> www.wipro.com
>>>>>
>>>>> Please do not print this email unless it is absolutely necessary.
>>>>>
>>>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>>>
>>>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>>>
>>>>> www.wipro.com
>>>
>>> Please do not print this email unless it is absolutely necessary.
>>>
>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>
>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>
>>> www.wipro.com
Re: DATA not storing as comma-separted
Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Why are you trying 0.7, yogesh? It's ancient at this point.
" Unable to create input splits for: file:///hello/demotry.txt "
implies the file does not exist.
Can you show a whole session in which you load data, store it using
PigStorage(','), cat it, and it comes out wrong?
So far I've been unable to reproduce your results.
D
On Wed, Jul 25, 2012 at 7:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
> Hello Yogesh,
>
> Also add these lines, export PIG_CLASSPATH=/HADOOP_HOME/conf &
> export HADOOP_CONF_DIR=/HADOOP_HOME/conf, and see if it works for you.
>
> Regards,
> Mohammad Tariq
>
>
> On Wed, Jul 25, 2012 at 6:01 PM, <yo...@wipro.com> wrote:
>> Hi mohammad,
>>
>> when I try the command
>>
>> Pig
>>
>> its shows error for 0.7.0 version
>>
>> mediaadmin$ pig
>> 12/07/25 17:54:15 INFO pig.Main: Logging error messages to: /users/mediaadmin/pig_1343219055229.log
>> 2012-07-25 17:54:15,451 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
>>
>> and this .log file doesn't exist /users/mediaadmin/
>>
>> Wht is it so, I have set the thses properties in pig-0.70.0/bin/pig file.
>>
>> ---------------------------------------------------------------------
>> The Pig command script
>> #
>> # Environment Variables
>> #
>> export JAVA_HOME=/Library/Java/Home
>> #
>> # PIG_CLASSPATH Extra Java CLASSPATH entries.
>> #
>> export HADOOP_HOME=/HADOOP/hadoop-0.20.2
>>
>> export HADOOP_CONF_DIR=/HADOOP/hadoop-0.20.2/conf
>>
>> # PIG_HEAPSIZE The maximum amount of heap to use, in MB.
>> # Default is 1000.
>> #
>> # PIG_OPTS Extra Java runtime options.
>> #
>> export PIG_CONF_DIR=/HADOOP/pig-0.7.0/conf
>> #
>> # PIG_ROOT_LOGGER The root appender. Default is INFO,console
>> #
>> # PIG_HADOOP_VERSION Version of hadoop to run with. Default is 20 (0.20).
>>
>> ----------------------------------------------------------------
>>
>>
>>
>>
>> ________________________________________
>> From: Mohammad Tariq [dontariq@gmail.com]
>> Sent: Wednesday, July 25, 2012 5:34 PM
>> To: user@pig.apache.org
>> Subject: Re: DATA not storing as comma-separted
>>
>> Also, it would be help to go to the MapReduce web UI and having a look
>> at the details of the job corresponding to this query.
>>
>> Regards,
>> Mohammad Tariq
>>
>>
>> On Wed, Jul 25, 2012 at 5:31 PM, Mohammad Tariq <do...@gmail.com> wrote:
>>> I have worked with pig-0.7.0 once and it was working fine. Try to see
>>> if there is anything interesting in the log files. Also, if possible,
>>> share 2-3 lines of your file..I'll give it a try on my machine.
>>>
>>> Regards,
>>> Mohammad Tariq
>>>
>>>
>>> On Wed, Jul 25, 2012 at 5:20 PM, <yo...@wipro.com> wrote:
>>>> Hi Mohammad,
>>>>
>>>> I have switched from pig 0.10.0 to 0.7.0 and its horrible experience.
>>>> I do perform
>>>>
>>>> grunt> A = load '/hello/demotry.txt'
>>>>>> as (name:chararray, roll:int, mssg:chararray);
>>>>
>>>> grunt> dump A;
>>>>
>>>> it shows this error:
>>>>
>>>> grunt> dump A;
>>>> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No column pruned for A
>>>> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No map keys pruned for A
>>>> 2012-07-25 17:20:34,102 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
>>>> 2012-07-25 17:20:34,169 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: Store(file:/tmp/temp61624047/tmp1087576502:org.apache.pig.builtin.BinStorage) - 1-18 Operator Key: 1-18)
>>>> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
>>>> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
>>>> 2012-07-25 17:20:34,211 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>> 2012-07-25 17:20:34,217 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>> 2012-07-25 17:20:34,217 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
>>>> 2012-07-25 17:20:35,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
>>>> 2012-07-25 17:20:35,599 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>> 2012-07-25 17:20:35,600 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
>>>> 2012-07-25 17:20:35,606 [Thread-7] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
>>>> 2012-07-25 17:20:35,750 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>> 2012-07-25 17:20:35,763 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
>>>> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
>>>> 2012-07-25 17:20:36,101 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map reduce job(s) failed!
>>>> 2012-07-25 17:20:36,107 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed to produce result in: "file:/tmp/temp61624047/tmp1087576502"
>>>> 2012-07-25 17:20:36,107 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
>>>> 2012-07-25 17:20:36,120 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>>> 2012-07-25 17:20:36,121 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: file:///hello/demotry.txt
>>>> Details at logfile: /users/mediaadmin/pig_1343217013235.log
>>>>
>>>>
>>>> why is it happening so :-(
>>>>
>>>> Please help and Suggest
>>>>
>>>> Thanks & Regards
>>>> yogesh Kumar
>>>>
>>>>
>>>>
>>>> ________________________________________
>>>> From: Mohammad Tariq [dontariq@gmail.com]
>>>> Sent: Wednesday, July 25, 2012 5:00 PM
>>>> To: user@pig.apache.org
>>>> Subject: Re: DATA not storing as comma-separted
>>>>
>>>> Hi Yogesh,
>>>>
>>>> Is 'load' working fine with PigStorage()?? Try to load
>>>> something using PigStorage(',') and dump it to see if that is working.
>>>>
>>>> Regards,
>>>> Mohammad Tariq
>>>>
>>>>
>>>> On Wed, Jul 25, 2012 at 4:41 PM, <yo...@wipro.com> wrote:
>>>>> Hello Dmitriy,
>>>>>
>>>>> I have also performed the cat command in hadoop.
>>>>>
>>>>> hadoop dfs -cat /hadoop/pig/records/part-m-00000
>>>>>
>>>>> but still it shows same output without commas.
>>>>> Please suggest
>>>>>
>>>>> Thanks & regards
>>>>> Yogesh Kumar
>>>>> ________________________________________
>>>>> From: Dmitriy Ryaboy [dvryaboy@gmail.com]
>>>>> Sent: Wednesday, July 25, 2012 4:33 PM
>>>>> To: user@pig.apache.org
>>>>> Subject: Re: DATA not storing as comma-separted
>>>>>
>>>>> Using the store expression you wrote should work. Dump is its own thing and doesn't know anything about the format you store things in. To see files created on hdfs, you can use cat.
>>>>>
>>>>> On Jul 25, 2012, at 3:48 AM, <yo...@wipro.com> wrote:
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> I am new to PIG, trying to stroe data in HDFS as comma separated by using command
>>>>>>
>>>>>> store RECORDS into 'hadoop/pig/records' using PigStorage(',');
>>>>>>
>>>>>> If I do
>>>>>>
>>>>>> dump RECORDS ;
>>>>>>
>>>>>> it shows
>>>>>>
>>>>>> (YogeshKumar 210 hello)
>>>>>> (Mohitkumar 211 hi)
>>>>>> (AAshichoudhary 212 hii)
>>>>>> (renuchoudhary 213 namestey)
>>>>>>
>>>>>> I want it to store as
>>>>>>
>>>>>> (YogeshKumar, 210, hello)
>>>>>> (Mohitkumar, 211,hi)
>>>>>> (AAshichoudhary, 212, hii)
>>>>>> (renuchoudhary, 213, namestey)
>>>>>>
>>>>>>
>>>>>> Please suggest and Help
>>>>>>
>>>>>> Thanks & Regards
>>>>>> Yogesh Kumar
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Please do not print this email unless it is absolutely necessary.
>>>>>>
>>>>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>>>>
>>>>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>>>>
>>>>>> www.wipro.com
>>>>>
>>>>> Please do not print this email unless it is absolutely necessary.
>>>>>
>>>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>>>
>>>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>>>
>>>>> www.wipro.com
>>>>
>>>> Please do not print this email unless it is absolutely necessary.
>>>>
>>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>>
>>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>>
>>>> www.wipro.com
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>
>> www.wipro.com
Re: DATA not storing as comma-separted
Posted by Mohammad Tariq <do...@gmail.com>.
Hello Yogesh,
Also add these lines, export PIG_CLASSPATH=/HADOOP_HOME/conf &
export HADOOP_CONF_DIR=/HADOOP_HOME/conf, and see if it works for you.
Regards,
Mohammad Tariq
On Wed, Jul 25, 2012 at 6:01 PM, <yo...@wipro.com> wrote:
> Hi mohammad,
>
> when I try the command
>
> Pig
>
> its shows error for 0.7.0 version
>
> mediaadmin$ pig
> 12/07/25 17:54:15 INFO pig.Main: Logging error messages to: /users/mediaadmin/pig_1343219055229.log
> 2012-07-25 17:54:15,451 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
>
> and this .log file doesn't exist /users/mediaadmin/
>
> Wht is it so, I have set the thses properties in pig-0.70.0/bin/pig file.
>
> ---------------------------------------------------------------------
> The Pig command script
> #
> # Environment Variables
> #
> export JAVA_HOME=/Library/Java/Home
> #
> # PIG_CLASSPATH Extra Java CLASSPATH entries.
> #
> export HADOOP_HOME=/HADOOP/hadoop-0.20.2
>
> export HADOOP_CONF_DIR=/HADOOP/hadoop-0.20.2/conf
>
> # PIG_HEAPSIZE The maximum amount of heap to use, in MB.
> # Default is 1000.
> #
> # PIG_OPTS Extra Java runtime options.
> #
> export PIG_CONF_DIR=/HADOOP/pig-0.7.0/conf
> #
> # PIG_ROOT_LOGGER The root appender. Default is INFO,console
> #
> # PIG_HADOOP_VERSION Version of hadoop to run with. Default is 20 (0.20).
>
> ----------------------------------------------------------------
>
>
>
>
> ________________________________________
> From: Mohammad Tariq [dontariq@gmail.com]
> Sent: Wednesday, July 25, 2012 5:34 PM
> To: user@pig.apache.org
> Subject: Re: DATA not storing as comma-separted
>
> Also, it would be help to go to the MapReduce web UI and having a look
> at the details of the job corresponding to this query.
>
> Regards,
> Mohammad Tariq
>
>
> On Wed, Jul 25, 2012 at 5:31 PM, Mohammad Tariq <do...@gmail.com> wrote:
>> I have worked with pig-0.7.0 once and it was working fine. Try to see
>> if there is anything interesting in the log files. Also, if possible,
>> share 2-3 lines of your file..I'll give it a try on my machine.
>>
>> Regards,
>> Mohammad Tariq
>>
>>
>> On Wed, Jul 25, 2012 at 5:20 PM, <yo...@wipro.com> wrote:
>>> Hi Mohammad,
>>>
>>> I have switched from pig 0.10.0 to 0.7.0 and its horrible experience.
>>> I do perform
>>>
>>> grunt> A = load '/hello/demotry.txt'
>>>>> as (name:chararray, roll:int, mssg:chararray);
>>>
>>> grunt> dump A;
>>>
>>> it shows this error:
>>>
>>> grunt> dump A;
>>> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No column pruned for A
>>> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No map keys pruned for A
>>> 2012-07-25 17:20:34,102 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
>>> 2012-07-25 17:20:34,169 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: Store(file:/tmp/temp61624047/tmp1087576502:org.apache.pig.builtin.BinStorage) - 1-18 Operator Key: 1-18)
>>> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
>>> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
>>> 2012-07-25 17:20:34,211 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2012-07-25 17:20:34,217 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2012-07-25 17:20:34,217 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
>>> 2012-07-25 17:20:35,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
>>> 2012-07-25 17:20:35,599 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2012-07-25 17:20:35,600 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
>>> 2012-07-25 17:20:35,606 [Thread-7] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
>>> 2012-07-25 17:20:35,750 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2012-07-25 17:20:35,763 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
>>> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
>>> 2012-07-25 17:20:36,101 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map reduce job(s) failed!
>>> 2012-07-25 17:20:36,107 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed to produce result in: "file:/tmp/temp61624047/tmp1087576502"
>>> 2012-07-25 17:20:36,107 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
>>> 2012-07-25 17:20:36,120 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>>> 2012-07-25 17:20:36,121 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: file:///hello/demotry.txt
>>> Details at logfile: /users/mediaadmin/pig_1343217013235.log
>>>
>>>
>>> why is it happening so :-(
>>>
>>> Please help and Suggest
>>>
>>> Thanks & Regards
>>> yogesh Kumar
>>>
>>>
>>>
>>> ________________________________________
>>> From: Mohammad Tariq [dontariq@gmail.com]
>>> Sent: Wednesday, July 25, 2012 5:00 PM
>>> To: user@pig.apache.org
>>> Subject: Re: DATA not storing as comma-separted
>>>
>>> Hi Yogesh,
>>>
>>> Is 'load' working fine with PigStorage()?? Try to load
>>> something using PigStorage(',') and dump it to see if that is working.
>>>
>>> Regards,
>>> Mohammad Tariq
>>>
>>>
>>> On Wed, Jul 25, 2012 at 4:41 PM, <yo...@wipro.com> wrote:
>>>> Hello Dmitriy,
>>>>
>>>> I have also performed the cat command in hadoop.
>>>>
>>>> hadoop dfs -cat /hadoop/pig/records/part-m-00000
>>>>
>>>> but still it shows same output without commas.
>>>> Please suggest
>>>>
>>>> Thanks & regards
>>>> Yogesh Kumar
>>>> ________________________________________
>>>> From: Dmitriy Ryaboy [dvryaboy@gmail.com]
>>>> Sent: Wednesday, July 25, 2012 4:33 PM
>>>> To: user@pig.apache.org
>>>> Subject: Re: DATA not storing as comma-separted
>>>>
>>>> Using the store expression you wrote should work. Dump is its own thing and doesn't know anything about the format you store things in. To see files created on hdfs, you can use cat.
>>>>
>>>> On Jul 25, 2012, at 3:48 AM, <yo...@wipro.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> I am new to PIG, trying to stroe data in HDFS as comma separated by using command
>>>>>
>>>>> store RECORDS into 'hadoop/pig/records' using PigStorage(',');
>>>>>
>>>>> If I do
>>>>>
>>>>> dump RECORDS ;
>>>>>
>>>>> it shows
>>>>>
>>>>> (YogeshKumar 210 hello)
>>>>> (Mohitkumar 211 hi)
>>>>> (AAshichoudhary 212 hii)
>>>>> (renuchoudhary 213 namestey)
>>>>>
>>>>> I want it to store as
>>>>>
>>>>> (YogeshKumar, 210, hello)
>>>>> (Mohitkumar, 211,hi)
>>>>> (AAshichoudhary, 212, hii)
>>>>> (renuchoudhary, 213, namestey)
>>>>>
>>>>>
>>>>> Please suggest and Help
>>>>>
>>>>> Thanks & Regards
>>>>> Yogesh Kumar
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Please do not print this email unless it is absolutely necessary.
>>>>>
>>>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>>>
>>>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>>>
>>>>> www.wipro.com
>>>>
>>>> Please do not print this email unless it is absolutely necessary.
>>>>
>>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>>
>>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>>
>>>> www.wipro.com
>>>
>>> Please do not print this email unless it is absolutely necessary.
>>>
>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>
>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>
>>> www.wipro.com
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com
RE: DATA not storing as comma-separted
Posted by yo...@wipro.com.
Hi mohammad,
when I try the command
Pig
its shows error for 0.7.0 version
mediaadmin$ pig
12/07/25 17:54:15 INFO pig.Main: Logging error messages to: /users/mediaadmin/pig_1343219055229.log
2012-07-25 17:54:15,451 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
and this .log file doesn't exist /users/mediaadmin/
Wht is it so, I have set the thses properties in pig-0.70.0/bin/pig file.
---------------------------------------------------------------------
The Pig command script
#
# Environment Variables
#
export JAVA_HOME=/Library/Java/Home
#
# PIG_CLASSPATH Extra Java CLASSPATH entries.
#
export HADOOP_HOME=/HADOOP/hadoop-0.20.2
export HADOOP_CONF_DIR=/HADOOP/hadoop-0.20.2/conf
# PIG_HEAPSIZE The maximum amount of heap to use, in MB.
# Default is 1000.
#
# PIG_OPTS Extra Java runtime options.
#
export PIG_CONF_DIR=/HADOOP/pig-0.7.0/conf
#
# PIG_ROOT_LOGGER The root appender. Default is INFO,console
#
# PIG_HADOOP_VERSION Version of hadoop to run with. Default is 20 (0.20).
----------------------------------------------------------------
________________________________________
From: Mohammad Tariq [dontariq@gmail.com]
Sent: Wednesday, July 25, 2012 5:34 PM
To: user@pig.apache.org
Subject: Re: DATA not storing as comma-separted
Also, it would be help to go to the MapReduce web UI and having a look
at the details of the job corresponding to this query.
Regards,
Mohammad Tariq
On Wed, Jul 25, 2012 at 5:31 PM, Mohammad Tariq <do...@gmail.com> wrote:
> I have worked with pig-0.7.0 once and it was working fine. Try to see
> if there is anything interesting in the log files. Also, if possible,
> share 2-3 lines of your file..I'll give it a try on my machine.
>
> Regards,
> Mohammad Tariq
>
>
> On Wed, Jul 25, 2012 at 5:20 PM, <yo...@wipro.com> wrote:
>> Hi Mohammad,
>>
>> I have switched from pig 0.10.0 to 0.7.0 and its horrible experience.
>> I do perform
>>
>> grunt> A = load '/hello/demotry.txt'
>>>> as (name:chararray, roll:int, mssg:chararray);
>>
>> grunt> dump A;
>>
>> it shows this error:
>>
>> grunt> dump A;
>> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No column pruned for A
>> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No map keys pruned for A
>> 2012-07-25 17:20:34,102 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
>> 2012-07-25 17:20:34,169 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: Store(file:/tmp/temp61624047/tmp1087576502:org.apache.pig.builtin.BinStorage) - 1-18 Operator Key: 1-18)
>> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
>> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
>> 2012-07-25 17:20:34,211 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:34,217 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:34,217 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
>> 2012-07-25 17:20:35,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
>> 2012-07-25 17:20:35,599 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:35,600 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
>> 2012-07-25 17:20:35,606 [Thread-7] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
>> 2012-07-25 17:20:35,750 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:35,763 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
>> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
>> 2012-07-25 17:20:36,101 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map reduce job(s) failed!
>> 2012-07-25 17:20:36,107 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed to produce result in: "file:/tmp/temp61624047/tmp1087576502"
>> 2012-07-25 17:20:36,107 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
>> 2012-07-25 17:20:36,120 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:36,121 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: file:///hello/demotry.txt
>> Details at logfile: /users/mediaadmin/pig_1343217013235.log
>>
>>
>> why is it happening so :-(
>>
>> Please help and Suggest
>>
>> Thanks & Regards
>> yogesh Kumar
>>
>>
>>
>> ________________________________________
>> From: Mohammad Tariq [dontariq@gmail.com]
>> Sent: Wednesday, July 25, 2012 5:00 PM
>> To: user@pig.apache.org
>> Subject: Re: DATA not storing as comma-separted
>>
>> Hi Yogesh,
>>
>> Is 'load' working fine with PigStorage()?? Try to load
>> something using PigStorage(',') and dump it to see if that is working.
>>
>> Regards,
>> Mohammad Tariq
>>
>>
>> On Wed, Jul 25, 2012 at 4:41 PM, <yo...@wipro.com> wrote:
>>> Hello Dmitriy,
>>>
>>> I have also performed the cat command in hadoop.
>>>
>>> hadoop dfs -cat /hadoop/pig/records/part-m-00000
>>>
>>> but still it shows same output without commas.
>>> Please suggest
>>>
>>> Thanks & regards
>>> Yogesh Kumar
>>> ________________________________________
>>> From: Dmitriy Ryaboy [dvryaboy@gmail.com]
>>> Sent: Wednesday, July 25, 2012 4:33 PM
>>> To: user@pig.apache.org
>>> Subject: Re: DATA not storing as comma-separted
>>>
>>> Using the store expression you wrote should work. Dump is its own thing and doesn't know anything about the format you store things in. To see files created on hdfs, you can use cat.
>>>
>>> On Jul 25, 2012, at 3:48 AM, <yo...@wipro.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I am new to PIG, trying to stroe data in HDFS as comma separated by using command
>>>>
>>>> store RECORDS into 'hadoop/pig/records' using PigStorage(',');
>>>>
>>>> If I do
>>>>
>>>> dump RECORDS ;
>>>>
>>>> it shows
>>>>
>>>> (YogeshKumar 210 hello)
>>>> (Mohitkumar 211 hi)
>>>> (AAshichoudhary 212 hii)
>>>> (renuchoudhary 213 namestey)
>>>>
>>>> I want it to store as
>>>>
>>>> (YogeshKumar, 210, hello)
>>>> (Mohitkumar, 211,hi)
>>>> (AAshichoudhary, 212, hii)
>>>> (renuchoudhary, 213, namestey)
>>>>
>>>>
>>>> Please suggest and Help
>>>>
>>>> Thanks & Regards
>>>> Yogesh Kumar
>>>>
>>>>
>>>>
>>>>
>>>> Please do not print this email unless it is absolutely necessary.
>>>>
>>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>>
>>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>>
>>>> www.wipro.com
>>>
>>> Please do not print this email unless it is absolutely necessary.
>>>
>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>
>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>
>>> www.wipro.com
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>
>> www.wipro.com
Please do not print this email unless it is absolutely necessary.
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
www.wipro.com
Re: DATA not storing as comma-separted
Posted by Mohammad Tariq <do...@gmail.com>.
Also, it would be help to go to the MapReduce web UI and having a look
at the details of the job corresponding to this query.
Regards,
Mohammad Tariq
On Wed, Jul 25, 2012 at 5:31 PM, Mohammad Tariq <do...@gmail.com> wrote:
> I have worked with pig-0.7.0 once and it was working fine. Try to see
> if there is anything interesting in the log files. Also, if possible,
> share 2-3 lines of your file..I'll give it a try on my machine.
>
> Regards,
> Mohammad Tariq
>
>
> On Wed, Jul 25, 2012 at 5:20 PM, <yo...@wipro.com> wrote:
>> Hi Mohammad,
>>
>> I have switched from pig 0.10.0 to 0.7.0 and its horrible experience.
>> I do perform
>>
>> grunt> A = load '/hello/demotry.txt'
>>>> as (name:chararray, roll:int, mssg:chararray);
>>
>> grunt> dump A;
>>
>> it shows this error:
>>
>> grunt> dump A;
>> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No column pruned for A
>> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No map keys pruned for A
>> 2012-07-25 17:20:34,102 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
>> 2012-07-25 17:20:34,169 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: Store(file:/tmp/temp61624047/tmp1087576502:org.apache.pig.builtin.BinStorage) - 1-18 Operator Key: 1-18)
>> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
>> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
>> 2012-07-25 17:20:34,211 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:34,217 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:34,217 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
>> 2012-07-25 17:20:35,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
>> 2012-07-25 17:20:35,599 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:35,600 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
>> 2012-07-25 17:20:35,606 [Thread-7] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
>> 2012-07-25 17:20:35,750 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:35,763 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
>> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
>> 2012-07-25 17:20:36,101 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map reduce job(s) failed!
>> 2012-07-25 17:20:36,107 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed to produce result in: "file:/tmp/temp61624047/tmp1087576502"
>> 2012-07-25 17:20:36,107 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
>> 2012-07-25 17:20:36,120 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
>> 2012-07-25 17:20:36,121 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: file:///hello/demotry.txt
>> Details at logfile: /users/mediaadmin/pig_1343217013235.log
>>
>>
>> why is it happening so :-(
>>
>> Please help and Suggest
>>
>> Thanks & Regards
>> yogesh Kumar
>>
>>
>>
>> ________________________________________
>> From: Mohammad Tariq [dontariq@gmail.com]
>> Sent: Wednesday, July 25, 2012 5:00 PM
>> To: user@pig.apache.org
>> Subject: Re: DATA not storing as comma-separted
>>
>> Hi Yogesh,
>>
>> Is 'load' working fine with PigStorage()?? Try to load
>> something using PigStorage(',') and dump it to see if that is working.
>>
>> Regards,
>> Mohammad Tariq
>>
>>
>> On Wed, Jul 25, 2012 at 4:41 PM, <yo...@wipro.com> wrote:
>>> Hello Dmitriy,
>>>
>>> I have also performed the cat command in hadoop.
>>>
>>> hadoop dfs -cat /hadoop/pig/records/part-m-00000
>>>
>>> but still it shows same output without commas.
>>> Please suggest
>>>
>>> Thanks & regards
>>> Yogesh Kumar
>>> ________________________________________
>>> From: Dmitriy Ryaboy [dvryaboy@gmail.com]
>>> Sent: Wednesday, July 25, 2012 4:33 PM
>>> To: user@pig.apache.org
>>> Subject: Re: DATA not storing as comma-separted
>>>
>>> Using the store expression you wrote should work. Dump is its own thing and doesn't know anything about the format you store things in. To see files created on hdfs, you can use cat.
>>>
>>> On Jul 25, 2012, at 3:48 AM, <yo...@wipro.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I am new to PIG, trying to stroe data in HDFS as comma separated by using command
>>>>
>>>> store RECORDS into 'hadoop/pig/records' using PigStorage(',');
>>>>
>>>> If I do
>>>>
>>>> dump RECORDS ;
>>>>
>>>> it shows
>>>>
>>>> (YogeshKumar 210 hello)
>>>> (Mohitkumar 211 hi)
>>>> (AAshichoudhary 212 hii)
>>>> (renuchoudhary 213 namestey)
>>>>
>>>> I want it to store as
>>>>
>>>> (YogeshKumar, 210, hello)
>>>> (Mohitkumar, 211,hi)
>>>> (AAshichoudhary, 212, hii)
>>>> (renuchoudhary, 213, namestey)
>>>>
>>>>
>>>> Please suggest and Help
>>>>
>>>> Thanks & Regards
>>>> Yogesh Kumar
>>>>
>>>>
>>>>
>>>>
>>>> Please do not print this email unless it is absolutely necessary.
>>>>
>>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>>
>>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>>
>>>> www.wipro.com
>>>
>>> Please do not print this email unless it is absolutely necessary.
>>>
>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>
>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>
>>> www.wipro.com
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>
>> www.wipro.com
Re: DATA not storing as comma-separted
Posted by Mohammad Tariq <do...@gmail.com>.
I have worked with pig-0.7.0 once and it was working fine. Try to see
if there is anything interesting in the log files. Also, if possible,
share 2-3 lines of your file..I'll give it a try on my machine.
Regards,
Mohammad Tariq
On Wed, Jul 25, 2012 at 5:20 PM, <yo...@wipro.com> wrote:
> Hi Mohammad,
>
> I have switched from pig 0.10.0 to 0.7.0 and its horrible experience.
> I do perform
>
> grunt> A = load '/hello/demotry.txt'
>>> as (name:chararray, roll:int, mssg:chararray);
>
> grunt> dump A;
>
> it shows this error:
>
> grunt> dump A;
> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No column pruned for A
> 2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No map keys pruned for A
> 2012-07-25 17:20:34,102 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
> 2012-07-25 17:20:34,169 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: Store(file:/tmp/temp61624047/tmp1087576502:org.apache.pig.builtin.BinStorage) - 1-18 Operator Key: 1-18)
> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
> 2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
> 2012-07-25 17:20:34,211 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
> 2012-07-25 17:20:34,217 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
> 2012-07-25 17:20:34,217 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2012-07-25 17:20:35,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2012-07-25 17:20:35,599 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
> 2012-07-25 17:20:35,600 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2012-07-25 17:20:35,606 [Thread-7] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 2012-07-25 17:20:35,750 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
> 2012-07-25 17:20:35,763 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
> 2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
> 2012-07-25 17:20:36,101 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map reduce job(s) failed!
> 2012-07-25 17:20:36,107 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed to produce result in: "file:/tmp/temp61624047/tmp1087576502"
> 2012-07-25 17:20:36,107 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
> 2012-07-25 17:20:36,120 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
> 2012-07-25 17:20:36,121 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: file:///hello/demotry.txt
> Details at logfile: /users/mediaadmin/pig_1343217013235.log
>
>
> why is it happening so :-(
>
> Please help and Suggest
>
> Thanks & Regards
> yogesh Kumar
>
>
>
> ________________________________________
> From: Mohammad Tariq [dontariq@gmail.com]
> Sent: Wednesday, July 25, 2012 5:00 PM
> To: user@pig.apache.org
> Subject: Re: DATA not storing as comma-separted
>
> Hi Yogesh,
>
> Is 'load' working fine with PigStorage()?? Try to load
> something using PigStorage(',') and dump it to see if that is working.
>
> Regards,
> Mohammad Tariq
>
>
> On Wed, Jul 25, 2012 at 4:41 PM, <yo...@wipro.com> wrote:
>> Hello Dmitriy,
>>
>> I have also performed the cat command in hadoop.
>>
>> hadoop dfs -cat /hadoop/pig/records/part-m-00000
>>
>> but still it shows same output without commas.
>> Please suggest
>>
>> Thanks & regards
>> Yogesh Kumar
>> ________________________________________
>> From: Dmitriy Ryaboy [dvryaboy@gmail.com]
>> Sent: Wednesday, July 25, 2012 4:33 PM
>> To: user@pig.apache.org
>> Subject: Re: DATA not storing as comma-separted
>>
>> Using the store expression you wrote should work. Dump is its own thing and doesn't know anything about the format you store things in. To see files created on hdfs, you can use cat.
>>
>> On Jul 25, 2012, at 3:48 AM, <yo...@wipro.com> wrote:
>>
>>> Hi All,
>>>
>>> I am new to PIG, trying to stroe data in HDFS as comma separated by using command
>>>
>>> store RECORDS into 'hadoop/pig/records' using PigStorage(',');
>>>
>>> If I do
>>>
>>> dump RECORDS ;
>>>
>>> it shows
>>>
>>> (YogeshKumar 210 hello)
>>> (Mohitkumar 211 hi)
>>> (AAshichoudhary 212 hii)
>>> (renuchoudhary 213 namestey)
>>>
>>> I want it to store as
>>>
>>> (YogeshKumar, 210, hello)
>>> (Mohitkumar, 211,hi)
>>> (AAshichoudhary, 212, hii)
>>> (renuchoudhary, 213, namestey)
>>>
>>>
>>> Please suggest and Help
>>>
>>> Thanks & Regards
>>> Yogesh Kumar
>>>
>>>
>>>
>>>
>>> Please do not print this email unless it is absolutely necessary.
>>>
>>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>>
>>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>>
>>> www.wipro.com
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>
>> www.wipro.com
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com
RE: DATA not storing as comma-separted
Posted by yo...@wipro.com.
Hi Mohammad,
I have switched from pig 0.10.0 to 0.7.0 and its horrible experience.
I do perform
grunt> A = load '/hello/demotry.txt'
>> as (name:chararray, roll:int, mssg:chararray);
grunt> dump A;
it shows this error:
grunt> dump A;
2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No column pruned for A
2012-07-25 17:20:34,081 [main] INFO org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No map keys pruned for A
2012-07-25 17:20:34,102 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
2012-07-25 17:20:34,169 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: Store(file:/tmp/temp61624047/tmp1087576502:org.apache.pig.builtin.BinStorage) - 1-18 Operator Key: 1-18)
2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2012-07-25 17:20:34,195 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2012-07-25 17:20:34,211 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2012-07-25 17:20:34,217 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2012-07-25 17:20:34,217 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2012-07-25 17:20:35,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2012-07-25 17:20:35,599 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2012-07-25 17:20:35,600 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2012-07-25 17:20:35,606 [Thread-7] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2012-07-25 17:20:35,750 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2012-07-25 17:20:35,763 [Thread-7] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2012-07-25 17:20:36,101 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2012-07-25 17:20:36,101 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map reduce job(s) failed!
2012-07-25 17:20:36,107 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed to produce result in: "file:/tmp/temp61624047/tmp1087576502"
2012-07-25 17:20:36,107 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2012-07-25 17:20:36,120 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2012-07-25 17:20:36,121 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input splits for: file:///hello/demotry.txt
Details at logfile: /users/mediaadmin/pig_1343217013235.log
why is it happening so :-(
Please help and Suggest
Thanks & Regards
yogesh Kumar
________________________________________
From: Mohammad Tariq [dontariq@gmail.com]
Sent: Wednesday, July 25, 2012 5:00 PM
To: user@pig.apache.org
Subject: Re: DATA not storing as comma-separted
Hi Yogesh,
Is 'load' working fine with PigStorage()?? Try to load
something using PigStorage(',') and dump it to see if that is working.
Regards,
Mohammad Tariq
On Wed, Jul 25, 2012 at 4:41 PM, <yo...@wipro.com> wrote:
> Hello Dmitriy,
>
> I have also performed the cat command in hadoop.
>
> hadoop dfs -cat /hadoop/pig/records/part-m-00000
>
> but still it shows same output without commas.
> Please suggest
>
> Thanks & regards
> Yogesh Kumar
> ________________________________________
> From: Dmitriy Ryaboy [dvryaboy@gmail.com]
> Sent: Wednesday, July 25, 2012 4:33 PM
> To: user@pig.apache.org
> Subject: Re: DATA not storing as comma-separted
>
> Using the store expression you wrote should work. Dump is its own thing and doesn't know anything about the format you store things in. To see files created on hdfs, you can use cat.
>
> On Jul 25, 2012, at 3:48 AM, <yo...@wipro.com> wrote:
>
>> Hi All,
>>
>> I am new to PIG, trying to stroe data in HDFS as comma separated by using command
>>
>> store RECORDS into 'hadoop/pig/records' using PigStorage(',');
>>
>> If I do
>>
>> dump RECORDS ;
>>
>> it shows
>>
>> (YogeshKumar 210 hello)
>> (Mohitkumar 211 hi)
>> (AAshichoudhary 212 hii)
>> (renuchoudhary 213 namestey)
>>
>> I want it to store as
>>
>> (YogeshKumar, 210, hello)
>> (Mohitkumar, 211,hi)
>> (AAshichoudhary, 212, hii)
>> (renuchoudhary, 213, namestey)
>>
>>
>> Please suggest and Help
>>
>> Thanks & Regards
>> Yogesh Kumar
>>
>>
>>
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>
>> www.wipro.com
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com
Please do not print this email unless it is absolutely necessary.
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
www.wipro.com
Re: DATA not storing as comma-separted
Posted by Mohammad Tariq <do...@gmail.com>.
Hi Yogesh,
Is 'load' working fine with PigStorage()?? Try to load
something using PigStorage(',') and dump it to see if that is working.
Regards,
Mohammad Tariq
On Wed, Jul 25, 2012 at 4:41 PM, <yo...@wipro.com> wrote:
> Hello Dmitriy,
>
> I have also performed the cat command in hadoop.
>
> hadoop dfs -cat /hadoop/pig/records/part-m-00000
>
> but still it shows same output without commas.
> Please suggest
>
> Thanks & regards
> Yogesh Kumar
> ________________________________________
> From: Dmitriy Ryaboy [dvryaboy@gmail.com]
> Sent: Wednesday, July 25, 2012 4:33 PM
> To: user@pig.apache.org
> Subject: Re: DATA not storing as comma-separted
>
> Using the store expression you wrote should work. Dump is its own thing and doesn't know anything about the format you store things in. To see files created on hdfs, you can use cat.
>
> On Jul 25, 2012, at 3:48 AM, <yo...@wipro.com> wrote:
>
>> Hi All,
>>
>> I am new to PIG, trying to stroe data in HDFS as comma separated by using command
>>
>> store RECORDS into 'hadoop/pig/records' using PigStorage(',');
>>
>> If I do
>>
>> dump RECORDS ;
>>
>> it shows
>>
>> (YogeshKumar 210 hello)
>> (Mohitkumar 211 hi)
>> (AAshichoudhary 212 hii)
>> (renuchoudhary 213 namestey)
>>
>> I want it to store as
>>
>> (YogeshKumar, 210, hello)
>> (Mohitkumar, 211,hi)
>> (AAshichoudhary, 212, hii)
>> (renuchoudhary, 213, namestey)
>>
>>
>> Please suggest and Help
>>
>> Thanks & Regards
>> Yogesh Kumar
>>
>>
>>
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>
>> www.wipro.com
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com
RE: DATA not storing as comma-separted
Posted by yo...@wipro.com.
Hello Dmitriy,
I have also performed the cat command in hadoop.
hadoop dfs -cat /hadoop/pig/records/part-m-00000
but still it shows same output without commas.
Please suggest
Thanks & regards
Yogesh Kumar
________________________________________
From: Dmitriy Ryaboy [dvryaboy@gmail.com]
Sent: Wednesday, July 25, 2012 4:33 PM
To: user@pig.apache.org
Subject: Re: DATA not storing as comma-separted
Using the store expression you wrote should work. Dump is its own thing and doesn't know anything about the format you store things in. To see files created on hdfs, you can use cat.
On Jul 25, 2012, at 3:48 AM, <yo...@wipro.com> wrote:
> Hi All,
>
> I am new to PIG, trying to stroe data in HDFS as comma separated by using command
>
> store RECORDS into 'hadoop/pig/records' using PigStorage(',');
>
> If I do
>
> dump RECORDS ;
>
> it shows
>
> (YogeshKumar 210 hello)
> (Mohitkumar 211 hi)
> (AAshichoudhary 212 hii)
> (renuchoudhary 213 namestey)
>
> I want it to store as
>
> (YogeshKumar, 210, hello)
> (Mohitkumar, 211,hi)
> (AAshichoudhary, 212, hii)
> (renuchoudhary, 213, namestey)
>
>
> Please suggest and Help
>
> Thanks & Regards
> Yogesh Kumar
>
>
>
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com
Please do not print this email unless it is absolutely necessary.
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
www.wipro.com
Re: DATA not storing as comma-separted
Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Using the store expression you wrote should work. Dump is its own thing and doesn't know anything about the format you store things in. To see files created on hdfs, you can use cat.
On Jul 25, 2012, at 3:48 AM, <yo...@wipro.com> wrote:
> Hi All,
>
> I am new to PIG, trying to stroe data in HDFS as comma separated by using command
>
> store RECORDS into 'hadoop/pig/records' using PigStorage(',');
>
> If I do
>
> dump RECORDS ;
>
> it shows
>
> (YogeshKumar 210 hello)
> (Mohitkumar 211 hi)
> (AAshichoudhary 212 hii)
> (renuchoudhary 213 namestey)
>
> I want it to store as
>
> (YogeshKumar, 210, hello)
> (Mohitkumar, 211,hi)
> (AAshichoudhary, 212, hii)
> (renuchoudhary, 213, namestey)
>
>
> Please suggest and Help
>
> Thanks & Regards
> Yogesh Kumar
>
>
>
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com