You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Russell Jurney (JIRA)" <ji...@apache.org> on 2012/07/07 03:03:33 UTC

[jira] [Resolved] (PIG-2792) Wonderdog stopped working in Pig 0.10.0 (worked in 0.9.2)

     [ https://issues.apache.org/jira/browse/PIG-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Russell Jurney resolved PIG-2792.
---------------------------------

    Resolution: Fixed

Fixed here: https://github.com/infochimps-labs/wonderdog/pull/8

Pull in the conf object if the hadoop cache call misses.
                
> Wonderdog stopped working in Pig 0.10.0 (worked in 0.9.2)
> ---------------------------------------------------------
>
>                 Key: PIG-2792
>                 URL: https://issues.apache.org/jira/browse/PIG-2792
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>    Affects Versions: 0.10.0, 0.11, 0.10.1
>         Environment: Pig with Wonderdog https://github.com/infochimps-labs/wonderdog for elasticsearch integration. Elasticsearch 0.18.6. Pig local mode.
>            Reporter: Russell Jurney
>            Priority: Blocker
>              Labels: a, about, area, book, did, i, moving, of, omg, technology, why, write
>             Fix For: 0.10.1
>
>
> The Pig UDFs in Wonderdog for ElasticSearch integration, which worked in 0.9.2 stopped working in 0.10.0.
> Now in 0.10.0 there is an error, as Wonderdog is unable to read its configuration from the hadoop cache.
> If someone can help identify what the issue is, or advise how Wonderdog or Pig can be modified so that wonderdog works with with Pig 0.10, it would be greatly appreciated.
> This issue is duped in the Wonderdog project here: https://github.com/infochimps-labs/wonderdog/issues/6 https://github.com/infochimps-labs/wonderdog/issues/5 and https://github.com/infochimps-labs/wonderdog/issues/7
> The error is below:
> 2012-07-06 16:50:51,501 [main] INFO  org.apache.pig.Main - Apache Pig version 0.10.0-SNAPSHOT (rexported) compiled Jun 22 2012, 15:56:16
> 2012-07-06 16:50:51,502 [main] INFO  org.apache.pig.Main - Logging error messages to: /private/tmp/pig_1341618651472.log
> 2012-07-06 16:50:51,829 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
> {"ok":true}
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
> 100    11  100    11    0     0    647      0 --:--:-- --:--:-- --:--:--   733
> 2012-07-06 16:50:53,206 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
> 2012-07-06 16:50:53,379 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
> 2012-07-06 16:50:53,403 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
> 2012-07-06 16:50:53,403 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
> 2012-07-06 16:50:53,441 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
> 2012-07-06 16:50:53,449 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2012-07-06 16:50:53,494 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
> 2012-07-06 16:50:53,560 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
> 2012-07-06 16:50:53,587 [Thread-7] WARN  org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 2012-07-06 16:50:53,597 [Thread-7] WARN  org.apache.hadoop.mapred.JobClient - No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
> ****file:/tmp/emails.json
> 2012-07-06 16:50:53,711 [Thread-7] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 2
> 2012-07-06 16:50:53,711 [Thread-7] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 2
> 2012-07-06 16:50:53,734 [Thread-7] WARN  org.apache.hadoop.io.compress.snappy.LoadSnappy - Snappy native library not loaded
> 2012-07-06 16:50:53,737 [Thread-7] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 3
> 2012-07-06 16:50:54,008 [Thread-8] INFO  org.apache.hadoop.mapred.Task -  Using ResourceCalculatorPlugin : null
> 2012-07-06 16:50:54,023 [Thread-8] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/tmp/emails.json/part-m-00000:0+33554432
> 2012-07-06 16:50:54,029 [Thread-8] INFO  com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using field:[message_id] for document ids
> 2012-07-06 16:50:54,029 [Thread-8] INFO  com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using [null] as es.config
> 2012-07-06 16:50:54,029 [Thread-8] INFO  com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using [null] as es.plugins.dir
> 2012-07-06 16:50:54,033 [Thread-8] WARN  org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in cleanup
> 2012-07-06 16:50:54,034 [Thread-8] WARN  org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
> java.lang.RuntimeException: java.lang.NullPointerException
> 	at com.infochimps.elasticsearch.ElasticSearchOutputFormat$ElasticSearchRecordWriter.<init>(ElasticSearchOutputFormat.java:133)
> 	at com.infochimps.elasticsearch.ElasticSearchOutputFormat.getRecordWriter(ElasticSearchOutputFormat.java:262)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
> 	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:628)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:753)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: java.lang.NullPointerException
> 	at java.util.Hashtable.put(Hashtable.java:394)
> 	at java.util.Properties.setProperty(Properties.java:143)
> 	at java.lang.System.setProperty(System.java:746)
> 	at com.infochimps.elasticsearch.ElasticSearchOutputFormat$ElasticSearchRecordWriter.<init>(ElasticSearchOutputFormat.java:130)
> 	... 6 more
> 2012-07-06 16:50:54,506 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local_0001
> 2012-07-06 16:50:54,506 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
> 2012-07-06 16:50:59,022 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local_0001 has failed! Stop running all dependent jobs
> 2012-07-06 16:50:59,023 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
> 2012-07-06 16:50:59,024 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> 2012-07-06 16:50:59,024 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats reported below may be incomplete
> 2012-07-06 16:50:59,025 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: 
> HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
> 1.0.2	0.10.0-SNAPSHOT	rjurney	2012-07-06 16:50:53	2012-07-06 16:50:59	UNKNOWN
> Failed!
> Failed Jobs:
> JobId	Alias	Feature	Message	Outputs
> job_local_0001	json_emails	MAP_ONLY	Message: Job failed! Error - NA	es://email/email?id=message_id&json=true&size=1000,
> Input(s):
> Failed to read data from "/tmp/emails.json"
> Output(s):
> Failed to produce result in "es://email/email?id=message_id&json=true&size=1000"
> Job DAG:
> job_local_0001
> 2012-07-06 16:50:59,025 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
> 2012-07-06 16:50:59,029 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
> 2012-07-06 16:50:59,029 [main] ERROR org.apache.pig.tools.grunt.GruntParser - org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
> 	at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
> 	at org.apache.pig.tools.grunt.GruntParser.processShCommand(GruntParser.java:1025)
> 	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:167)
> 	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
> 	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> 	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> 	at org.apache.pig.Main.run(Main.java:555)
> 	at org.apache.pig.Main.main(Main.java:111)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Details also at logfile: /private/tmp/pig_1341618651472.log
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> {
>   "took" : 75,
>   "timed_out" : false,
>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  "_shards" : {
>     "total" : 5,
>     "successful" : 5,
>     "failed" : 0
>   },
>   "hits" : {
>     "total" : 0,
>     "max_score" : null,
>     "hits" : [ ]
>   }
> }
> 100   193  100   193    0     0   2475      0 --:--:-- --:--:-- --:--:--  2539
> 2012-07-06 16:50:59,140 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
> 2012-07-06 16:50:59,140 [main] ERROR org.apache.pig.tools.grunt.GruntParser - org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
> 	at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
> 	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
> 	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> 	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> 	at org.apache.pig.Main.run(Main.java:555)
> 	at org.apache.pig.Main.main(Main.java:111)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira