You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Michael Brown (JIRA)" <ji...@apache.org> on 2018/06/14 00:44:00 UTC

[jira] [Updated] (IMPALA-7170) "tests/comparison/data_generator.py populate" is broken

     [ https://issues.apache.org/jira/browse/IMPALA-7170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Brown updated IMPALA-7170:
----------------------------------
    Summary: "tests/comparison/data_generator.py populate" is broken  (was: test/comparison is broken)

> "tests/comparison/data_generator.py populate" is broken
> -------------------------------------------------------
>
>                 Key: IMPALA-7170
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7170
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 3.0
>            Reporter: Tianyi Wang
>            Priority: Major
>
> test/comparison in Impala 3.x is broken, presumably by the switch to Hadoop 3.
> Firstly, to run the tests in impala 3.x, the mini-cluster needs to be started with YARN, which is not documented anywhere. 
> Then, data_generator.py will exit with the following error:
> {noformat}
> 2018-04-23 23:15:46,065 INFO:db_connection[752]:Dropping database randomness
> 2018-04-23 23:15:46,095 INFO:db_connection[234]:Creating database randomness
> 2018-04-23 23:15:52,390 INFO:data_generator[235]:Starting MR job to generate data for randomness
> Traceback (most recent call last):
>   File "tests/comparison/data_generator.py", line 339, in <module>
>     populator.populate_db(args.table_count, postgresql_conn=postgresql_conn)
>   File "tests/comparison/data_generator.py", line 134, in populate_db
>     self._run_data_generator_mr_job([g for _, g in table_and_generators], self.db_name)
>   File "tests/comparison/data_generator.py", line 244, in _run_data_generator_mr_job
>     % (reducer_count, ','.join(files), mapper_input_file, hdfs_output_dir))
>   File "/home/impdev/projects/impala/tests/comparison/cluster.py", line 476, in run_mr_job
>     stderr=subprocess.STDOUT, env=env)
>   File "/home/impdev/projects/impala/tests/util/shell_util.py", line 113, in shell
>     "\ncmd: %s\nstdout: %s\nstderr: %s") % (retcode, cmd, output, err))
> Exception: Command returned non-zero exit code: 1
> cmd: set -euo pipefail
> hadoop jar /home/impdev/projects/impala/toolchain/cdh_components/hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-streaming-3.0.0-cdh6.x-SNAPSHOT.jar -D mapred.reduce.tasks=36 \
>         -D stream.num.map.output.key.fields=2 \
>         -files tests/comparison/common.py,tests/comparison/db_types.py,tests/comparison/data_generator_mapred_common.py,tests/comparison/data_generator_mapper.py,tests/comparison/data_generator_reducer.py,tests/comparison/random_val_generator.py \
>         -input /tmp/data_gen_randomness_mr_input_1524525348 \
>         -output /tmp/data_gen_randomness_mr_output_1524525348 \
>         -mapper data_generator_mapper.py \
>         -reducer data_generator_reducer.py
> stdout: packageJobJar: [] [/home/impdev/projects/impala/toolchain/cdh_components/hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-streaming-3.0.0-cdh6.x-SNAPSHOT.jar] /tmp/streamjob2990195923122538287.jar tmpDir=null
> 18/04/23 23:15:53 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
> 18/04/23 23:15:53 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
> 18/04/23 23:15:54 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/impdev/.staging/job_1524519161700_0002
> 18/04/23 23:15:54 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 18/04/23 23:15:54 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev 2b3bd7731ff3ef5d8585a004b90696630e5cea96]
> 18/04/23 23:15:54 INFO mapred.FileInputFormat: Total input files to process : 1
> 18/04/23 23:15:54 INFO mapreduce.JobSubmitter: number of splits:2
> 18/04/23 23:15:54 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
> 18/04/23 23:15:54 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
> 18/04/23 23:15:54 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524519161700_0002
> 18/04/23 23:15:54 INFO mapreduce.JobSubmitter: Executing with tokens: []
> 18/04/23 23:15:54 INFO conf.Configuration: resource-types.xml not found
> 18/04/23 23:15:54 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
> 18/04/23 23:15:54 INFO impl.YarnClientImpl: Submitted application application_1524519161700_0002
> 18/04/23 23:15:54 INFO mapreduce.Job: The url to track the job: http://c37e0835e988:8088/proxy/application_1524519161700_0002/
> 18/04/23 23:15:54 INFO mapreduce.Job: Running job: job_1524519161700_0002
> 18/04/23 23:16:00 INFO mapreduce.Job: Job job_1524519161700_0002 running in uber mode : false
> 18/04/23 23:16:00 INFO mapreduce.Job:  map 0% reduce 0%
> 18/04/23 23:16:06 INFO mapreduce.Job: Job job_1524519161700_0002 failed with state FAILED due to: Application application_1524519161700_0002 failed 2 times due to AM Container for appattempt_1524519161700_0002_000002 exited with  exitCode: 255
> Failing this attempt.Diagnostics: [2018-04-23 23:16:06.473]Exception from container-launch.
> Container id: container_1524519161700_0002_02_000001
> Exit code: 255
> [2018-04-23 23:16:06.475]Container exited with a non-zero exit code 255. Error file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider class
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as a root resource class
> Apr 23, 2018 11:16:03 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
> INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
> INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
> INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
> INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest"
> log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> [2018-04-23 23:16:06.476]Container exited with a non-zero exit code 255. Error file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider class
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as a root resource class
> Apr 23, 2018 11:16:03 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
> INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
> INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
> INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
> Apr 23, 2018 11:16:03 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
> INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest"
> log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> For more detailed output, check the application tracking page: http://localhost:8088/cluster/app/application_1524519161700_0002 Then click on links to logs of each attempt.
> . Failing the application.
> 18/04/23 23:16:06 INFO mapreduce.Job: Counters: 0
> 18/04/23 23:16:06 ERROR streaming.StreamJob: Job not successful!
> Streaming Command Failed!
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org