You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Léopold Boudard (Jira)" <ji...@apache.org> on 2020/04/01 12:21:00 UTC

[jira] [Commented] (BEAM-9645) Python flinkrunner cannot inspect container

    [ https://issues.apache.org/jira/browse/BEAM-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17072710#comment-17072710 ] 

Léopold Boudard commented on BEAM-9645:
---------------------------------------

small update on this, it seems clearly to be linked to issues during portable executor container initialization though I couldn't find a way to debug/diagnose what's the problem (and I don't encounter same init issues on dataflow runner).

I could circumvent this issue by directly using a specific container that embeded required packages

[https://beam.apache.org/documentation/runtime/environments/#customizing-container-images]
{code:java}
GOOGLE_APPLICATION_CREDENTIALS=~/gcp/dataflow.json python -m listing_beam_pipeline.run --runner FlinkRunner --flink_master={} --flink_version 1.9  --input gs://... --output gs://... --environment_config eu.gcr.io/ma-dev2/listing_beam_pipeline:latest
{code}

> Python flinkrunner cannot inspect container
> -------------------------------------------
>
>                 Key: BEAM-9645
>                 URL: https://issues.apache.org/jira/browse/BEAM-9645
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.19.0
>         Environment: dataproc cluster, running flink version 1.9
>            Reporter: Léopold Boudard
>            Priority: Major
>
> Hi,
> I'm trying to submit a python pipeline job as portable runner with FlinkRunner with a very simple setup file, though I can't see error logs since it fails retrieving logs/state from underlying container:
> {code:java}
> Caused by: java.io.IOException: Received exit code 1 for command 'docker inspect -f {{.State.Running}} 248d660be908ef58385eb962658cee831c4c3b1be9ea1f835e4563f53016363b'. stderr: Error: No such object: 248d660be908ef58385eb962658cee831c4c3b1be9ea1f835e4563f53016363bCaused by: java.io.IOException: Received exit code 1 for command 'docker inspect -f {{.State.Running}} 248d660be908ef58385eb962658cee831c4c3b1be9ea1f835e4563f53016363b'. stderr: Error: No such object: 248d660be908ef58385eb962658cee831c4c3b1be9ea1f835e4563f53016363b at org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:234) at org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:168) at org.apache.beam.runners.fnexecution.environment.DockerCommand.isContainerRunning(DockerCommand.java:112) at org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory.createEnvironment(DockerEnvironmentFactory.java:165) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:200) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:184) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) ... 12 more Suppressed: java.io.IOException: Received exit code 1 for command 'docker kill 248d660be908ef58385eb962658cee831c4c3b1be9ea1f835e4563f53016363b'. stderr: Error response from daemon: Cannot kill container: 248d660be908ef58385eb962658cee831c4c3b1be9ea1f835e4563f53016363b: No such container: 248d660be908ef58385eb962658cee831c4c3b1be9ea1f835e4563f53016363b at org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:234) at org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:168) at org.apache.beam.runners.fnexecution.environment.DockerCommand.killContainer(DockerCommand.java:148) at org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory.createEnvironment(DockerEnvironmentFactory.java:192) ... 22 moreERROR:root:java.io.IOException: Received exit code 1 for command 'docker inspect -f {{.State.Running}} 248d660be908ef58385eb962658cee831c4c3b1be9ea1f835e4563f53016363b'. stderr: Error: No such object: 248d660be908ef58385eb962658cee831c4c3b1be9ea1f835e4563f53016363b[flink-runner-job-invoker] INFO org.apache.beam.runners.fnexecution.artifact.AbstractArtifactRetrievalService - Manifest at /var/folders/s3/29yl1s8125j33_9vb5tczdxw0000gn/T/beam-tempiwx1szcz/artifacts6ptzmo6u/job_a340349c-cc95-4e32-9cbd-7f915c2d0407/MANIFEST has 1 artifact locations[flink-runner-job-invoker] INFO org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactStagingService - Removed dir /var/folders/s3/29yl1s8125j33_9vb5tczdxw0000gn/T/beam-tempiwx1szcz/artifacts6ptzmo6u/job_a340349c-cc95-4e32-9cbd-7f915c2d0407/Traceback (most recent call last):  File "importer/test_runner.py", line 46, in <module>    run()  File "importer/test_runner.py", line 41, in run    | 'write to file' >> WriteToText(known_args.output)  File "/Users/leopold/.pyenv/versions/BenchmarkListingStreaming/lib/python3.6/site-packages/apache_beam/pipeline.py", line 481, in __exit__    self.run().wait_until_finish()  File "/Users/leopold/.pyenv/versions/BenchmarkListingStreaming/lib/python3.6/site-packages/apache_beam/runners/portability/portable_runner.py", line 455, in wait_until_finish    self._job_id, self._state, self._last_error_message()))
> {code}
>  
> Job args
> --runner FlinkRunner --flink_master=\{flink_master} --flink_version 1.9 --setup_file ./setup.py
> And corresponding setup file
> {code:java}
> import setuptoolsinstall_requires = ['networkx==2.4', 'PyYAML==5.3']
> setuptools.setup(
>     name='benchmark_listing',
>     version='0.1.0', #build_version('0.1.0'),    author='MeilleursAgents',
>     author_email='geeks@meilleursagents.com',
>     description='Benchmark Listing Streaming',
>     # long_description=long_description,
>     # long_description_content_type='text/markdown',    install_requires=install_requires,
>     # packages=setuptools.find_packages(),
>     # python_requires='>=3.7.6',
>     # include_package_data=True,
> )
> {code}
> Could you advise on this issue please?
> Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)