You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/11/19 10:08:00 UTC

[jira] [Commented] (FLINK-10925) NPE in PythonPlanStreamer

    [ https://issues.apache.org/jira/browse/FLINK-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16691500#comment-16691500 ] 

ASF GitHub Bot commented on FLINK-10925:
----------------------------------------

kkolman opened a new pull request #7139: [FLINK-10925] [python] Fix NPE in PythonPlanStreamer
URL: https://github.com/apache/flink/pull/7139
 
 
   ## What is the purpose of the change
   
   This pull requests makes it easier to troubleshoot Python Batch API issues that would happen when trying to run a Python Batch API job without Python installed.
   
   ## Brief change log
   
     - added *PythonPlanStreamer* check for null value before calling close method on an instance variable
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
     - Manually verified the change by submitting a Python Batch API job to cluster not having python installed
    - the job failed with an explanatory error message "**Failed to run plan: python does not point to a valid python binary.**"
   - prior to this fix the job would fail with a confusing "**Failed to run plan: null**" message.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
     - Does this pull request introduce a new feature? no
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> NPE in PythonPlanStreamer
> -------------------------
>
>                 Key: FLINK-10925
>                 URL: https://issues.apache.org/jira/browse/FLINK-10925
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.6.2
>            Reporter: Karel Kolman
>            Priority: Major
>              Labels: pull-request-available
>
> Encountered the following issue while testing Python Batch API:
> {noformat}
> root@a0810aa4b51b:/opt/flink# ./bin/pyflink.sh examples/python/batch/WordCount.py  -
> Starting execution of program
> Failed to run plan: null
> The program didn't contain a Flink job. Perhaps you forgot to call execute() on the execution environment.
> {noformat}
> with logs containing the following stacktrace:
> {noformat}
> 2018-11-19 09:11:51,036 ERROR org.apache.flink.python.api.PythonPlanBinder                  - Failed to run plan.
> java.lang.NullPointerException
>         at org.apache.flink.python.api.streaming.plan.PythonPlanStreamer.close(PythonPlanStreamer.java:129)
>         at org.apache.flink.python.api.PythonPlanBinder.runPlan(PythonPlanBinder.java:201)
>         at org.apache.flink.python.api.PythonPlanBinder.main(PythonPlanBinder.java:98)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
>         at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
>         at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:426){noformat}
> My root cause was not having Python installed on the docker image being used.
> Patching Flink with 
> {noformat}
> diff --git a/flink-libraries/flink-python/src/main/java/org/apache/flink/python/api/streaming/plan/PythonPlanStreamer.java b/flink-libraries/flink-python/src/main/java/org/apache/flink/python/api/streaming/plan/PythonPlanStreamer.java
> index d25f3d51ff..3e6a068d8a 100644
> --- a/flink-libraries/flink-python/src/main/java/org/apache/flink/python/api/streaming/plan/PythonPlanStreamer.java
> +++ b/flink-libraries/flink-python/src/main/java/org/apache/flink/python/api/streaming/plan/PythonPlanStreamer.java
> @@ -126,7 +126,9 @@ public class PythonPlanStreamer {
>                         process.destroy();
>                 } finally {
>                         try {
> -                               server.close();
> +                               if (server != null) {
> +                                       server.close();
> +                               }
>                         } catch (IOException e) {
>                                 LOG.error("Failed to close socket.", e);
>                         }
> {noformat}
> an attempt to run Python Batch API example will fail with
> {noformat}
> root@33837d1efa28:/opt/flink# ./bin/pyflink.sh examples/python/batch/WordCount.py -
> Starting execution of program
> Failed to run plan: python does not point to a valid python binary.
> The program didn't contain a Flink job. Perhaps you forgot to call execute() on the execution environment
> {noformat}
> which correctly identifes the problem i was facing - missing python (or incorrect python bin path).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)