You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by iemejia <gi...@git.apache.org> on 2016/07/13 14:05:02 UTC

[GitHub] flink pull request #2239: [FLINK-4208] Support Running Flink processes in fo...

GitHub user iemejia opened a pull request:

    https://github.com/apache/flink/pull/2239

    [FLINK-4208] Support Running Flink processes in foreground mode

    Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
    If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html).
    In addition to going through the list, please provide a meaningful description of your changes.
    
    - [x] General
      - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
      - The pull request addresses only one issue
      - Each commit in the PR has a meaningful commit message (including the JIRA id)
    
    - [x] Documentation
      - Documentation has been added for new functionality
      - Old documentation affected by the pull request has been updated
      - JavaDoc for public methods has been added
    
    - [x] Tests & Build
      - Functionality added by the pull request is covered by tests
      - `mvn clean verify` has been executed successfully locally or a Travis build has passed


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/iemejia/flink FLINK-4208

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2239.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2239
    
----
commit af9ac6cdb3f6601d6248abe82df4fd44de4453e5
Author: Isma�l Mej�a <ie...@gmail.com>
Date:   2016-07-13T13:55:13Z

    [FLINK-4208] Add support to run flink as a foreground process

commit 5a27c3ce0bfaf7cf862aba3929f7712a3897bc19
Author: Isma�l Mej�a <ie...@gmail.com>
Date:   2016-07-13T13:56:13Z

    [FLINK-4208] Make the flink daemon a real daemon
    nohup ignores the HUP terminal signals, so the process is still alive
    even after the terminal ends

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2239: [FLINK-4208] Support Running Flink processes in fo...

Posted by iemejia <gi...@git.apache.org>.
Github user iemejia closed the pull request at:

    https://github.com/apache/flink/pull/2239


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2239: [FLINK-4208] Support Running Flink processes in foregroun...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/2239
  
    Hi @iemejia, is the situation with Docker that if the Flink processes are started as daemons and the script returns that Docker assumes the process has terminated?
    
    Skipping the pid file might work fine for a container where one wouldn't start multiple TaskManagers but would cause logfile issues otherwise. We do need to add locking to the pid file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2239: [FLINK-4208] Support Running Flink processes in foregroun...

Posted by iemejia <gi...@git.apache.org>.
Github user iemejia commented on the issue:

    https://github.com/apache/flink/pull/2239
  
    Thanks Greg, I was probably too tired last night because I put the wait in a weird place, I just tried now and everything is working, it is still not 'real' foreground, since Ctrl-C gets captured by the wait, but it fixes the issue for the docker image so I am going to close this pull request and the issue.
    If you or anybody else is still interested I will keep it, but I am going to close it if you don't mind.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2239: [FLINK-4208] Support Running Flink processes in fo...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2239#discussion_r70844709
  
    --- Diff: flink-dist/src/main/flink-bin/bin/flink-daemon.sh ---
    @@ -77,31 +77,36 @@ if [[ ${JAVA_VERSION} =~ ${IS_NUMBER} ]]; then
     fi
     
     case $STARTSTOP in
    +    (start|start-foreground)
    +      # Rotate log files
    +      rotateLogFile $log
    +      rotateLogFile $out
    +
    +      # Print a warning if daemons are already running on host
    +      if [ -f $pid ]; then
    +        active=()
    +        while IFS='' read -r p || [[ -n "$p" ]]; do
    +          kill -0 $p >/dev/null 2>&1
    +          if [ $? -eq 0 ]; then
    +            active+=($p)
    +          fi
    +        done < "${pid}"
     
    -    (start)
    -        # Rotate log files
    -        rotateLogFile $log
    -        rotateLogFile $out
    -
    -        # Print a warning if daemons are already running on host
    -        if [ -f $pid ]; then
    -          active=()
    -          while IFS='' read -r p || [[ -n "$p" ]]; do
    -            kill -0 $p >/dev/null 2>&1
    -            if [ $? -eq 0 ]; then
    -              active+=($p)
    -            fi
    -          done < "${pid}"
    -
    -          count="${#active[@]}"
    +        count="${#active[@]}"
     
    -          if [ ${count} -gt 0 ]; then
    -            echo "[INFO] $count instance(s) of $DAEMON are already running on $HOSTNAME."
    -          fi
    +        if [ ${count} -gt 0 ]; then
    +          echo "[INFO] $count instance(s) of $DAEMON are already running on $HOSTNAME."
             fi
    +      fi
    +
    +      if [[ $STARTSTOP == "start-foreground" ]]; then
    +        echo "Starting $DAEMON as a foreground process on host $HOSTNAME."
    +        $JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null
    +      fi
     
    +      if [[ $STARTSTOP == "start" ]]; then
             echo "Starting $DAEMON daemon on host $HOSTNAME."
    -        $JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null &
    +        nohup $JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null &
    --- End diff --
    
    Under what circumstances is the `nohup` necessary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2239: [FLINK-4208] Support Running Flink processes in foregroun...

Posted by iemejia <gi...@git.apache.org>.
Github user iemejia commented on the issue:

    https://github.com/apache/flink/pull/2239
  
    I tried not to change the current daemon behavior, that's the reason why I took
    the decision to add an additional option.  I am not sure if using wait may work for what I want but if it does, perfect, can you give me hints of how to test this ? I naively did this and it does not seem to work.
    
    ```
            if [[ ${mypid} =~ ${IS_NUMBER} ]] && kill -0 $mypid > /dev/null 2>&1 ; then
                echo $mypid >> $pid
    +           wait $mypid # I also tried with $pid and it does not work either
            else
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2239: [FLINK-4208] Support Running Flink processes in foregroun...

Posted by iemejia <gi...@git.apache.org>.
Github user iemejia commented on the issue:

    https://github.com/apache/flink/pull/2239
  
    @aljoscha As discussed in FLINK-4118 I am doing this to support this use case and eventually remove the dependency on python and supervisord from the docker image.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2239: [FLINK-4208] Support Running Flink processes in foregroun...

Posted by iemejia <gi...@git.apache.org>.
Github user iemejia commented on the issue:

    https://github.com/apache/flink/pull/2239
  
    Hi, Yes, this is exactly the situation, in a previous pull request I was optimizing the flink docker image, however I found that the image used supervisord to catch and keep alive those daemons, so I wanted to remove this dependency (because it adds around 40MB to the image + python and some extra stuff).
    
    Can you give me some hints on the best way to address this ? Or how can I improve my current approach (notice that I took the start-foreground idea from zookeeper).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2239: [FLINK-4208] Support Running Flink processes in foregroun...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/2239
  
    Using `wait $mypid` or just `wait` works for me if I `./bin/jobmanager.sh start cluster` (jobmanager starts in foreground), then in another terminal `./bin/jobmanager.sh stop` and both terminals are now at the command prompt.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2239: [FLINK-4208] Support Running Flink processes in fo...

Posted by iemejia <gi...@git.apache.org>.
Github user iemejia commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2239#discussion_r70873347
  
    --- Diff: flink-dist/src/main/flink-bin/bin/flink-daemon.sh ---
    @@ -77,31 +77,36 @@ if [[ ${JAVA_VERSION} =~ ${IS_NUMBER} ]]; then
     fi
     
     case $STARTSTOP in
    +    (start|start-foreground)
    +      # Rotate log files
    +      rotateLogFile $log
    +      rotateLogFile $out
    +
    +      # Print a warning if daemons are already running on host
    +      if [ -f $pid ]; then
    +        active=()
    +        while IFS='' read -r p || [[ -n "$p" ]]; do
    +          kill -0 $p >/dev/null 2>&1
    +          if [ $? -eq 0 ]; then
    +            active+=($p)
    +          fi
    +        done < "${pid}"
     
    -    (start)
    -        # Rotate log files
    -        rotateLogFile $log
    -        rotateLogFile $out
    -
    -        # Print a warning if daemons are already running on host
    -        if [ -f $pid ]; then
    -          active=()
    -          while IFS='' read -r p || [[ -n "$p" ]]; do
    -            kill -0 $p >/dev/null 2>&1
    -            if [ $? -eq 0 ]; then
    -              active+=($p)
    -            fi
    -          done < "${pid}"
    -
    -          count="${#active[@]}"
    +        count="${#active[@]}"
     
    -          if [ ${count} -gt 0 ]; then
    -            echo "[INFO] $count instance(s) of $DAEMON are already running on $HOSTNAME."
    -          fi
    +        if [ ${count} -gt 0 ]; then
    +          echo "[INFO] $count instance(s) of $DAEMON are already running on $HOSTNAME."
             fi
    +      fi
    +
    +      if [[ $STARTSTOP == "start-foreground" ]]; then
    +        echo "Starting $DAEMON as a foreground process on host $HOSTNAME."
    +        $JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null
    +      fi
     
    +      if [[ $STARTSTOP == "start" ]]; then
             echo "Starting $DAEMON daemon on host $HOSTNAME."
    -        $JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null &
    +        nohup $JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null &
    --- End diff --
    
    I added this because I was trying to grasp what makes a daemon a daemon and I found a reference that convinced me that nohup was missing:
    https://stackoverflow.com/questions/3430330/best-way-to-make-a-shell-script-daemon
    
    Additionally when I looked for inspiration for my changes (the start-foreground name), I look
    at how they started the daemon in zookeeper and I noticed they use nohup
    too.
    https://github.com/apache/zookeeper/blob/trunk/bin/zkServer.sh#L219
    
    This is an extra thing and not the core of the Pull Request, if you don't agree
    I can rebase and remove that commit, but I think it is worth the addition.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2239: [FLINK-4208] Support Running Flink processes in foregroun...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/2239
  
    What if instead of changing how we start the daemon (so continue to always start as a background process), we instead add a `wait` after the PID file has been updated when starting a foreground process?
    
    If FLINK-4212 is accepted we would also need to release the file lock before calling `wait`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2239: [FLINK-4208] Support Running Flink processes in foregroun...

Posted by iemejia <gi...@git.apache.org>.
Github user iemejia commented on the issue:

    https://github.com/apache/flink/pull/2239
  
    I hope this gets into 1.1.0 so I can push the changes to the docker image once this is available in the official binary distribution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2239: [FLINK-4208] Support Running Flink processes in fo...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2239#discussion_r70877433
  
    --- Diff: flink-dist/src/main/flink-bin/bin/flink-daemon.sh ---
    @@ -77,31 +77,36 @@ if [[ ${JAVA_VERSION} =~ ${IS_NUMBER} ]]; then
     fi
     
     case $STARTSTOP in
    +    (start|start-foreground)
    +      # Rotate log files
    +      rotateLogFile $log
    +      rotateLogFile $out
    +
    +      # Print a warning if daemons are already running on host
    +      if [ -f $pid ]; then
    +        active=()
    +        while IFS='' read -r p || [[ -n "$p" ]]; do
    +          kill -0 $p >/dev/null 2>&1
    +          if [ $? -eq 0 ]; then
    +            active+=($p)
    +          fi
    +        done < "${pid}"
     
    -    (start)
    -        # Rotate log files
    -        rotateLogFile $log
    -        rotateLogFile $out
    -
    -        # Print a warning if daemons are already running on host
    -        if [ -f $pid ]; then
    -          active=()
    -          while IFS='' read -r p || [[ -n "$p" ]]; do
    -            kill -0 $p >/dev/null 2>&1
    -            if [ $? -eq 0 ]; then
    -              active+=($p)
    -            fi
    -          done < "${pid}"
    -
    -          count="${#active[@]}"
    +        count="${#active[@]}"
     
    -          if [ ${count} -gt 0 ]; then
    -            echo "[INFO] $count instance(s) of $DAEMON are already running on $HOSTNAME."
    -          fi
    +        if [ ${count} -gt 0 ]; then
    +          echo "[INFO] $count instance(s) of $DAEMON are already running on $HOSTNAME."
             fi
    +      fi
    +
    +      if [[ $STARTSTOP == "start-foreground" ]]; then
    +        echo "Starting $DAEMON as a foreground process on host $HOSTNAME."
    +        $JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null
    +      fi
     
    +      if [[ $STARTSTOP == "start" ]]; then
             echo "Starting $DAEMON daemon on host $HOSTNAME."
    -        $JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null &
    +        nohup $JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null &
    --- End diff --
    
    This is excellent documentation. It looks like (from a `grep daemon` through the code) that Flink is daemonizing its threads internally. Perhaps @StephanEwen can look at this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---