You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@knox.apache.org by GitBox <gi...@apache.org> on 2020/01/02 18:21:02 UTC

[GitHub] [knox] smolnar82 opened a new pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start

smolnar82 opened a new pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start
URL: https://github.com/apache/knox/pull/230
 
 
   ## What changes were proposed in this pull request?
   
   Apart from the already existing PID check, we are going to verify if the Jetty server is in `STARTED` state. We can achieve it by implementing Jetty's `LifeCylce.Listener` interface and write out the status (STARTING, STARTED, FAILURE, STOPPING, STOPPED) into `$DATA_DIR/gatewayServer.status` file. The startup script will return 0 when this file contains `STARTED`; 1 otherwise (the file is overwritten in every status change).
   
   Additionally, two new command-line options are introduced to the `start` command:
   * `--test-gateway-retry-attempts`: indicates the number of tries the startup script should execute before it fails. Defaults to 5.
   * `--test-gateway-retry-sleep`: the amount of time that the test process will wait or sleep before a retry is issued. Defaults to 2s.
   
   **Note**: currently, the `stop` command simply kills the Java process and there is no hook in the Java code to handle graceful shutdown. Because of this the `STOPPING` and `STOPPED` statuses never got written into the status file. To avoid future errors I rather added a delete command in the shell script.
   
   ## How was this patch tested?
   
   Executed a full build:
   ```
   $ mvn clean -Dshellcheck=true -T1C verify -Prelease,package
   ...
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD SUCCESS
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time: 18:24 min (Wall Clock)
   [INFO] Finished at: 2020-01-02T18:50:47+01:00
   [INFO] Final Memory: 410M/1798M
   [INFO] ------------------------------------------------------------------------
   ```
   
   Tested manually:
   * confirmed that the new status file became created/removed when started/stopped the server
   * tried the `start` command with different options (`--printEnv`, `--test-gateway-retry-attempts`, `--test-gateway-retry-sleep`)
   * made sure (using debug messages in the shell scripts) that the status check took place (in my environment it needed 3 re-tries)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [knox] risdenk commented on issue #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start

Posted by GitBox <gi...@apache.org>.
risdenk commented on issue #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start
URL: https://github.com/apache/knox/pull/230#issuecomment-570297655
 
 
   So just to make sure I understand the problem, the issue is that the launcher runs the gateway process and doesn't wait for the initialization of the server to complete before returning and saying this is up and running.
   
   Is it possible to maybe use the listener or another method in the GatewayServer to wait until initialization is complete? Is the launcher returning too soon? 
   
   Ideally after `GatewayServer` starts the server would be completely ready. It is weird that we are saying that everything is good before that :/
   
   I am not sure I am a fan of the writing a file and checking the output to avoid what looks to be a fundamental gateway starting issue. I wonder how we avoid this in tests currently. There must be an existing method to make sure that the topology is deployed or something along those lines or we would see issues like this in the tests. The solution in this PR wouldn't help the tests since it would be in the shell script only.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [knox] smolnar82 commented on issue #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start

Posted by GitBox <gi...@apache.org>.
smolnar82 commented on issue #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start
URL: https://github.com/apache/knox/pull/230#issuecomment-570520825
 
 
   The main difference between our tests and the way how we start it from the startup script is:
   * within the tests we directly call `GatewayServer.startServer` (usually in methods annotated with `@BeforeClass`) and wait until it returns
   * whereas in the script - where the application is configured to run in the background by default - creates a Java process with `nohup` which invokes `GatewayServer.main()` with the given argument and let the process live its own life in the background. Right after the Java call the script continues to check if the PID exists. Usually, it takes some time for the server to start up but the script indicates it's up&running because the PID is there.
   
   It's not _easy_ to add hook in a shell script to check the 2nd case:
   1. either the server persists its state somewhere which the script can read (this is what I implemented here)
   2. or the script tries to ping the server somehow
   2.1. The REST API would be ideal, but this would require at least one servlet which does not require authentication.
   2.2. Or one might try to check if the port is open. For instance:
   ```
   $ nc -z localhost 8443
   
   $ echo $?
   1
   
   $ nc -z localhost 8443
   
   $ echo $?
   1
   
   $ nc -z localhost 8443
   Connection to localhost port 8443 [tcp/pcsync-https] succeeded!
   
   $ echo $?
   0
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [knox] smolnar82 commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start and registering shutdown hook in order to stop the server gracefully.

Posted by GitBox <gi...@apache.org>.
smolnar82 commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start and registering shutdown hook in order to stop the server gracefully.
URL: https://github.com/apache/knox/pull/230#discussion_r363677522
 
 

 ##########
 File path: gateway-server/src/main/java/org/apache/knox/gateway/GatewayServerLifecyleListener.java
 ##########
 @@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.knox.gateway;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+import java.util.Locale;
+
+import org.apache.knox.gateway.config.GatewayConfig;
+import org.apache.knox.gateway.i18n.messages.MessagesFactory;
+import org.eclipse.jetty.util.component.LifeCycle;
+
+public class GatewayServerLifecyleListener implements LifeCycle.Listener {
+
+  private static final GatewayMessages log = MessagesFactory.get(GatewayMessages.class);
+
+  private static final ThreadLocal<DateFormat> DATE_FORMAT = ThreadLocal.withInitial(() -> new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZ", Locale.getDefault()));
+
+  private enum Status {
+    STARTING, STARTED, FAILURE, STOPPING, STOPPED
+  };
+
+  private final Path lifeCylceFilePath;
 
 Review comment:
   Done.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [knox] risdenk commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start

Posted by GitBox <gi...@apache.org>.
risdenk commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start
URL: https://github.com/apache/knox/pull/230#discussion_r363338928
 
 

 ##########
 File path: gateway-release-common/home/bin/knox-functions.sh
 ##########
 @@ -250,6 +302,11 @@ function appStop {
      exit 1
    else
      rm -f "$APP_PID_FILE"
+
+     #TODO: due to the current way of shutting down the server the status has never been changed to STOPPED (STARTING seems to be not reliable either which is weird).
 
 Review comment:
   This TODO seems relevant if this isn't being written then this will cause issues. Were you able to figure out why lifecycle isn't writing this file?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [knox] smolnar82 commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start

Posted by GitBox <gi...@apache.org>.
smolnar82 commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start
URL: https://github.com/apache/knox/pull/230#discussion_r363462354
 
 

 ##########
 File path: gateway-server/src/main/java/org/apache/knox/gateway/GatewayServerLifecyleListener.java
 ##########
 @@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.knox.gateway;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+import java.util.Locale;
+
+import org.apache.knox.gateway.config.GatewayConfig;
+import org.apache.knox.gateway.i18n.messages.MessagesFactory;
+import org.eclipse.jetty.util.component.LifeCycle;
+
+public class GatewayServerLifecyleListener implements LifeCycle.Listener {
+
+  private static final GatewayMessages log = MessagesFactory.get(GatewayMessages.class);
+
+  private static final ThreadLocal<DateFormat> DATE_FORMAT = ThreadLocal.withInitial(() -> new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZ", Locale.getDefault()));
+
+  private enum Status {
+    STARTING, STARTED, FAILURE, STOPPING, STOPPED
+  };
+
+  private final Path lifeCylceFilePath;
 
 Review comment:
   Will change it as soon as I can.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [knox] smolnar82 commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start and registering shutdown hook in order to stop the server gracefully.

Posted by GitBox <gi...@apache.org>.
smolnar82 commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start and registering shutdown hook in order to stop the server gracefully.
URL: https://github.com/apache/knox/pull/230#discussion_r363677458
 
 

 ##########
 File path: gateway-release-common/home/bin/knox-functions.sh
 ##########
 @@ -250,6 +302,11 @@ function appStop {
      exit 1
    else
      rm -f "$APP_PID_FILE"
+
+     #TODO: due to the current way of shutting down the server the status has never been changed to STOPPED (STARTING seems to be not reliable either which is weird).
 
 Review comment:
   Added the above-mentioned shutdown hook and it's working like a charm.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [knox] smolnar82 commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start

Posted by GitBox <gi...@apache.org>.
smolnar82 commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start
URL: https://github.com/apache/knox/pull/230#discussion_r363465413
 
 

 ##########
 File path: gateway-release-common/home/bin/knox-functions.sh
 ##########
 @@ -250,6 +302,11 @@ function appStop {
      exit 1
    else
      rm -f "$APP_PID_FILE"
+
+     #TODO: due to the current way of shutting down the server the status has never been changed to STOPPED (STARTING seems to be not reliable either which is weird).
 
 Review comment:
   Currently, the JVM is "stopped" with a `kill` command but there is no [shutdown hook](https://docs.oracle.com/javase/8/docs/api/java/lang/Runtime.html#addShutdownHook-java.lang.Thread-) registered in `GatewayServer`. Adding a shutdown hook would help in most of the cases but it may still send a SIGKILL when stopping a server 'normally' did not succeed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [knox] risdenk commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start

Posted by GitBox <gi...@apache.org>.
risdenk commented on a change in pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start
URL: https://github.com/apache/knox/pull/230#discussion_r363339216
 
 

 ##########
 File path: gateway-server/src/main/java/org/apache/knox/gateway/GatewayServerLifecyleListener.java
 ##########
 @@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.knox.gateway;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+import java.util.Locale;
+
+import org.apache.knox.gateway.config.GatewayConfig;
+import org.apache.knox.gateway.i18n.messages.MessagesFactory;
+import org.eclipse.jetty.util.component.LifeCycle;
+
+public class GatewayServerLifecyleListener implements LifeCycle.Listener {
+
+  private static final GatewayMessages log = MessagesFactory.get(GatewayMessages.class);
+
+  private static final ThreadLocal<DateFormat> DATE_FORMAT = ThreadLocal.withInitial(() -> new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZ", Locale.getDefault()));
+
+  private enum Status {
+    STARTING, STARTED, FAILURE, STOPPING, STOPPED
+  };
+
+  private final Path lifeCylceFilePath;
 
 Review comment:
   Should be `lifeCycleFilePath`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [knox] smolnar82 merged pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start and registering shutdown hook in order to stop the server gracefully.

Posted by GitBox <gi...@apache.org>.
smolnar82 merged pull request #230: KNOX-2157 - Verifying the server's state in addition to PID check at gateway start and registering shutdown hook in order to stop the server gracefully.
URL: https://github.com/apache/knox/pull/230
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services