You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by rj...@apache.org on 2006/05/14 22:42:43 UTC
svn commit: r406423 - /tomcat/connectors/trunk/jk/native/TODO

Author: rjung
Date: Sun May 14 13:42:42 2006
New Revision: 406423

URL: http://svn.apache.org/viewcvs?rev=406423&view=rev
Log:
Adding a list of further TODOs and ideas.

Added:
    tomcat/connectors/trunk/jk/native/TODO

Added: tomcat/connectors/trunk/jk/native/TODO
URL: http://svn.apache.org/viewcvs/tomcat/connectors/trunk/jk/native/TODO?rev=406423&view=auto
==============================================================================
--- tomcat/connectors/trunk/jk/native/TODO (added)
+++ tomcat/connectors/trunk/jk/native/TODO Sun May 14 13:42:42 2006
@@ -0,0 +1,230 @@
+TODO for tomcat-connectors
+
+$Id$
+
+1) Counter types in shm
+=======================
+
+Why are elected and errors of type size_t?
+
+    /* Number of times the worker was elected */
+    volatile size_t  elected;
+    /* Number of non 200 responses */
+    volatile size_t  errors;
+
+Shouldn't they be jk_uint64_t?
+
+2) Implement "distance"
+=======================
+
+Add a new attribut "distance" to balanced workers.
+Without session Id or stickyness the lb worker will
+choose between al usable workers of the smallest distance.
+
+3) Reduce number of string comparisons in most_suitable
+========================================================
+
+a) redirect/domains
+
+It would be easy to improve the redirect string b an integer, giving the
+index of the worker in the lb. Then lb would not need to search for the redirect worker.
+
+The same way, one could add a list with indizes to workers in the same domain.
+Whenever domain names are managed (init and status worker update) one would
+scan the worker list and update the index list.
+
+Finally one could have a list of workers, whose domain is the same as the redirect
+attribute of the worker, because that's also something we consider.
+
+What I'm not sure about, even in the existing code, is the locking between updates
+by the status worker and the process local information about the workers,
+especially in the case, when status updates a redirect or domain attribute.
+
+I would like to keep these attributes and the new index arrays process local,
+and the processes should find out about changes made by status to shm (redirect/domain)
+and then rebuild their data. No need to get these on every request from the shm,
+only the check for up-to-date should be made.
+
+b) exact matches for jvmRoutes
+
+Could we use hashes instead of string comparisons all the time?
+I'm not sure, if a good enough hash takes longer than a string comparison though.
+
+4) Optimization of JK_WORKER_USABLE
+====================================
+
+We use that one quite a lot. Since it is now a combination of four
+ANDs of negated values, we could also do
+
+#define JK_WORKER_USABLE(w)   (!((w)->in_error_state || ($w)->is_stopped || (w)->is_disabled || (w)->is_busy))
+
+I think it we should consider combining the flags into an additional
+is_usable (makes checks easier, but of course we would have to set it
+every time we change one of the four other flags). But in terms of
+performance that happens rarely.
+
+5) Code separation between factory, validate and init in lb
+============================================================
+
+The factory contains:
+
+        private_data->worker.retries = JK_RETRIES;
+        private_data->s->recover_wait_time = WAIT_BEFORE_RECOVER;
+
+I think, this should move to validate() or init().
+It might even be obsolete, because in init, we already have:
+
+    pThis->retries = jk_get_worker_retries(props, p->s->name,
+    p->s->retries = pThis->retries;
+    p->s->recover_wait_time = jk_get_worker_recover_timeout(props, p->s->name, WAIT_BEFORE_RECOVER);
+    if (p->s->recover_wait_time < WAIT_BEFORE_RECOVER)
+        p->s->recover_wait_time = WAIT_BEFORE_RECOVER;
+
+Then: In validate there is
+
+                p->lb_workers[i].s->error_time = 0;
+
+So shouldn't there also be
+
+                p->lb_workers[i].s->maintain_time = time(NULL);
+
+6) Refactor Logging
+====================
+
+a) Use the same code files for the request logging functions in Apache 1.3 and 2.0.
+
+b) Use the same code files for piped logging in Apache 1.3 and 2.0.
+
+7) ajpget
+==========
+
+Combine ajplib and Apache ab to an ajp13 commandline client ajpget.
+
+8) Parsing workers.properties
+=============================
+
+Parsing of workers.properties aditionally to just looking up attributes
+would help users to detect syntax errors in the file. At the moment
+no information will be logged, e.g. when attributes contain typos.
+
+9) Persisting workers.properties
+=================================
+
+Make workers.properties persist from inside status worker.
+
+10) Reduce number of uses of time(NULL)
+=======================================
+
+We use time(NULL) a lot. Since it only has resolution of a second,
+I'm asking myself, if we could update the actual time in only a few
+places and get time out of some variables when needed. The same does
+not hold true for millisecond time, but in several cases we use the time,
+it's not very critical, that it is exact. These cases are related to:
+
+- last_access for usage against timeout value that is ~minutes
+- error_time for usage against retry timeout that is ~minutes
+- maintain_time for usage against transfer division interval, that is ~minutes
+- uri_worker_map checked for usage against JK_URIMAP_RELOAD=1 minute
+- check against worker_maintain_time which is ~minutes
+
+So I think, it would suffice to set an actual time at the beginning of
+the request/response cycle (used by everything before the request is being
+sent over the socket) and maybe after the response shows up/ an error occurs
+(for everything else, if there is).
+
+For which cases would it be OK, to use the time before sending to TC:
+- uri_worker_map "checked" (uri map lookup starts early)
+- maintain (starts in front of the request)
+- "now" inside retry_worker could be taken from the calling maintain
+- setting/testing last_access in
+  - jk_ajp_common.c:ajp_connect_to_endpoint()
+  - jk_ajp_common.c:ajp_get_endpoint()
+  - jk_ajp_common.c:ajp_maintain()
+
+What about the others:
+- setting last_access in init should use the actual time in
+  jk_ajp_common.c:ajp_create_endpoint_cache()
+
+- setting last_access again after the service could also use the 
+  actual time in jk_ajp_common.c:ajp_done()
+- setting error_time should better use the actual time
+  jk_lb_worker.c service(): rec->s->error_time = time(NULL);
+
+The last two cases could again use the same time, which then would be needed
+to be generated at the end or directly after service.
+
+11) Access/Modification Time in shm
+==================================
+
+a) [Discussion] What will this generally be used for? At the moment,
+only jk_status "uses" it, but it only sets the values, it never asks for them.
+
+b) [Improvement, minor] jk_shm_set_workers_time() implicitly calls
+jk_shm_sync_access_time(), but jk_status does:
+
+            jk_shm_set_workers_time(time(NULL));
+            /* Since we updated the config no need to reload
+             * on the next request
+             */
+            jk_shm_sync_access_time();
+
+two times. So depending on the idea of the functionality of these calls,
+either set_workers_time and sync_access_time should be independently,
+or the second call in jk_status coulkd be removed.
+
+12) "Destroy" functionality
+==========================
+
+[Hint] Destroy on a worker never seems to free shm,
+but I think that was already a flaw without shm.
+
+13) Locks against shm
+=====================
+
+It might be an intersting experiment to implement an improved locking structure.
+It looks like one would need in fact different types of locks.
+In shm we have as read/write information:
+
+Changed only by status worker:
+- redirect, domain, lb_factor, sticky_session, sticky_session_force,
+  recover_wait_time, retries, status (is_disabled, is_stopped).
+
+These changes need some kind of reconfiguration in the threads after
+change and before the next request/response. Since changes are rare,
+here we would be better of, with a simple detect change and copy from
+shm to process procedure. status updates the data in shm and after that
+the time stamp on the shh. Each process checks the time stamp before
+doing a request, and when the time stamp changed it does a writer CS
+lock and updates it's local copy. All threads always do a reader CS
+lock when doing a request/response cycle. Reader CS locks are concurrent,
+writers are exclusive. So readers are not allowed, when the config data is being updated.
+
+Changed by the threads themselves (and via reset by the status worker):
+- counters needed by routing decisions (lb_value, readed, transferred, busy)
+- timers needed by maintenance functions (error_time, servic_time/maintain_time)
+- status is_busy, in_error_state
+- uncritical data with no influence on routing decisions (max_busy, elected, errors,
+  in_recovering)
+
+Here again we could improve by using reader/writer locks. I have a
+tendency for the PESSIMISTIC side of locking, but I think we could
+shrink the code blocks that should be locked. At the monent they are
+pretty big (most of get_most_suitable_worker).
+
+Read-only: name and id.
+
+By the way: at several places we don't check for errors on getting the lock.
+
+14) What I didn't yet check
+===========================
+
+a) Correctness of is_busy handling
+
+b) Correctness of the reset values after reset by status worker
+
+c) What would be the exact behaviour, if shm does not work (memory case).
+   Will this be a critical failure, or will we only experience a
+   degradation in routing decisions.
+
+d) How complete is mod_proxy_ajp/mod_proxy_balancer.
+   Port changes from mod_jk to them.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org