| TODO for tomcat-connectors |
| |
| $Id$ |
| |
| 1) Optimize "distance" |
| ====================== |
| |
| Sorting the list of balanced workers by distance would be nice, but: |
| How to combine the sorting with the offset implementation (especially |
| useful for strategy BUSYNESS under low load). |
| |
| 2) Reduce number of string comparisons in most_suitable |
| ======================================================== |
| |
| a) redirect/domains |
| |
| It would be easy to improve the redirect string b an integer, giving the |
| index of the worker in the lb. Then lb would not need to search for the redirect worker. |
| |
| The same way, one could add a list with indizes to workers in the same domain. |
| Whenever domain names are managed (init and status worker update) one would |
| scan the worker list and update the index list. |
| |
| Finally one could have a list of workers, whose domain is the same as the redirect |
| attribute of the worker, because that's also something we consider. |
| |
| What I'm not sure about, even in the existing code, is the locking between updates |
| by the status worker and the process local information about the workers, |
| especially in the case, when status updates a redirect or domain attribute. |
| |
| I would like to keep these attributes and the new index arrays process local, |
| and the processes should find out about changes made by status to shm (redirect/domain) |
| and then rebuild their data. No need to get these on every request from the shm, |
| only the check for up-to-date should be made. |
| |
| b) exact matches for jvmRoutes |
| |
| Could we use hashes instead of string comparisons all the time? |
| I'm not sure, if a good enough hash takes longer than a string comparison though. |
| |
| 3) Optimization of JK_WORKER_USABLE |
| ==================================== |
| |
| We use that one quite a lot. Since it is now a combination of four |
| ANDs of negated values, we could also do |
| |
| #define JK_WORKER_USABLE(w) (!((w)->in_error_state || ($w)->is_stopped || (w)->is_disabled || (w)->is_busy)) |
| |
| I think it we should consider combining the flags into an additional |
| is_usable (makes checks easier, but of course we would have to set it |
| every time we change one of the four other flags). But in terms of |
| performance that happens rarely. |
| |
| 4) Code separation between factory, validate and init in lb |
| ============================================================ |
| |
| The factory contains: |
| |
| private_data->worker.retries = JK_RETRIES; |
| private_data->s->recover_wait_time = WAIT_BEFORE_RECOVER; |
| |
| I think, this should move to validate() or init(). |
| It might even be obsolete, because in init, we already have: |
| |
| pThis->retries = jk_get_worker_retries(props, p->s->name, |
| p->s->retries = pThis->retries; |
| p->s->recover_wait_time = jk_get_worker_recover_timeout(props, p->s->name, WAIT_BEFORE_RECOVER); |
| if (p->s->recover_wait_time < WAIT_BEFORE_RECOVER) |
| p->s->recover_wait_time = WAIT_BEFORE_RECOVER; |
| |
| Then: In validate there is |
| |
| p->lb_workers[i].s->error_time = 0; |
| |
| So shouldn't there also be |
| |
| p->lb_workers[i].s->maintain_time = time(NULL); |
| |
| 5) Refactor Logging |
| ==================== |
| |
| a) Use the same code files for the request logging functions in Apache 1.3 and 2.0. |
| |
| b) Use the same code files for piped logging in Apache 1.3 and 2.0. |
| |
| 6) ajpget |
| ========== |
| |
| Combine ajplib and Apache ab to an ajp13 commandline client ajpget. |
| |
| 7) Manage lb method and locking via jk_status |
| ============================================= |
| |
| It's not yet contained in the shm. |
| |
| 8) Parsing workers.properties |
| ============================= |
| |
| Parsing of workers.properties aditionally to just looking up attributes |
| would help users to detect syntax errors in the file. At the moment |
| no information will be logged, e.g. when attributes contain typos. |
| |
| 9) Persisting workers.properties |
| ================================ |
| |
| Make workers.properties persist from inside status worker. |
| |
| 10) Reduce number of uses of time(NULL) |
| ======================================= |
| |
| We use time(NULL) a lot. Since it only has resolution of a second, |
| I'm asking myself, if we could update the actual time in only a few |
| places and get time out of some variables when needed. The same does |
| not hold true for millisecond time, but in several cases we use the time, |
| it's not very critical, that it is exact. These cases are related to: |
| |
| Some of this is already been done, the remaining parts are: |
| |
| - last_access for usage against timeout value that is ~minutes |
| - error_time for usage against retry timeout that is ~minutes |
| - uri_worker_map checked for usage against JK_URIMAP_RELOAD=1 minute |
| |
| So I think, it would suffice to set an actual time at the beginning of |
| the request/response cycle (used by everything before the request is being |
| sent over the socket) and maybe after the response shows up/ an error occurs |
| (for everything else, if there is). |
| |
| For which cases would it be OK, to use the time before sending to TC: |
| - uri_worker_map "checked" (uri map lookup starts early) |
| - setting/testing last_access in |
| - jk_ajp_common.c:ajp_connect_to_endpoint() |
| - jk_ajp_common.c:ajp_get_endpoint() |
| - jk_ajp_common.c:ajp_maintain() |
| |
| What about the others: |
| - setting last_access in init should use the actual time in |
| jk_ajp_common.c:ajp_create_endpoint_cache() |
| |
| - setting last_access again after the service could also use the |
| actual time in jk_ajp_common.c:ajp_done() |
| - setting error_time should better use the actual time |
| jk_lb_worker.c service(): rec->s->error_time = time(NULL); |
| |
| The last two cases could again use the same time, which then would be needed |
| to be generated at the end or directly after service. |
| |
| 11) Access/Modification Time in shm |
| =================================== |
| |
| a) [Discussion] What will this generally be used for? At the moment, |
| only jk_status "uses" it, but it only sets the values, it never asks for them. |
| |
| b) [Improvement, minor] jk_shm_set_workers_time() implicitly calls |
| jk_shm_sync_access_time(), but jk_status does: |
| |
| jk_shm_set_workers_time(time(NULL)); |
| /* Since we updated the config no need to reload |
| * on the next request |
| */ |
| jk_shm_sync_access_time(); |
| |
| two times. So depending on the idea of the functionality of these calls, |
| either set_workers_time and sync_access_time should be independently, |
| or the second call in jk_status coulkd be removed. |
| |
| 12) "Destroy" functionality |
| =========================== |
| |
| [Hint] Destroy on a worker never seems to free shm, |
| but I think that was already a flaw without shm. |
| |
| 13) Locks against shm |
| ===================== |
| |
| It might be an intersting experiment to implement an improved locking structure. |
| It looks like one would need in fact different types of locks. |
| In shm we have as read/write information: |
| |
| Changed only by status worker: |
| - redirect, domain, lb_factor, sticky_session, sticky_session_force, |
| recover_wait_time, retries, status (is_disabled, is_stopped). |
| |
| These changes need some kind of reconfiguration in the threads after |
| change and before the next request/response. Since changes are rare, |
| here we would be better of, with a simple detect change and copy from |
| shm to process procedure. status updates the data in shm and after that |
| the time stamp on the shh. Each process checks the time stamp before |
| doing a request, and when the time stamp changed it does a writer CS |
| lock and updates it's local copy. All threads always do a reader CS |
| lock when doing a request/response cycle. Reader CS locks are concurrent, |
| writers are exclusive. So readers are not allowed, when the config data is being updated. |
| |
| Changed by the threads themselves (and via reset by the status worker): |
| - counters needed by routing decisions (lb_value, readed, transferred, busy) |
| - timers needed by maintenance functions (error_time, servic_time/maintain_time) |
| - status is_busy, in_error_state |
| - uncritical data with no influence on routing decisions (max_busy, elected, errors, |
| in_recovering) |
| |
| Here again we could improve by using reader/writer locks. I have a |
| tendency for the PESSIMISTIC side of locking, but I think we could |
| shrink the code blocks that should be locked. At the monent they are |
| pretty big (most of get_most_suitable_worker). |
| |
| Read-only: name and id. |
| |
| By the way: at several places we don't check for errors on getting the lock. |
| |
| 14) What I didn't yet check |
| =========================== |
| |
| a) Correctness of is_busy handling |
| |
| b) Correctness of the reset values after reset by status worker |
| |
| c) What would be the exact behaviour, if shm does not work (memory case). |
| Will this be a critical failure, or will we only experience a |
| degradation in routing decisions. |
| |
| d) How complete is mod_proxy_ajp/mod_proxy_balancer. |
| Port changes from mod_jk to them. |