container/modules/groupcom/to-do.txt - tomcat55 - Git at Google

 To Do:
 ===========================================

 Documentation:
 ===========================================
  Why is tribes unique compared to JGroups/Appia and other group comm protocols

  1. Uses NIO and TCP for guaranteed delivery and the ability to join large groups

  2. Guarantees messages the following way
     a) TCP messaging, with a following READ for NIO to ensure non broken channel
     b) ACK messages from the receiver
     c) ACK after processing

  3. Same (single) channel can handle all types of guarantees (a,b,c) at the same time
     and both process synchronous and asynchronous messaging.
     This is key to support different requirements for messaging through
     the same channel to save system resources.

  4. For async messaging, errors are reported through an error handler, callback

  5. Ability to send on multiple streams at the same time, in parallel, to improve performance

  6. Designed with replication in mind, that some pieces of data don't need to be totally ordered.

  7. Its not built with the uniform group model in mind, but it can be accomplished using interceptors.

  8. Future version will have WAN membership and replication

 Bugs:
 ===========================================
  a) Somehow the first NIO connection made, always closes down, why
  c) RpcChannel - collect "no reply" replies, so that we don't have to time out

 Code Tasks:
 ===========================================
 38. Make the AbstractReplicatedMap accept non serializable elements, but just don't replicate them

 36. UDP Sender and Receiver, initially without flow control and guaranteed delivery.
     This can be easily done as an interceptor, and operate in parallel with the TCP sender.
     It can implement an auto detect, so that if the message is going to a destination in the same network
     patch, then send it over UDP.

 35. The ability to share one channel amongst multiple processes

 34. Configurable payload for the membership heartbeat, so that the app can decide what to heartbeat.
     such as JMX management port, ala Andy Piper's suggestion.

 33. PerfectFDInterceptor, when a member is reported missing, first check TCP path too.

 32. Replicated JNDI entries in Tomcat in the format
     cluster:<map name>/<entry key> for example
     cluster:myapps/db/shared/dbinfo

 31. A layer on top of the GroupChannel, to allow multiple processes share
     a channel to send/receive message to conserve system resources - this is way in the future.

 30. CookieBasedReplicationMap - a very simple extension to the LazyReplicatedMap
     but instead of randomly selecting a backup node and then publishing the PROXY to all
     the other nodes in the group, this will simply
     read/write a cookie, for a backup location, so the nodes will
     never know until the request comes in.
     This is useful in extremely large clusters, and essentially reduces
     very much of the network chatter, this task is dependent on task25
     Question to be answered: How does the map know of the cookie?
     Potential answer: Use a thread local and bind the request/response to it

 29. Thread pool, shrink dynamically

 27. XmlConfigurator - read an XML file to configure the channel.

 26. JNDIChannel - a way to bind the group channel in a JNDI tree,
     so that shared resources can access it.

 25. Member.uniqueId - 16 bytes unique for a member, UUID
     Needed to not confuse a crashed member with a revived member on the same port

 23. TotalOrderInterceptor - fairly straight forward implementation
     This interceptor would depend on the fact that there is some sort of
     membership coordinator, see task 9.
     Once there is a coordinator in the group, the total order protocol is the same
     as the OrderInterceptor, except that it gets its message number from
     the coordinator, prior to sending it out.
     The TotalOrderInterceptor, will keep a order number per member,
     this way, ordering is kept intact when different messages are sent
     two different members, ie
     Message A - all members - total order (mbrA-2, mbrB-2, mbrC-2, mbrD-2)
     Message B - mbrC,mbrD only - total order (mbrA-2, mbrB-2, mbrC-3, mbrD-3)
     - The combination of Member uniqueId,orderId is unique, nothing else
       this way, if a member crashes, we don't hold the queue, instead we start over.
     - A TotalOrder token, will contain the coordinator uniqueId as well.
     - One parameter should be "receive sequence timeout" incase the coordinator is not responding.
     - OPTION A)
       the coordinator doesn't forward the message
       since the app will not receive the proper error message,
       instead the sequencer just returns the sequence, then the member itself sends the message
       pros: the app will find out if the send failed/succeeded
       cons: if the send fails, the sequencer is out of sync for the failed member
       OPTION B)
       The coordinator, receives the message, adds on the sequence number
       then sends the message on behalf of the requesting members
       pros: sequencer is in charge of the sequence
       cons: the sequence can become overloaded, since it has to do all the trafficing
             the requesting member will not know if the message failed/succeeded
       OPTION C) Research papers on total order, better algorithms exist.

 21. Implement a WAN membership layer, using a WANMbrInterceptor and a
     WAN Router/Forwarder (Tipi on top of a ManagedChannel)

 20. Implement a TCP membership interceptor, for guaranteed functionality, not just discovery

 19. Implement a hardcoded tcp membership

 18. Implement SSL encryption over message transfers, BIO and NIO

 8. WaitForCompletionInterceptor - waits for the message to get processed by all receivers before returning
    (This is useful when synchronized=false and waitForAck=false, to improve
    parallel processing, but you want to have all messages sent in parallel and
    don't return until all have been processed on the remote end.)

 9. CoordinatorInterceptor - manages the selection of a cluster coordinator
    just had a brilliant idea, if GroupChannel keeps its own view of members,
    the coordinator interceptor can hold on to the member added/disappared event
    It can also intercept down going messages if the coordinator disappeared
    while a new coordinator is chosen
    It can also intercept down going messages for members disappeared that the
    calling app not yet knows about, to avoid a ChannelException


 10. Xa2PhaseCommitInterceptor - make sure the message doesn't reach the receiver unless all members got it

 11. Code a ReplicatedFileSystem example, package org.apache.catalina.tipis

 13. StateTransfer interceptor
     the ideas just come up in my head. the state transfer interceptor
     will hold all incoming messages until it has received a message
     with a STATE_TRANSFER header as the first of the bytes.
     Once it has received state, it will pretty much take itself out of the loop
     The benefit of the new ParallelNioSender is that it doesn't require to know about
     a member to transfer state, all it has to do is to reply to a message that came in.
     State is a one time deal for the entire channel, so a
     session replication cluster, would transfer state as one block, not one per context

 14. Keepalive count and idle kill off for Nio senders

 16. Guaranteed delivery of messages, ie either all get it or none get it.
     Meaning, that all receivers get it, then wait for a process command.
     ala Gossip protocol - this is fairly redundant with a Xa2PhaseCommitInterceptor
     except it doesn't keep a transaction log.

 17. Implement transactions - the ability to start a transaction, send several messages,
                              and then commit the transaction

 Tasks Completed
 ===========================================
 1. True synchronized/asynchronized replication enabled using flags
 Sender.sendAck/Receiver.waitForAck/Receiver.synchronized
 Task Desc: waitForAck - should only mean, we received the message, not for the
 message to get processesed. This should improve throughput, and an interceptor
 can do waitForCompletion
 Status: Complete
 Notes:

 2. Unique id, send it in byte array instead of string

 3. DataSender or ReplicationTransmitter swallows IOException, this should be
 Notes: This has only been fixed for the pooled synchronized. the fastasynch
 aint working that well

 4. ChannelMessage.getMessage should return streamable, that way we can wrap,
 pass it around and all those good things without having to copy byte arrays
 left and right
 Notes: Instead of using a streamable, this is implemented using the XByteBuffer,
        which is very easy to use. It also becomes a single spot for optimizations.
        Ideally, there would be a pool of XByteBuffers, that all use direct ByteBuffers
        for its data handling.

 5. OrderInterceptor - guarantees the order of messages
 Notes: completed

 6. NIO and IO DataSender, since the IO is blocking
 Notes: completed. works very well, have not implemented suspect error logging.

 7. FragmentationInterceptor - splits up messages that are larger than X bytes.
 Notes: complated

 15. remove DataSenderFactory and DataSender.properties -
     these cause the settings to be hard coded ant not pluggable.
 Notes: Completed, now you can initialize a transport class

 12. LazyReplicatedHashMap - memory efficient clustered map.
     This map can be used for PRIMARY/SECONDARY session replication
     Ahh, the beauty of storing data in remote locations
     The lazy hash map will only replicate its attribute names to all members in the group
     with that name, it will also replicate the source (where to get the object)
     and the backup member where it can find a backup if the source is gone.
     If the source disappears, the backup node will replicate attributes that
     are stored to a new primary backups can be chosen on round robin.
     When a new member arrives and requests state, that member will get all the attribute
     names and the locations.
     It can replicate every X seconds, or on dirty flags by the objects stored,
     or a request to scan for dirty flags, or a request with the objects.
 Notes: the map has been completed

 22. sendAck and synchronized should not have to be a XML config,
     it can be configured on a per packet basis using ClusterData.getOptions()
 Notes: see Channel.SEND_OPT_XXXX variables

 28. Thread pool should have maxThreads and minThreads and grow dynamically

 24. MessageDispatchInterceptor - for asynchronous sending
     - looks at the options flag SEND_OPTIONS_ASYNCHRONOUS
     - has two modes
       a) async parallel send - each message to all destinations before next message
       b) async per/member - one thread per member using the FastAsyncQueue (good for groups with slow receivers)
     - Callback error handler - for when messages fail, and the application wishes to become notified
     - MUST HAVE A LIMIT QUEUE SIZE IN MB, to avoid OOM errors or persist the queue.
     - MUST USE ClusterData.deepclone() to ensure thread safety if ClusterData objects get recycled
 Notes: Simple implementation, one thread, invokes all senders in parallel.
        Deep cloning is configurable as optimization.

 37. Interceptor.getOptionFlag() - lets the system configure a flag to be used
     for the interceptor. that way, all constants don't have to be configured
     in Channel.SEND_FLAG_XXXX.
     Also, the GroupChannel will make a conflict check upon startup,
     so that there is no conflict. I will change options to a long,
     so that we can have 63 flags, hence up to 60 interceptors.
 Notes: Completed, remained an int, so 31 flags

  b) State synchronization for the map - will need to add in MSG_INIT
  Fixed map bug
	To Do:
	===========================================

	Documentation:
	===========================================
	Why is tribes unique compared to JGroups/Appia and other group comm protocols

	1. Uses NIO and TCP for guaranteed delivery and the ability to join large groups

	2. Guarantees messages the following way
	a) TCP messaging, with a following READ for NIO to ensure non broken channel
	b) ACK messages from the receiver
	c) ACK after processing

	3. Same (single) channel can handle all types of guarantees (a,b,c) at the same time
	and both process synchronous and asynchronous messaging.
	This is key to support different requirements for messaging through
	the same channel to save system resources.

	4. For async messaging, errors are reported through an error handler, callback

	5. Ability to send on multiple streams at the same time, in parallel, to improve performance

	6. Designed with replication in mind, that some pieces of data don't need to be totally ordered.

	7. Its not built with the uniform group model in mind, but it can be accomplished using interceptors.

	8. Future version will have WAN membership and replication

	Bugs:
	===========================================
	a) Somehow the first NIO connection made, always closes down, why
	c) RpcChannel - collect "no reply" replies, so that we don't have to time out

	Code Tasks:
	===========================================
	38. Make the AbstractReplicatedMap accept non serializable elements, but just don't replicate them

	36. UDP Sender and Receiver, initially without flow control and guaranteed delivery.
	This can be easily done as an interceptor, and operate in parallel with the TCP sender.
	It can implement an auto detect, so that if the message is going to a destination in the same network
	patch, then send it over UDP.

	35. The ability to share one channel amongst multiple processes

	34. Configurable payload for the membership heartbeat, so that the app can decide what to heartbeat.
	such as JMX management port, ala Andy Piper's suggestion.

	33. PerfectFDInterceptor, when a member is reported missing, first check TCP path too.

	32. Replicated JNDI entries in Tomcat in the format
	cluster:<map name>/<entry key> for example
	cluster:myapps/db/shared/dbinfo

	31. A layer on top of the GroupChannel, to allow multiple processes share
	a channel to send/receive message to conserve system resources - this is way in the future.

	30. CookieBasedReplicationMap - a very simple extension to the LazyReplicatedMap
	but instead of randomly selecting a backup node and then publishing the PROXY to all
	the other nodes in the group, this will simply
	read/write a cookie, for a backup location, so the nodes will
	never know until the request comes in.
	This is useful in extremely large clusters, and essentially reduces
	very much of the network chatter, this task is dependent on task25
	Question to be answered: How does the map know of the cookie?
	Potential answer: Use a thread local and bind the request/response to it

	29. Thread pool, shrink dynamically

	27. XmlConfigurator - read an XML file to configure the channel.

	26. JNDIChannel - a way to bind the group channel in a JNDI tree,
	so that shared resources can access it.

	25. Member.uniqueId - 16 bytes unique for a member, UUID
	Needed to not confuse a crashed member with a revived member on the same port

	23. TotalOrderInterceptor - fairly straight forward implementation
	This interceptor would depend on the fact that there is some sort of
	membership coordinator, see task 9.
	Once there is a coordinator in the group, the total order protocol is the same
	as the OrderInterceptor, except that it gets its message number from
	the coordinator, prior to sending it out.
	The TotalOrderInterceptor, will keep a order number per member,
	this way, ordering is kept intact when different messages are sent
	two different members, ie
	Message A - all members - total order (mbrA-2, mbrB-2, mbrC-2, mbrD-2)
	Message B - mbrC,mbrD only - total order (mbrA-2, mbrB-2, mbrC-3, mbrD-3)
	- The combination of Member uniqueId,orderId is unique, nothing else
	this way, if a member crashes, we don't hold the queue, instead we start over.
	- A TotalOrder token, will contain the coordinator uniqueId as well.
	- One parameter should be "receive sequence timeout" incase the coordinator is not responding.
	- OPTION A)
	the coordinator doesn't forward the message
	since the app will not receive the proper error message,
	instead the sequencer just returns the sequence, then the member itself sends the message
	pros: the app will find out if the send failed/succeeded
	cons: if the send fails, the sequencer is out of sync for the failed member
	OPTION B)
	The coordinator, receives the message, adds on the sequence number
	then sends the message on behalf of the requesting members
	pros: sequencer is in charge of the sequence
	cons: the sequence can become overloaded, since it has to do all the trafficing
	the requesting member will not know if the message failed/succeeded
	OPTION C) Research papers on total order, better algorithms exist.

	21. Implement a WAN membership layer, using a WANMbrInterceptor and a
	WAN Router/Forwarder (Tipi on top of a ManagedChannel)

	20. Implement a TCP membership interceptor, for guaranteed functionality, not just discovery

	19. Implement a hardcoded tcp membership

	18. Implement SSL encryption over message transfers, BIO and NIO

	8. WaitForCompletionInterceptor - waits for the message to get processed by all receivers before returning
	(This is useful when synchronized=false and waitForAck=false, to improve
	parallel processing, but you want to have all messages sent in parallel and
	don't return until all have been processed on the remote end.)

	9. CoordinatorInterceptor - manages the selection of a cluster coordinator
	just had a brilliant idea, if GroupChannel keeps its own view of members,
	the coordinator interceptor can hold on to the member added/disappared event
	It can also intercept down going messages if the coordinator disappeared
	while a new coordinator is chosen
	It can also intercept down going messages for members disappeared that the
	calling app not yet knows about, to avoid a ChannelException


	10. Xa2PhaseCommitInterceptor - make sure the message doesn't reach the receiver unless all members got it

	11. Code a ReplicatedFileSystem example, package org.apache.catalina.tipis

	13. StateTransfer interceptor
	the ideas just come up in my head. the state transfer interceptor
	will hold all incoming messages until it has received a message
	with a STATE_TRANSFER header as the first of the bytes.
	Once it has received state, it will pretty much take itself out of the loop
	The benefit of the new ParallelNioSender is that it doesn't require to know about
	a member to transfer state, all it has to do is to reply to a message that came in.
	State is a one time deal for the entire channel, so a
	session replication cluster, would transfer state as one block, not one per context

	14. Keepalive count and idle kill off for Nio senders

	16. Guaranteed delivery of messages, ie either all get it or none get it.
	Meaning, that all receivers get it, then wait for a process command.
	ala Gossip protocol - this is fairly redundant with a Xa2PhaseCommitInterceptor
	except it doesn't keep a transaction log.

	17. Implement transactions - the ability to start a transaction, send several messages,
	and then commit the transaction

	Tasks Completed
	===========================================
	1. True synchronized/asynchronized replication enabled using flags
	Sender.sendAck/Receiver.waitForAck/Receiver.synchronized
	Task Desc: waitForAck - should only mean, we received the message, not for the
	message to get processesed. This should improve throughput, and an interceptor
	can do waitForCompletion
	Status: Complete
	Notes:

	2. Unique id, send it in byte array instead of string

	3. DataSender or ReplicationTransmitter swallows IOException, this should be
	Notes: This has only been fixed for the pooled synchronized. the fastasynch
	aint working that well

	4. ChannelMessage.getMessage should return streamable, that way we can wrap,
	pass it around and all those good things without having to copy byte arrays
	left and right
	Notes: Instead of using a streamable, this is implemented using the XByteBuffer,
	which is very easy to use. It also becomes a single spot for optimizations.
	Ideally, there would be a pool of XByteBuffers, that all use direct ByteBuffers
	for its data handling.

	5. OrderInterceptor - guarantees the order of messages
	Notes: completed

	6. NIO and IO DataSender, since the IO is blocking
	Notes: completed. works very well, have not implemented suspect error logging.

	7. FragmentationInterceptor - splits up messages that are larger than X bytes.
	Notes: complated

	15. remove DataSenderFactory and DataSender.properties -
	these cause the settings to be hard coded ant not pluggable.
	Notes: Completed, now you can initialize a transport class

	12. LazyReplicatedHashMap - memory efficient clustered map.
	This map can be used for PRIMARY/SECONDARY session replication
	Ahh, the beauty of storing data in remote locations
	The lazy hash map will only replicate its attribute names to all members in the group
	with that name, it will also replicate the source (where to get the object)
	and the backup member where it can find a backup if the source is gone.
	If the source disappears, the backup node will replicate attributes that
	are stored to a new primary backups can be chosen on round robin.
	When a new member arrives and requests state, that member will get all the attribute
	names and the locations.
	It can replicate every X seconds, or on dirty flags by the objects stored,
	or a request to scan for dirty flags, or a request with the objects.
	Notes: the map has been completed

	22. sendAck and synchronized should not have to be a XML config,
	it can be configured on a per packet basis using ClusterData.getOptions()
	Notes: see Channel.SEND_OPT_XXXX variables

	28. Thread pool should have maxThreads and minThreads and grow dynamically

	24. MessageDispatchInterceptor - for asynchronous sending
	- looks at the options flag SEND_OPTIONS_ASYNCHRONOUS
	- has two modes
	a) async parallel send - each message to all destinations before next message
	b) async per/member - one thread per member using the FastAsyncQueue (good for groups with slow receivers)
	- Callback error handler - for when messages fail, and the application wishes to become notified
	- MUST HAVE A LIMIT QUEUE SIZE IN MB, to avoid OOM errors or persist the queue.
	- MUST USE ClusterData.deepclone() to ensure thread safety if ClusterData objects get recycled
	Notes: Simple implementation, one thread, invokes all senders in parallel.
	Deep cloning is configurable as optimization.

	37. Interceptor.getOptionFlag() - lets the system configure a flag to be used
	for the interceptor. that way, all constants don't have to be configured
	in Channel.SEND_FLAG_XXXX.
	Also, the GroupChannel will make a conflict check upon startup,
	so that there is no conflict. I will change options to a long,
	so that we can have 63 flags, hence up to 60 interceptors.
	Notes: Completed, remained an int, so 31 flags

	b) State synchronization for the map - will need to add in MSG_INIT
	Fixed map bug