blob: 2051cec6f5fc902168aeedd8121e90095e24d453 [file] [log] [blame]
==============================
Next actions
==============================
- Add MbeanFactory to generate dynamic cluster at runtime.
Problem: How we can start those central services?
- StandardEngine support load Mbean from external file.
- when a lot of messages expire it comes to burst of messages
- all 60 Sec when ManagerBase#processExpires is called a lot of messages are send!
- Better is to transfer a spezial epxire message with an array of expired session messages.
- This reduce message transfer and reduce waits for acks.
- complicated implementation thing , sessions expires when call isValid :-(
- build a tool that receive the stage and present it as simple web app.
detect some problems
different active sessions
long queues
detect long wait acks
Idea: Wrote a ant script with the new jmx tasks!
Optimzied information access:
Create a MBean.attribute that store the complete cluster state
information inside one attribute from type TabularData and CompositeData!
Implement as SimpleTcpCluster operation
Cluster <CompositeData>
-- Attributes <SimpleType>
-- ClusterMembership
-- Attributes <SimpleType>
-- Members <CompositeType>
-- Attributes <SimpeType>
-- ClusterReceiver
-- Attributes <SimpleType>
-- ClusterSender
-- Attributes <SimpleType>
-- DataSenders <TabularData>
-- Sender <CompositeType>
-- Attributes <SimpeType>
-- (optional)Queue Stats <CompositeData>
-- Attributes <SimpeType>
- add cluster setup template (src)
- documentation
wrote a complete new how-to
add example configurations
add complete attribute descriptions
add JMX information
add deployment help
Need help (pero)
- implement fragmentation of large replication objects
Compress at message level
Splitting Messages ala FarmDeployer war handling
- add a message type to the message header.
- filtering at receiver that drop message before build Object
- define short type definition.
- Every Session message with differenz type
- type with String
- add test cluster project
functional testing a lot of szenarios
differen replication mode
restart after failure
crash under load
compress and uncompressed
Junit test ( started)
automated regression testing with some standard configs
wrote a test client that can get a JMX State to verify the different cluster testszenarios
- direct JSR 160 client - preferred option
- via mx4j http adaptor ( xml protocol)
- setup different szenarios
use new remote jmx ant tasks to grab information from mbeans
- filter all cluster messages
- Filtering that no all message send to all member.
- Now with domain mode registeration session message only send to
members from same domain.
- setup a Cluster LifecycleListener to send the cluster status! (monitoring)
- Make the compact state via JMX API avialable first.
- setup cluster Listener that send own state to spezial member (sender from a message like GET_ALL_SESSIONS)
- create a cluster status app (html/xml)
- attributes
- send message
- current members
- receive message
- active session at differen clustered manager
- which app are clusters (registered managers)
- avg processing times
- operations
- resetStats
- send a message (String)
- getDisplay state from other members??
- stop queue to send
- queue receiver message from some members ( later send after redeploying)
- configure some send parameters
- keepalive
- compress
- wait for ack
- other things
- based on Cluster JMX API
- watch some values from complete cluster and display some graphs
- display the informations from all nodes
- display informations from other cluster domains
via XML documents and http
- display stats as xml
- operation via JMX (MX4J adaptor)
================
problems
================
- MemoryUser principal from UserDatabaseRealm not handled to replicated
- look inside DeltaRequest.setPrincipal(Principal,boolean)
detected by Dirk de Kok (tomdev 16.8.2005)
- only GenericPrincipal from all other realms are handled well.
- We not set SimpleTcpCluster Properties when element exists inside config.
Element must have all properties!! - Note inside docs!!
- How we can stop the request traffic when restart an application?
currently the jk 1.2.10 can only disable the complete loadbalancer,
but this detect only the new session request desicion.
Request with sessions marks send to tomcat.
Fix: jk > 1.2.11 has a stopped flag, but then all application stop traffic
and session transfer from other nodes not stopped!!
- Can't stop message replication for a spezial member and application
- this need a spezial cluster message and send filter at SimpleTcpCluster
- Don't generate cluster message when no member is at cluster!
- Register DeltaManager as Cluster LifecycleListener and stop cresting and sending
- Reduce memory consume when only local node is active
- Important feature when nodes crashed, and only one server exists under load...
==============================
Nice to have:
==============================
- Replication ContextAttributes
- Cluster config at engine level (user request 06/05)
Register a cluster infrastructur for many vhosts
configure backup systems!
Add Cluster Element to digister
- Configure the McastService no accept every member.
- receive a secret key
- have a allow or deny list like RequestFilterValve
- Also receiver don't accept request from not allowed members
- ReplicationListener and SocketReplicationListener only accept data from cluster member (low level ip restriction)
- PooledSocketSender
Add more stats
check all Pooled sender checkKeepAlive
- Implement a NonSerializable interface for session attributes that do not
wish to be replicated
Then we must have ClusterNonSerializable at common classloader
- Extend StandardSession if possible
- Implement primary/secondary replication logic
Now we have a domain sending mode, but we can send a broadcast when
local node have no backup.
Wait a time periode, then find a backup
- Implement context attribute replication (?)
pero:
Also send Start/Stop messages from Context to complete cluster!
With 5.5.10 you can wrote a Cluster Lifecycle Manager that do this.
Register for AFTER_MANAGERREGISTER_EVENT and AFTER_MANAGERUNREGISTER_EVENT
and also to the context ServletContextAttributeEvent Listener
Access the Context
((Manager)event.getData()).getContainer()
- Fix farm deployment for 5.5
pero:
Every start all application are deployed only to all running cluster nodes
New registered nodes don't get the applications!
Deploy must send a GET_ALL_APP to all other Deployers.
FH: Correction, you should not send GET_ALL_APP to all deployers, only
to the main one. Which could be another property of the Member object, it
would not make sense to transfer the same webapp over and over again.
Only watchEnabled Deployer send this member all deployed application.
pero:
Yes very true, but currently we distribute also all wars from watch node at begining.
Waiting to start other nodes is only change to not got these war's!
I have made some experiments to register war deployment at new memberAdded to cluster.
Add JMX Support
Resend Deployed Applications to all or one cluster node.
Show all watch Resource
Processing Time
Change fileMessage Buffersize.
Start/Stop Cluster wide application
Deployer and Watcher sync with engine background thread!
Fixed!
Last FileMessage fragment need longe ackTimeout
<Cluster ..> <Sender ... ackTimeout="60000"/> </Cluster>
- Change the cluster protocol that developer can add there own data serialzable/deserialzable format (high risk)
Currently
header 6 bytes (FLT2002)
compressflag 4 byte
data.length 4 bytes
data,
end header 6 bytes (TLF2003)
Optimized to
header 2 bytes (TC)
type 1 byte
compressflag 1 byte
data.length 4 bytes,
data | <real uncompressed data.length (4 bytes)> data
"type" means user defined type and receiver extract bytes and type and sende it to callback
s. ObjectReader or SocketObjectReader
compress 1
first data 4 data bytes are the real uncompressed data length. ( Is for better memory management atr recevier side, S. XByteBuffer)
change at DataSender.writeData and XByteBuffer and add flexible handling to ClusterSender and ClusterReceiver
- Add single sign on support
==============================
COMPLETED
==============================
5.5.10 (pero)
- add mapping sender mapping properties file (IDataSenderFactory)
- let advanced people eaiser implemented there own sender mode
- We register different application with same name from different host?
SimpleTcpManager register manager with app name + hostname when Cluster is configured as Engine element.
- Configured DeltaManager inside context
- SimpleTcpCluster setProperty and transferproperty reflect changes only to defaultMode managers
- Look inside SimpleTcpCluster.addManager and DeltaManager.start?
- Session serialization eat memory but now we can send session messages with blocks...
When all sessions serialze after GET_ALL_SESSION is received following works
- find all sessions
- serialize a block or all sessions as byte array
- serialize the complete SessionMessageImpl to transfer message
- WaitForAck mode and resend probleme
- Now message creator can configure resend and compress mode!
- Add a default simple cluster config with good defaults and only
one cluster element inside server.xml. Setup with fastasyncmode.
Service Elements
ReplicationTransmitter,
SocketReplicationListener,
McastService,
ClusterSessionListener
ReplicationValve
You can change property setting with SimpleTcpCluster prefix "sender.XXX, receiver.XXX, valve.XXX, listener.XXX, service.XXX"
- Fix resend GET_ALL_SESSIONS when wait ACK failed at receiver side
- Fix that ClusterValve not remove when cluster stops
- Set timestamp only at first time inside SessionMessageImpl
- Set timestamp from findsessions when handling GET_ALL_SESSION
- Set this timestamp to all SEND_SESSION_DATA and TRANSFER Complete messages
- Drop all received message inside GET_ALL_SESSION message queue (DeltaManager)
- Mcast Service as JMX MBean (change cluster domain at runtime)
- send cluster domain with mcast ping
- With sendClusterDomainOnly=true only session message from same domain are received
- Session only replicated to members from same domain, with sendClusterDomainOnly=false
at Sender (ReplicationTransmitter) session messages send to all members.
- GET ALL Session send to first member inside same cluster domain
- better restart szenario at DeltaManager after failure restart (java service wrapper).
queue all other session events
as STATE Transfer Complete is received, dequeue all received sessions messages.
- restructure methods at DeltaManager
- extract handleXXS methods for better DeltaManager subclassing.
- split big get all sessions from one server into blocks of sessions and separate STATE Transfer message!
- no complete sync sessions when GET ALL Sessions event is received.
- add JMX API for ClusteRreceiver
- ClusterReceiver is now Callback when message is received
- SimpleTcpCluster only receive ClusterMessage (API change)
Redesign SimpleTcpCluster message receiving to ClusterReceiverBase:
- optimized data uncompressed
- better extendablity
- XByteBuffer only buffer bytes and don't uncompress.
- Add receiver JMX stats with new attribute 'Receiver.doReceivedProcessingStats'
- optimized createManager and addManager that also can configured normal StandardManager
to use cluster message transfer without replication.
- add support to dynamic property transfer from SimpleTcpCluster to the Manager
like ReplicationTransmitter
All manager attributes can be configured:
- expireSessionsOnShutdown (false)
- notifySessionListenersOnReplication (true)
- notifyListenersOnReplication (true)
- maxActiveSessions (-1)
- timeoutAllSession (60 sec)
- sessionIdLength (16)
- processExpiresFrequency (6 - exipre all six engine background periodic event (60 sec))
- algorithm (MD5)
- entropy
- randomFile
- randomClass
- Setup the cluster without SessionReplication Manager
Only a message bus.
Configure the bus with you ClusterListener and valves and a StandardManager
- reduce cpu and memory consume (Receiver)
- set new compress sender flag at default=false ( < CPU usage)
- Make compact algo
currently message receive data is split at XbyteBuffer#extractPackage and
SimpleTcpCluster#messageDataReceived
- reduce memory and cpu consume (send message)
- set new compress sender flag at default=false ( < CPU usage)
- don't copy the buffer to add message header
transfer this from SimpleTcpCluster to DataSender pushMessage
successfull refactored
- make it possible that a subclass crypt the transfered messages
sub class ReplicationTransmitter and override createMessageData
- don't copy START and END Header for every message, instead send dirctly and DataSender.writeData.
- Add a flag for replicated attribute events, to enable or disable them
Now can configued with notifyListenersOnReplication=false at SimpleTCPCluster
Also can drop HttpSessionLsitener events
can configued with notifySessionListenersOnReplication=false at SimpleTCPCluster
- Refactoring DeltaManager
- Transfer attributes from Cluster config to DeltaManager
- Fix at 5.5.9 cluster hang bug!
- Add more Valve to direct cluster config
- Add Lifecycle Listener support to direct cluster config
- Add ClusterListener support to direct cluster config
- Add new SocketReplicationListener
- Add Stats to DeltaManager
5.5.9 (pero)
- JMX friendly
pero: Add some MBeanSupport to SimpleTCPCluster, ReplicationTransmitter and Senders
- Add Keep Alive and WaitForAck at async mode implementation.
Make this feature configurable to Sender element at server.xml
Is include with 5.5.8
- Add support to new Async Mode from Rainer Jung
Integrated with 5.5.9
fastasyncqueue mode