blob: 4de4a8e385510b51da0c139fbcb8c49242e30d91 [file] [log] [blame]
<div class="wiki-content maincontent">
<p>It'd be great to offer kick ass support for streaming files over ActiveMQ of any arbitrary size. The basic idea is to fragment the stream into multiple messages and send/receive those over JMS.</p>
<p>There are a few issues to consider...</p>
<h3>Use casess</h3>
<ol><li>many producers writing and only 1 consumer. Each consumer ideally should completely process 1 stream at once (IO streams are often blocking IO, so the client thread can't do much with other streams while its reading 1).</li><li>if a consumer gets 1Gb through a 10Gb file and dies, we need to re-deliver the messages to another consumer.</li></ol>
<h3>Goal</h3>
<p>Our goal should be</p>
<ul><li>for each consumer to process a single stream in one go before trying to process another.</li></ul>
<h3>Possible Failure Conditions</h3>
<ul><li>Consumer could die in the middle of reading a stream. Recovery options:
<ol><li>Trash the rest of the stream</li><li>Restart delivery of the stream to the next consumer</li><li>Continue delivery of the steam to the next consumer at the point of failure.</li></ol>
</li><li>Producer could die in the middle of writing a stream. We may need to detect his failure. We could
<ol><li>Send the stream in a transaction. The stream is not sent to a consumer until it's fully received by the broker. Down side: consumer has high latency before being able to receive the message.</li><li>Consumer timeout: if a message is not receive soon enough the consumer assumes the producer is dead. (What if he's not??)</li></ol>
</li><li>Consumer could start to receive in the middle of a stream. Condition could happen:
<ol><li>if another consumer assumed the broker was dead (but it was not).</li><li>if a non stream consumer acidentally removed messages, or messages were sent to the DLQ due to consumer rollbacks.</li></ol>
</li></ul>
<h3>Implementation Issues</h3>
<ul><li>we can use message groups to ensure the same consumer processes all messages for a given stream - but unfortunately message groups does not prevent a consumer being overloaded with more than one message group at a time - maybe thats a new feature we can add?</li><li>avoid the broker running out of RAM - so spool to disk (or throttle the producer) if the only consumer is working on a different stream.
<ul><li>If messages sent using transactions, major chanages need may need to be made to how we do transaction management and message journaling. Currently, all messages in a in progress transaction are held in memory until it commits (Synchronization callbacks are holding on the the messages). The journal currently holds on to all transacted messages in it's journal + memory until commit, it does now allow the journal to rollover messages that are part of transaction that is in progess.</li></ul>
</li><li>given that the entire stream must be processed in one go, we can only ACK the entire stream of messages - so we need to disable pre-fetch?
<ul><li>Prefetch disabling is not needed as the consumers send back a special kind of ack which only temporarily expands the prefetch window.</li></ul>
</li></ul>
</div>