Date: 2022-09-13
Accepted (lazy consensus).
Implemented.
In order to reduce memory allocation, Apache James buffers big emails into temporary files, in both SMTP and during the email processing.
This means that all James entities relying on Mail
object needs to have resource management in place to prevent temporary file leaks. If not this can result in both unreasonable disk occupation and file descriptor leaks.
Core components were found to badly manage those resources. Some mailets (bounce), mail queue APIs, mail repository APIs were found to be causing temporary file leaks.
James allows users to define custom mailets, that them too can badly manage emails. If insiders tend to commit such errors, then we should expect our users to commit them too.
This points toward the need of a systematic approach to detect and mitigate temporary file leaks.
Similar leak detection is performed by other libraries. One can mention here Netty buffer libraries, which relies on phantom references to detect leaks. Phantom references allows the Java garbage collector to report recently GCed object. Upon phantom reference allocation, a check can then be done to check recently GCed object, and release related resources if need be.
Implement a leak detector “a la Netty” using phantom references.
Allow several modes:
Makes James core components “leak proofed”.
Allows user to safely write extensions involving emails (eg duplication / sending).
Performance impact of turning on “simple” leak detection (default behaviour) is expected not to be significant.
Similar functionalities can be implemented in a simpler way by relying on Java finalize method, called upon garbage collection.
Yet, such a move should not be done as many operational problems come through the use of ‘finalize’:
Mail management in James test suite is poor which leads to many false positive and prevents us from leveraging leaks detector benefits as part of our test suite. Significant work would be needed to do so.