RELEASE-NOTES.txt - commons-digester - Git at Google

 $Id: $


                           Commons Digester Package
                                 Version 2.0 alpha
                                Release Notes


 INTRODUCTION:
 ============

 The Apache Jakarta Commons Digester
 Release 2.0 of the Apache Jakarta Commons Digester package is a significant
 rewrite of the original package. All the fundamental concepts remain the same,
 but the APIs have been redesigned based on the lessons learnt from the 1.x
 series of releases.

 IMPORTANT NOTES
 ===============


 Dependencies
 ------------
 The 2.0 Digester release requires:
    Logging 1.0.x + BeanUtils 1.7

 MAJOR CHANGES SINCE 1.x
 =======================

 This section is intended for the use of those familiar with the 1.x releases
 of this product. There are many changes, but those listed below are the
 most significant. Mostly, this information is restricted to listing changes
 in *functionality*; only a few implementation-level changes are listed here.

 Versioning
 ----------
 At the current time, the new code uses the package name
   org.apache.commons.digester2.*
 There will no doubt be debate over whether this is a good idea, or whether
 the original
   org.apache.commons.digester.*
 package names should be used.

 General principles
 ------------------
 * Protected members are not used for classes in the o.a.c.digester2 package.
   Instead, members are private, and protected setter/getter methods are provided
   where needed. This makes it easier in future to change classes without
   breaking existing subclasses that have been defined by users of the Digester
   classes.
 * It is still undecided whether concrete Action classes should follow the above
   approach or use protected members.

 Renamed/repackaged classes
 ----------------
 * Rule --> Action
   The term "rule" has confused a number of people over the years. The new
   and hopefully clearer term "action" is used instead. The word "rule" is
   now used only to refer to a (pattern, action) pair, which is more intuitive.

 * Rules --> RuleManager
   The word "Rules" to mean *not* a collection of Rule objects, but instead
   the pattern-matching engine that happens to *contain* a collection of Rule
   objects was always confusing.

 * RulesBase --> DefaultRuleManager
   This should speak for itself.

 * All the basic action classes (formerly Rule classes) now reside in the
   o.a.c.digester2.actions package.

 * Renamed actions:
    NodeCreateRule --> CreateNodeRule
    ObjectCreateRule --> CreateObjectAction
    FactoryCreateRule --> CreateObjectWithFactoryAction
    ObjectCreationFactory --> ObjectFactory
    AbstractObjectCreationFactory --> AbstractObjectFactory

 Digester class
 ------------------
 * Digester has been split into:
    * Digester
    * SAXHandler
    * Context
    * ActionFactory

   The old Digester interface had a huge number of methods. Many of these
   were only because Digester also implemented the interfaces necessary to:
   (a) handle the SAX parser callbacks, and
   (b) for the Rule (now Action) classes to store data on it during the
       parse (the object stack etc).
   (c) conveniently create Rule (now Action) instances.

   These pieces of functionality have now been split out into separate
   classes, so:
    * Digester now contains only the basic methods that users of the
      library need to interact with.
    * SAXHandler handles the callbacks from the parser
    * Context holds the object stack, current match path, and related data.
    * ActionFactory provides the factory methods to conveniently create,
      configure and add Action objects to a Digester or RuleManager. Moving
      this functionality out of the Digester object also allows the Digester
      class to be distributed with a subset (including none) of the default
      Action classes if desired.

   Note that because parsing state is stored on the Context object now, it
   is easier to implement the often-requested feature of being able to parse
   multiple xml documents with the same Digester instance.

 Namespace-aware parsing
 -----------------------
 The Digester now *always* uses a namespace-aware xml parser.
 The DefaultRuleManager patterns properly support namespaces, eg
   /ns1:foo/ns2:bar/baz
 where the URIs that ns1 and ns2 correspond to have been defined via
 earlier calls to method DefaultRuleManager.addNamespace(prefix, uri).

 Entity Resolution
 -----------------
 The basic functionality previously provided for entity resolution has been
 improved.
 * By default any attempt to access an external entity which has not
   been explicitly mapped to some (presumably local) resource is regarded as a
   fatal error. See setAllowUnknownExternalEntities
 * External DTDs can be ignored. Yes, this has dangers, but sometimes it is
   necessary. See setIgnoreExternalDTD.

 DefaultRuleManager
 ------------------
 The DefaultRuleManager (formerly RulesBase) now uses a more xpath-like
 syntax for its patterns. It still isn't full xpath support, just a little
 closer for general consistency. In particular, a leading slash is required
 on absolute paths. A pattern with no leading slash is a relative path, and
 is equivalent to the old "*/" prefix.

 Action (formerly Rule) API changes
 -----------------------------------
 * Action is an interface. The AbstractAction class has been defined and is
   the recommended base for all custom actions.
 * Action classes no longer have a "digester" member pointing to their "owner".
   Instead, the begin/body/end methods are always passed a Context object that
   allows them to access the object stack etc.
 * Action classes are required to avoid modification of any member variable
   during parsing (ie from their begin/body/end methods). All data must instead
   be stored on the provided Context object. This effectively makes an Action
   instance both re-entrant and thread-safe.
 * The two regulations above mean that an Action instance can now be used
   concurrently by multiple Digester instances (eg in a pool).
 * Deprecated methods have been removed.
 * Actions get "bodySegment" callbacks when their content is mixed
   text and child elements. This allows Actions to process XHTML-style
   markup input more easily.
 * Actions get a new "beginParse" callback when startDocument occurs.
 * method finish renamed to finishParse

 SetPropertiesAction
 -------------------
 * The option now exists to specify the custom attr->property mapping via a
   Map parameter, not just a pair of String arrays. This is much nicer.
 * hyphenated xml attribute names are now automatically mapped to camelCase,
   eg some-attr="1" causes a call to setSomeAttr("1").

 CreateNodeAction
 ----------------
 * It is now possible to create DOM1 (ie non-namespaced) nodes and attributes
   even when the parser being used is namespace-aware.
 * Namespace-aware elements and attributes are created by default
 * The implementation has changed; rather than redirecting the xml parser
   to itself, the SAXHandler object is requested to forward ContentHandler
   calls to itself. This has no externally-visible effect, but makes the
   implementation much cleaner (esp. cleanup after a parse failure).

 CreateObjectAction
 ------------------
 * The ignoreCreateException functionality has been removed. I'm not sure
   what use-cases it supports, or whether anybody actually uses it. The code
   is rather complex and nasty, so if someone really needs this functionality
   they can complain, and we can add it back in later with sufficient comments
   to allow future maintainers to know when the feature is useful...

 Exceptions
 -----------
 A lot more methods are declared to throw explicit Exceptions, which should
 result in more reliable and explicit error-handling.

 Terminology
 ------------
 The word "pattern" is now used exclusively for a string that is interpreted
 by a RuleManager instance.

 The word "path" is now used for a string that describes an absolute path
 from the root document node to the current xml element. When a pattern
 matches the path, the associated Action is executed.

 Xml-rules
 ------------
 The xmlrules module has not yet been reimplemented. However the following
 changes are planned:
 * A RuleManager instance will be returned rather than a Digester.
   Because a RuleManager is thread-safe, this allows a pool of Digester
   instances to be configured with this object without having to reparse
   the xmlrules input file.
 * the xmlrules file will be able to specify what RuleManager subclass
   is desired (with the default being the DefaultRuleManager class).
 * The rule parser constructor will take a list of Action (formerly Rule)
   classes, and will auto-configure itself by using reflection against these
   classes rather than the current system where code is written for each
   Rule class.
 * Because the list of Actions to support is passed in at runtime, the rule
   parser class will not have explicit dependencies upon the default actions.
   This allows the class to be distributed without the set of default actions
   if desired. The ActionFactory class will provide a factory method for
   creating a rule parser instance which knows about all the default actions
 * The input xmlrules file will be able to specify custom action classes.

 Other notes
 -----------
 * The Digester class now only deals in XMLReader rather than SAXParser.
   This shouldn't remove any functionality, just simplify the code.
 * The default errorHandler methods now throw an exception for errors and
   fatal-errors reported by the parser rather than the old behaviour of just
   logging the error then continuing.
 * ParserFeatureSetterFactory and related classes have not been reimplemented,
   and will not be reimplemented by me. If they are wanted, someone else will
   have to do this.
 * I haven't implemented RuleSets. Are they useful to anyone?
 * the peek and pop methods on the digester, parameter and named stacks
   now throw an exception if misused rather than return null.

 Still TO-DO
 ------------
 * Think about alternative ways of performing logging.
 * Think about how to support pattern syntax of "/foo[@attr=value]" style.
   This may require a quite different API for RuleManager, so that RuleManager
   is passed the actual Elements required, rather than a string representing
   just the current path.
 * break up CallParamAction into multiple simpler actions
 * refactor CallMethodAction to clean up its constructor.
 * Fix rules that store data on themselves.
 * Think about resolving dependency issues on Beanutils by allowing digester
   to use beanutils via a local classloader. That means that it is ok to use
   digester even in a situation where another version of beanutils is the
   default.
 * sort out schemaLocation/schemaLanguage mess.
 * support rules to handle processing instructions.
 * look into moving from BeanUtils to Morph, as BeanUtils has a lot of
   functionality we don't use.
	$Id: $


	Commons Digester Package
	Version 2.0 alpha
	Release Notes


	INTRODUCTION:
	============

	The Apache Jakarta Commons Digester
	Release 2.0 of the Apache Jakarta Commons Digester package is a significant
	rewrite of the original package. All the fundamental concepts remain the same,
	but the APIs have been redesigned based on the lessons learnt from the 1.x
	series of releases.

	IMPORTANT NOTES
	===============


	Dependencies
	------------
	The 2.0 Digester release requires:
	Logging 1.0.x + BeanUtils 1.7

	MAJOR CHANGES SINCE 1.x
	=======================

	This section is intended for the use of those familiar with the 1.x releases
	of this product. There are many changes, but those listed below are the
	most significant. Mostly, this information is restricted to listing changes
	in functionality; only a few implementation-level changes are listed here.

	Versioning
	----------
	At the current time, the new code uses the package name
	org.apache.commons.digester2.*
	There will no doubt be debate over whether this is a good idea, or whether
	the original
	org.apache.commons.digester.*
	package names should be used.

	General principles
	------------------
	* Protected members are not used for classes in the o.a.c.digester2 package.
	Instead, members are private, and protected setter/getter methods are provided
	where needed. This makes it easier in future to change classes without
	breaking existing subclasses that have been defined by users of the Digester
	classes.
	* It is still undecided whether concrete Action classes should follow the above
	approach or use protected members.

	Renamed/repackaged classes
	----------------
	* Rule --> Action
	The term "rule" has confused a number of people over the years. The new
	and hopefully clearer term "action" is used instead. The word "rule" is
	now used only to refer to a (pattern, action) pair, which is more intuitive.

	* Rules --> RuleManager
	The word "Rules" to mean not a collection of Rule objects, but instead
	the pattern-matching engine that happens to contain a collection of Rule
	objects was always confusing.

	* RulesBase --> DefaultRuleManager
	This should speak for itself.

	* All the basic action classes (formerly Rule classes) now reside in the
	o.a.c.digester2.actions package.

	* Renamed actions:
	NodeCreateRule --> CreateNodeRule
	ObjectCreateRule --> CreateObjectAction
	FactoryCreateRule --> CreateObjectWithFactoryAction
	ObjectCreationFactory --> ObjectFactory
	AbstractObjectCreationFactory --> AbstractObjectFactory

	Digester class
	------------------
	* Digester has been split into:
	* Digester
	* SAXHandler
	* Context
	* ActionFactory

	The old Digester interface had a huge number of methods. Many of these
	were only because Digester also implemented the interfaces necessary to:
	(a) handle the SAX parser callbacks, and
	(b) for the Rule (now Action) classes to store data on it during the
	parse (the object stack etc).
	(c) conveniently create Rule (now Action) instances.

	These pieces of functionality have now been split out into separate
	classes, so:
	* Digester now contains only the basic methods that users of the
	library need to interact with.
	* SAXHandler handles the callbacks from the parser
	* Context holds the object stack, current match path, and related data.
	* ActionFactory provides the factory methods to conveniently create,
	configure and add Action objects to a Digester or RuleManager. Moving
	this functionality out of the Digester object also allows the Digester
	class to be distributed with a subset (including none) of the default
	Action classes if desired.

	Note that because parsing state is stored on the Context object now, it
	is easier to implement the often-requested feature of being able to parse
	multiple xml documents with the same Digester instance.

	Namespace-aware parsing
	-----------------------
	The Digester now always uses a namespace-aware xml parser.
	The DefaultRuleManager patterns properly support namespaces, eg
	/ns1:foo/ns2:bar/baz
	where the URIs that ns1 and ns2 correspond to have been defined via
	earlier calls to method DefaultRuleManager.addNamespace(prefix, uri).

	Entity Resolution
	-----------------
	The basic functionality previously provided for entity resolution has been
	improved.
	* By default any attempt to access an external entity which has not
	been explicitly mapped to some (presumably local) resource is regarded as a
	fatal error. See setAllowUnknownExternalEntities
	* External DTDs can be ignored. Yes, this has dangers, but sometimes it is
	necessary. See setIgnoreExternalDTD.

	DefaultRuleManager
	------------------
	The DefaultRuleManager (formerly RulesBase) now uses a more xpath-like
	syntax for its patterns. It still isn't full xpath support, just a little
	closer for general consistency. In particular, a leading slash is required
	on absolute paths. A pattern with no leading slash is a relative path, and
	is equivalent to the old "*/" prefix.

	Action (formerly Rule) API changes
	-----------------------------------
	* Action is an interface. The AbstractAction class has been defined and is
	the recommended base for all custom actions.
	* Action classes no longer have a "digester" member pointing to their "owner".
	Instead, the begin/body/end methods are always passed a Context object that
	allows them to access the object stack etc.
	* Action classes are required to avoid modification of any member variable
	during parsing (ie from their begin/body/end methods). All data must instead
	be stored on the provided Context object. This effectively makes an Action
	instance both re-entrant and thread-safe.
	* The two regulations above mean that an Action instance can now be used
	concurrently by multiple Digester instances (eg in a pool).
	* Deprecated methods have been removed.
	* Actions get "bodySegment" callbacks when their content is mixed
	text and child elements. This allows Actions to process XHTML-style
	markup input more easily.
	* Actions get a new "beginParse" callback when startDocument occurs.
	* method finish renamed to finishParse

	SetPropertiesAction
	-------------------
	* The option now exists to specify the custom attr->property mapping via a
	Map parameter, not just a pair of String arrays. This is much nicer.
	* hyphenated xml attribute names are now automatically mapped to camelCase,
	eg some-attr="1" causes a call to setSomeAttr("1").

	CreateNodeAction
	----------------
	* It is now possible to create DOM1 (ie non-namespaced) nodes and attributes
	even when the parser being used is namespace-aware.
	* Namespace-aware elements and attributes are created by default
	* The implementation has changed; rather than redirecting the xml parser
	to itself, the SAXHandler object is requested to forward ContentHandler
	calls to itself. This has no externally-visible effect, but makes the
	implementation much cleaner (esp. cleanup after a parse failure).

	CreateObjectAction
	------------------
	* The ignoreCreateException functionality has been removed. I'm not sure
	what use-cases it supports, or whether anybody actually uses it. The code
	is rather complex and nasty, so if someone really needs this functionality
	they can complain, and we can add it back in later with sufficient comments
	to allow future maintainers to know when the feature is useful...

	Exceptions
	-----------
	A lot more methods are declared to throw explicit Exceptions, which should
	result in more reliable and explicit error-handling.

	Terminology
	------------
	The word "pattern" is now used exclusively for a string that is interpreted
	by a RuleManager instance.

	The word "path" is now used for a string that describes an absolute path
	from the root document node to the current xml element. When a pattern
	matches the path, the associated Action is executed.

	Xml-rules
	------------
	The xmlrules module has not yet been reimplemented. However the following
	changes are planned:
	* A RuleManager instance will be returned rather than a Digester.
	Because a RuleManager is thread-safe, this allows a pool of Digester
	instances to be configured with this object without having to reparse
	the xmlrules input file.
	* the xmlrules file will be able to specify what RuleManager subclass
	is desired (with the default being the DefaultRuleManager class).
	* The rule parser constructor will take a list of Action (formerly Rule)
	classes, and will auto-configure itself by using reflection against these
	classes rather than the current system where code is written for each
	Rule class.
	* Because the list of Actions to support is passed in at runtime, the rule
	parser class will not have explicit dependencies upon the default actions.
	This allows the class to be distributed without the set of default actions
	if desired. The ActionFactory class will provide a factory method for
	creating a rule parser instance which knows about all the default actions
	* The input xmlrules file will be able to specify custom action classes.

	Other notes
	-----------
	* The Digester class now only deals in XMLReader rather than SAXParser.
	This shouldn't remove any functionality, just simplify the code.
	* The default errorHandler methods now throw an exception for errors and
	fatal-errors reported by the parser rather than the old behaviour of just
	logging the error then continuing.
	* ParserFeatureSetterFactory and related classes have not been reimplemented,
	and will not be reimplemented by me. If they are wanted, someone else will
	have to do this.
	* I haven't implemented RuleSets. Are they useful to anyone?
	* the peek and pop methods on the digester, parameter and named stacks
	now throw an exception if misused rather than return null.

	Still TO-DO
	------------
	* Think about alternative ways of performing logging.
	* Think about how to support pattern syntax of "/foo[@attr=value]" style.
	This may require a quite different API for RuleManager, so that RuleManager
	is passed the actual Elements required, rather than a string representing
	just the current path.
	* break up CallParamAction into multiple simpler actions
	* refactor CallMethodAction to clean up its constructor.
	* Fix rules that store data on themselves.
	* Think about resolving dependency issues on Beanutils by allowing digester
	to use beanutils via a local classloader. That means that it is ok to use
	digester even in a situation where another version of beanutils is the
	default.
	* sort out schemaLocation/schemaLanguage mess.
	* support rules to handle processing instructions.
	* look into moving from BeanUtils to Morph, as BeanUtils has a lot of
	functionality we don't use.