RELEASE_2_1_13/src/documentation/xdocs/performancetips.xml - cocoon - Git at Google

 <?xml version="1.0" encoding="UTF-8"?>
 <!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
 -->
 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.0//EN" "document-v10.dtd">

 <document>
  <header>
   <title>Apache Cocoon Performance Tips</title>
   <authors>
    <person name="Gerhard Froehlich" email="g-froehlich@gmx.de"/>
   </authors>
  </header>

  <body>

  <s1 title="Disclaimer">
    <p>The Cocoon Performance Tips in this version is a loose collection of
    usenet articles regarding how to improve the Apache Cocoon performance.</p>
    <p>As in the real world, it needs some kind of evolution to get better.
    If you have suggestions how to make it better or new kool tips, then be brave and
    send it to the <link href="http://cocoon.apache.org/community/mail-lists.html">
    Cocoon Mailing Lists</link>!</p>
    <note>Sometimes the tips maybe doubled or contradictory. If you notice something
    like that, then send a note to the <link href="http://cocoon.apache.org/community/mail-lists.html">
    Cocoon Mailing Lists</link>.</note>
  </s1>

  <s1 title="Common">
    <ul>
       <li>Logging kills performance. Consider disabling logging entirely from
       Cocoon (leave only the ERROR channel) and let Apache or the servlet
       container log accesses and stuff.</li>

       <li>Use a transparent proxy in front of your web server! The fastest
       response is the one that is not even processed. Cocoon is very slow
       (compared to a proxy server) to read resources such as stylesheets and
       images. A transparent proxy (SQUID, for example, don't use Apache 1.3
       mod_proxy because it is not fully compatible with HTTP/1.1 and disables
       connection keep-alive - Apache 2 mod_proxy is fine).
       Make sure you tune how long the static resources
       that Cocoon "read"s from the sitemap are cached (look into the readers
       code to find out more).</li>

       <li>Consider prerendering or time-based batch-process the static parts
       of your site. PDF reports, rasterized SVG graphs or things that change
       regularly.</li>

       <li>For optimum performance with Tomcat 4 and Cocoon 2,
       use the HTTP/1.0 connector.</li>

       <li>Move static content out of Cocoon's control. Move your static content out of the
       Cocoon servlet context and into its own context (just letting Tomcat serve directly).
       An even better approach would be to use a front-end webserver to serve the static, but
       installing Apache + Tomcat + our Cocoon app would be a bit much when Tomcat + our Cocoon
       app is doing fine.</li>

       <li>Disable resource reloading. The disk I/O system could become the
       bottleneck.</li>

       <li>Search for messages such as "decommissioning instance of...". This reveals some
       undersized pools which are corrected by tuning cocoon.xconf and sitemap.xmap.
       Undersized pools act like an object factory, plus the ComponentManager
       overhead.</li>
    </ul>
  </s1>

  <s1 title="Caching and Pooling">
    <ul>
      <li>Fine-tune the pool sizes for components in the files cocoon.xconf and
      sitemap.xmap. If the pools are too small for the load this will have a great
      impact on your performance. The goal is to achieve such a configuration that for
      every request there is a free component in the pool. Suppose, you have up
      to 100 simultaneous requests and your pipelines have up to 2 xslt
      transformers, then you need to set the maximum pool size to 200 xslt
      transformers. They will be created when needed and retained to the pool
      for future use.
      </li>

      <li>Fine-tune the Cocoon settings for the store and the other stuff.</li>

      <li>Important is the size of the documents that will be cached, because
      caching appears to be very time consuming process.</li>

      <li>Use the Caching Pipelines in your sitemaps where relevant.</li>
      <li>If your cache is set too small to keep the entire XML in memory,
       then the cache will be of no benefit.</li>

      <li>Watch the cachability in the log files, and make sure that things
       are being fed from the cache.</li>
      <li>Only use dynamic data when it is needed. Dynamic pages can't be
       cached 100%.</li>

      <li>Don't put Cocoon webapp too deep into directory structure. Cache
      keys contain absolute file names (or hash values of the absolute file
      names - in 2.0.X series), and the deeper cocoon is located in the
      filesystem, the longer keys are becoming. Obviously, longer keys will
      take more time to process them. In worst case scenario, slowdown up to
      10% could be achieved (unscientific observations, do your own
      test).</li>

    </ul>
      <p>More information about caching can be found
        <link href="userdocs/concepts/caching.html">in the Caching documentation</link>.
        Especially look at the information about the expires configuration.
      </p>
  </s1>

  <s1 title="JVM and OS">
    <ul>
       <li>Consider using a good JVM on a good OS. Scalability is a very
       different beast than pure speed. An Apple DualG4 866 seems to run faster
       than a Sun Enterprise 4500 (and costs a fraction), but try hitting them
       with 2000 concurrent Cocoon requests.</li>

       <li>Fine-tune your JVM settings (max heap-size, initial memory, s.o.).
       Please read the <link
       href="http://java.sun.com/docs/hotspot/PerformanceFAQ.html">Java Performance
       FAQ's</link> and the <link href="http://java.sun.com/docs/hotspot/gc/index.html">Tuning
       Garbage Collection</link> Document.</li>

       <li>Don't specify the -Xms parameter.</li>

       <li>Set the <code>-Xnoclassgc</code> parameter on the Sun JDK 1.3.1!
       It reduces the frequency of need for garbage collection by permitting the
       memory allocated to unused classes to be reused (instead of having to be
       collected and/or compacted).  Less fragmentation means less collection
       means better response times.</li>
    </ul>
  </s1>

  <s1 title="Perfomance Formulas">
    <ul>
      <li>Consider following formula for Pipeline Processing:<br />
      <code>Number_of_simultaneous_users * depth_of_content_aggregation</code>
      </li>

      <li>Consider following formula for Generators/Transformers/Serializers:<br />
      <code>Amount_required_to_process_one_request * Number_of_simultaneous_users</code>
      </li>

      <li>Consider following formula for Connectors:<br />
      <code>Count_of_pipeline_components_to_process_one_request *
      Number_of_simultaneous_users</code></li>

    </ul>
  </s1>

  <s1 title="Pipelines">
    <ul>
      <li>Keep an eye on the overall complexity of pipelines.</li>

      <li>Try to keep the size of the documents going through the pipeline
      small. To big documents slows down translation.</li>

      <li>Use the Caching Pipelines in your sitemaps where relevant.</li>

      <li>Use the <code>expires</code> parameter (see above) as frequently as you can.
      	It improves the end user experience dramatically. Browsers and intermediate
      	proxy servers love the HTTP <code>Expires</code> header.</li>
    </ul>
  </s1>

  <s1 title="XSP">
    <p>Consider turning your XSPs into Generators by hand and call them
     directly. Of course you don't need to do this for all pages, but it's
     recommended to it for those which are heavy loaded.</p>
    <p>You can try it this way:</p>
    <p>Cocoon will compile your XSP's into Java classes
     (see tomcat/work/..../org/apache/cocoon/www/my_xsp.class). After that, add
     the generated Generator to the Sitemap:<br />
     <code>
      &lt;map:generator type="myXSP" src="org.apache.cocoon.www.my_xsp"/&gt;
     </code>
    </p>
    <p>And use it:<br />
     <code>
      &lt;map:generate type="myXSP"/&gt;
     </code>
    </p>
  </s1>

  <s1 title="XSLT and XSL">
    <note>For more tips and information about XSL and XSLT grep the Internet and the
    <link href="http://xml.apache.org/xalan-j/index.html">Xalan Homepage</link>
    </note>

    <ul>
     <li>Try to keep the number of templates in the XSL translation small.</li>

     <li>There are several ways of doing the same stuff in XSLT, test the
     difference between them.</li>

     <li>Consider browser-dependent targetting to perform client-side XSLT.
     Cocoon is very fast if it doesn't do transformations. IE 5.5 and 6 are
     pretty compliant and might be something around 30% of your hits
     (probably more on some popular public web sites like Nasa's). Reducing
     one/third of the transformations might speed up a LOT.</li>

    <li>How complicated are the XSLT stylesheets? If you are not using global
    variables or parameters this will speeds things up.</li>

    <li>Consider using XSLTC instead of Xalan. XSLTC compiles XSLT to bytecode (translets)
    	the first time a stylesheet is used. Consequently it uses the compiled code
    	which is faster by a magnitude than the interpreted one.</li>

    </ul>
   </s1>
  </body>
 </document>
	<?xml version="1.0" encoding="UTF-8"?>
	<!--
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->
	<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.0//EN" "document-v10.dtd">

	<document>
	<header>
	<title>Apache Cocoon Performance Tips</title>
	<authors>
	<person name="Gerhard Froehlich" email="g-froehlich@gmx.de"/>
	</authors>
	</header>

	<body>

	<s1 title="Disclaimer">
	<p>The Cocoon Performance Tips in this version is a loose collection of
	usenet articles regarding how to improve the Apache Cocoon performance.</p>
	<p>As in the real world, it needs some kind of evolution to get better.
	If you have suggestions how to make it better or new kool tips, then be brave and
	send it to the <link href="http://cocoon.apache.org/community/mail-lists.html">
	Cocoon Mailing Lists</link>!</p>
	<note>Sometimes the tips maybe doubled or contradictory. If you notice something
	like that, then send a note to the <link href="http://cocoon.apache.org/community/mail-lists.html">
	Cocoon Mailing Lists</link>.</note>
	</s1>

	<s1 title="Common">
	<ul>
	<li>Logging kills performance. Consider disabling logging entirely from
	Cocoon (leave only the ERROR channel) and let Apache or the servlet
	container log accesses and stuff.</li>

	<li>Use a transparent proxy in front of your web server! The fastest
	response is the one that is not even processed. Cocoon is very slow
	(compared to a proxy server) to read resources such as stylesheets and
	images. A transparent proxy (SQUID, for example, don't use Apache 1.3
	mod_proxy because it is not fully compatible with HTTP/1.1 and disables
	connection keep-alive - Apache 2 mod_proxy is fine).
	Make sure you tune how long the static resources
	that Cocoon "read"s from the sitemap are cached (look into the readers
	code to find out more).</li>

	<li>Consider prerendering or time-based batch-process the static parts
	of your site. PDF reports, rasterized SVG graphs or things that change
	regularly.</li>

	<li>For optimum performance with Tomcat 4 and Cocoon 2,
	use the HTTP/1.0 connector.</li>

	<li>Move static content out of Cocoon's control. Move your static content out of the
	Cocoon servlet context and into its own context (just letting Tomcat serve directly).
	An even better approach would be to use a front-end webserver to serve the static, but
	installing Apache + Tomcat + our Cocoon app would be a bit much when Tomcat + our Cocoon
	app is doing fine.</li>

	<li>Disable resource reloading. The disk I/O system could become the
	bottleneck.</li>

	<li>Search for messages such as "decommissioning instance of...". This reveals some
	undersized pools which are corrected by tuning cocoon.xconf and sitemap.xmap.
	Undersized pools act like an object factory, plus the ComponentManager
	overhead.</li>
	</ul>
	</s1>

	<s1 title="Caching and Pooling">
	<ul>
	<li>Fine-tune the pool sizes for components in the files cocoon.xconf and
	sitemap.xmap. If the pools are too small for the load this will have a great
	impact on your performance. The goal is to achieve such a configuration that for
	every request there is a free component in the pool. Suppose, you have up
	to 100 simultaneous requests and your pipelines have up to 2 xslt
	transformers, then you need to set the maximum pool size to 200 xslt
	transformers. They will be created when needed and retained to the pool
	for future use.
	</li>

	<li>Fine-tune the Cocoon settings for the store and the other stuff.</li>

	<li>Important is the size of the documents that will be cached, because
	caching appears to be very time consuming process.</li>

	<li>Use the Caching Pipelines in your sitemaps where relevant.</li>
	<li>If your cache is set too small to keep the entire XML in memory,
	then the cache will be of no benefit.</li>

	<li>Watch the cachability in the log files, and make sure that things
	are being fed from the cache.</li>
	<li>Only use dynamic data when it is needed. Dynamic pages can't be
	cached 100%.</li>

	<li>Don't put Cocoon webapp too deep into directory structure. Cache
	keys contain absolute file names (or hash values of the absolute file
	names - in 2.0.X series), and the deeper cocoon is located in the
	filesystem, the longer keys are becoming. Obviously, longer keys will
	take more time to process them. In worst case scenario, slowdown up to
	10% could be achieved (unscientific observations, do your own
	test).</li>

	</ul>
	<p>More information about caching can be found
	<link href="userdocs/concepts/caching.html">in the Caching documentation</link>.
	Especially look at the information about the expires configuration.
	</p>
	</s1>

	<s1 title="JVM and OS">
	<ul>
	<li>Consider using a good JVM on a good OS. Scalability is a very
	different beast than pure speed. An Apple DualG4 866 seems to run faster
	than a Sun Enterprise 4500 (and costs a fraction), but try hitting them
	with 2000 concurrent Cocoon requests.</li>

	<li>Fine-tune your JVM settings (max heap-size, initial memory, s.o.).
	Please read the <link
	href="http://java.sun.com/docs/hotspot/PerformanceFAQ.html">Java Performance
	FAQ's</link> and the <link href="http://java.sun.com/docs/hotspot/gc/index.html">Tuning
	Garbage Collection</link> Document.</li>

	<li>Don't specify the -Xms parameter.</li>

	<li>Set the <code>-Xnoclassgc</code> parameter on the Sun JDK 1.3.1!
	It reduces the frequency of need for garbage collection by permitting the
	memory allocated to unused classes to be reused (instead of having to be
	collected and/or compacted). Less fragmentation means less collection
	means better response times.</li>
	</ul>
	</s1>

	<s1 title="Perfomance Formulas">
	<ul>
	<li>Consider following formula for Pipeline Processing:<br />
	<code>Number_of_simultaneous_users * depth_of_content_aggregation</code>
	</li>

	<li>Consider following formula for Generators/Transformers/Serializers:<br />
	<code>Amount_required_to_process_one_request * Number_of_simultaneous_users</code>
	</li>

	<li>Consider following formula for Connectors:<br />
	<code>Count_of_pipeline_components_to_process_one_request *
	Number_of_simultaneous_users</code></li>

	</ul>
	</s1>

	<s1 title="Pipelines">
	<ul>
	<li>Keep an eye on the overall complexity of pipelines.</li>

	<li>Try to keep the size of the documents going through the pipeline
	small. To big documents slows down translation.</li>

	<li>Use the Caching Pipelines in your sitemaps where relevant.</li>

	<li>Use the <code>expires</code> parameter (see above) as frequently as you can.
	It improves the end user experience dramatically. Browsers and intermediate
	proxy servers love the HTTP <code>Expires</code> header.</li>
	</ul>
	</s1>

	<s1 title="XSP">
	<p>Consider turning your XSPs into Generators by hand and call them
	directly. Of course you don't need to do this for all pages, but it's
	recommended to it for those which are heavy loaded.</p>
	<p>You can try it this way:</p>
	<p>Cocoon will compile your XSP's into Java classes
	(see tomcat/work/..../org/apache/cocoon/www/my_xsp.class). After that, add
	the generated Generator to the Sitemap:<br />
	<code>
	<map:generator type="myXSP" src="org.apache.cocoon.www.my_xsp"/>
	</code>
	</p>
	<p>And use it:<br />
	<code>
	<map:generate type="myXSP"/>
	</code>
	</p>
	</s1>

	<s1 title="XSLT and XSL">
	<note>For more tips and information about XSL and XSLT grep the Internet and the
	<link href="http://xml.apache.org/xalan-j/index.html">Xalan Homepage</link>
	</note>

	<ul>
	<li>Try to keep the number of templates in the XSL translation small.</li>

	<li>There are several ways of doing the same stuff in XSLT, test the
	difference between them.</li>

	<li>Consider browser-dependent targetting to perform client-side XSLT.
	Cocoon is very fast if it doesn't do transformations. IE 5.5 and 6 are
	pretty compliant and might be something around 30% of your hits
	(probably more on some popular public web sites like Nasa's). Reducing
	one/third of the transformations might speed up a LOT.</li>

	<li>How complicated are the XSLT stylesheets? If you are not using global
	variables or parameters this will speeds things up.</li>

	<li>Consider using XSLTC instead of Xalan. XSLTC compiles XSLT to bytecode (translets)
	the first time a stylesheet is used. Consequently it uses the compiled code
	which is faster by a magnitude than the interpreted one.</li>

	</ul>
	</s1>
	</body>
	</document>