blob: 7319fe581e62eef23a9849454afe816ea87326c0 [file] [log] [blame]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<head>
<META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=windows-1252">
<TITLE></TITLE>
<META NAME="GENERATOR" CONTENT="StarOffice 6.0 (Win32)">
<META NAME="CREATED" CONTENT="20010710;11455400">
<META NAME="CHANGED" CONTENT="20010802;10173004">
<STYLE>
<!--
@page { margin: 2cm }
H3 { margin-bottom: 0.21cm }
H2 { margin-bottom: 0.21cm; page-break-before: auto; page-break-after: auto }
P { margin-bottom: 0.21cm; text-align: left; widows: 2; orphans: 2; page-break-before: auto; page-break-after: auto }
H1 { margin-bottom: 0.21cm }
P { margin-bottom: 0.21cm; text-align: left; widows: 2; orphans: 2; page-break-before: auto; page-break-after: auto }
-->
</STYLE>
</head>
<body>
<H1>How To Generate Profiling Data</H1>
<P>Andre Fischer and Carsten Driesner, July 11<SUP>th</SUP> , 2001.
Last change on August 2<SUP>nd</SUP> , 2001.</P>
<H2>Introduction</H2>
<P>The task that is to be solved is the generation of profiling data
for a manually selected set of points in programs of the
OpenOffice/Sun One Webtop family. Moreover this should be
accomplished as automatically as possible, so that timings of
different program versions can be compared to each other
automatically.</P>
<P>The main reason for not using an existing tool like Truetime or
Quantify is the huge amount of data generated by these programs. In
general the developers of the different modules of the office know
about the time critical parts of their code. Therefore only profiling
informations about these code parts are of interest. Everything else
only obscures the view to the relevant data.</P>
<P>With this in mind we use a different approach. Instead of blindly
instrumenting---i.e. adding code that generates profiling data---all
files, we concentrate of the areas of interest. The developers have
to add once manually commands into their code that emits time stamps
together with context information. From then on everything else goes
automatically. If the those files are compiled with a certain
preprocessor symbol defined and the office is run with a certain
environment variable set, then profiling information is written to a
log file (or, to be more precise, to one log file per process). These
log files are then processed by a Perl script and are transfomed into
a Calc document.</P>
<P>The details of this process are explained below. Note that this is
a work in progress. Especially the transformation of the log files
into a Calc document will have to be adapted to the needs of those
who use those documents.</P>
<H2>Instrumenting the source files</H2>
<P>The header file &lt;rtl/logfile.hxx&gt; (guarded by the
preprocessor symbol <CODE>_RTL_LOGFILE_HXX_</CODE>) contains a set of
macros that can be used to emit time stamps at certain points in the
code. The reason for using macros instead of a more decent C++
construct is to provide a way of removing the code for writing
profiling data completely from the compiled executable and it's
libraries. The macros wrap calls to the class ::rtl::Logfile
(declared in &lt;rtl/logfile.hxx&gt;) and the function
rtl_logfile_trace (declared in &lt;rtl/logfile.h&gt;). They generate
actual code only when at compile time the preprocessor symbol TIMELOG
is defined. This is either accomplished by building a special office
version (which will be the normal Pro-version) or by setting an
environment variable <CODE>ENVCDEFS</CODE> to include <CODE>-DTIMELOG</CODE>.
Depending on the shell you use</P>
<P><CODE>set ENVCDEFS=-DTIMELOG</CODE></P>
<P>will do that (4NT).</P>
<P>The macros can be used in two ways:</P>
<OL>
<LI><P><CODE>RTL_LOGFILE_CONTEXT_AUTHOR(instance,project,author,name)</CODE>
creates a context instance that can be referenced by <CODE>instance</CODE>.
The three remaining parameters contain the name of the project in
which the macro is used, the Sun id of the author beeing responsible
for the code in which the macro is used (for example af119097 for
one of the authors) and the <CODE>name</CODE> of the function or
other scope that this context refers to. The context instance emits
one time stamp when it is created and one when it is destroyed.
Therefore, if placed as first statement of a function definition, a
time stamp is written to the log file when the function is entered
and one when it is exited even when there is more then one
(implicit) return statement.
</P>
<P>The name passed to the context instantiation is written with
every time stamp that originates from such a context. This is also
true for the following macros.</P>
<LI><P><CODE>RTL_LOGFILE_CONTEXT_TRACE(instance,message)</CODE>,</P>
<P><CODE>RTL_LOGFILE_CONTEXT_TRACE1(instance,message,arg1)</CODE>,</P>
<P><CODE>RTL_LOGFILE_CONTEXT_TRACE2(instance,message,arg1,arg2)</CODE>,
and</P>
<P><CODE>RTL_LOGFILE_CONTEXT_TRACE3(instance,message,arg1,arg2,arg3)</CODE>
write time stamps with arbitrary messages to the log file. The
actual message is given by the string <CODE>message</CODE>. Together
with zero to three arguments it is passed to a printf style output
function. That means that for every argument there has to be a
suitable % format string in <CODE>message</CODE>. In order to allow
the log file be parsed by the existing Perl scripts message may
contain newlines. Each message is prefixed with the context name.</P>
<P>These macros exist also in a context free version.</P>
<P><CODE>RTL_LOGFILE_TRACE_AUTHOR(project,author,message)</CODE>,</P>
<P><CODE>RTL_LOGFILE_TRACE_AUTHOR1(project,author,message,arg1)</CODE>,</P>
<P><CODE>RTL_LOGFILE_TRACE_AUTHOR2(project,author,message,arg1,arg2)</CODE>,
and</P>
<P><CODE>RTL_LOGFILE_TRACE_AUTHOR3(project,author,message,arg1,arg2,arg3)</CODE>
do the same thing as their context twins except that the messages
are not prefixed with a context name. As you can see, the project
and author have to given to each trace macro instead of just to the
one creating a context. They are defined in &lt;rtl/logfile.h&gt;
which is included from &lt;rtl/logfile.hxx&gt;. Therefore for only
using the context free macros it is sufficient to include
&lt;rtl/logfile.h&gt;.</P>
</OL>
<P>For every introduced above there is an analogon without the
<CODE>_AUTHOR</CODE> suffix which do not accept the <CODE>project</CODE>
and <CODE>author</CODE> arguments. Because both of these are used
later in the post processing stage we discourage you to use these
other macro versions.</P>
<P>Depending on which of these two ways is used to generate a time
stamp, they are named function/scope time stamps respectively message
time stamps.</P>
<H2>Creating profiling information</H2>
<P>If you have instrumented your code like described above and have
compiled and installed it you can create profiling information by
starting the office with the bootsrap variable <CODE>RTL_LOGFILE</CODE>
set to a file name prefix. This prefix is completed by appending an
underscore, the process id and a &#147;.log&#148; suffix. Note that
backslashes have to be escaped by another backslash. The variable
<CODE>RTL_LOGFILE</CODE> can be set in one of the following ways:</P>
<UL>
<LI><P>Setting the environment variable <CODE>RTL_LOGFILE</CODE>.
This for instance be done with</P>
<UL>
<P><CODE>set RTL_LOG<CODE>FILE</CODE>=c:\\logfile</CODE></P>
</UL>
<P>from a 4NT shell.</P>
<LI><P>Passing the argument <CODE>-env:RTL_LOGFIL<CODE>E=</CODE>c:\\logfile</CODE>
on the commandline to the office.</P>
<LI><P>Put an entry into &lt;executable&gt;<CODE>.ini</CODE> or
&lt;executable&gt;<CODE>rc</CODE> file (the first with, the second
without a dot) where &lt;executable&gt; is the name of the
executable. For more information on this method please refer to
<A HREF="http://udk.openoffice.org/common/man/concept/uno_default_bootstrapping.html">http://udk.openoffice.org/common/man/concept/uno_default_bootstrapping.html</A></P>
</UL>
<H2>Transforming the log files into Calc documents</H2>
<P>Once you have created a log file you may want to convert it into a
more readable form. There is a set of Perl scripts that create a Calc
document from a log file that contains for every thread a set of
pages with different views of the profiling data.
</P>
<P>The first page shows a pretty printed version of the list of time
stamps. They are indented according to the calling hierarchy. For
each time stamp you can see the time it has been written (or to be
more precise the time on which it's writing has been requested), if
applicable the time the function/scope took to compute and the
function/scope name and message.</P>
<P>The second page shows a list of all functions/scopes for which
timing informations exist. Every list entry shows the function/scope
name, total, minimal, maximal, and average time and the number of
calls.</P>
<H3>Filtering</H3>
<P>The list of time stamps can be filtered in order to reduce a large
amount of data to a managable size and to exclude profiling
information from projects you are not interested in. There are two
filters. One is for explicit inclusion of time stamps, the other for
explicit exclusion. Each is initialized from a file containing
regular expressions (Perl style) given on the command line. The file
after the <CODE>-i</CODE> switch defines those time stamps that are
to be included in the reports and the file after the <CODE>-e</CODE>
switch defines those time stamps that are to be excluded. The files
may contain empty lines or comment lines whose first character is an
'#'. An empty inclusion filter does have no effect. If both filters
are specified, then a time stamp is written to the report if it
matches at least one of the regular expression of the inclusion
filter and none of the exclusion filter.</P>
<P>The idea is to use the inclusion filter if you have just a small
number of functions or messages you are interested in and to use the
exclusion filter if there is only a small number you would not like
in your reports.</P>
<P>An inclusion file might look like this.</P>
<P><CODE># Regular expression matching all function names that shall
show up</CODE></P>
<P><CODE># in the reports. </CODE>
</P>
<P><CODE># Show calls to BuildChart in the chart project.</CODE></P>
<P><CODE>BuildChart</CODE></P>
<H3>Outlook</H3>
<P>The form and number of reports will of course have to be changed
and extended. We are looking forward to your feedback.</P>
<H2>Further Documentation</H2>
<P>Furthor documentation can be found in the C++ header files
<CODE>&lt;rtl/logfile.h&gt;</CODE> and <CODE>&lt;rtl/logfile.hxx&gt;</CODE>
and the Perl scripts in project <CODE>tools/contrib/profiling</CODE>.
You can use for instance <CODE>pod2html</CODE> to extract and
transform the Perl scripts documentation in HTML documents.</P>
<P>Documentation of the time stamp format can be found <A HREF="time-stamp-format.html">here</A>.</P>
<H2>Contact</H2>
<P>You can contact the authors under <A HREF="mailto:andre.w.fischer@sun.com">andre.w.fischer@sun.com</A>
or <A HREF="mailto:carsten.driesner@sun.com">carsten.driesner@sun.com</A>.</P>
<P><BR><BR>
</P>
</body>
</HTML>