| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> |
| <HTML> |
| <head> |
| <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=windows-1252"> |
| <TITLE></TITLE> |
| <META NAME="GENERATOR" CONTENT="StarOffice 6.0 (Win32)"> |
| <META NAME="CREATED" CONTENT="20010710;11455400"> |
| <META NAME="CHANGED" CONTENT="20010802;10173004"> |
| <STYLE> |
| <!-- |
| @page { margin: 2cm } |
| H3 { margin-bottom: 0.21cm } |
| H2 { margin-bottom: 0.21cm; page-break-before: auto; page-break-after: auto } |
| P { margin-bottom: 0.21cm; text-align: left; widows: 2; orphans: 2; page-break-before: auto; page-break-after: auto } |
| H1 { margin-bottom: 0.21cm } |
| P { margin-bottom: 0.21cm; text-align: left; widows: 2; orphans: 2; page-break-before: auto; page-break-after: auto } |
| --> |
| </STYLE> |
| </head> |
| <body> |
| <H1>How To Generate Profiling Data</H1> |
| <P>Andre Fischer and Carsten Driesner, July 11<SUP>th</SUP> , 2001. |
| Last change on August 2<SUP>nd</SUP> , 2001.</P> |
| <H2>Introduction</H2> |
| <P>The task that is to be solved is the generation of profiling data |
| for a manually selected set of points in programs of the |
| OpenOffice/Sun One Webtop family. Moreover this should be |
| accomplished as automatically as possible, so that timings of |
| different program versions can be compared to each other |
| automatically.</P> |
| <P>The main reason for not using an existing tool like Truetime or |
| Quantify is the huge amount of data generated by these programs. In |
| general the developers of the different modules of the office know |
| about the time critical parts of their code. Therefore only profiling |
| informations about these code parts are of interest. Everything else |
| only obscures the view to the relevant data.</P> |
| <P>With this in mind we use a different approach. Instead of blindly |
| instrumenting---i.e. adding code that generates profiling data---all |
| files, we concentrate of the areas of interest. The developers have |
| to add once manually commands into their code that emits time stamps |
| together with context information. From then on everything else goes |
| automatically. If the those files are compiled with a certain |
| preprocessor symbol defined and the office is run with a certain |
| environment variable set, then profiling information is written to a |
| log file (or, to be more precise, to one log file per process). These |
| log files are then processed by a Perl script and are transfomed into |
| a Calc document.</P> |
| <P>The details of this process are explained below. Note that this is |
| a work in progress. Especially the transformation of the log files |
| into a Calc document will have to be adapted to the needs of those |
| who use those documents.</P> |
| <H2>Instrumenting the source files</H2> |
| <P>The header file <rtl/logfile.hxx> (guarded by the |
| preprocessor symbol <CODE>_RTL_LOGFILE_HXX_</CODE>) contains a set of |
| macros that can be used to emit time stamps at certain points in the |
| code. The reason for using macros instead of a more decent C++ |
| construct is to provide a way of removing the code for writing |
| profiling data completely from the compiled executable and it's |
| libraries. The macros wrap calls to the class ::rtl::Logfile |
| (declared in <rtl/logfile.hxx>) and the function |
| rtl_logfile_trace (declared in <rtl/logfile.h>). They generate |
| actual code only when at compile time the preprocessor symbol TIMELOG |
| is defined. This is either accomplished by building a special office |
| version (which will be the normal Pro-version) or by setting an |
| environment variable <CODE>ENVCDEFS</CODE> to include <CODE>-DTIMELOG</CODE>. |
| Depending on the shell you use</P> |
| <P><CODE>set ENVCDEFS=-DTIMELOG</CODE></P> |
| <P>will do that (4NT).</P> |
| <P>The macros can be used in two ways:</P> |
| <OL> |
| <LI><P><CODE>RTL_LOGFILE_CONTEXT_AUTHOR(instance,project,author,name)</CODE> |
| creates a context instance that can be referenced by <CODE>instance</CODE>. |
| The three remaining parameters contain the name of the project in |
| which the macro is used, the Sun id of the author beeing responsible |
| for the code in which the macro is used (for example af119097 for |
| one of the authors) and the <CODE>name</CODE> of the function or |
| other scope that this context refers to. The context instance emits |
| one time stamp when it is created and one when it is destroyed. |
| Therefore, if placed as first statement of a function definition, a |
| time stamp is written to the log file when the function is entered |
| and one when it is exited even when there is more then one |
| (implicit) return statement. |
| </P> |
| <P>The name passed to the context instantiation is written with |
| every time stamp that originates from such a context. This is also |
| true for the following macros.</P> |
| <LI><P><CODE>RTL_LOGFILE_CONTEXT_TRACE(instance,message)</CODE>,</P> |
| <P><CODE>RTL_LOGFILE_CONTEXT_TRACE1(instance,message,arg1)</CODE>,</P> |
| <P><CODE>RTL_LOGFILE_CONTEXT_TRACE2(instance,message,arg1,arg2)</CODE>, |
| and</P> |
| <P><CODE>RTL_LOGFILE_CONTEXT_TRACE3(instance,message,arg1,arg2,arg3)</CODE> |
| write time stamps with arbitrary messages to the log file. The |
| actual message is given by the string <CODE>message</CODE>. Together |
| with zero to three arguments it is passed to a printf style output |
| function. That means that for every argument there has to be a |
| suitable % format string in <CODE>message</CODE>. In order to allow |
| the log file be parsed by the existing Perl scripts message may |
| contain newlines. Each message is prefixed with the context name.</P> |
| <P>These macros exist also in a context free version.</P> |
| <P><CODE>RTL_LOGFILE_TRACE_AUTHOR(project,author,message)</CODE>,</P> |
| <P><CODE>RTL_LOGFILE_TRACE_AUTHOR1(project,author,message,arg1)</CODE>,</P> |
| <P><CODE>RTL_LOGFILE_TRACE_AUTHOR2(project,author,message,arg1,arg2)</CODE>, |
| and</P> |
| <P><CODE>RTL_LOGFILE_TRACE_AUTHOR3(project,author,message,arg1,arg2,arg3)</CODE> |
| do the same thing as their context twins except that the messages |
| are not prefixed with a context name. As you can see, the project |
| and author have to given to each trace macro instead of just to the |
| one creating a context. They are defined in <rtl/logfile.h> |
| which is included from <rtl/logfile.hxx>. Therefore for only |
| using the context free macros it is sufficient to include |
| <rtl/logfile.h>.</P> |
| </OL> |
| <P>For every introduced above there is an analogon without the |
| <CODE>_AUTHOR</CODE> suffix which do not accept the <CODE>project</CODE> |
| and <CODE>author</CODE> arguments. Because both of these are used |
| later in the post processing stage we discourage you to use these |
| other macro versions.</P> |
| <P>Depending on which of these two ways is used to generate a time |
| stamp, they are named function/scope time stamps respectively message |
| time stamps.</P> |
| <H2>Creating profiling information</H2> |
| <P>If you have instrumented your code like described above and have |
| compiled and installed it you can create profiling information by |
| starting the office with the bootsrap variable <CODE>RTL_LOGFILE</CODE> |
| set to a file name prefix. This prefix is completed by appending an |
| underscore, the process id and a “.log” suffix. Note that |
| backslashes have to be escaped by another backslash. The variable |
| <CODE>RTL_LOGFILE</CODE> can be set in one of the following ways:</P> |
| <UL> |
| <LI><P>Setting the environment variable <CODE>RTL_LOGFILE</CODE>. |
| This for instance be done with</P> |
| <UL> |
| <P><CODE>set RTL_LOG<CODE>FILE</CODE>=c:\\logfile</CODE></P> |
| </UL> |
| <P>from a 4NT shell.</P> |
| <LI><P>Passing the argument <CODE>-env:RTL_LOGFIL<CODE>E=</CODE>c:\\logfile</CODE> |
| on the commandline to the office.</P> |
| <LI><P>Put an entry into <executable><CODE>.ini</CODE> or |
| <executable><CODE>rc</CODE> file (the first with, the second |
| without a dot) where <executable> is the name of the |
| executable. For more information on this method please refer to |
| <A HREF="http://udk.openoffice.org/common/man/concept/uno_default_bootstrapping.html">http://udk.openoffice.org/common/man/concept/uno_default_bootstrapping.html</A></P> |
| </UL> |
| <H2>Transforming the log files into Calc documents</H2> |
| <P>Once you have created a log file you may want to convert it into a |
| more readable form. There is a set of Perl scripts that create a Calc |
| document from a log file that contains for every thread a set of |
| pages with different views of the profiling data. |
| </P> |
| <P>The first page shows a pretty printed version of the list of time |
| stamps. They are indented according to the calling hierarchy. For |
| each time stamp you can see the time it has been written (or to be |
| more precise the time on which it's writing has been requested), if |
| applicable the time the function/scope took to compute and the |
| function/scope name and message.</P> |
| <P>The second page shows a list of all functions/scopes for which |
| timing informations exist. Every list entry shows the function/scope |
| name, total, minimal, maximal, and average time and the number of |
| calls.</P> |
| <H3>Filtering</H3> |
| <P>The list of time stamps can be filtered in order to reduce a large |
| amount of data to a managable size and to exclude profiling |
| information from projects you are not interested in. There are two |
| filters. One is for explicit inclusion of time stamps, the other for |
| explicit exclusion. Each is initialized from a file containing |
| regular expressions (Perl style) given on the command line. The file |
| after the <CODE>-i</CODE> switch defines those time stamps that are |
| to be included in the reports and the file after the <CODE>-e</CODE> |
| switch defines those time stamps that are to be excluded. The files |
| may contain empty lines or comment lines whose first character is an |
| '#'. An empty inclusion filter does have no effect. If both filters |
| are specified, then a time stamp is written to the report if it |
| matches at least one of the regular expression of the inclusion |
| filter and none of the exclusion filter.</P> |
| <P>The idea is to use the inclusion filter if you have just a small |
| number of functions or messages you are interested in and to use the |
| exclusion filter if there is only a small number you would not like |
| in your reports.</P> |
| <P>An inclusion file might look like this.</P> |
| <P><CODE># Regular expression matching all function names that shall |
| show up</CODE></P> |
| <P><CODE># in the reports. </CODE> |
| </P> |
| <P><CODE># Show calls to BuildChart in the chart project.</CODE></P> |
| <P><CODE>BuildChart</CODE></P> |
| <H3>Outlook</H3> |
| <P>The form and number of reports will of course have to be changed |
| and extended. We are looking forward to your feedback.</P> |
| <H2>Further Documentation</H2> |
| <P>Furthor documentation can be found in the C++ header files |
| <CODE><rtl/logfile.h></CODE> and <CODE><rtl/logfile.hxx></CODE> |
| and the Perl scripts in project <CODE>tools/contrib/profiling</CODE>. |
| You can use for instance <CODE>pod2html</CODE> to extract and |
| transform the Perl scripts documentation in HTML documents.</P> |
| <P>Documentation of the time stamp format can be found <A HREF="time-stamp-format.html">here</A>.</P> |
| <H2>Contact</H2> |
| <P>You can contact the authors under <A HREF="mailto:andre.w.fischer@sun.com">andre.w.fischer@sun.com</A> |
| or <A HREF="mailto:carsten.driesner@sun.com">carsten.driesner@sun.com</A>.</P> |
| <P><BR><BR> |
| </P> |
| </body> |
| </HTML> |