| \documentclass[11pt,a4paper]{article} |
| \setlength{\parindent}{0ex} |
| \setlength{\parskip}{1em plus1ex minus.5ex} |
| %% $Id$ |
| \usepackage{fullpage} |
| \usepackage{verbatim} |
| \usepackage[dvips]{graphicx} |
| %%\addtolength{\textheight}{2cm} |
| \begin{document} |
| |
| \pagestyle{empty} |
| |
| \begin{titlepage} |
| \begin{center} |
| \Huge |
| \textbf{Apache Tcl} |
| |
| \sffamily |
| \small |
| Fast, Light, Easy, Powerful\\ |
| |
| By David N. Welton, Apache Software Foundation\\ |
| |
| \footnotesize |
| davidw@apache.org |
| \normalsize |
| \end{center} |
| \end{titlepage} |
| |
| \sffamily |
| |
| \begin{abstract} |
| Programming for the web can be both easy and powerful, quick and |
| elegant, fast, and lightweight. Find out about what's available for |
| both new users and experienced programmers, as we cover everything |
| from how to get started, different systems available, programming |
| strategies, and the low-level beauty of integrating Apache and Tcl. |
| \end{abstract} |
| |
| \tableofcontents |
| |
| \section{Introduction to Apache and Tcl} |
| |
| \subsection{Apache} |
| |
| Apache Tcl is, of course, the name for the projects which have the |
| Apache web server and Tcl language in common, but more than that, we |
| have all arrived at this junction between Apache and Tcl because we |
| have found it to be the optimal solution. |
| |
| Apache is, of course, the world's most popular web server. According |
| to netcraft, Apache accounts for around 60\% of the web server market, |
| putting it ahead of all other contenders combined. |
| |
| Apache is very flexible, fast, correct and secure. To attach meanings |
| to these descriptions, let's go through them and explain. |
| |
| The flexibility in Apache lies in its ability to be configured in many |
| different ways, from minimalistic to a full-fledged ``web |
| application'' server. |
| |
| While there are faster web servers available, Apache competes very |
| admirably with the current offerings on the market. |
| |
| Of great importance, Apache is correct - it adheres to defined |
| Internet standards. |
| |
| And, most of all, Apache is very secure. The code has been examined |
| by experts the world over, and when bugs are discovered, fixes are |
| promptly available. |
| |
| The Apache web server was first created by a group of sysadmins who |
| were using the NCSA server and had created patches to enhance and |
| improve it. Thus, "A Patchy" web server. |
| |
| The Apache Software Foundation now encompasses a large variety of |
| diverse projects in addition to Tcl: the web server, XML, Java, Perl, |
| PHP, APR, Python, and is a non-profit corporation registered in the |
| U.S., with members throughout the world. |
| |
| The ASF consists of 60-some members, and hundreds of "committers" - |
| those who have the rights to make changes to different projects' code. |
| |
| \subsection{Tcl} |
| |
| Tcl stands for ``Tool Command Language''. It was created in the late |
| 80'ies, by John Ousterhout, then a professor at the University of |
| California at Berkeley, as a replacement for the many small extension |
| languages he was writing for his applications. From the very |
| beginning, Tcl was engineered to be combined with other systems. |
| |
| The original idea behind Tcl is still very valid today: powerful |
| applications can be made much more so by letting the user access parts |
| of them programmatically. One-off languages are best replaced by a |
| more general solution. The answer: a reusable library that provides a |
| scripting language. This allows for ``embedding and extending'' of |
| Tcl. It can either be ``embedded'' into another application, where |
| the other program controls what is going on, and calls Tcl, or |
| ``extended'', which means that Tcl is in control, and it either has |
| extra code built in, or it loads it dynamically. Apache Tcl projects |
| are almost exclusively of the first type. Apache is the center of |
| attention, and it makes calls to Tcl at opportune times. |
| |
| Let's look at some of Tcl's most important features: |
| |
| First of all, it's easy to learn, having a very simple syntax. This |
| means that anyone can pick it up and start using it almost right away, |
| even if they are not an expert software engineer. As we will see |
| later, this is at times important for a web language. |
| |
| Despite being easy, Tcl is also quite flexible. It doesn't limit the |
| expert programmer, and encourages them to take advantage of its |
| features in order to solve problems most elegantly. It is actually |
| possible to create new control structures such as 'if' or 'while' in |
| Tcl code itself, without resorting to C! |
| |
| As mentioned above, embedding and extending are central to Tcl's |
| philosophy. It is possible to access a great deal of the language's |
| internals via its well-documented C API. |
| |
| What's more, Tcl doesn't take up up a lot of system resources. It |
| doesn't waste memory, nor disk space, and is reasonably fast as a |
| byte-compiled scripting language. |
| |
| Of course, Tcl is Free Software. You can do pretty much anything you |
| like with it, as it's distributed under the very liberal BSD-style |
| license. It's used in many free software projects, as well as |
| proprietary applications in large corporations. |
| |
| Another point that needs mentioning is that the Tcl language is |
| natively multiplatform. It is designed to work equally well on Unix, |
| Windows, as well as both Mac OS classic and MacOS X. |
| |
| The Tk toolkit is almost always mentioned in the same phrase as Tcl, |
| but it's important to understand that they are two separate things, |
| despite both being developed by John Ousterhout, and having a closely |
| entwined history. Tk is a graphical toolkit for rapidly developing |
| visual applications with a native look and feel. |
| |
| Not only does Tcl do the simple things well, it also makes complex |
| things possible (to borrow a phrase). The standard library, tcllib, |
| contains code to perform all kinds of tasks, as well as manage complex |
| data structures, such as graphs, queues, matrixes, etc... As a |
| testament to Tcl's flexibility, not even object orientation is built |
| into the language itself, but can be loaded as a package, the most |
| popular one being [Incr Tcl]. Tcl has been around long enough that |
| there is probably a package out there that does what you need. |
| |
| \subsection{Tcl and the Web} |
| |
| The web (and XML, for that matter) is primarily text oriented, so a |
| language that's good at dealing with text is well suited to the web, |
| and much, much faster to develop with than a low-level language like |
| C. This is why languages like Perl and Tcl have been so successful |
| for web programming to date. It's easier, security is improved by not |
| worrying about buffer overflows, and everything is generally more |
| flexible. |
| |
| The second big advantage when using a ``scripting language'' for the |
| web is that it makes the craft of creating dynamic web pages available |
| to more people, who might not necessarily be skilled enough to use a |
| more complex language such as Java or C++. Not everyone who needs to |
| get something up on the web has the luxury of knowing many languages, |
| and/or having the time to learn a more difficult one. While Tcl is a |
| fine tool in the hands of an expert software engineer, it is also |
| particularly well suited to this type of use, because of its simple |
| structure, and the ease with which it is learned. As an added bonus, |
| having learned the language, it can be used beyond the web for system |
| tasks, graphical apps, and more. |
| |
| As far back as 1995, when many of us were still experimenting with |
| simplistic CGI's, Tcl was being integrated into what was to go on to |
| become Vignette's StoryServer, and AOLserver, two systems that are |
| still in use today. AOLserver is even open source, and is a very well |
| regarded high-performance, multi-threaded web server with built-in |
| scriptability, thanks to Tcl. |
| |
| \subsection{Tcl Example 1.} |
| |
| A few small examples of Tcl code follow. They demonstrate some of the |
| fundamental constructs of the language. |
| |
| \begin{verbatim} |
| set seconds [clock seconds] |
| puts "The time is [clock format \$seconds]" |
| \end{verbatim} |
| |
| In the above example, 3 Tcl commands are used, ``set'', ``puts'', and |
| ``clock''. The first line sets the variable ``seconds'' to the value |
| returned by the command ``clock seconds''. In the second line, |
| ``clock format'' is called with the variable \$seconds and the result |
| is left inside the string which is then passed to ``puts'', which will |
| send it to stdout. |
| |
| \begin{verbatim} |
| proc greeting {lang} { |
| switch \$lang { |
| en { |
| return "Hello, how is it going?" |
| } |
| it { |
| return "Buon giorno, come va?" |
| } |
| fr { |
| return "Bonjour, .......?" |
| } |
| de { |
| return "Guten Tag, wie gehts?" |
| } |
| zt { |
| return "Grüezi, wi gaat's?" |
| } |
| } |
| } |
| \end{verbatim} |
| |
| This is an example of a ``proc'' - a new Tcl procedure or command. It |
| accepts one argument, ``lang'', which it then proceeds to examine, and |
| then return a different string depending on the result. |
| |
| \section{Apache Tcl} |
| |
| Our projects aren't just related by the fact that we all use Tcl - the |
| Apache Tcl projects share a common philosophy that "simple things |
| should be easy, and hard things should be possible" (to borrow a quote |
| - it works well for them!). We all arrived at Tcl not because it's the |
| only tool in our box, but because for us it is the right tool. |
| |
| The "Apache Tcl Project" is currently comprised of 5 individual |
| projects, and is moving towards having 3, or less, in the future. |
| |
| \includegraphics[scale=0.6]{./future.ps} |
| |
| Getting ahead of ourselves, our short term plans are to release Rivet, |
| and phase out mod\_dtcl and NeoWebScript. Our long term goals, should |
| they come to fruition, are to further modularize all our offerings, so |
| that at some point, it would be possible to run Rivet and Websh as |
| packages loaded from within mod\_tcl, as it is capable of giving us |
| low level access to the Apache internals from Tcl. So, we wouldn't |
| really only have one project, but there would be, at the same time, |
| more modularity and tighter integration amongst the remaining systems. |
| |
| Without further ado, our 5 projects are: |
| |
| \begin{itemize} |
| \item mod\_dtcl\\ |
| mod\_dtcl was created in 1998 by David Welton, and, while there were |
| other products out there that did similar things, mod\_dtcl was one |
| of the first that was Free Software. |
| |
| The Apache Software Foundation subsequently created the Tcl group, |
| with mod\_dtcl as its basis in late 2000. |
| |
| mod\_dtcl will eventually be replaced by Apache Rivet, which is |
| in fact, getting just about 100\% of our development time lately. |
| |
| The original design goals of mod\_dtcl were based on observations |
| about the success of PHP, and sought to borrow the best of that |
| system, while avoiding some of the downfalls. Principally, mod\_dtcl |
| was meant to be fast, light, and simple to use. |
| |
| \item NeoWebScript\\ |
| NeoWebScript was created by Karl Lehenbauer, one of the early Tcl |
| users and contributors and is currently maintained by Damon |
| Courtney. NWS, as it's known, became part of the Apache Tcl project |
| in early 2001, after having been released as Free Software. In its |
| early years, NWS was not really ``Open Source''. |
| |
| NWS was written with an ISP type environment in mind, and as a |
| consequence, as features such as a sandbox, utilizing Tcl's safe |
| interpreters, which is able to keep users contained. In addition, |
| NWS aims to make a lot of popular extensions such as 'gd' and |
| database extensions available as part of the core product. |
| |
| \item mod\_tcl\\ |
| mod\_tcl was created by Michael Link, for Apache 2.0, with the goal |
| of exposing the Apache API in Tcl, in order to make it possible to |
| write Apache modules in Tcl. Most of the other projects take |
| advantage of Apache features, and are tightly linked to the server, |
| but could conceivably also operate independently of Apache. |
| mod\_tcl takes the opposite approach and gives you access to a great |
| deal of Apache's C API. This flexibility means that, one day, other |
| Apache Tcl modules might just be plugins for mod\_tcl. |
| |
| \item Websh\\ |
| Websh was born in 1995 as a C++ library incorporating a Tcl |
| interpreter, and gradually migrated to being a Tcl extension. It |
| became Free Software in 2000, and was contributed to the ASF in |
| 2001. Andrej Vckovski was the original author, with major work done |
| by Ronnie Brunner (Websh 2) and Simon Hefti (Websh 3). |
| |
| Websh is more of a complete ``application development'' environment |
| than the other systems, which is in some ways a tradeoff. It |
| provides many powerful tools for the programmer, but takes more time |
| to learn, and requires just a bit more ramp-up time to start |
| creating pages. |
| |
| One unique feature of Websh is that it is relatively independent of |
| Apache, and, in fact, a version is provided that runs as a |
| stand-alone CGI. |
| |
| \item Apache Rivet\\ |
| The most recent of the Apache Tcl projects, Rivet is the future of |
| both mod\_dtcl and NeoWebScript. Both David Welton and Damon |
| Courtney are collaborating to combine the best of the two previous |
| systems into something that has the best of both worlds, and leaves |
| behind some of their baggage. |
| |
| Our aim is to produce a system that is fast, light, simple, |
| powerful, takes advantage of the best of Apache and Tcl, and that |
| comes with a rich variety of existing Tcl extensions. |
| |
| Rivet is alpha software, although it works quite well, and we are |
| preparing a release (it should be available by the time ApacheCon |
| arrives). |
| \end{itemize} |
| |
| \subsection{Rivet Example 1.} |
| |
| The following example shows Apache Rivet at work. |
| |
| \verbatiminput{rivetexample1.rvt} |
| |
| This example shows us several features of Tcl. In reality, the Tcl |
| code used here is entirely generic. The only thing that tells us that |
| it is used in Rivet, are the $<$? and ?$>$ delimiters that indicate |
| that the particular section of the file should be parsed as Tcl code, |
| and not HTML. The table (see below) which results from this code is |
| created by the two \textbf{for} loops, which set and then increment |
| their respective variables. The value used to determine the shade of |
| gray for a particular cell is calculated by the \textbf{expr} command, |
| and is in turn used in the \textbf{format} command, which sets up both |
| the bgcolor to use for the cell. Note that the numbers to be |
| displayed in the cell are interpolated directly in the string, and are |
| not handled by the format command. |
| |
| \includegraphics[scale=1.0]{./table.ps} |
| |
| \subsection{Programming Strategies} |
| |
| One of the fundamental advantages of Tcl web programming is that it |
| gives us the freedom to choose from several available strategies for |
| creating a web site. |
| |
| ``Quick and dirty'' is what we will call the first style. It's what |
| happens when HTML and code are freely mixed throughout a web page. |
| It's common for novice programmers to take this approach because it's |
| very immediate. However, there are times when you just need to ``get |
| it done'', and the sooner the better. So we do support this style of |
| programming. |
| |
| Of course, for large sites, where it's important to keep some kind of |
| order, and where the resources are available to separate out HTML |
| writing, graphic design, and programming, we can certainly use Tcl in |
| such a way as to draw a clear line between the logic and the |
| presentation. |
| |
| One of the best, and easiest ways to do this is to create Tcl commands |
| that take few or no arguments, so that they look almost like more HTML |
| tags on a web page, as we see in the next example: |
| |
| \includegraphics[scale=1.0]{./example2.ps} |
| |
| Here we see that the Tcl commands act as if they were just extra tags. |
| Of course, this is overly simplified, however, it gives the general |
| idea of how one might begin to go about separating content from how it |
| is displayed. An interesting idea that hasn't been fully explored is |
| to create a series of high-level function-based widgets that work both |
| as HTML and as a Tk interface. |
| |
| \section{Integration Apache and Tcl} |
| |
| This section briefly discusses some of what went into integrating |
| Apache and Tcl through their C interfaces. Both of these systems make |
| a large portion of their functionality available to the programmer via |
| their API's. It has been our experience that both are well thought |
| out, and a pleasure to work with. Hooking them together has been very |
| enjoyable work! Because it's a very broad topic, and more than one |
| talk could be dedicated to each individual project, and also because |
| the author is most familiar with it, Rivet will be the basis for the |
| following discussion, although much of it applies to the other |
| projects as well. |
| |
| Tcl has an exhaustive list of functions that may be accessed via it's |
| C interface. One may manipulate interpreters, variables, threads, |
| create and manipulate ``channels'' (more on that in just a bit), |
| utilize the event loop, write new Tcl commands, as well as a host of |
| convenience features, such as systems to translate between unicode, |
| utf, and a variety of character sets, in addition to hash tables, |
| dynamic strings, and more. Yet again, too much to discuss in this |
| space - we hope those interested will investigate on their own. |
| |
| \subsection{Apache Initialization and Directives} |
| |
| Apache's configuration directives are difficult to get right, but |
| provide the user with the ability to control Apache at each step of |
| its ``life cycle''. To begin with, Rivet lets you specify scripts |
| that may be run when Apache starts up, and stops, called GlobalInit |
| and GlobalExit. Because Apache 1.3 still forks sub-processes, it's |
| useful to be able to intervene when these child processses are |
| launched. That is handled through Child Init and Exit configuration |
| directives. Yet another useful way to modify Rivet's behavior is by |
| inserting code before and after a page is executed. This could be |
| used to insert custom headers or footers, for instance. Furthermore, |
| options are also available to control file upload characteristics, as |
| well as determine whether separate virtual hosts get their own |
| interpreters. This last feature is particularly important in |
| environments where separate clients might want to have their own Tcl |
| code, loaded separately. |
| |
| \subsection{How Rivet Serves Pages} |
| |
| When a Rivet page (.rvt) is requested from the Apache server via HTTP, |
| after passing through Apache, the module checks to see if a cached |
| version of the page's bytecode is available. If it isn't, the page is |
| read into memory, and parsed into a script, which is then executed, as |
| well as stored in the cache. If it is available in cache, the script |
| is executed directly, without having to touch the disk at all to |
| reload it. Naturally the cache size is configurable via an Apache |
| directive. The parser works by transforming chunks of HTML - |
| everything outside of the $<$? ?$>$ tags - into large \textbf{puts} |
| statements, which can then be executed along with the rest of the Tcl |
| code as one large script. |
| |
| \subsection{Tcl Channels} |
| One of the especially interesting things about the Rivet |
| implementation is the use of Tcl channels to send output to Apache. |
| What this means is that, instead of having to use a special, custom |
| Tcl command to send text to the web, we can use regular \textbf{puts} |
| in Rivet code, which means it is even easier to take normal, |
| unmodified Tcl scripts, and have them ``just work'' on the web. |
| |
| Tcl channels are input/output drivers that can be created and |
| manipulated at the C level. They let you do custom \textbf{Close}, |
| \textbf{Input}, \textbf{Output}, \textbf{Seek}, \textbf{Set} and |
| \textbf{Get} Option, \textbf{GetHandle}, \textbf{Block}, |
| \textbf{Flush}, and \textbf{Event} handler functions. |
| |
| Here is an example of the output function used to pass data from Tcl |
| to Apache. |
| |
| \begin{verbatim} |
| static int |
| outputproc(ClientData instancedata, char *buf, |
| int toWrite, int *errorCodePtr) |
| { |
| rivet_server_conf *rsc = (rivet_server_conf *)instancedata; |
| rivet_interp_globals *globals = |
| Tcl_GetAssocData(rsc->server_interp, "rivet", NULL); |
| |
| TclWeb_PrintHeaders(globals->req); |
| if (globals->req->content_sent == 0) |
| { |
| ap_rwrite(buf, toWrite, globals->r); |
| ap_rflush(globals->r); |
| } |
| return toWrite; |
| } |
| \end{verbatim} |
| |
| It's really very simple! Most of the function is about getting the |
| correct context to operate on, and then ``buf'' and ``toWrite'' are |
| passed directly down to the Apache layer. |
| |
| \subsection{Apache's C Interface} |
| |
| This is another topic that merits hours of discussion on its own, so |
| we are going to stick with a few highlights that have been of |
| particular help to us as when working with Apache code. |
| |
| It would be hard not to talk about ``pools'', which are Apache's |
| memory management system for modules, and are very convenient for the |
| module author. Pools take care of freeing memory after you have used |
| it for tasks during the request (well, at other points too, but we are |
| aiming for a simple talk!). There are also lots of convenient string |
| functions that take advantage of pools to make life easier |
| |
| It's not part of Apache proper but the apreq code is very useful for |
| handling data received from users - cookies, GET and POST variables. |
| Rivet uses apreq to facilitate the transformation of this raw |
| information into something that Tcl can turn into a variable. |
| |
| apxs - while it's certainly not a very exciting aspect of Apache, |
| having a build system that tells you everything it knows about how to |
| build things on your platform is very convenient. Rivet uses its own |
| build system to get out of the autoconf/make pit. |
| |
| \section{Conclusion} |
| |
| With all of these API's to work with, uniting them has been both very |
| pleasant, and not all that difficult. In all the Rivet .c and .h |
| files, there are around 5500 lines, total. Not bad, considering Tcl is |
| 150K lines, and Apache 1.3 is around 120K! |
| |
| The ability to leverage so much power - linking together two |
| different, flexible, multipurpose systems - is a testament to the |
| strength and adaptability of both Tcl and Apache. |
| |
| More information is available at the following sites: |
| |
| \begin{itemize} |
| \item http://www.tcl.tk - The Tcl web site. |
| \item http://wiki.tcl.tk - Tcl Wiki |
| \item news:comp.lang.tcl - The Tcl newsgroup. |
| \item I can be reached for questions at davidw@apache.org |
| \end{itemize} |
| |
| \end{document} |
| |