| <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 3.2//EN"> |
| <html> |
| <head> |
| <title>HTMLArea Spell Checker</title> |
| </head> |
| |
| <body> |
| <h1>HTMLArea Spell Checker</h1> |
| |
| <p>The HTMLArea Spell Checker subsystem consists of the following |
| files:</p> |
| |
| <ul> |
| |
| <li>spell-checker.js — the spell checker plugin interface for |
| HTMLArea</li> |
| |
| <li>spell-checker-ui.html — the HTML code for the user |
| interface</li> |
| |
| <li>spell-checker-ui.js — functionality of the user |
| interface</li> |
| |
| <li>spell-checker-logic.cgi — Perl CGI script that checks a text |
| given through POST for spelling errors</li> |
| |
| <li>spell-checker-style.css — style for mispelled words</li> |
| |
| <li>lang/en.js — main language file (English).</li> |
| |
| </ul> |
| |
| <h2>Process overview</h2> |
| |
| <p> |
| When an end-user clicks the "spell-check" button in the HTMLArea |
| editor, a new window is opened with the URL of "spell-check-ui.html". |
| This window initializes itself with the text found in the editor (uses |
| <tt>window.opener.SpellChecker.editor</tt> global variable) and it |
| submits the text to the server-side script "spell-check-logic.cgi". |
| The target of the FORM is an inline frame which is used both to |
| display the text and correcting. |
| </p> |
| |
| <p> |
| Further, spell-check-logic.cgi calls Aspell for each portion of plain |
| text found in the given HTML. It rebuilds an HTML file that contains |
| clear marks of which words are incorrect, along with suggestions for |
| each of them. This file is then loaded in the inline frame. Upon |
| loading, a JavaScript function from "spell-check-ui.js" is called. |
| This function will retrieve all mispelled words from the HTML of the |
| iframe and will setup the user interface so that it allows correction. |
| </p> |
| |
| <h2>The server-side script (spell-check-logic.cgi)</h2> |
| |
| <p> |
| <strong>Unicode safety</strong> — the program <em>is</em> |
| Unicode safe. HTML entities are expanded into their corresponding |
| Unicode characters. These characters will be matched as part of the |
| word passed to Aspell. All texts passed to Aspell are in Unicode |
| (when appropriate). <strike>However, Aspell seems to not support Unicode |
| yet (<a |
| href="http://mail.gnu.org/archive/html/aspell-user/2000-11/msg00007.html">thread concerning Aspell and Unicode</a>). |
| This mean that words containing Unicode |
| characters that are not in 0..255 are likely to be reported as "mispelled" by Aspell.</strike> |
| </p> |
| |
| <p> |
| <strong style="font-variant: small-caps; color: |
| red;">Update:</strong> though I've never seen it mentioned |
| anywhere, it looks that Aspell <em>does</em>, in fact, speak |
| Unicode. Or else, maybe <code>Text::Aspell</code> does |
| transparent conversion; anyway, this new version of our |
| SpellChecker plugin is, as tests show so far, fully |
| Unicode-safe... well, probably the <em>only</em> freeware |
| Web-based spell-checker which happens to have Unicode support. |
| </p> |
| |
| <p> |
| The Perl Unicode manual (man perluniintro) states: |
| </p> |
| |
| <blockquote> |
| <em> |
| Starting from Perl 5.6.0, Perl has had the capacity to handle Unicode |
| natively. Perl 5.8.0, however, is the first recommended release for |
| serious Unicode work. The maintenance release 5.6.1 fixed many of the |
| problems of the initial Unicode implementation, but for example regular |
| expressions still do not work with Unicode in 5.6.1. |
| </em> |
| </blockquote> |
| |
| <p>In other words, do <em>not</em> assume that this script is |
| Unicode-safe on Perl interpreters older than 5.8.0.</p> |
| |
| <p>The following Perl modules are required:</p> |
| |
| <ul> |
| <li><a href="http://search.cpan.org/search?query=Text%3A%3AAspell&mode=all" target="_blank">Text::Aspell</a></li> |
| <li><a href="http://search.cpan.org/search?query=XML%3A%3ADOM&mode=all" target="_blank">XML::DOM</a></li> |
| <li><a href="http://search.cpan.org/search?query=CGI&mode=all" target="_blank">CGI</a></li> |
| </ul> |
| |
| <p>Of these, only Text::Aspell might need to be installed manually. The |
| others are likely to be available by default in most Perl distributions.</p> |
| |
| <hr /> |
| <address><a href="http://dynarch.com/mishoo/">Mihai Bazon</a></address> |
| <!-- Created: Thu Jul 17 13:22:27 EEST 2003 --> |
| <!-- hhmts start --> Last modified: Fri Jan 30 19:14:11 EET 2004 <!-- hhmts end --> |
| <!-- doc-lang: English --> |
| </body> |
| </html> |