| <!-- doc/src/sgml/sources.sgml --> |
| |
| <chapter id="source"> |
| <title>PostgreSQL Coding Conventions</title> |
| |
| <sect1 id="source-format"> |
| <title>Formatting</title> |
| |
| <para> |
| Source code formatting uses 4 column tab spacing, with |
| tabs preserved (i.e., tabs are not expanded to spaces). |
| Each logical indentation level is one additional tab stop. |
| </para> |
| |
| <para> |
| Layout rules (brace positioning, etc) follow BSD conventions. In |
| particular, curly braces for the controlled blocks of <literal>if</literal>, |
| <literal>while</literal>, <literal>switch</literal>, etc go on their own lines. |
| </para> |
| |
| <para> |
| Limit line lengths so that the code is readable in an 80-column window. |
| (This doesn't mean that you must never go past 80 columns. For instance, |
| breaking a long error message string in arbitrary places just to keep the |
| code within 80 columns is probably not a net gain in readability.) |
| </para> |
| |
| <para> |
| To maintain a consistent coding style, do not use C++ style comments |
| (<literal>//</literal> comments). <application>pgindent</application> |
| will replace them with <literal>/* ... */</literal>. |
| </para> |
| |
| <para> |
| The preferred style for multi-line comment blocks is |
| <programlisting> |
| /* |
| * comment text begins here |
| * and continues here |
| */ |
| </programlisting> |
| Note that comment blocks that begin in column 1 will be preserved as-is |
| by <application>pgindent</application>, but it will re-flow indented comment blocks |
| as though they were plain text. If you want to preserve the line breaks |
| in an indented block, add dashes like this: |
| <programlisting> |
| /*---------- |
| * comment text begins here |
| * and continues here |
| *---------- |
| */ |
| </programlisting> |
| </para> |
| |
| <para> |
| While submitted patches do not absolutely have to follow these formatting |
| rules, it's a good idea to do so. Your code will get run through |
| <application>pgindent</application> before the next release, so there's no point in |
| making it look nice under some other set of formatting conventions. |
| A good rule of thumb for patches is <quote>make the new code look like |
| the existing code around it</quote>. |
| </para> |
| |
| <para> |
| The <filename>src/tools</filename> directory contains sample settings |
| files that can be used with the <productname>emacs</productname>, |
| <productname>xemacs</productname> or <productname>vim</productname> |
| editors to help ensure that they format code according to these |
| conventions. |
| </para> |
| |
| <para> |
| The text browsing tools <application>more</application> and |
| <application>less</application> can be invoked as: |
| <programlisting> |
| more -x4 |
| less -x4 |
| </programlisting> |
| to make them show tabs appropriately. |
| </para> |
| </sect1> |
| |
| <sect1 id="error-message-reporting"> |
| <title>Reporting Errors Within the Server</title> |
| |
| <indexterm> |
| <primary>ereport</primary> |
| </indexterm> |
| <indexterm> |
| <primary>elog</primary> |
| </indexterm> |
| |
| <para> |
| Error, warning, and log messages generated within the server code |
| should be created using <function>ereport</function>, or its older cousin |
| <function>elog</function>. The use of this function is complex enough to |
| require some explanation. |
| </para> |
| |
| <para> |
| There are two required elements for every message: a severity level |
| (ranging from <literal>DEBUG</literal> to <literal>PANIC</literal>) and a primary |
| message text. In addition there are optional elements, the most |
| common of which is an error identifier code that follows the SQL spec's |
| SQLSTATE conventions. |
| <function>ereport</function> itself is just a shell macro that exists |
| mainly for the syntactic convenience of making message generation |
| look like a single function call in the C source code. The only parameter |
| accepted directly by <function>ereport</function> is the severity level. |
| The primary message text and any optional message elements are |
| generated by calling auxiliary functions, such as <function>errmsg</function>, |
| within the <function>ereport</function> call. |
| </para> |
| |
| <para> |
| A typical call to <function>ereport</function> might look like this: |
| <programlisting> |
| ereport(ERROR, |
| errcode(ERRCODE_DIVISION_BY_ZERO), |
| errmsg("division by zero")); |
| </programlisting> |
| This specifies error severity level <literal>ERROR</literal> (a run-of-the-mill |
| error). The <function>errcode</function> call specifies the SQLSTATE error code |
| using a macro defined in <filename>src/include/utils/errcodes.h</filename>. The |
| <function>errmsg</function> call provides the primary message text. |
| </para> |
| |
| <para> |
| You will also frequently see this older style, with an extra set of |
| parentheses surrounding the auxiliary function calls: |
| <programlisting> |
| ereport(ERROR, |
| (errcode(ERRCODE_DIVISION_BY_ZERO), |
| errmsg("division by zero"))); |
| </programlisting> |
| The extra parentheses were required |
| before <productname>PostgreSQL</productname> version 12, but are now |
| optional. |
| </para> |
| |
| <para> |
| Here is a more complex example: |
| <programlisting> |
| ereport(ERROR, |
| errcode(ERRCODE_AMBIGUOUS_FUNCTION), |
| errmsg("function %s is not unique", |
| func_signature_string(funcname, nargs, |
| NIL, actual_arg_types)), |
| errhint("Unable to choose a best candidate function. " |
| "You might need to add explicit typecasts.")); |
| </programlisting> |
| This illustrates the use of format codes to embed run-time values into |
| a message text. Also, an optional <quote>hint</quote> message is provided. |
| The auxiliary function calls can be written in any order, but |
| conventionally <function>errcode</function> |
| and <function>errmsg</function> appear first. |
| </para> |
| |
| <para> |
| If the severity level is <literal>ERROR</literal> or higher, |
| <function>ereport</function> aborts execution of the current query |
| and does not return to the caller. If the severity level is |
| lower than <literal>ERROR</literal>, <function>ereport</function> returns normally. |
| </para> |
| |
| <para> |
| The available auxiliary routines for <function>ereport</function> are: |
| <itemizedlist> |
| <listitem> |
| <para> |
| <function>errcode(sqlerrcode)</function> specifies the SQLSTATE error identifier |
| code for the condition. If this routine is not called, the error |
| identifier defaults to |
| <literal>ERRCODE_INTERNAL_ERROR</literal> when the error severity level is |
| <literal>ERROR</literal> or higher, <literal>ERRCODE_WARNING</literal> when the |
| error level is <literal>WARNING</literal>, otherwise (for <literal>NOTICE</literal> |
| and below) <literal>ERRCODE_SUCCESSFUL_COMPLETION</literal>. |
| While these defaults are often convenient, always think whether they |
| are appropriate before omitting the <function>errcode()</function> call. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errmsg(const char *msg, ...)</function> specifies the primary error |
| message text, and possibly run-time values to insert into it. Insertions |
| are specified by <function>sprintf</function>-style format codes. In addition to |
| the standard format codes accepted by <function>sprintf</function>, the format |
| code <literal>%m</literal> can be used to insert the error message returned |
| by <function>strerror</function> for the current value of <literal>errno</literal>. |
| <footnote> |
| <para> |
| That is, the value that was current when the <function>ereport</function> call |
| was reached; changes of <literal>errno</literal> within the auxiliary reporting |
| routines will not affect it. That would not be true if you were to |
| write <literal>strerror(errno)</literal> explicitly in <function>errmsg</function>'s |
| parameter list; accordingly, do not do so. |
| </para> |
| </footnote> |
| <literal>%m</literal> does not require any |
| corresponding entry in the parameter list for <function>errmsg</function>. |
| Note that the message string will be run through <function>gettext</function> |
| for possible localization before format codes are processed. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errmsg_internal(const char *msg, ...)</function> is the same as |
| <function>errmsg</function>, except that the message string will not be |
| translated nor included in the internationalization message dictionary. |
| This should be used for <quote>cannot happen</quote> cases that are probably |
| not worth expending translation effort on. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errmsg_plural(const char *fmt_singular, const char *fmt_plural, |
| unsigned long n, ...)</function> is like <function>errmsg</function>, but with |
| support for various plural forms of the message. |
| <replaceable>fmt_singular</replaceable> is the English singular format, |
| <replaceable>fmt_plural</replaceable> is the English plural format, |
| <replaceable>n</replaceable> is the integer value that determines which plural |
| form is needed, and the remaining arguments are formatted according |
| to the selected format string. For more information see |
| <xref linkend="nls-guidelines"/>. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errdetail(const char *msg, ...)</function> supplies an optional |
| <quote>detail</quote> message; this is to be used when there is additional |
| information that seems inappropriate to put in the primary message. |
| The message string is processed in just the same way as for |
| <function>errmsg</function>. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errdetail_internal(const char *msg, ...)</function> is the same |
| as <function>errdetail</function>, except that the message string will not be |
| translated nor included in the internationalization message dictionary. |
| This should be used for detail messages that are not worth expending |
| translation effort on, for instance because they are too technical to be |
| useful to most users. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errdetail_plural(const char *fmt_singular, const char *fmt_plural, |
| unsigned long n, ...)</function> is like <function>errdetail</function>, but with |
| support for various plural forms of the message. |
| For more information see <xref linkend="nls-guidelines"/>. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errdetail_log(const char *msg, ...)</function> is the same as |
| <function>errdetail</function> except that this string goes only to the server |
| log, never to the client. If both <function>errdetail</function> (or one of |
| its equivalents above) and |
| <function>errdetail_log</function> are used then one string goes to the client |
| and the other to the log. This is useful for error details that are |
| too security-sensitive or too bulky to include in the report |
| sent to the client. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errdetail_log_plural(const char *fmt_singular, const char |
| *fmt_plural, unsigned long n, ...)</function> is like |
| <function>errdetail_log</function>, but with support for various plural forms of |
| the message. |
| For more information see <xref linkend="nls-guidelines"/>. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errhint(const char *msg, ...)</function> supplies an optional |
| <quote>hint</quote> message; this is to be used when offering suggestions |
| about how to fix the problem, as opposed to factual details about |
| what went wrong. |
| The message string is processed in just the same way as for |
| <function>errmsg</function>. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errhint_plural(const char *fmt_singular, const char *fmt_plural, |
| unsigned long n, ...)</function> is like <function>errhint</function>, but with |
| support for various plural forms of the message. |
| For more information see <xref linkend="nls-guidelines"/>. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errcontext(const char *msg, ...)</function> is not normally called |
| directly from an <function>ereport</function> message site; rather it is used |
| in <literal>error_context_stack</literal> callback functions to provide |
| information about the context in which an error occurred, such as the |
| current location in a PL function. |
| The message string is processed in just the same way as for |
| <function>errmsg</function>. Unlike the other auxiliary functions, this can |
| be called more than once per <function>ereport</function> call; the successive |
| strings thus supplied are concatenated with separating newlines. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errposition(int cursorpos)</function> specifies the textual location |
| of an error within a query string. Currently it is only useful for |
| errors detected in the lexical and syntactic analysis phases of |
| query processing. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errtable(Relation rel)</function> specifies a relation whose |
| name and schema name should be included as auxiliary fields in the error |
| report. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errtablecol(Relation rel, int attnum)</function> specifies |
| a column whose name, table name, and schema name should be included as |
| auxiliary fields in the error report. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errtableconstraint(Relation rel, const char *conname)</function> |
| specifies a table constraint whose name, table name, and schema name |
| should be included as auxiliary fields in the error report. Indexes |
| should be considered to be constraints for this purpose, whether or |
| not they have an associated <structname>pg_constraint</structname> entry. Be |
| careful to pass the underlying heap relation, not the index itself, as |
| <literal>rel</literal>. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errdatatype(Oid datatypeOid)</function> specifies a data |
| type whose name and schema name should be included as auxiliary fields |
| in the error report. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errdomainconstraint(Oid datatypeOid, const char *conname)</function> |
| specifies a domain constraint whose name, domain name, and schema name |
| should be included as auxiliary fields in the error report. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errcode_for_file_access()</function> is a convenience function that |
| selects an appropriate SQLSTATE error identifier for a failure in a |
| file-access-related system call. It uses the saved |
| <literal>errno</literal> to determine which error code to generate. |
| Usually this should be used in combination with <literal>%m</literal> in the |
| primary error message text. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errcode_for_socket_access()</function> is a convenience function that |
| selects an appropriate SQLSTATE error identifier for a failure in a |
| socket-related system call. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errhidestmt(bool hide_stmt)</function> can be called to specify |
| suppression of the <literal>STATEMENT:</literal> portion of a message in the |
| postmaster log. Generally this is appropriate if the message text |
| includes the current statement already. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| <function>errhidecontext(bool hide_ctx)</function> can be called to |
| specify suppression of the <literal>CONTEXT:</literal> portion of a message in |
| the postmaster log. This should only be used for verbose debugging |
| messages where the repeated inclusion of context would bloat the log |
| too much. |
| </para> |
| </listitem> |
| </itemizedlist> |
| </para> |
| |
| <note> |
| <para> |
| At most one of the functions <function>errtable</function>, |
| <function>errtablecol</function>, <function>errtableconstraint</function>, |
| <function>errdatatype</function>, or <function>errdomainconstraint</function> should |
| be used in an <function>ereport</function> call. These functions exist to |
| allow applications to extract the name of a database object associated |
| with the error condition without having to examine the |
| potentially-localized error message text. |
| These functions should be used in error reports for which it's likely |
| that applications would wish to have automatic error handling. As of |
| <productname>PostgreSQL</productname> 9.3, complete coverage exists only for |
| errors in SQLSTATE class 23 (integrity constraint violation), but this |
| is likely to be expanded in future. |
| </para> |
| </note> |
| |
| <para> |
| There is an older function <function>elog</function> that is still heavily used. |
| An <function>elog</function> call: |
| <programlisting> |
| elog(level, "format string", ...); |
| </programlisting> |
| is exactly equivalent to: |
| <programlisting> |
| ereport(level, errmsg_internal("format string", ...)); |
| </programlisting> |
| Notice that the SQLSTATE error code is always defaulted, and the message |
| string is not subject to translation. |
| Therefore, <function>elog</function> should be used only for internal errors and |
| low-level debug logging. Any message that is likely to be of interest to |
| ordinary users should go through <function>ereport</function>. Nonetheless, |
| there are enough internal <quote>cannot happen</quote> error checks in the |
| system that <function>elog</function> is still widely used; it is preferred for |
| those messages for its notational simplicity. |
| </para> |
| |
| <para> |
| Advice about writing good error messages can be found in |
| <xref linkend="error-style-guide"/>. |
| </para> |
| </sect1> |
| |
| <sect1 id="error-style-guide"> |
| <title>Error Message Style Guide</title> |
| |
| <para> |
| This style guide is offered in the hope of maintaining a consistent, |
| user-friendly style throughout all the messages generated by |
| <productname>PostgreSQL</productname>. |
| </para> |
| |
| <simplesect> |
| <title>What Goes Where</title> |
| |
| <para> |
| The primary message should be short, factual, and avoid reference to |
| implementation details such as specific function names. |
| <quote>Short</quote> means <quote>should fit on one line under normal |
| conditions</quote>. Use a detail message if needed to keep the primary |
| message short, or if you feel a need to mention implementation details |
| such as the particular system call that failed. Both primary and detail |
| messages should be factual. Use a hint message for suggestions about what |
| to do to fix the problem, especially if the suggestion might not always be |
| applicable. |
| </para> |
| |
| <para> |
| For example, instead of: |
| <programlisting> |
| IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m |
| (plus a long addendum that is basically a hint) |
| </programlisting> |
| write: |
| <programlisting> |
| Primary: could not create shared memory segment: %m |
| Detail: Failed syscall was shmget(key=%d, size=%u, 0%o). |
| Hint: the addendum |
| </programlisting> |
| </para> |
| |
| <para> |
| Rationale: keeping the primary message short helps keep it to the point, |
| and lets clients lay out screen space on the assumption that one line is |
| enough for error messages. Detail and hint messages can be relegated to a |
| verbose mode, or perhaps a pop-up error-details window. Also, details and |
| hints would normally be suppressed from the server log to save |
| space. Reference to implementation details is best avoided since users |
| aren't expected to know the details. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Formatting</title> |
| |
| <para> |
| Don't put any specific assumptions about formatting into the message |
| texts. Expect clients and the server log to wrap lines to fit their own |
| needs. In long messages, newline characters (\n) can be used to indicate |
| suggested paragraph breaks. Don't end a message with a newline. Don't |
| use tabs or other formatting characters. (In error context displays, |
| newlines are automatically added to separate levels of context such as |
| function calls.) |
| </para> |
| |
| <para> |
| Rationale: Messages are not necessarily displayed on terminal-type |
| displays. In GUI displays or browsers these formatting instructions are |
| at best ignored. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Quotation Marks</title> |
| |
| <para> |
| English text should use double quotes when quoting is appropriate. |
| Text in other languages should consistently use one kind of quotes that is |
| consistent with publishing customs and computer output of other programs. |
| </para> |
| |
| <para> |
| Rationale: The choice of double quotes over single quotes is somewhat |
| arbitrary, but tends to be the preferred use. Some have suggested |
| choosing the kind of quotes depending on the type of object according to |
| SQL conventions (namely, strings single quoted, identifiers double |
| quoted). But this is a language-internal technical issue that many users |
| aren't even familiar with, it won't scale to other kinds of quoted terms, |
| it doesn't translate to other languages, and it's pretty pointless, too. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Use of Quotes</title> |
| |
| <para> |
| Always use quotes to delimit file names, user-supplied identifiers, and |
| other variables that might contain words. Do not use them to mark up |
| variables that will not contain words (for example, operator names). |
| </para> |
| |
| <para> |
| There are functions in the backend that will double-quote their own output |
| as needed (for example, <function>format_type_be()</function>). Do not put |
| additional quotes around the output of such functions. |
| </para> |
| |
| <para> |
| Rationale: Objects can have names that create ambiguity when embedded in a |
| message. Be consistent about denoting where a plugged-in name starts and |
| ends. But don't clutter messages with unnecessary or duplicate quote |
| marks. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Grammar and Punctuation</title> |
| |
| <para> |
| The rules are different for primary error messages and for detail/hint |
| messages: |
| </para> |
| |
| <para> |
| Primary error messages: Do not capitalize the first letter. Do not end a |
| message with a period. Do not even think about ending a message with an |
| exclamation point. |
| </para> |
| |
| <para> |
| Detail and hint messages: Use complete sentences, and end each with |
| a period. Capitalize the first word of sentences. Put two spaces after |
| the period if another sentence follows (for English text; might be |
| inappropriate in other languages). |
| </para> |
| |
| <para> |
| Error context strings: Do not capitalize the first letter and do |
| not end the string with a period. Context strings should normally |
| not be complete sentences. |
| </para> |
| |
| <para> |
| Rationale: Avoiding punctuation makes it easier for client applications to |
| embed the message into a variety of grammatical contexts. Often, primary |
| messages are not grammatically complete sentences anyway. (And if they're |
| long enough to be more than one sentence, they should be split into |
| primary and detail parts.) However, detail and hint messages are longer |
| and might need to include multiple sentences. For consistency, they should |
| follow complete-sentence style even when there's only one sentence. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Upper Case vs. Lower Case</title> |
| |
| <para> |
| Use lower case for message wording, including the first letter of a |
| primary error message. Use upper case for SQL commands and key words if |
| they appear in the message. |
| </para> |
| |
| <para> |
| Rationale: It's easier to make everything look more consistent this |
| way, since some messages are complete sentences and some not. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Avoid Passive Voice</title> |
| |
| <para> |
| Use the active voice. Use complete sentences when there is an acting |
| subject (<quote>A could not do B</quote>). Use telegram style without |
| subject if the subject would be the program itself; do not use |
| <quote>I</quote> for the program. |
| </para> |
| |
| <para> |
| Rationale: The program is not human. Don't pretend otherwise. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Present vs. Past Tense</title> |
| |
| <para> |
| Use past tense if an attempt to do something failed, but could perhaps |
| succeed next time (perhaps after fixing some problem). Use present tense |
| if the failure is certainly permanent. |
| </para> |
| |
| <para> |
| There is a nontrivial semantic difference between sentences of the form: |
| <programlisting> |
| could not open file "%s": %m |
| </programlisting> |
| and: |
| <programlisting> |
| cannot open file "%s" |
| </programlisting> |
| The first one means that the attempt to open the file failed. The |
| message should give a reason, such as <quote>disk full</quote> or |
| <quote>file doesn't exist</quote>. The past tense is appropriate because |
| next time the disk might not be full anymore or the file in question might |
| exist. |
| </para> |
| |
| <para> |
| The second form indicates that the functionality of opening the named file |
| does not exist at all in the program, or that it's conceptually |
| impossible. The present tense is appropriate because the condition will |
| persist indefinitely. |
| </para> |
| |
| <para> |
| Rationale: Granted, the average user will not be able to draw great |
| conclusions merely from the tense of the message, but since the language |
| provides us with a grammar we should use it correctly. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Type of the Object</title> |
| |
| <para> |
| When citing the name of an object, state what kind of object it is. |
| </para> |
| |
| <para> |
| Rationale: Otherwise no one will know what <quote>foo.bar.baz</quote> |
| refers to. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Brackets</title> |
| |
| <para> |
| Square brackets are only to be used (1) in command synopses to denote |
| optional arguments, or (2) to denote an array subscript. |
| </para> |
| |
| <para> |
| Rationale: Anything else does not correspond to widely-known customary |
| usage and will confuse people. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Assembling Error Messages</title> |
| |
| <para> |
| When a message includes text that is generated elsewhere, embed it in |
| this style: |
| <programlisting> |
| could not open file %s: %m |
| </programlisting> |
| </para> |
| |
| <para> |
| Rationale: It would be difficult to account for all possible error codes |
| to paste this into a single smooth sentence, so some sort of punctuation |
| is needed. Putting the embedded text in parentheses has also been |
| suggested, but it's unnatural if the embedded text is likely to be the |
| most important part of the message, as is often the case. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Reasons for Errors</title> |
| |
| <para> |
| Messages should always state the reason why an error occurred. |
| For example: |
| <programlisting> |
| BAD: could not open file %s |
| BETTER: could not open file %s (I/O failure) |
| </programlisting> |
| If no reason is known you better fix the code. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Function Names</title> |
| |
| <para> |
| Don't include the name of the reporting routine in the error text. We have |
| other mechanisms for finding that out when needed, and for most users it's |
| not helpful information. If the error text doesn't make as much sense |
| without the function name, reword it. |
| <programlisting> |
| BAD: pg_strtoint32: error in "z": cannot parse "z" |
| BETTER: invalid input syntax for type integer: "z" |
| </programlisting> |
| </para> |
| |
| <para> |
| Avoid mentioning called function names, either; instead say what the code |
| was trying to do: |
| <programlisting> |
| BAD: open() failed: %m |
| BETTER: could not open file %s: %m |
| </programlisting> |
| If it really seems necessary, mention the system call in the detail |
| message. (In some cases, providing the actual values passed to the |
| system call might be appropriate information for the detail message.) |
| </para> |
| |
| <para> |
| Rationale: Users don't know what all those functions do. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Tricky Words to Avoid</title> |
| |
| <formalpara> |
| <title>Unable</title> |
| <para> |
| <quote>Unable</quote> is nearly the passive voice. Better use |
| <quote>cannot</quote> or <quote>could not</quote>, as appropriate. |
| </para> |
| </formalpara> |
| |
| <formalpara> |
| <title>Bad</title> |
| <para> |
| Error messages like <quote>bad result</quote> are really hard to interpret |
| intelligently. It's better to write why the result is <quote>bad</quote>, |
| e.g., <quote>invalid format</quote>. |
| </para> |
| </formalpara> |
| |
| <formalpara> |
| <title>Illegal</title> |
| <para> |
| <quote>Illegal</quote> stands for a violation of the law, the rest is |
| <quote>invalid</quote>. Better yet, say why it's invalid. |
| </para> |
| </formalpara> |
| |
| <formalpara> |
| <title>Unknown</title> |
| <para> |
| Try to avoid <quote>unknown</quote>. Consider <quote>error: unknown |
| response</quote>. If you don't know what the response is, how do you know |
| it's erroneous? <quote>Unrecognized</quote> is often a better choice. |
| Also, be sure to include the value being complained of. |
| <programlisting> |
| BAD: unknown node type |
| BETTER: unrecognized node type: 42 |
| </programlisting> |
| </para> |
| </formalpara> |
| |
| <formalpara> |
| <title>Find vs. Exists</title> |
| <para> |
| If the program uses a nontrivial algorithm to locate a resource (e.g., a |
| path search) and that algorithm fails, it is fair to say that the program |
| couldn't <quote>find</quote> the resource. If, on the other hand, the |
| expected location of the resource is known but the program cannot access |
| it there then say that the resource doesn't <quote>exist</quote>. Using |
| <quote>find</quote> in this case sounds weak and confuses the issue. |
| </para> |
| </formalpara> |
| |
| <formalpara> |
| <title>May vs. Can vs. Might</title> |
| <para> |
| <quote>May</quote> suggests permission (e.g., "You may borrow my rake."), |
| and has little use in documentation or error messages. |
| <quote>Can</quote> suggests ability (e.g., "I can lift that log."), |
| and <quote>might</quote> suggests possibility (e.g., "It might rain |
| today."). Using the proper word clarifies meaning and assists |
| translation. |
| </para> |
| </formalpara> |
| |
| <formalpara> |
| <title>Contractions</title> |
| <para> |
| Avoid contractions, like <quote>can't</quote>; use |
| <quote>cannot</quote> instead. |
| </para> |
| </formalpara> |
| |
| <formalpara> |
| <title>Non-negative</title> |
| <para> |
| Avoid <quote>non-negative</quote> as it is ambiguous |
| about whether it accepts zero. It's better to use |
| <quote>greater than zero</quote> or |
| <quote>greater than or equal to zero</quote>. |
| </para> |
| </formalpara> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Proper Spelling</title> |
| |
| <para> |
| Spell out words in full. For instance, avoid: |
| <itemizedlist> |
| <listitem> |
| <para> |
| spec |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| stats |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| parens |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| auth |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| xact |
| </para> |
| </listitem> |
| </itemizedlist> |
| </para> |
| |
| <para> |
| Rationale: This will improve consistency. |
| </para> |
| |
| </simplesect> |
| |
| <simplesect> |
| <title>Localization</title> |
| |
| <para> |
| Keep in mind that error message texts need to be translated into other |
| languages. Follow the guidelines in <xref linkend="nls-guidelines"/> |
| to avoid making life difficult for translators. |
| </para> |
| </simplesect> |
| |
| </sect1> |
| |
| <sect1 id="source-conventions"> |
| <title>Miscellaneous Coding Conventions</title> |
| |
| <simplesect> |
| <title>C Standard</title> |
| <para> |
| Code in <productname>PostgreSQL</productname> should only rely on language |
| features available in the C99 standard. That means a conforming |
| C99 compiler has to be able to compile postgres, at least aside |
| from a few platform dependent pieces. |
| </para> |
| <para> |
| A few features included in the C99 standard are, at this time, not |
| permitted to be used in core <productname>PostgreSQL</productname> |
| code. This currently includes variable length arrays, intermingled |
| declarations and code, <literal>//</literal> comments, universal |
| character names. Reasons for that include portability and historical |
| practices. |
| </para> |
| <para> |
| Features from later revisions of the C standard or compiler specific |
| features can be used, if a fallback is provided. |
| </para> |
| <para> |
| For example <literal>_Static_assert()</literal> and |
| <literal>__builtin_constant_p</literal> are currently used, even though |
| they are from newer revisions of the C standard and a |
| <productname>GCC</productname> extension respectively. If not available |
| we respectively fall back to using a C99 compatible replacement that |
| performs the same checks, but emits rather cryptic messages and do not |
| use <literal>__builtin_constant_p</literal>. |
| </para> |
| </simplesect> |
| |
| <simplesect> |
| <title>Function-Like Macros and Inline Functions</title> |
| <para> |
| Both, macros with arguments and <literal>static inline</literal> |
| functions, may be used. The latter are preferable if there are |
| multiple-evaluation hazards when written as a macro, as e.g., the |
| case with |
| <programlisting> |
| #define Max(x, y) ((x) > (y) ? (x) : (y)) |
| </programlisting> |
| or when the macro would be very long. In other cases it's only |
| possible to use macros, or at least easier. For example because |
| expressions of various types need to be passed to the macro. |
| </para> |
| <para> |
| When the definition of an inline function references symbols |
| (i.e., variables, functions) that are only available as part of the |
| backend, the function may not be visible when included from frontend |
| code. |
| <programlisting> |
| #ifndef FRONTEND |
| static inline MemoryContext |
| MemoryContextSwitchTo(MemoryContext context) |
| { |
| MemoryContext old = CurrentMemoryContext; |
| |
| CurrentMemoryContext = context; |
| return old; |
| } |
| #endif /* FRONTEND */ |
| </programlisting> |
| In this example <literal>CurrentMemoryContext</literal>, which is only |
| available in the backend, is referenced and the function thus |
| hidden with a <literal>#ifndef FRONTEND</literal>. This rule |
| exists because some compilers emit references to symbols |
| contained in inline functions even if the function is not used. |
| </para> |
| </simplesect> |
| |
| <simplesect> |
| <title>Writing Signal Handlers</title> |
| <para> |
| To be suitable to run inside a signal handler code has to be |
| written very carefully. The fundamental problem is that, unless |
| blocked, a signal handler can interrupt code at any time. If code |
| inside the signal handler uses the same state as code outside |
| chaos may ensue. As an example consider what happens if a signal |
| handler tries to acquire a lock that's already held in the |
| interrupted code. |
| </para> |
| <para> |
| Barring special arrangements code in signal handlers may only |
| call async-signal safe functions (as defined in POSIX) and access |
| variables of type <literal>volatile sig_atomic_t</literal>. A few |
| functions in <command>postgres</command> are also deemed signal safe, importantly |
| <function>SetLatch()</function>. |
| </para> |
| <para> |
| In most cases signal handlers should do nothing more than note |
| that a signal has arrived, and wake up code running outside of |
| the handler using a latch. An example of such a handler is the |
| following: |
| <programlisting> |
| static void |
| handle_sighup(SIGNAL_ARGS) |
| { |
| int save_errno = errno; |
| |
| got_SIGHUP = true; |
| SetLatch(MyLatch); |
| |
| errno = save_errno; |
| } |
| </programlisting> |
| <varname>errno</varname> is saved and restored because |
| <function>SetLatch()</function> might change it. If that were not done |
| interrupted code that's currently inspecting <varname>errno</varname> might see the wrong |
| value. |
| </para> |
| </simplesect> |
| |
| <simplesect> |
| <title>Calling Function Pointers</title> |
| |
| <para> |
| For clarity, it is preferred to explicitly dereference a function pointer |
| when calling the pointed-to function if the pointer is a simple variable, |
| for example: |
| <programlisting> |
| (*emit_log_hook) (edata); |
| </programlisting> |
| (even though <literal>emit_log_hook(edata)</literal> would also work). |
| When the function pointer is part of a structure, then the extra |
| punctuation can and usually should be omitted, for example: |
| <programlisting> |
| paramInfo->paramFetch(paramInfo, paramId); |
| </programlisting> |
| </para> |
| </simplesect> |
| </sect1> |
| </chapter> |