doc/src/sgml/sources.sgml - cloudberry - Git at Google

 <!-- doc/src/sgml/sources.sgml -->

  <chapter id="source">
   <title>PostgreSQL Coding Conventions</title>

   <sect1 id="source-format">
    <title>Formatting</title>

    <para>
     Source code formatting uses 4 column tab spacing, with
     tabs preserved (i.e., tabs are not expanded to spaces).
     Each logical indentation level is one additional tab stop.
    </para>

    <para>
     Layout rules (brace positioning, etc) follow BSD conventions.  In
     particular, curly braces for the controlled blocks of <literal>if</literal>,
     <literal>while</literal>, <literal>switch</literal>, etc go on their own lines.
    </para>

    <para>
     Limit line lengths so that the code is readable in an 80-column window.
     (This doesn't mean that you must never go past 80 columns.  For instance,
     breaking a long error message string in arbitrary places just to keep the
     code within 80 columns is probably not a net gain in readability.)
    </para>

    <para>
     To maintain a consistent coding style, do not use C++ style comments
     (<literal>//</literal> comments).  <application>pgindent</application>
     will replace them with <literal>/* ... */</literal>.
    </para>

    <para>
     The preferred style for multi-line comment blocks is
 <programlisting>
 /*
  * comment text begins here
  * and continues here
  */
 </programlisting>
     Note that comment blocks that begin in column 1 will be preserved as-is
     by <application>pgindent</application>, but it will re-flow indented comment blocks
     as though they were plain text.  If you want to preserve the line breaks
     in an indented block, add dashes like this:
 <programlisting>
     /*----------
      * comment text begins here
      * and continues here
      *----------
      */
 </programlisting>
    </para>

    <para>
     While submitted patches do not absolutely have to follow these formatting
     rules, it's a good idea to do so.  Your code will get run through
     <application>pgindent</application> before the next release, so there's no point in
     making it look nice under some other set of formatting conventions.
     A good rule of thumb for patches is <quote>make the new code look like
     the existing code around it</quote>.
    </para>

    <para>
     The <filename>src/tools</filename> directory contains sample settings
     files that can be used with the <productname>emacs</productname>,
     <productname>xemacs</productname> or <productname>vim</productname>
     editors to help ensure that they format code according to these
     conventions.
    </para>

    <para>
     The text browsing tools <application>more</application> and
     <application>less</application> can be invoked as:
 <programlisting>
 more -x4
 less -x4
 </programlisting>
     to make them show tabs appropriately.
    </para>
   </sect1>

   <sect1 id="error-message-reporting">
    <title>Reporting Errors Within the Server</title>

    <indexterm>
     <primary>ereport</primary>
    </indexterm>
    <indexterm>
     <primary>elog</primary>
    </indexterm>

    <para>
     Error, warning, and log messages generated within the server code
     should be created using <function>ereport</function>, or its older cousin
     <function>elog</function>.  The use of this function is complex enough to
     require some explanation.
    </para>

    <para>
     There are two required elements for every message: a severity level
     (ranging from <literal>DEBUG</literal> to <literal>PANIC</literal>) and a primary
     message text.  In addition there are optional elements, the most
     common of which is an error identifier code that follows the SQL spec's
     SQLSTATE conventions.
     <function>ereport</function> itself is just a shell macro that exists
     mainly for the syntactic convenience of making message generation
     look like a single function call in the C source code.  The only parameter
     accepted directly by <function>ereport</function> is the severity level.
     The primary message text and any optional message elements are
     generated by calling auxiliary functions, such as <function>errmsg</function>,
     within the <function>ereport</function> call.
    </para>

    <para>
     A typical call to <function>ereport</function> might look like this:
 <programlisting>
 ereport(ERROR,
         errcode(ERRCODE_DIVISION_BY_ZERO),
         errmsg("division by zero"));
 </programlisting>
     This specifies error severity level <literal>ERROR</literal> (a run-of-the-mill
     error).  The <function>errcode</function> call specifies the SQLSTATE error code
     using a macro defined in <filename>src/include/utils/errcodes.h</filename>.  The
     <function>errmsg</function> call provides the primary message text.
    </para>

    <para>
     You will also frequently see this older style, with an extra set of
     parentheses surrounding the auxiliary function calls:
 <programlisting>
 ereport(ERROR,
         (errcode(ERRCODE_DIVISION_BY_ZERO),
          errmsg("division by zero")));
 </programlisting>
     The extra parentheses were required
     before <productname>PostgreSQL</productname> version 12, but are now
     optional.
    </para>

    <para>
     Here is a more complex example:
 <programlisting>
 ereport(ERROR,
         errcode(ERRCODE_AMBIGUOUS_FUNCTION),
         errmsg("function %s is not unique",
                func_signature_string(funcname, nargs,
                                      NIL, actual_arg_types)),
         errhint("Unable to choose a best candidate function. "
                 "You might need to add explicit typecasts."));
 </programlisting>
     This illustrates the use of format codes to embed run-time values into
     a message text.  Also, an optional <quote>hint</quote> message is provided.
     The auxiliary function calls can be written in any order, but
     conventionally <function>errcode</function>
     and <function>errmsg</function> appear first.
    </para>

    <para>
     If the severity level is <literal>ERROR</literal> or higher,
     <function>ereport</function> aborts execution of the current query
     and does not return to the caller. If the severity level is
     lower than <literal>ERROR</literal>, <function>ereport</function> returns normally.
    </para>

    <para>
     The available auxiliary routines for <function>ereport</function> are:
   <itemizedlist>
    <listitem>
     <para>
      <function>errcode(sqlerrcode)</function> specifies the SQLSTATE error identifier
      code for the condition.  If this routine is not called, the error
      identifier defaults to
      <literal>ERRCODE_INTERNAL_ERROR</literal> when the error severity level is
      <literal>ERROR</literal> or higher, <literal>ERRCODE_WARNING</literal> when the
      error level is <literal>WARNING</literal>, otherwise (for <literal>NOTICE</literal>
      and below) <literal>ERRCODE_SUCCESSFUL_COMPLETION</literal>.
      While these defaults are often convenient, always think whether they
      are appropriate before omitting the <function>errcode()</function> call.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errmsg(const char *msg, ...)</function> specifies the primary error
      message text, and possibly run-time values to insert into it.  Insertions
      are specified by <function>sprintf</function>-style format codes.  In addition to
      the standard format codes accepted by <function>sprintf</function>, the format
      code <literal>%m</literal> can be used to insert the error message returned
      by <function>strerror</function> for the current value of <literal>errno</literal>.
      <footnote>
       <para>
        That is, the value that was current when the <function>ereport</function> call
        was reached; changes of <literal>errno</literal> within the auxiliary reporting
        routines will not affect it.  That would not be true if you were to
        write <literal>strerror(errno)</literal> explicitly in <function>errmsg</function>'s
        parameter list; accordingly, do not do so.
       </para>
      </footnote>
      <literal>%m</literal> does not require any
      corresponding entry in the parameter list for <function>errmsg</function>.
      Note that the message string will be run through <function>gettext</function>
      for possible localization before format codes are processed.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errmsg_internal(const char *msg, ...)</function> is the same as
      <function>errmsg</function>, except that the message string will not be
      translated nor included in the internationalization message dictionary.
      This should be used for <quote>cannot happen</quote> cases that are probably
      not worth expending translation effort on.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errmsg_plural(const char *fmt_singular, const char *fmt_plural,
      unsigned long n, ...)</function> is like <function>errmsg</function>, but with
      support for various plural forms of the message.
      <replaceable>fmt_singular</replaceable> is the English singular format,
      <replaceable>fmt_plural</replaceable> is the English plural format,
      <replaceable>n</replaceable> is the integer value that determines which plural
      form is needed, and the remaining arguments are formatted according
      to the selected format string.  For more information see
      <xref linkend="nls-guidelines"/>.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errdetail(const char *msg, ...)</function> supplies an optional
      <quote>detail</quote> message; this is to be used when there is additional
      information that seems inappropriate to put in the primary message.
      The message string is processed in just the same way as for
      <function>errmsg</function>.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errdetail_internal(const char *msg, ...)</function> is the same
      as <function>errdetail</function>, except that the message string will not be
      translated nor included in the internationalization message dictionary.
      This should be used for detail messages that are not worth expending
      translation effort on, for instance because they are too technical to be
      useful to most users.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errdetail_plural(const char *fmt_singular, const char *fmt_plural,
      unsigned long n, ...)</function> is like <function>errdetail</function>, but with
      support for various plural forms of the message.
      For more information see <xref linkend="nls-guidelines"/>.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errdetail_log(const char *msg, ...)</function> is the same as
      <function>errdetail</function> except that this string goes only to the server
      log, never to the client.  If both <function>errdetail</function> (or one of
      its equivalents above) and
      <function>errdetail_log</function> are used then one string goes to the client
      and the other to the log.  This is useful for error details that are
      too security-sensitive or too bulky to include in the report
      sent to the client.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errdetail_log_plural(const char *fmt_singular, const char
      *fmt_plural, unsigned long n, ...)</function> is like
      <function>errdetail_log</function>, but with support for various plural forms of
      the message.
      For more information see <xref linkend="nls-guidelines"/>.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errhint(const char *msg, ...)</function> supplies an optional
      <quote>hint</quote> message; this is to be used when offering suggestions
      about how to fix the problem, as opposed to factual details about
      what went wrong.
      The message string is processed in just the same way as for
      <function>errmsg</function>.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errhint_plural(const char *fmt_singular, const char *fmt_plural,
      unsigned long n, ...)</function> is like <function>errhint</function>, but with
      support for various plural forms of the message.
      For more information see <xref linkend="nls-guidelines"/>.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errcontext(const char *msg, ...)</function> is not normally called
      directly from an <function>ereport</function> message site; rather it is used
      in <literal>error_context_stack</literal> callback functions to provide
      information about the context in which an error occurred, such as the
      current location in a PL function.
      The message string is processed in just the same way as for
      <function>errmsg</function>.  Unlike the other auxiliary functions, this can
      be called more than once per <function>ereport</function> call; the successive
      strings thus supplied are concatenated with separating newlines.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errposition(int cursorpos)</function> specifies the textual location
      of an error within a query string.  Currently it is only useful for
      errors detected in the lexical and syntactic analysis phases of
      query processing.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errtable(Relation rel)</function> specifies a relation whose
      name and schema name should be included as auxiliary fields in the error
      report.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errtablecol(Relation rel, int attnum)</function> specifies
      a column whose name, table name, and schema name should be included as
      auxiliary fields in the error report.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errtableconstraint(Relation rel, const char *conname)</function>
      specifies a table constraint whose name, table name, and schema name
      should be included as auxiliary fields in the error report.  Indexes
      should be considered to be constraints for this purpose, whether or
      not they have an associated <structname>pg_constraint</structname> entry.  Be
      careful to pass the underlying heap relation, not the index itself, as
      <literal>rel</literal>.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errdatatype(Oid datatypeOid)</function> specifies a data
      type whose name and schema name should be included as auxiliary fields
      in the error report.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errdomainconstraint(Oid datatypeOid, const char *conname)</function>
      specifies a domain constraint whose name, domain name, and schema name
      should be included as auxiliary fields in the error report.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errcode_for_file_access()</function> is a convenience function that
      selects an appropriate SQLSTATE error identifier for a failure in a
      file-access-related system call.  It uses the saved
      <literal>errno</literal> to determine which error code to generate.
      Usually this should be used in combination with <literal>%m</literal> in the
      primary error message text.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errcode_for_socket_access()</function> is a convenience function that
      selects an appropriate SQLSTATE error identifier for a failure in a
      socket-related system call.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errhidestmt(bool hide_stmt)</function> can be called to specify
      suppression of the <literal>STATEMENT:</literal> portion of a message in the
      postmaster log.  Generally this is appropriate if the message text
      includes the current statement already.
     </para>
    </listitem>
    <listitem>
     <para>
      <function>errhidecontext(bool hide_ctx)</function> can be called to
      specify suppression of the <literal>CONTEXT:</literal> portion of a message in
      the postmaster log.  This should only be used for verbose debugging
      messages where the repeated inclusion of context would bloat the log
      too much.
     </para>
    </listitem>
   </itemizedlist>
    </para>

    <note>
     <para>
      At most one of the functions <function>errtable</function>,
      <function>errtablecol</function>, <function>errtableconstraint</function>,
      <function>errdatatype</function>, or <function>errdomainconstraint</function> should
      be used in an <function>ereport</function> call.  These functions exist to
      allow applications to extract the name of a database object associated
      with the error condition without having to examine the
      potentially-localized error message text.
      These functions should be used in error reports for which it's likely
      that applications would wish to have automatic error handling.  As of
      <productname>PostgreSQL</productname> 9.3, complete coverage exists only for
      errors in SQLSTATE class 23 (integrity constraint violation), but this
      is likely to be expanded in future.
     </para>
    </note>

    <para>
     There is an older function <function>elog</function> that is still heavily used.
     An <function>elog</function> call:
 <programlisting>
 elog(level, "format string", ...);
 </programlisting>
     is exactly equivalent to:
 <programlisting>
 ereport(level, errmsg_internal("format string", ...));
 </programlisting>
     Notice that the SQLSTATE error code is always defaulted, and the message
     string is not subject to translation.
     Therefore, <function>elog</function> should be used only for internal errors and
     low-level debug logging.  Any message that is likely to be of interest to
     ordinary users should go through <function>ereport</function>.  Nonetheless,
     there are enough internal <quote>cannot happen</quote> error checks in the
     system that <function>elog</function> is still widely used; it is preferred for
     those messages for its notational simplicity.
    </para>

    <para>
     Advice about writing good error messages can be found in
     <xref linkend="error-style-guide"/>.
    </para>
   </sect1>

   <sect1 id="error-style-guide">
    <title>Error Message Style Guide</title>

    <para>
     This style guide is offered in the hope of maintaining a consistent,
     user-friendly style throughout all the messages generated by
     <productname>PostgreSQL</productname>.
    </para>

   <simplesect>
    <title>What Goes Where</title>

    <para>
     The primary message should be short, factual, and avoid reference to
     implementation details such as specific function names.
     <quote>Short</quote> means <quote>should fit on one line under normal
     conditions</quote>.  Use a detail message if needed to keep the primary
     message short, or if you feel a need to mention implementation details
     such as the particular system call that failed. Both primary and detail
     messages should be factual.  Use a hint message for suggestions about what
     to do to fix the problem, especially if the suggestion might not always be
     applicable.
    </para>

    <para>
     For example, instead of:
 <programlisting>
 IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m
 (plus a long addendum that is basically a hint)
 </programlisting>
     write:
 <programlisting>
 Primary:    could not create shared memory segment: %m
 Detail:     Failed syscall was shmget(key=%d, size=%u, 0%o).
 Hint:       the addendum
 </programlisting>
    </para>

    <para>
     Rationale: keeping the primary message short helps keep it to the point,
     and lets clients lay out screen space on the assumption that one line is
     enough for error messages.  Detail and hint messages can be relegated to a
     verbose mode, or perhaps a pop-up error-details window.  Also, details and
     hints would normally be suppressed from the server log to save
     space. Reference to implementation details is best avoided since users
     aren't expected to know the details.
    </para>

   </simplesect>

   <simplesect>
    <title>Formatting</title>

    <para>
     Don't put any specific assumptions about formatting into the message
     texts.  Expect clients and the server log to wrap lines to fit their own
     needs.  In long messages, newline characters (\n) can be used to indicate
     suggested paragraph breaks.  Don't end a message with a newline.  Don't
     use tabs or other formatting characters.  (In error context displays,
     newlines are automatically added to separate levels of context such as
     function calls.)
    </para>

    <para>
     Rationale: Messages are not necessarily displayed on terminal-type
     displays.  In GUI displays or browsers these formatting instructions are
     at best ignored.
    </para>

   </simplesect>

   <simplesect>
    <title>Quotation Marks</title>

    <para>
     English text should use double quotes when quoting is appropriate.
     Text in other languages should consistently use one kind of quotes that is
     consistent with publishing customs and computer output of other programs.
    </para>

    <para>
     Rationale: The choice of double quotes over single quotes is somewhat
     arbitrary, but tends to be the preferred use.  Some have suggested
     choosing the kind of quotes depending on the type of object according to
     SQL conventions (namely, strings single quoted, identifiers double
     quoted).  But this is a language-internal technical issue that many users
     aren't even familiar with, it won't scale to other kinds of quoted terms,
     it doesn't translate to other languages, and it's pretty pointless, too.
    </para>

   </simplesect>

   <simplesect>
    <title>Use of Quotes</title>

    <para>
     Always use quotes to delimit file names, user-supplied identifiers, and
     other variables that might contain words.  Do not use them to mark up
     variables that will not contain words (for example, operator names).
    </para>

    <para>
     There are functions in the backend that will double-quote their own output
     as needed (for example, <function>format_type_be()</function>).  Do not put
     additional quotes around the output of such functions.
    </para>

    <para>
     Rationale: Objects can have names that create ambiguity when embedded in a
     message.  Be consistent about denoting where a plugged-in name starts and
     ends.  But don't clutter messages with unnecessary or duplicate quote
     marks.
    </para>

   </simplesect>

   <simplesect>
    <title>Grammar and Punctuation</title>

    <para>
     The rules are different for primary error messages and for detail/hint
     messages:
    </para>

    <para>
     Primary error messages: Do not capitalize the first letter.  Do not end a
     message with a period.  Do not even think about ending a message with an
     exclamation point.
    </para>

    <para>
     Detail and hint messages: Use complete sentences, and end each with
     a period.  Capitalize the first word of sentences.  Put two spaces after
     the period if another sentence follows (for English text; might be
     inappropriate in other languages).
    </para>

    <para>
     Error context strings: Do not capitalize the first letter and do
     not end the string with a period.  Context strings should normally
     not be complete sentences.
    </para>

    <para>
     Rationale: Avoiding punctuation makes it easier for client applications to
     embed the message into a variety of grammatical contexts.  Often, primary
     messages are not grammatically complete sentences anyway.  (And if they're
     long enough to be more than one sentence, they should be split into
     primary and detail parts.)  However, detail and hint messages are longer
     and might need to include multiple sentences.  For consistency, they should
     follow complete-sentence style even when there's only one sentence.
    </para>

   </simplesect>

   <simplesect>
    <title>Upper Case vs. Lower Case</title>

    <para>
     Use lower case for message wording, including the first letter of a
     primary error message.  Use upper case for SQL commands and key words if
     they appear in the message.
    </para>

    <para>
     Rationale: It's easier to make everything look more consistent this
     way, since some messages are complete sentences and some not.
    </para>

   </simplesect>

   <simplesect>
    <title>Avoid Passive Voice</title>

    <para>
     Use the active voice.  Use complete sentences when there is an acting
     subject (<quote>A could not do B</quote>).  Use telegram style without
     subject if the subject would be the program itself; do not use
     <quote>I</quote> for the program.
    </para>

    <para>
     Rationale: The program is not human.  Don't pretend otherwise.
    </para>

   </simplesect>

   <simplesect>
    <title>Present vs. Past Tense</title>

    <para>
     Use past tense if an attempt to do something failed, but could perhaps
     succeed next time (perhaps after fixing some problem).  Use present tense
     if the failure is certainly permanent.
    </para>

    <para>
     There is a nontrivial semantic difference between sentences of the form:
 <programlisting>
 could not open file "%s": %m
 </programlisting>
 and:
 <programlisting>
 cannot open file "%s"
 </programlisting>
     The first one means that the attempt to open the file failed.  The
     message should give a reason, such as <quote>disk full</quote> or
     <quote>file doesn't exist</quote>.  The past tense is appropriate because
     next time the disk might not be full anymore or the file in question might
     exist.
    </para>

    <para>
     The second form indicates that the functionality of opening the named file
     does not exist at all in the program, or that it's conceptually
     impossible.  The present tense is appropriate because the condition will
     persist indefinitely.
    </para>

    <para>
     Rationale: Granted, the average user will not be able to draw great
     conclusions merely from the tense of the message, but since the language
     provides us with a grammar we should use it correctly.
    </para>

   </simplesect>

   <simplesect>
    <title>Type of the Object</title>

    <para>
     When citing the name of an object, state what kind of object it is.
    </para>

    <para>
     Rationale: Otherwise no one will know what <quote>foo.bar.baz</quote>
     refers to.
    </para>

   </simplesect>

   <simplesect>
    <title>Brackets</title>

    <para>
     Square brackets are only to be used (1) in command synopses to denote
     optional arguments, or (2) to denote an array subscript.
    </para>

    <para>
     Rationale: Anything else does not correspond to widely-known customary
     usage and will confuse people.
    </para>

   </simplesect>

   <simplesect>
    <title>Assembling Error Messages</title>

    <para>
    When a message includes text that is generated elsewhere, embed it in
    this style:
 <programlisting>
 could not open file %s: %m
 </programlisting>
    </para>

    <para>
     Rationale: It would be difficult to account for all possible error codes
     to paste this into a single smooth sentence, so some sort of punctuation
     is needed.  Putting the embedded text in parentheses has also been
     suggested, but it's unnatural if the embedded text is likely to be the
     most important part of the message, as is often the case.
    </para>

   </simplesect>

   <simplesect>
    <title>Reasons for Errors</title>

    <para>
     Messages should always state the reason why an error occurred.
     For example:
 <programlisting>
 BAD:    could not open file %s
 BETTER: could not open file %s (I/O failure)
 </programlisting>
     If no reason is known you better fix the code.
    </para>

   </simplesect>

   <simplesect>
    <title>Function Names</title>

    <para>
     Don't include the name of the reporting routine in the error text. We have
     other mechanisms for finding that out when needed, and for most users it's
     not helpful information.  If the error text doesn't make as much sense
     without the function name, reword it.
 <programlisting>
 BAD:    pg_strtoint32: error in "z": cannot parse "z"
 BETTER: invalid input syntax for type integer: "z"
 </programlisting>
    </para>

    <para>
     Avoid mentioning called function names, either; instead say what the code
     was trying to do:
 <programlisting>
 BAD:    open() failed: %m
 BETTER: could not open file %s: %m
 </programlisting>
     If it really seems necessary, mention the system call in the detail
     message.  (In some cases, providing the actual values passed to the
     system call might be appropriate information for the detail message.)
    </para>

    <para>
     Rationale: Users don't know what all those functions do.
    </para>

   </simplesect>

   <simplesect>
    <title>Tricky Words to Avoid</title>

   <formalpara>
     <title>Unable</title>
    <para>
     <quote>Unable</quote> is nearly the passive voice.  Better use
     <quote>cannot</quote> or <quote>could not</quote>, as appropriate.
    </para>
   </formalpara>

   <formalpara>
     <title>Bad</title>
    <para>
     Error messages like <quote>bad result</quote> are really hard to interpret
     intelligently.  It's better to write why the result is <quote>bad</quote>,
     e.g., <quote>invalid format</quote>.
    </para>
   </formalpara>

   <formalpara>
     <title>Illegal</title>
    <para>
     <quote>Illegal</quote> stands for a violation of the law, the rest is
     <quote>invalid</quote>. Better yet, say why it's invalid.
    </para>
   </formalpara>

   <formalpara>
     <title>Unknown</title>
    <para>
     Try to avoid <quote>unknown</quote>.  Consider <quote>error: unknown
     response</quote>.  If you don't know what the response is, how do you know
     it's erroneous? <quote>Unrecognized</quote> is often a better choice.
     Also, be sure to include the value being complained of.
 <programlisting>
 BAD:    unknown node type
 BETTER: unrecognized node type: 42
 </programlisting>
    </para>
   </formalpara>

   <formalpara>
     <title>Find vs. Exists</title>
    <para>
     If the program uses a nontrivial algorithm to locate a resource (e.g., a
     path search) and that algorithm fails, it is fair to say that the program
     couldn't <quote>find</quote> the resource.  If, on the other hand, the
     expected location of the resource is known but the program cannot access
     it there then say that the resource doesn't <quote>exist</quote>.  Using
     <quote>find</quote> in this case sounds weak and confuses the issue.
    </para>
   </formalpara>

   <formalpara>
     <title>May vs. Can vs. Might</title>
    <para>
     <quote>May</quote> suggests permission (e.g., "You may borrow my rake."),
     and has little use in documentation or error messages.
     <quote>Can</quote> suggests ability (e.g., "I can lift that log."),
     and <quote>might</quote> suggests possibility (e.g., "It might rain
     today.").  Using the proper word clarifies meaning and assists
     translation.
    </para>
   </formalpara>

   <formalpara>
     <title>Contractions</title>
    <para>
     Avoid contractions, like <quote>can't</quote>;  use
     <quote>cannot</quote> instead.
    </para>
   </formalpara>

   <formalpara>
     <title>Non-negative</title>
    <para>
     Avoid <quote>non-negative</quote> as it is ambiguous
     about whether it accepts zero.  It's better to use
     <quote>greater than zero</quote> or
     <quote>greater than or equal to zero</quote>.
    </para>
   </formalpara>

   </simplesect>

   <simplesect>
    <title>Proper Spelling</title>

    <para>
     Spell out words in full.  For instance, avoid:
   <itemizedlist>
    <listitem>
     <para>
      spec
     </para>
    </listitem>
    <listitem>
     <para>
      stats
     </para>
    </listitem>
    <listitem>
     <para>
      parens
     </para>
    </listitem>
    <listitem>
     <para>
      auth
     </para>
    </listitem>
    <listitem>
     <para>
      xact
     </para>
    </listitem>
   </itemizedlist>
    </para>

    <para>
     Rationale: This will improve consistency.
    </para>

   </simplesect>

   <simplesect>
    <title>Localization</title>

    <para>
     Keep in mind that error message texts need to be translated into other
     languages.  Follow the guidelines in <xref linkend="nls-guidelines"/>
     to avoid making life difficult for translators.
    </para>
   </simplesect>

   </sect1>

   <sect1 id="source-conventions">
    <title>Miscellaneous Coding Conventions</title>

    <simplesect>
     <title>C Standard</title>
     <para>
      Code in <productname>PostgreSQL</productname> should only rely on language
      features available in the C99 standard. That means a conforming
      C99 compiler has to be able to compile postgres, at least aside
      from a few platform dependent pieces.
     </para>
     <para>
      A few features included in the C99 standard are, at this time, not
      permitted to be used in core <productname>PostgreSQL</productname>
      code. This currently includes variable length arrays, intermingled
      declarations and code, <literal>//</literal> comments, universal
      character names. Reasons for that include portability and historical
      practices.
     </para>
     <para>
      Features from later revisions of the C standard or compiler specific
      features can be used, if a fallback is provided.
     </para>
     <para>
      For example <literal>_Static_assert()</literal> and
      <literal>__builtin_constant_p</literal> are currently used, even though
      they are from newer revisions of the C standard and a
      <productname>GCC</productname> extension respectively. If not available
      we respectively fall back to using a C99 compatible replacement that
      performs the same checks, but emits rather cryptic messages and do not
      use <literal>__builtin_constant_p</literal>.
     </para>
    </simplesect>

    <simplesect>
     <title>Function-Like Macros and Inline Functions</title>
     <para>
      Both, macros with arguments and <literal>static inline</literal>
      functions, may be used. The latter are preferable if there are
      multiple-evaluation hazards when written as a macro, as e.g., the
      case with
 <programlisting>
 #define Max(x, y)       ((x) > (y) ? (x) : (y))
 </programlisting>
      or when the macro would be very long. In other cases it's only
      possible to use macros, or at least easier.  For example because
      expressions of various types need to be passed to the macro.
     </para>
     <para>
      When the definition of an inline function references symbols
      (i.e., variables, functions) that are only available as part of the
      backend, the function may not be visible when included from frontend
      code.
 <programlisting>
 #ifndef FRONTEND
 static inline MemoryContext
 MemoryContextSwitchTo(MemoryContext context)
 {
     MemoryContext old = CurrentMemoryContext;

     CurrentMemoryContext = context;
     return old;
 }
 #endif   /* FRONTEND */
 </programlisting>
      In this example <literal>CurrentMemoryContext</literal>, which is only
      available in the backend, is referenced and the function thus
      hidden with a <literal>#ifndef FRONTEND</literal>. This rule
      exists because some compilers emit references to symbols
      contained in inline functions even if the function is not used.
     </para>
    </simplesect>

    <simplesect>
     <title>Writing Signal Handlers</title>
     <para>
      To be suitable to run inside a signal handler code has to be
      written very carefully. The fundamental problem is that, unless
      blocked, a signal handler can interrupt code at any time. If code
      inside the signal handler uses the same state as code outside
      chaos may ensue. As an example consider what happens if a signal
      handler tries to acquire a lock that's already held in the
      interrupted code.
     </para>
     <para>
      Barring special arrangements code in signal handlers may only
      call async-signal safe functions (as defined in POSIX) and access
      variables of type <literal>volatile sig_atomic_t</literal>. A few
      functions in <command>postgres</command> are also deemed signal safe, importantly
      <function>SetLatch()</function>.
     </para>
     <para>
      In most cases signal handlers should do nothing more than note
      that a signal has arrived, and wake up code running outside of
      the handler using a latch. An example of such a handler is the
      following:
 <programlisting>
 static void
 handle_sighup(SIGNAL_ARGS)
 {
     int         save_errno = errno;

     got_SIGHUP = true;
     SetLatch(MyLatch);

     errno = save_errno;
 }
 </programlisting>
      <varname>errno</varname> is saved and restored because
      <function>SetLatch()</function> might change it. If that were not done
      interrupted code that's currently inspecting <varname>errno</varname> might see the wrong
      value.
     </para>
    </simplesect>

    <simplesect>
     <title>Calling Function Pointers</title>

     <para>
      For clarity, it is preferred to explicitly dereference a function pointer
      when calling the pointed-to function if the pointer is a simple variable,
      for example:
 <programlisting>
 (*emit_log_hook) (edata);
 </programlisting>
      (even though <literal>emit_log_hook(edata)</literal> would also work).
      When the function pointer is part of a structure, then the extra
      punctuation can and usually should be omitted, for example:
 <programlisting>
 paramInfo->paramFetch(paramInfo, paramId);
 </programlisting>
     </para>
    </simplesect>
   </sect1>
  </chapter>