doc/src/sgml/plhandler.sgml - cloudberry - Git at Google

 <!-- doc/src/sgml/plhandler.sgml -->

  <chapter id="plhandler">
    <title>Writing a Procedural Language Handler</title>

    <indexterm zone="plhandler">
     <primary>procedural language</primary>
     <secondary>handler for</secondary>
    </indexterm>

    <para>
     All calls to functions that are written in a language other than
     the current <quote>version 1</quote> interface for compiled
     languages (this includes functions in user-defined procedural languages
     and functions written in SQL) go through a <firstterm>call handler</firstterm>
     function for the specific language.  It is the responsibility of
     the call handler to execute the function in a meaningful way, such
     as by interpreting the supplied source text.  This chapter outlines
     how a new procedural language's call handler can be written.
    </para>

    <para>
     The call handler for a procedural language is a
     <quote>normal</quote> function that must be written in a compiled
     language such as C, using the version-1 interface, and registered
     with <productname>PostgreSQL</productname> as taking no arguments
     and returning the type <type>language_handler</type>.  This
     special pseudo-type identifies the function as a call handler and
     prevents it from being called directly in SQL commands.
     For more details on C language calling conventions and dynamic loading,
     see <xref linkend="xfunc-c"/>.
    </para>

    <para>
     The call handler is called in the same way as any other function:
     It receives a pointer to a
     <structname>FunctionCallInfoBaseData</structname> <type>struct</type> containing
     argument values and information about the called function, and it
     is expected to return a <type>Datum</type> result (and possibly
     set the <structfield>isnull</structfield> field of the
     <structname>FunctionCallInfoBaseData</structname> structure, if it wishes
     to return an SQL null result).  The difference between a call
     handler and an ordinary callee function is that the
     <structfield>flinfo-&gt;fn_oid</structfield> field of the
     <structname>FunctionCallInfoBaseData</structname> structure will contain
     the OID of the actual function to be called, not of the call
     handler itself.  The call handler must use this field to determine
     which function to execute.  Also, the passed argument list has
     been set up according to the declaration of the target function,
     not of the call handler.
    </para>

    <para>
     It's up to the call handler to fetch the entry of the function from the
     <classname>pg_proc</classname> system catalog and to analyze the argument
     and return types of the called function. The <literal>AS</literal> clause from the
     <command>CREATE FUNCTION</command> command for the function will be found
     in the <literal>prosrc</literal> column of the
     <classname>pg_proc</classname> row. This is commonly source
     text in the procedural language, but in theory it could be something else,
     such as a path name to a file, or anything else that tells the call handler
     what to do in detail.
    </para>

    <para>
     Often, the same function is called many times per SQL statement.
     A call handler can avoid repeated lookups of information about the
     called function by using the
     <structfield>flinfo-&gt;fn_extra</structfield> field.  This will
     initially be <symbol>NULL</symbol>, but can be set by the call handler to point at
     information about the called function.  On subsequent calls, if
     <structfield>flinfo-&gt;fn_extra</structfield> is already non-<symbol>NULL</symbol>
     then it can be used and the information lookup step skipped.  The
     call handler must make sure that
     <structfield>flinfo-&gt;fn_extra</structfield> is made to point at
     memory that will live at least until the end of the current query,
     since an <structname>FmgrInfo</structname> data structure could be
     kept that long.  One way to do this is to allocate the extra data
     in the memory context specified by
     <structfield>flinfo-&gt;fn_mcxt</structfield>; such data will
     normally have the same lifespan as the
     <structname>FmgrInfo</structname> itself.  But the handler could
     also choose to use a longer-lived memory context so that it can cache
     function definition information across queries.
    </para>

    <para>
     When a procedural-language function is invoked as a trigger, no arguments
     are passed in the usual way, but the
     <structname>FunctionCallInfoBaseData</structname>'s
     <structfield>context</structfield> field points at a
     <structname>TriggerData</structname> structure, rather than being <symbol>NULL</symbol>
     as it is in a plain function call.  A language handler should
     provide mechanisms for procedural-language functions to get at the trigger
     information.
    </para>

    <para>
     A template for a procedural-language handler written as a C extension is
     provided in <literal>src/test/modules/plsample</literal>.  This is a
     working sample demonstrating one way to create a procedural-language
     handler, process parameters, and return a value.
    </para>

    <para>
     Although providing a call handler is sufficient to create a minimal
     procedural language, there are two other functions that can optionally
     be provided to make the language more convenient to use.  These
     are a <firstterm>validator</firstterm> and an
     <firstterm>inline handler</firstterm>.  A validator can be provided
     to allow language-specific checking to be done during
     <xref linkend="sql-createfunction"/>.
     An inline handler can be provided to allow the language to support
     anonymous code blocks executed via the <xref linkend="sql-do"/> command.
    </para>

    <para>
     If a validator is provided by a procedural language, it
     must be declared as a function taking a single parameter of type
     <type>oid</type>.  The validator's result is ignored, so it is customarily
     declared to return <type>void</type>.  The validator will be called at
     the end of a <command>CREATE FUNCTION</command> command that has created
     or updated a function written in the procedural language.
     The passed-in OID is the OID of the function's <classname>pg_proc</classname>
     row.  The validator must fetch this row in the usual way, and do
     whatever checking is appropriate.
     First, call <function>CheckFunctionValidatorAccess()</function> to diagnose
     explicit calls to the validator that the user could not achieve through
     <command>CREATE FUNCTION</command>.  Typical checks then include verifying
     that the function's argument and result types are supported by the
     language, and that the function's body is syntactically correct
     in the language.  If the validator finds the function to be okay,
     it should just return.  If it finds an error, it should report that
     via the normal <function>ereport()</function> error reporting mechanism.
     Throwing an error will force a transaction rollback and thus prevent
     the incorrect function definition from being committed.
    </para>

    <para>
     Validator functions should typically honor the <xref
     linkend="guc-check-function-bodies"/> parameter: if it is turned off then
     any expensive or context-sensitive checking should be skipped.  If the
     language provides for code execution at compilation time, the validator
     must suppress checks that would induce such execution.  In particular,
     this parameter is turned off by <application>pg_dump</application> so that it can
     load procedural language functions without worrying about side effects or
     dependencies of the function bodies on other database objects.
     (Because of this requirement, the call handler should avoid
     assuming that the validator has fully checked the function.  The point
     of having a validator is not to let the call handler omit checks, but
     to notify the user immediately if there are obvious errors in a
     <command>CREATE FUNCTION</command> command.)
     While the choice of exactly what to check is mostly left to the
     discretion of the validator function, note that the core
     <command>CREATE FUNCTION</command> code only executes <literal>SET</literal> clauses
     attached to a function when <varname>check_function_bodies</varname> is on.
     Therefore, checks whose results might be affected by GUC parameters
     definitely should be skipped when <varname>check_function_bodies</varname> is
     off, to avoid false failures when reloading a dump.
    </para>

    <para>
     If an inline handler is provided by a procedural language, it
     must be declared as a function taking a single parameter of type
     <type>internal</type>.  The inline handler's result is ignored, so it is
     customarily declared to return <type>void</type>.  The inline handler
     will be called when a <command>DO</command> statement is executed specifying
     the procedural language.  The parameter actually passed is a pointer
     to an <structname>InlineCodeBlock</structname> struct, which contains information
     about the <command>DO</command> statement's parameters, in particular the
     text of the anonymous code block to be executed.  The inline handler
     should execute this code and return.
    </para>

    <para>
     It's recommended that you wrap all these function declarations,
     as well as the <command>CREATE LANGUAGE</command> command itself, into
     an <firstterm>extension</firstterm> so that a simple <command>CREATE EXTENSION</command>
     command is sufficient to install the language.  See
     <xref linkend="extend-extensions"/> for information about writing
     extensions.
    </para>

    <para>
     The procedural languages included in the standard distribution
     are good references when trying to write your own language handler.
     Look into the <filename>src/pl</filename> subdirectory of the source tree.
     The <xref linkend="sql-createlanguage"/>
     reference page also has some useful details.
    </para>

  </chapter>
	<!-- doc/src/sgml/plhandler.sgml -->

	<chapter id="plhandler">
	<title>Writing a Procedural Language Handler</title>

	<indexterm zone="plhandler">
	<primary>procedural language</primary>
	<secondary>handler for</secondary>
	</indexterm>

	<para>
	All calls to functions that are written in a language other than
	the current <quote>version 1</quote> interface for compiled
	languages (this includes functions in user-defined procedural languages
	and functions written in SQL) go through a <firstterm>call handler</firstterm>
	function for the specific language. It is the responsibility of
	the call handler to execute the function in a meaningful way, such
	as by interpreting the supplied source text. This chapter outlines
	how a new procedural language's call handler can be written.
	</para>

	<para>
	The call handler for a procedural language is a
	<quote>normal</quote> function that must be written in a compiled
	language such as C, using the version-1 interface, and registered
	with <productname>PostgreSQL</productname> as taking no arguments
	and returning the type <type>language_handler</type>. This
	special pseudo-type identifies the function as a call handler and
	prevents it from being called directly in SQL commands.
	For more details on C language calling conventions and dynamic loading,
	see <xref linkend="xfunc-c"/>.
	</para>

	<para>
	The call handler is called in the same way as any other function:
	It receives a pointer to a
	<structname>FunctionCallInfoBaseData</structname> <type>struct</type> containing
	argument values and information about the called function, and it
	is expected to return a <type>Datum</type> result (and possibly
	set the <structfield>isnull</structfield> field of the
	<structname>FunctionCallInfoBaseData</structname> structure, if it wishes
	to return an SQL null result). The difference between a call
	handler and an ordinary callee function is that the
	<structfield>flinfo->fn_oid</structfield> field of the
	<structname>FunctionCallInfoBaseData</structname> structure will contain
	the OID of the actual function to be called, not of the call
	handler itself. The call handler must use this field to determine
	which function to execute. Also, the passed argument list has
	been set up according to the declaration of the target function,
	not of the call handler.
	</para>

	<para>
	It's up to the call handler to fetch the entry of the function from the
	<classname>pg_proc</classname> system catalog and to analyze the argument
	and return types of the called function. The <literal>AS</literal> clause from the
	<command>CREATE FUNCTION</command> command for the function will be found
	in the <literal>prosrc</literal> column of the
	<classname>pg_proc</classname> row. This is commonly source
	text in the procedural language, but in theory it could be something else,
	such as a path name to a file, or anything else that tells the call handler
	what to do in detail.
	</para>

	<para>
	Often, the same function is called many times per SQL statement.
	A call handler can avoid repeated lookups of information about the
	called function by using the
	<structfield>flinfo->fn_extra</structfield> field. This will
	initially be <symbol>NULL</symbol>, but can be set by the call handler to point at
	information about the called function. On subsequent calls, if
	<structfield>flinfo->fn_extra</structfield> is already non-<symbol>NULL</symbol>
	then it can be used and the information lookup step skipped. The
	call handler must make sure that
	<structfield>flinfo->fn_extra</structfield> is made to point at
	memory that will live at least until the end of the current query,
	since an <structname>FmgrInfo</structname> data structure could be
	kept that long. One way to do this is to allocate the extra data
	in the memory context specified by
	<structfield>flinfo->fn_mcxt</structfield>; such data will
	normally have the same lifespan as the
	<structname>FmgrInfo</structname> itself. But the handler could
	also choose to use a longer-lived memory context so that it can cache
	function definition information across queries.
	</para>

	<para>
	When a procedural-language function is invoked as a trigger, no arguments
	are passed in the usual way, but the
	<structname>FunctionCallInfoBaseData</structname>'s
	<structfield>context</structfield> field points at a
	<structname>TriggerData</structname> structure, rather than being <symbol>NULL</symbol>
	as it is in a plain function call. A language handler should
	provide mechanisms for procedural-language functions to get at the trigger
	information.
	</para>

	<para>
	A template for a procedural-language handler written as a C extension is
	provided in <literal>src/test/modules/plsample</literal>. This is a
	working sample demonstrating one way to create a procedural-language
	handler, process parameters, and return a value.
	</para>

	<para>
	Although providing a call handler is sufficient to create a minimal
	procedural language, there are two other functions that can optionally
	be provided to make the language more convenient to use. These
	are a <firstterm>validator</firstterm> and an
	<firstterm>inline handler</firstterm>. A validator can be provided
	to allow language-specific checking to be done during
	<xref linkend="sql-createfunction"/>.
	An inline handler can be provided to allow the language to support
	anonymous code blocks executed via the <xref linkend="sql-do"/> command.
	</para>

	<para>
	If a validator is provided by a procedural language, it
	must be declared as a function taking a single parameter of type
	<type>oid</type>. The validator's result is ignored, so it is customarily
	declared to return <type>void</type>. The validator will be called at
	the end of a <command>CREATE FUNCTION</command> command that has created
	or updated a function written in the procedural language.
	The passed-in OID is the OID of the function's <classname>pg_proc</classname>
	row. The validator must fetch this row in the usual way, and do
	whatever checking is appropriate.
	First, call <function>CheckFunctionValidatorAccess()</function> to diagnose
	explicit calls to the validator that the user could not achieve through
	<command>CREATE FUNCTION</command>. Typical checks then include verifying
	that the function's argument and result types are supported by the
	language, and that the function's body is syntactically correct
	in the language. If the validator finds the function to be okay,
	it should just return. If it finds an error, it should report that
	via the normal <function>ereport()</function> error reporting mechanism.
	Throwing an error will force a transaction rollback and thus prevent
	the incorrect function definition from being committed.
	</para>

	<para>
	Validator functions should typically honor the <xref
	linkend="guc-check-function-bodies"/> parameter: if it is turned off then
	any expensive or context-sensitive checking should be skipped. If the
	language provides for code execution at compilation time, the validator
	must suppress checks that would induce such execution. In particular,
	this parameter is turned off by <application>pg_dump</application> so that it can
	load procedural language functions without worrying about side effects or
	dependencies of the function bodies on other database objects.
	(Because of this requirement, the call handler should avoid
	assuming that the validator has fully checked the function. The point
	of having a validator is not to let the call handler omit checks, but
	to notify the user immediately if there are obvious errors in a
	<command>CREATE FUNCTION</command> command.)
	While the choice of exactly what to check is mostly left to the
	discretion of the validator function, note that the core
	<command>CREATE FUNCTION</command> code only executes <literal>SET</literal> clauses
	attached to a function when <varname>check_function_bodies</varname> is on.
	Therefore, checks whose results might be affected by GUC parameters
	definitely should be skipped when <varname>check_function_bodies</varname> is
	off, to avoid false failures when reloading a dump.
	</para>

	<para>
	If an inline handler is provided by a procedural language, it
	must be declared as a function taking a single parameter of type
	<type>internal</type>. The inline handler's result is ignored, so it is
	customarily declared to return <type>void</type>. The inline handler
	will be called when a <command>DO</command> statement is executed specifying
	the procedural language. The parameter actually passed is a pointer
	to an <structname>InlineCodeBlock</structname> struct, which contains information
	about the <command>DO</command> statement's parameters, in particular the
	text of the anonymous code block to be executed. The inline handler
	should execute this code and return.
	</para>

	<para>
	It's recommended that you wrap all these function declarations,
	as well as the <command>CREATE LANGUAGE</command> command itself, into
	an <firstterm>extension</firstterm> so that a simple <command>CREATE EXTENSION</command>
	command is sufficient to install the language. See
	<xref linkend="extend-extensions"/> for information about writing
	extensions.
	</para>

	<para>
	The procedural languages included in the standard distribution
	are good references when trying to write your own language handler.
	Look into the <filename>src/pl</filename> subdirectory of the source tree.
	The <xref linkend="sql-createlanguage"/>
	reference page also has some useful details.
	</para>

	</chapter>