src/test/regress/README_UPG2 - hawq - Git at Google


 					 The Upgrade Regression Test
 								(upg2)

    The upg2 test is the test in installcheck-good that tests the
    scripts used to maintain the upgrader.  It manages this by creating
    versions of the catalog that match those that existed in 3.2 and
    3.3 and performs the same catalog upgrade steps on those schemas as
    are used by the actual gpmigrator.

    This test should hopefully diff when a developer makes changes to
    the catalog.  Proper maintenance of the upg2 test will ensure that
    the migrator will keep in sync with the changes made to the
    catalog.
    __________________________________________________________________

 						  Updating the test

    So, what is proper maintenance of the upg2 test?
    What should a developer do or not do to update it?

    ================
    Things not to do:
    ================

    1) Do not update the master file to show your new diffs:

    	  This will mask problems, the test should show that the upgraded
    	  catalogs are identical.

    2) Do not update any files used to upgrade from 3.2:

       This corresponds to (regress/data/upgrade33) - don't edit
       anything in here.  The upgrade from 3.2 catalog to 3.3 catalog
       is already correct all changes should now be in terms of
       upgrading 3.3 catalog => 3.4.

    ===================
    How to make changes:
    -==================

    1) Adding new functions:

       If you add a new function (pg_proc.h) then you must also add the
       new function to:

 	  	- regress/data/upgrade34/upg2_pg_proc_toadd33.data

 		  This file is in pretty much the same format as pg_proc.h.

 		- regress/data/upgrade34/upg2_pg_depend_toadd33.data
 		  This is the pg_depend entries for the new function

 		- regress/data/upgrade34/upg2_pg_description_toadd33.data

 		  This is the description for the new function should exactly
 		  match DESCR() from pg_proc.h

    2) Adding new catalog tables:

         **NOTE WELL**: Any new tables MUST have a fixed OID for the
 		entry in pg_type.  In order to accomplish this the value must
 		be inserted into the list of upgrade types at the bottom of
 		pg_type.h AND it must be added to the list of specially
 		handled cases inside of bootparse.y.

 		If you are adding a new Shared (cross-database) table then you
 		must make sure to update the IsSharedRelation function in
 		catalog.c.

 		After chaging any of the .h files in "include/catalog"
 		(including the pg_type.h changes from the above step) you must
 		update catversion.h do a make distclean and rerun
 		gpinitsystem. Otherwise you may update the rest of the files
 		below based on stale information.

   	    - regress/data/upgrade34/upg2_pg_class_toadd33.data

 		  This file is in pretty much the same format as pg_class.h.
 		  Also includes new index entries (and sequences, views?)

   	    - regress/data/upgrade34/upg2_pg_attribute_toadd33.data

 		  Column definitions.

 		- regress/data/upgrade34/upg2_pg_depend_toadd33.data

 		  This is the pg_depend entries for the new table (and
 		  indexes).  [check for relid in refobjid]

 		- regress/data/upgrade34/upg2_pg_index_toadd33.data

 		  Only required if the new table has indices.  Note that the
 		  index names from the DECLARE_INDEX/DECLARE_UNIQUE_INDEX
 		  definitions in indexing.h.

 		- regress/data/upgrade34/upg2_pg_type_toadd33.data

 		  This is the implementation type for the new table.

   3) Altering existing catalog tables:

      This is probably the most complicated procedure, there are a
      couple rules to follow.

      You must add any new columns onto the END of a table.  Inserting
      columns into the middle of the table is not supported by our
      current upgrade strategy.

      Any changes must be able to be applied incrementally to an
      existing table that contains data.  For instance if you want to
      add a new column with a NOT NULL constraint then you must first
      add the column, then assign values for all existing rows, finally
      add the constraint.

      This is really the only case that you may have to edit either
      regress/input/upg2.source or regress/output/upg2.source.

      a) Open (regress/input/upg2.source) and find the section on
         "Build the 32 catalog".

         In this section we create a bunch of tables in the public
         schema "LIKE" the ones in the catalog schema, this is the set
         of tables that have not had schema edits between 32 and the
         current version.  If your table is in this list then you need
         to move it to the next section "don't use LIKE, as 3.2 had a
         different structure", and create the table explicitly using
         the schema of the table AS IT EXISTED IN 3.2.

         Repeat the above for every major release prior to the current
         one (currently 3.2, 3.3).

         If you are modifying a table that does not appear in the file
         then you will also need to add tests to make sure it upgrades
         properly.  This is more involved and you should ask either
         Caleb or Gavin for help.

 	 b) Edit (regress/input/loadcat34.source) or the corresponding
         file for the target release.

         You should never modify a loadcat file for a previous release.

         This should be edited with the DDL steps necessary to migrate
         from the old schema to the new schema.  Remember to reindex
         any modified tables.

      c) Edit (regress/output/upg2.source)

    	    There are really two things that need to happen here.  First
 	    you need to update the output for any changes you did in step
 	    (a) above.  This should be fairly straight forward.

      d) Have either Caleb or Gavin review the change.

         Altering upg2 to account for catalog schema changes is
         difficult and we want to make sure that nothing is being
         omitted.

      e) No really, have us review it.

         Don't make us hunt you down.

    __________________________________________________________________

 					  Structure of the upg2 test

    The primary test driver for the upg2 test is in
    (regress/input/upg2.source) this file is responsible for creating
    copies of the catalog from prior versions and testing that the
    upgrade scripts (regress/data/upgrade*/upg2_* and
    regress/input/loadcat*) properly upgrade the catalog correctly.
    This occurs in several stages.

    First it creates N skeleton catalogs, one for each prior version
    that we support upgrade FROM.  For any catalog tables that have not
    undergone schema edits we just create them "LIKE" the corresponding
    table in pg_catalog.  For catalog tables that have changed since
    that version we instead create them with the schema that they had
    in the referenced version.

    Second we load in the reference versions of those schemas.  This is
    done with COPY statements from the dumped output of the referenced
    version.  For example the 3.2 schema is loaded from
    (regress/data/upgrade32/pg_*32.data).  Note that these ".data"
    files are dumps from a particular version, so you should never
    update them, that would equate to making a catalog change in a
    released version... which you can't do.

    Third it begins the process of upgrading.  Here it will
    incrementally upgrade from each major version to the next running
    tests that the schemas match correctly along each iteration.

    Currently this means the following:

      1) Upgrade from the 3.2 schema into a 3.3 schema

         - regress/input/loadcat33.source
         - regress/data/upg2_conversion32.source

      2) Test the upgraded 3.2 schema against the loaded 3.3 schema

      3) Upgrade from the 3.3 schema into a 3.4 schema

  	    - regress/input/loadcat34.source

      4) Test the upgraded 3.3 against the schema in pg_catalog

    The structure of the loadcat*.source files is to copy the
    upg2_pg_*_toadd*.data files into the relevant catalog tables,
    perform any additional transforms, and reindex the updated
    catalogs.

	The Upgrade Regression Test
	(upg2)

	The upg2 test is the test in installcheck-good that tests the
	scripts used to maintain the upgrader. It manages this by creating
	versions of the catalog that match those that existed in 3.2 and
	3.3 and performs the same catalog upgrade steps on those schemas as
	are used by the actual gpmigrator.

	This test should hopefully diff when a developer makes changes to
	the catalog. Proper maintenance of the upg2 test will ensure that
	the migrator will keep in sync with the changes made to the
	catalog.
	__________________________________________________________________

	Updating the test

	So, what is proper maintenance of the upg2 test?
	What should a developer do or not do to update it?

	================
	Things not to do:
	================

	1) Do not update the master file to show your new diffs:

	This will mask problems, the test should show that the upgraded
	catalogs are identical.

	2) Do not update any files used to upgrade from 3.2:

	This corresponds to (regress/data/upgrade33) - don't edit
	anything in here. The upgrade from 3.2 catalog to 3.3 catalog
	is already correct all changes should now be in terms of
	upgrading 3.3 catalog => 3.4.

	===================
	How to make changes:
	-==================

	1) Adding new functions:

	If you add a new function (pg_proc.h) then you must also add the
	new function to:

	- regress/data/upgrade34/upg2_pg_proc_toadd33.data

	This file is in pretty much the same format as pg_proc.h.

	- regress/data/upgrade34/upg2_pg_depend_toadd33.data
	This is the pg_depend entries for the new function

	- regress/data/upgrade34/upg2_pg_description_toadd33.data

	This is the description for the new function should exactly
	match DESCR() from pg_proc.h

	2) Adding new catalog tables:

	NOTE WELL: Any new tables MUST have a fixed OID for the
	entry in pg_type. In order to accomplish this the value must
	be inserted into the list of upgrade types at the bottom of
	pg_type.h AND it must be added to the list of specially
	handled cases inside of bootparse.y.

	If you are adding a new Shared (cross-database) table then you
	must make sure to update the IsSharedRelation function in
	catalog.c.

	After chaging any of the .h files in "include/catalog"
	(including the pg_type.h changes from the above step) you must
	update catversion.h do a make distclean and rerun
	gpinitsystem. Otherwise you may update the rest of the files
	below based on stale information.

	- regress/data/upgrade34/upg2_pg_class_toadd33.data

	This file is in pretty much the same format as pg_class.h.
	Also includes new index entries (and sequences, views?)

	- regress/data/upgrade34/upg2_pg_attribute_toadd33.data

	Column definitions.

	- regress/data/upgrade34/upg2_pg_depend_toadd33.data

	This is the pg_depend entries for the new table (and
	indexes). [check for relid in refobjid]

	- regress/data/upgrade34/upg2_pg_index_toadd33.data

	Only required if the new table has indices. Note that the
	index names from the DECLARE_INDEX/DECLARE_UNIQUE_INDEX
	definitions in indexing.h.

	- regress/data/upgrade34/upg2_pg_type_toadd33.data

	This is the implementation type for the new table.

	3) Altering existing catalog tables:

	This is probably the most complicated procedure, there are a
	couple rules to follow.

	You must add any new columns onto the END of a table. Inserting
	columns into the middle of the table is not supported by our
	current upgrade strategy.

	Any changes must be able to be applied incrementally to an
	existing table that contains data. For instance if you want to
	add a new column with a NOT NULL constraint then you must first
	add the column, then assign values for all existing rows, finally
	add the constraint.

	This is really the only case that you may have to edit either
	regress/input/upg2.source or regress/output/upg2.source.

	a) Open (regress/input/upg2.source) and find the section on
	"Build the 32 catalog".

	In this section we create a bunch of tables in the public
	schema "LIKE" the ones in the catalog schema, this is the set
	of tables that have not had schema edits between 32 and the
	current version. If your table is in this list then you need
	to move it to the next section "don't use LIKE, as 3.2 had a
	different structure", and create the table explicitly using
	the schema of the table AS IT EXISTED IN 3.2.

	Repeat the above for every major release prior to the current
	one (currently 3.2, 3.3).

	If you are modifying a table that does not appear in the file
	then you will also need to add tests to make sure it upgrades
	properly. This is more involved and you should ask either
	Caleb or Gavin for help.

	b) Edit (regress/input/loadcat34.source) or the corresponding
	file for the target release.

	You should never modify a loadcat file for a previous release.

	This should be edited with the DDL steps necessary to migrate
	from the old schema to the new schema. Remember to reindex
	any modified tables.

	c) Edit (regress/output/upg2.source)

	There are really two things that need to happen here. First
	you need to update the output for any changes you did in step
	(a) above. This should be fairly straight forward.

	d) Have either Caleb or Gavin review the change.

	Altering upg2 to account for catalog schema changes is
	difficult and we want to make sure that nothing is being
	omitted.

	e) No really, have us review it.

	Don't make us hunt you down.

	__________________________________________________________________

	Structure of the upg2 test

	The primary test driver for the upg2 test is in
	(regress/input/upg2.source) this file is responsible for creating
	copies of the catalog from prior versions and testing that the
	upgrade scripts (regress/data/upgrade/upg2_ and
	regress/input/loadcat*) properly upgrade the catalog correctly.
	This occurs in several stages.

	First it creates N skeleton catalogs, one for each prior version
	that we support upgrade FROM. For any catalog tables that have not
	undergone schema edits we just create them "LIKE" the corresponding
	table in pg_catalog. For catalog tables that have changed since
	that version we instead create them with the schema that they had
	in the referenced version.

	Second we load in the reference versions of those schemas. This is
	done with COPY statements from the dumped output of the referenced
	version. For example the 3.2 schema is loaded from
	(regress/data/upgrade32/pg_*32.data). Note that these ".data"
	files are dumps from a particular version, so you should never
	update them, that would equate to making a catalog change in a
	released version... which you can't do.

	Third it begins the process of upgrading. Here it will
	incrementally upgrade from each major version to the next running
	tests that the schemas match correctly along each iteration.

	Currently this means the following:

	1) Upgrade from the 3.2 schema into a 3.3 schema

	- regress/input/loadcat33.source
	- regress/data/upg2_conversion32.source

	2) Test the upgraded 3.2 schema against the loaded 3.3 schema

	3) Upgrade from the 3.3 schema into a 3.4 schema

	- regress/input/loadcat34.source

	4) Test the upgraded 3.3 against the schema in pg_catalog

	The structure of the loadcat*.source files is to copy the
	upg2_pg__toadd.data files into the relevant catalog tables,
	perform any additional transforms, and reindex the updated
	catalogs.