| Enable auto-ANALYZE through autovacuum daemon on Coordinator(QD) in GPDB. |
| |
| The basic idea for only enabling auto-ANALYZE through Coordinator’s autovacuum |
| daemon is to collect pgstat info into Coordinator when executing queries. Start |
| the Coordinator’s autovacuum launcher process. Fire an autovacuum work process for |
| a database on Coordinator when the naptime reaches. Then the autovacuum worker |
| will iterate through all tables/materialized views under a specified database, |
| and execute ANALYZE for tables which reached the analyze threshold. Note the |
| ANALYZE statement issued in the autovacuum worker on Coordinator is the same as |
| executing it through query on QD. ie. The auto-ANALYZE is coordinated by the |
| coordinator and segments do not start it’s own autovacuum launcher and autovacuum |
| worker. |
| |
| Implementations |
| ================================================= |
| ### Collect pgstat on coordinator for relations. |
| Since the auto-ANALYZE is fired from Coordinator, we need to collect pgstat info |
| into coordinator. With commit: 259cb9e74cd899edaae63bf9a89a664db19c1ab1, we are |
| able to collect tuple relead pgstat table info from segments. Then the |
| auto-analyze could consider partition tables now. |
| The idea is, for writer QE, report current nest level xact tables pgstat |
| to QD through libpq at the end of a query statement. For a single |
| statement, it wouldn't operate too many tables, so the effort is really |
| small. |
| And on QD, retrieve and combine these tables' stat from the dispatch |
| result and add to current nest level xact pgstats. |
| |
| Currently auto-ANALYZE is not supported for partition root/mid-layer tables, |
| because we don't have accurate pgstat for mid-layer tables and root tables. |
| This is same with Postres. |
| There's already a discussion on Postgres upstream to enable auto-ANALYZE for |
| root/mid-layer partition tables. Once it merged into upstream, we can cherry |
| pick the patch. See: |
| https://www.postgresql.org/message-id/CAKkQ508_PwVgwJyBY=0Lmkz90j8CmWNPUxgHvCUwGhMrouz6UA@mail.gmail.com |
| |
| For relation who's storage parameter `autovacuum_enable=false` with autovacuum enabled, |
| if `auto_stat` is configured, the ANALYZE will follow `auto_stats` logic. |
| |
| And since `auto_stats` runs in same transaction that modify the table, |
| the `n_mod_since_analyze` of the table's pgstat will not set to 0 after |
| `auto_stat`'s ANALYZE. |
| We can simulate this with below query: |
| ``` |
| =# CREATE TABLE autostatstbl (id bigint); |
| CREATE TABLE |
| =# begin; |
| BEGIN |
| =*# INSERT INTO autostatstbl select i from generate_series(1, 1000) as i; |
| INSERT 0 1000 |
| =*# analyze autostatstbl; |
| ANALYZE |
| =*# end; |
| COMMIT |
| =# select analyze_count, autoanalyze_count, n_mod_since_analyze from pg_stat_all_tables where relname = 'autostatstbl'; |
| analyze_count | autoanalyze_count | n_mod_since_analyze |
| ---------------+-------------------+--------------------- |
| 1 | 0 | 1000 |
| (1 row) |
| ``` |
| This is because ANALYZE call `pgstat_report_analyze` immediately. |
| but the INSERT statement's pgstat change only gets sent out(pgstat_report_stat) |
| after commit and ready for the next query. |
| So `auto_stats` does not affect whether auto-ANALYZE will be executed |
| for the table. |
| |
| ### Make the autovacuum guc work to start the autovacuum launcher. |
| We used to disable modifying autovacuum ANALYZE related GUCs. Now, only enable |
| `autovacuum`, `autovacuum_naptime`, `autovacuum_analyze_threshold`, |
| `autovacuum_analyze_scale_factor`. |
| |
| ### Recognize databases that need vacuum(template0 only) or analyze under the autovacuum launcher. |
| As in PR https://github.com/greenplum-db/gpdb/pull/4548. We support force xid |
| wraparound VACUUM through autovacuum for template0 only. This is by checking |
| the database’s `datallowconn` field. If it’s false, it means not allowing users |
| to connect it directly. That’s also the reason why we enable force xid wraparound |
| VACUUM for template0. In `get_database_list` under autovacuum.c, it used to only |
| return the databases whos `datallowconn=false`. Note the force xid wraparound |
| VACUUM for template0 could be run on both **Coordinator** and **segments**. |
| |
| Then we reuse the `datallowconn` for auto-ANALYZE logic. If the current |
| database’s `datallowconn` sets to true, we only perform ANALYZE in autovacuum |
| workers on Coordinator. Now in `get_database_list`, we check if the |
| `datallowconn=true`, autovacuum is enabled and current autovacuum daemon is |
| Coordinator’s, we perverse the `datallowconn` in the returned database list. The new |
| logic limit only the **Coordinator’s** autovacuum daemon has the ability to fire |
| autovacuum workers on databases other than template0( so if enable autovacuum |
| on segments, it still only executes force xid wraparound on template0). |
| Also a new field `wi_for_analyze` in `WorkerInfoData` is added to pass this |
| into the autovacuum worker. Then if the target database’s `datallowconn=true` |
| for a new autovacuum worker, the worker’s `wi_for_analyze` will be set to true, |
| and the started worker will only execute ANALYZE logic. Other tables xid |
| wraparound VACUUM are ignored. |
| |
| We also optimize the logic of `do_start_worker`, it used to always choose the |
| force xid wraparound database as the target for a new autovacuum worker. Since |
| we only enable auto-ANALYZE for databases except template0, the old logic will |
| be trapped into a database who needs to execute xid wraparound. But autovacuum |
| workers never execute VACUUM on it. |
| |
| ### Change autovacuum worker’s Gp_role for ANALYZE. |
| Since auto-ANALYZE is coordinated by the Coordinator and segments do not start it’s |
| own autovacuum launcher and autovacuum worker. We need Coordinator’s autovacuum |
| workers which do auto-ANALYZE on a database have `Gp_role = GP_ROLE_DISPATCH`. |
| So it’ll do the ANALYZE on coordinator to calculate statistics with right behavior, |
| for example, acquire sample rows from segments. |
| But for VACUUM on template0, we still keep it running as `GP_ROLE_UTILITY`. |
| |
| ### Autovacuum workers for ANALYZE should not block other commands. |
| In postgres, the autovacuum workers generally don't block other commands. |
| If a process attempts to acquire a lock that conflicts with the SHARE UPDATE |
| EXCLUSIVE lock held by autovacuum, lock acquisition will interrupt the |
| autovacuum. |
| This feature should also work for the auto-ANALYZE’s autovacuum workers. |
| A test is added to ensure this(src/test/isolation2/input/autovacuum-analyze.source). |
| |
| A simple example is: |
| ```sql |
| CREATE TABLE anaabort(id int); |
| INSERT INTO anaabort select i from generate_series(1, 1000) as i; |
| |
| -- assume the autovacuum worker is executing ANALYZE on the table |
| |
| T1: BEGIN; |
| T1: LOCK TABLE anaabort in EXCLUSIVE mode; -- abort the autovacuum worker |
| T1: ... |
| T1: END; |
| |
| -- Then the autovacuum worker's ANALYZE command should abort and the |
| -- pg_statistic should not get updated. |
| SELECT count(*) FROM pg_statistic where starelid = 'anaabort'::regclass; |
| count |
| ------- |
| 0 |
| (1 row) |
| ``` |
| |
| Limitations |
| ================================================== |
| |
| ### Fully partition table support |
| As discussed, mid-layer and root partition tables are still not supported |
| for auto-ANALYZE. |
| |
| ### Clean orphan temp tables in autovacuum |
| For GPDB, the temp table schema is named as "pg_temp_<gp_session_id>" while |
| Postgres is "pg_temp_<backendid>". |
| During the autovacuum orphan temp table checking, we can not parse the |
| backendid in GPDB. So, autovacuum will failed to find the backend which using |
| the temp table. And it'll treat all temp tables as orphan. So disable clean up |
| orphan temp tables in GPDB for now. |
| |
| ### Cost-based delay |
| In Postgres, the cost-base delay feature is used to limit the VACUUM and |
| ANALYZE IO cost. Since the IO cost is not collected to coordinator, so feature |
| is not working in cluster wide. But we can still use it to limit IO cost for |
| each segment. |
| We plan to support the disk io limit in the resource group. Once it is ready, |
| we will see if the resource group can help here. |
| |
| |
| ### VACUUM under autovacuum |
| Now, GPDB only supports xid wrap around VACUUM on template0 through the |
| autovacuum. Since we still have different opinions for the auto-vacuum |
| implementation, current PR does not include the vacuum part. |