| = Virtual Tables |
| |
| Apache Cassandra 4.0 implements virtual tables (https://issues.apache.org/jira/browse/CASSANDRA-7622[CASSANDRA-7622]). |
| Virtual tables are tables backed by an API instead of data explicitly managed and stored as SSTables. |
| Apache Cassandra 4.0 implements a virtual keyspace interface for virtual tables. |
| Virtual tables are specific to each node. |
| |
| Some of the features of virtual tables are the ability to: |
| |
| * expose metrics through CQL |
| * expose YAML configuration information |
| |
| Virtual keyspaces and tables are quite different from regular tables and keyspaces: |
| |
| * Virtual tables are created in special keyspaces and not just any keyspace. |
| * Virtual tables are managed by Cassandra. Users cannot run DDL to create new virtual tables or DML to modify existing virtual tables. |
| * Virtual tables are currently read-only, although that may change in a later version. |
| * Virtual tables are local only, non-distributed, and thus not replicated. |
| * Virtual tables have no associated SSTables. |
| * Consistency level of the queries sent to virtual tables are ignored. |
| * All existing virtual tables use `LocalPartitioner`. |
| Since a virtual table is not replicated the partitioner sorts in order of partition keys instead of by their hash. |
| * Making advanced queries using `ALLOW FILTERING` and aggregation functions can be executed in virtual tables, even though in normal tables we dont recommend it. |
| |
| == Virtual Keyspaces |
| |
| Apache Cassandra 4.0 has added two new keyspaces for virtual tables: |
| |
| * `system_virtual_schema` |
| * `system_views`. |
| |
| The `system_virtual_schema` keyspace has three tables: `keyspaces`, |
| `columns` and `tables` for the virtual keyspace, table, and column definitions, respectively. |
| These tables contain schema information for the virtual tables. |
| It is used by Cassandra internally and a user should not access it directly. |
| |
| The `system_views` keyspace contains the actual virtual tables. |
| |
| == Virtual Table Limitations |
| |
| Before disccusing virtual keyspaces and tables, note that virtual keyspaces and tables have some limitations. |
| These limitations are subject to change. |
| Virtual keyspaces cannot be altered or dropped. |
| In fact, no operations can be performed against virtual keyspaces. |
| |
| Virtual tables cannot be created in virtual keyspaces. |
| Virtual tables cannot be altered, dropped, or truncated. |
| Secondary indexes, types, functions, aggregates, materialized views, and triggers cannot be created for virtual tables. |
| Expiring time-to-live (TTL) columns cannot be created. |
| Virtual tables do not support conditional updates or deletes. |
| Aggregates may be run in SELECT statements. |
| |
| Conditional batch statements cannot include mutations for virtual tables, nor can a virtual table statement be included in a logged batch. |
| In fact, mutations for virtual and regular tables cannot occur in the same batch table. |
| |
| == Virtual Tables |
| |
| Each of the virtual tables in the `system_views` virtual keyspace contain different information. |
| |
| The following table describes the virtual tables: |
| |
| [width="98%",cols="27%,73%",] |
| |=== |
| |Virtual Table |Description |
| |
| |caches |Displays the general cache information including cache name, capacity_bytes, entry_count, hit_count, hit_ratio double, |
| recent_hit_rate_per_second, recent_request_rate_per_second, request_count, and size_bytes. |
| |
| |clients |Lists information about all connected clients. |
| |
| |coordinator_read_latency |Records counts, keyspace_name, table_name, max, median, and per_second for coordinator reads. |
| |
| |coordinator_scan |Records counts, keyspace_name, table_name, max, median, and per_second for coordinator scans. |
| |
| |coordinator_write_latency |Records counts, keyspace_name, table_name, max, median, and per_second for coordinator writes. |
| |
| |cql_metrics |Metrics specific to CQL prepared statement caching. |
| |
| |disk_usage |Records disk usage including disk_space, keyspace_name, and table_name, sorted by system keyspaces. |
| |
| |gossip_info |Lists the gossip information for the cluster. |
| |
| |internode_inbound |Lists information about the inbound internode messaging. |
| |
| |internode_outbound |Information about the outbound internode messaging. |
| |
| |local_read_latency |Records counts, keyspace_name, table_name, max, median, and per_second for local reads. |
| |
| |local_scan |Records counts, keyspace_name, table_name, max, median, and per_second for local scans. |
| |
| |local_write_latency |Records counts, keyspace_name, table_name, max, median, and per_second for local writes. |
| |
| |max_partition_size |A table metric for maximum partition size. |
| |
| |rows_per_read |Records counts, keyspace_name, tablek_name, max, and median for rows read. |
| |
| |settings |Displays configuration settings in cassandra.yaml. |
| |
| |sstable_tasks |Lists currently running tasks and progress on SSTables, for operations like compaction and upgrade. |
| |
| |system_properties |Displays environmental system properties set on the node. |
| |
| |thread_pools |Lists metrics for each thread pool. |
| |
| |tombstones_per_read |Records counts, keyspace_name, tablek_name, max, and median for tombstones. |
| |
| |=== |
| |
| We shall discuss some of the virtual tables in more detail next. |
| |
| === Clients Virtual Table |
| |
| The `clients` virtual table lists all active connections (connected |
| clients) including their ip address, port, client_options, connection stage, driver |
| name, driver version, hostname, protocol version, request count, ssl |
| enabled, ssl protocol and user name: |
| |
| .... |
| cqlsh> EXPAND ON ; |
| Now Expanded output is enabled |
| cqlsh> SELECT * FROM system_views.clients; |
| |
| @ Row 1 |
| ------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| address | 127.0.0.1 |
| port | 50687 |
| client_options | {'CQL_VERSION': '3.4.6', 'DRIVER_NAME': 'DataStax Python Driver', 'DRIVER_VERSION': '3.25.0'} |
| connection_stage | ready |
| driver_name | DataStax Python Driver |
| driver_version | 3.25.0 |
| hostname | localhost |
| protocol_version | 5 |
| request_count | 16 |
| ssl_cipher_suite | null |
| ssl_enabled | False |
| ssl_protocol | null |
| username | anonymous |
| |
| @ Row 2 |
| ------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| address | 127.0.0.1 |
| port | 50688 |
| client_options | {'CQL_VERSION': '3.4.6', 'DRIVER_NAME': 'DataStax Python Driver', 'DRIVER_VERSION': '3.25.0'} |
| connection_stage | ready |
| driver_name | DataStax Python Driver |
| driver_version | 3.25.0 |
| hostname | localhost |
| protocol_version | 5 |
| request_count | 4 |
| ssl_cipher_suite | null |
| ssl_enabled | False |
| ssl_protocol | null |
| username | anonymous |
| |
| @ Row 3 |
| ------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| address | 127.0.0.1 |
| port | 50753 |
| client_options | {'APPLICATION_NAME': 'TestApp', 'APPLICATION_VERSION': '1.0.0', 'CLIENT_ID': '55b3efbd-c56b-469d-8cca-016b860b2f03', 'CQL_VERSION': '3.0.0', 'DRIVER_NAME': 'DataStax Java driver for Apache Cassandra(R)', 'DRIVER_VERSION': '4.13.0'} |
| connection_stage | ready |
| driver_name | DataStax Java driver for Apache Cassandra(R) |
| driver_version | 4.13.0 |
| hostname | localhost |
| protocol_version | 5 |
| request_count | 18 |
| ssl_cipher_suite | null |
| ssl_enabled | False |
| ssl_protocol | null |
| username | anonymous |
| |
| @ Row 4 |
| ------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| address | 127.0.0.1 |
| port | 50755 |
| client_options | {'APPLICATION_NAME': 'TestApp', 'APPLICATION_VERSION': '1.0.0', 'CLIENT_ID': '55b3efbd-c56b-469d-8cca-016b860b2f03', 'CQL_VERSION': '3.0.0', 'DRIVER_NAME': 'DataStax Java driver for Apache Cassandra(R)', 'DRIVER_VERSION': '4.13.0'} |
| connection_stage | ready |
| driver_name | DataStax Java driver for Apache Cassandra(R) |
| driver_version | 4.13.0 |
| hostname | localhost |
| protocol_version | 5 |
| request_count | 7 |
| ssl_cipher_suite | null |
| ssl_enabled | False |
| ssl_protocol | null |
| username | anonymous |
| |
| (4 rows) |
| .... |
| |
| Some examples of how `clients` can be used are: |
| |
| * To find applications using old incompatible versions of drivers before |
| upgrading and with `nodetool enableoldprotocolversions` and |
| `nodetool disableoldprotocolversions` during upgrades. |
| * To identify clients sending too many requests. |
| * To find if SSL is enabled during the migration to and from ssl. |
| * To identify all options the client is sending, e.g. APPLICATION_NAME and APPLICATION_VERSION |
| |
| The virtual tables may be described with `DESCRIBE` statement. The DDL |
| listed however cannot be run to create a virtual table. As an example |
| describe the `system_views.clients` virtual table: |
| |
| .... |
| cqlsh> DESCRIBE TABLE system_views.clients; |
| |
| /* |
| Warning: Table system_views.clients is a virtual table and cannot be recreated with CQL. |
| Structure, for reference: |
| VIRTUAL TABLE system_views.clients ( |
| address inet, |
| port int, |
| client_options frozen<map<text, text>>, |
| connection_stage text, |
| driver_name text, |
| driver_version text, |
| hostname text, |
| protocol_version int, |
| request_count bigint, |
| ssl_cipher_suite text, |
| ssl_enabled boolean, |
| ssl_protocol text, |
| username text, |
| PRIMARY KEY (address, port) |
| ) WITH CLUSTERING ORDER BY (port ASC) |
| AND comment = 'currently connected clients'; |
| */ |
| .... |
| |
| === Caches Virtual Table |
| |
| The `caches` virtual table lists information about the caches. The four |
| caches presently created are chunks, counters, keys and rows. A query on |
| the `caches` virtual table returns the following details: |
| |
| .... |
| cqlsh:system_views> SELECT * FROM system_views.caches; |
| name | capacity_bytes | entry_count | hit_count | hit_ratio | recent_hit_rate_per_second | recent_request_rate_per_second | request_count | size_bytes |
| ---------+----------------+-------------+-----------+-----------+----------------------------+--------------------------------+---------------+------------ |
| chunks | 229638144 | 29 | 166 | 0.83 | 5 | 6 | 200 | 475136 |
| counters | 26214400 | 0 | 0 | NaN | 0 | 0 | 0 | 0 |
| keys | 52428800 | 14 | 124 | 0.873239 | 4 | 4 | 142 | 1248 |
| rows | 0 | 0 | 0 | NaN | 0 | 0 | 0 | 0 |
| |
| (4 rows) |
| .... |
| |
| === CQL metrics Virtual Table |
| The `cql_metrics` virtual table lists metrics specific to CQL prepared statement caching. A query on `cql_metrics` virtual table lists below metrics. |
| |
| .... |
| cqlsh> select * from system_views.cql_metrics ; |
| |
| name | value |
| ------------------------------+------- |
| prepared_statements_count | 0 |
| prepared_statements_evicted | 0 |
| prepared_statements_executed | 0 |
| prepared_statements_ratio | 0 |
| regular_statements_executed | 17 |
| .... |
| |
| === Settings Virtual Table |
| |
| The `settings` table is rather useful and lists all the current |
| configuration settings from the `cassandra.yaml`. The encryption options |
| are overridden to hide the sensitive truststore information or |
| passwords. The configuration settings however cannot be set using DML on |
| the virtual table presently: : |
| |
| .... |
| cqlsh:system_views> SELECT * FROM system_views.settings; |
| |
| name | value |
| -------------------------------------+-------------------- |
| allocate_tokens_for_keyspace | null |
| audit_logging_options_enabled | false |
| auto_snapshot | true |
| automatic_sstable_upgrade | false |
| cluster_name | Test Cluster |
| transient_replication_enabled | false |
| hinted_handoff_enabled | true |
| hints_directory | /home/ec2-user/cassandra/data/hints |
| incremental_backups | false |
| initial_token | null |
| ... |
| ... |
| ... |
| rpc_address | localhost |
| ssl_storage_port | 7001 |
| start_native_transport | true |
| storage_port | 7000 |
| stream_entire_sstables | true |
| (224 rows) |
| .... |
| |
| The `settings` table can be really useful if yaml file has been changed |
| since startup and dont know running configuration, or to find if they |
| have been modified via jmx/nodetool or virtual tables. |
| |
| === Thread Pools Virtual Table |
| |
| The `thread_pools` table lists information about all thread pools. |
| Thread pool information includes active tasks, active tasks limit, |
| blocked tasks, blocked tasks all time, completed tasks, and pending |
| tasks. A query on the `thread_pools` returns following details: |
| |
| .... |
| cqlsh:system_views> select * from system_views.thread_pools; |
| |
| name | active_tasks | active_tasks_limit | blocked_tasks | blocked_tasks_all_time | completed_tasks | pending_tasks |
| ------------------------------+--------------+--------------------+---------------+------------------------+-----------------+--------------- |
| AntiEntropyStage | 0 | 1 | 0 | 0 | 0 | 0 |
| CacheCleanupExecutor | 0 | 1 | 0 | 0 | 0 | 0 |
| CompactionExecutor | 0 | 2 | 0 | 0 | 881 | 0 |
| CounterMutationStage | 0 | 32 | 0 | 0 | 0 | 0 |
| GossipStage | 0 | 1 | 0 | 0 | 0 | 0 |
| HintsDispatcher | 0 | 2 | 0 | 0 | 0 | 0 |
| InternalResponseStage | 0 | 2 | 0 | 0 | 0 | 0 |
| MemtableFlushWriter | 0 | 2 | 0 | 0 | 1 | 0 |
| MemtablePostFlush | 0 | 1 | 0 | 0 | 2 | 0 |
| MemtableReclaimMemory | 0 | 1 | 0 | 0 | 1 | 0 |
| MigrationStage | 0 | 1 | 0 | 0 | 0 | 0 |
| MiscStage | 0 | 1 | 0 | 0 | 0 | 0 |
| MutationStage | 0 | 32 | 0 | 0 | 0 | 0 |
| Native-Transport-Requests | 1 | 128 | 0 | 0 | 130 | 0 |
| PendingRangeCalculator | 0 | 1 | 0 | 0 | 1 | 0 |
| PerDiskMemtableFlushWriter_0 | 0 | 2 | 0 | 0 | 1 | 0 |
| ReadStage | 0 | 32 | 0 | 0 | 13 | 0 |
| Repair-Task | 0 | 2147483647 | 0 | 0 | 0 | 0 |
| RequestResponseStage | 0 | 2 | 0 | 0 | 0 | 0 |
| Sampler | 0 | 1 | 0 | 0 | 0 | 0 |
| SecondaryIndexManagement | 0 | 1 | 0 | 0 | 0 | 0 |
| ValidationExecutor | 0 | 2147483647 | 0 | 0 | 0 | 0 |
| ViewBuildExecutor | 0 | 1 | 0 | 0 | 0 | 0 |
| ViewMutationStage | 0 | 32 | 0 | 0 | 0 | 0 |
| .... |
| |
| (24 rows) |
| |
| === Internode Inbound Messaging Virtual Table |
| |
| The `internode_inbound` virtual table is for the internode inbound |
| messaging. Initially no internode inbound messaging may get listed. In |
| addition to the address, port, datacenter and rack information includes |
| corrupt frames recovered, corrupt frames unrecovered, error bytes, error |
| count, expired bytes, expired count, processed bytes, processed count, |
| received bytes, received count, scheduled bytes, scheduled count, |
| throttled count, throttled nanos, using bytes, using reserve bytes. A |
| query on the `internode_inbound` returns following details: |
| |
| .... |
| cqlsh:system_views> SELECT * FROM system_views.internode_inbound; |
| address | port | dc | rack | corrupt_frames_recovered | corrupt_frames_unrecovered | |
| error_bytes | error_count | expired_bytes | expired_count | processed_bytes | |
| processed_count | received_bytes | received_count | scheduled_bytes | scheduled_count | throttled_count | throttled_nanos | using_bytes | using_reserve_bytes |
| ---------+------+----+------+--------------------------+----------------------------+- |
| ---------- |
| (0 rows) |
| .... |
| |
| === SSTables Tasks Virtual Table |
| |
| The `sstable_tasks` could be used to get information about running |
| tasks. It lists following columns: |
| |
| .... |
| cqlsh:system_views> SELECT * FROM sstable_tasks; |
| keyspace_name | table_name | task_id | kind | progress | total | unit |
| ---------------+------------+--------------------------------------+------------+----------+----------+------- |
| basic | wide2 | c3909740-cdf7-11e9-a8ed-0f03de2d9ae1 | compaction | 60418761 | 70882110 | bytes |
| basic | wide2 | c7556770-cdf7-11e9-a8ed-0f03de2d9ae1 | compaction | 2995623 | 40314679 | bytes |
| .... |
| |
| As another example, to find how much time is remaining for SSTable |
| tasks, use the following query: |
| |
| .... |
| SELECT total - progress AS remaining |
| FROM system_views.sstable_tasks; |
| .... |
| |
| === Gossip Information Virtual Table |
| |
| The `gossip_info` virtual table lists the Gossip information for the cluster. An example query is as follows: |
| |
| .... |
| cqlsh> select address, port, generation, heartbeat, load, dc, rack from system_views.gossip_info; |
| |
| address | port | generation | heartbeat | load | dc | rack |
| -----------+------+------------+-----------+---------+-------------+------- |
| 127.0.0.1 | 7000 | 1645575140 | 312 | 70542.0 | datacenter1 | rack1 |
| 127.0.0.2 | 7000 | 1645575135 | 318 | 70499.0 | datacenter1 | rack1 |
| 127.0.0.3 | 7000 | 1645575140 | 312 | 70504.0 | datacenter1 | rack1 |
| 127.0.0.4 | 7000 | 1645575141 | 311 | 70502.0 | datacenter1 | rack1 |
| 127.0.0.5 | 7000 | 1645575136 | 315 | 70500.0 | datacenter1 | rack1 |
| |
| (5 rows) |
| .... |
| |
| === Other Virtual Tables |
| |
| Some examples of using other virtual tables are as follows. |
| |
| Find tables with most disk usage: |
| |
| .... |
| cqlsh> SELECT * FROM disk_usage WHERE mebibytes > 1 ALLOW FILTERING; |
| |
| keyspace_name | table_name | mebibytes |
| ---------------+------------+----------- |
| keyspace1 | standard1 | 288 |
| tlp_stress | keyvalue | 3211 |
| .... |
| |
| Find queries on table/s with greatest read latency: |
| |
| .... |
| cqlsh> SELECT * FROM local_read_latency WHERE per_second > 1 ALLOW FILTERING; |
| |
| keyspace_name | table_name | p50th_ms | p99th_ms | count | max_ms | per_second |
| ---------------+------------+----------+----------+----------+---------+------------ |
| tlp_stress | keyvalue | 0.043 | 0.152 | 49785158 | 186.563 | 11418.356 |
| .... |
| |
| |
| == Example |
| |
| [arabic, start=1] |
| . To list the keyspaces, enter ``cqlsh`` and run the CQL command ``DESCRIBE KEYSPACES``: |
| |
| [source, cql] |
| ---- |
| cqlsh> DESC KEYSPACES; |
| system_schema system system_distributed system_virtual_schema |
| system_auth system_traces system_views |
| ---- |
| |
| [arabic, start=2] |
| . To view the virtual table schema, run the CQL commands ``USE system_virtual_schema`` and ``SELECT * FROM tables``: |
| |
| [source, cql] |
| ---- |
| cqlsh> USE system_virtual_schema; |
| cqlsh> SELECT * FROM tables; |
| ---- |
| |
| results in: |
| |
| [source, cql] |
| ---- |
| keyspace_name | table_name | comment |
| -----------------------+---------------------------+-------------------------------------- |
| system_views | caches | system caches |
| system_views | clients | currently connected clients |
| system_views | coordinator_read_latency | |
| system_views | coordinator_scan_latency | |
| system_views | coordinator_write_latency | |
| system_views | disk_usage | |
| system_views | internode_inbound | |
| system_views | internode_outbound | |
| system_views | local_read_latency | |
| system_views | local_scan_latency | |
| system_views | local_write_latency | |
| system_views | max_partition_size | |
| system_views | rows_per_read | |
| system_views | settings | current settings |
| system_views | sstable_tasks | current sstable tasks |
| system_views | system_properties | Cassandra relevant system properties |
| system_views | thread_pools | |
| system_views | tombstones_per_read | |
| system_virtual_schema | columns | virtual column definitions |
| system_virtual_schema | keyspaces | virtual keyspace definitions |
| system_virtual_schema | tables | virtual table definitions |
| |
| (21 rows) |
| ---- |
| |
| [arabic, start=3] |
| . To view the virtual tables, run the CQL commands ``USE system_view`` and ``DESCRIBE tables``: |
| |
| [source, cql] |
| ---- |
| cqlsh> USE system_view;; |
| cqlsh> DESCRIBE tables; |
| ---- |
| |
| results in: |
| |
| [source, cql] |
| ---- |
| sstable_tasks clients coordinator_write_latency |
| disk_usage local_write_latency tombstones_per_read |
| thread_pools internode_outbound settings |
| local_scan_latency coordinator_scan_latency system_properties |
| internode_inbound coordinator_read_latency max_partition_size |
| local_read_latency rows_per_read caches |
| ---- |
| |
| [arabic, start=4] |
| . To look at any table data, run the CQL command ``SELECT``: |
| |
| [source, cql] |
| ---- |
| cqlsh> USE system_view;; |
| cqlsh> SELECT * FROM clients LIMIT 2; |
| ---- |
| results in: |
| |
| [source, cql] |
| ---- |
| address | port | connection_stage | driver_name | driver_version | hostname | protocol_version | request_count | ssl_cipher_suite | ssl_enabled | ssl_protocol | username |
| -----------+-------+------------------+------------------------+----------------+-----------+------------------+---------------+------------------+-------------+--------------+----------- |
| 127.0.0.1 | 37308 | ready | DataStax Python Driver | 3.21.0.post0 | localhost | 4 | 17 | null | False | null | anonymous |
| 127.0.0.1 | 37310 | ready | DataStax Python Driver | 3.21.0.post0 | localhost | 4 | 8 | null | False | null | anonymous |
| |
| (2 rows) |
| ---- |