src/doc/4.0-alpha3/_sources/data_modeling/data_modeling_refining.rst.txt - cassandra-website - Git at Google

 .. Licensed to the Apache Software Foundation (ASF) under one
 .. or more contributor license agreements.  See the NOTICE file
 .. distributed with this work for additional information
 .. regarding copyright ownership.  The ASF licenses this file
 .. to you under the Apache License, Version 2.0 (the
 .. "License"); you may not use this file except in compliance
 .. with the License.  You may obtain a copy of the License at
 ..
 ..     http://www.apache.org/licenses/LICENSE-2.0
 ..
 .. Unless required by applicable law or agreed to in writing, software
 .. distributed under the License is distributed on an "AS IS" BASIS,
 .. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 .. See the License for the specific language governing permissions and
 .. limitations under the License.

 .. role:: raw-latex(raw)
    :format: latex
 ..

 Evaluating and Refining Data Models
 ===================================

 Once you’ve created a physical model, there are some steps you’ll want
 to take to evaluate and refine table designs to help ensure optimal
 performance.

 Calculating Partition Size
 --------------------------

 The first thing that you want to look for is whether your tables will have
 partitions that will be overly large, or to put it another way, too
 wide. Partition size is measured by the number of cells (values) that
 are stored in the partition. Cassandra’s hard limit is 2 billion cells
 per partition, but you’ll likely run into performance issues before
 reaching that limit.

 In order to calculate the size of partitions, use the following
 formula:

 .. math:: N_v = N_r (N_c - N_{pk} - N_s) + N_s

 The number of values (or cells) in the partition (N\ :sub:`v`) is equal to
 the number of static columns (N\ :sub:`s`) plus the product of the number
 of rows (N\ :sub:`r`) and the number of of values per row. The number of
 values per row is defined as the number of columns (N\ :sub:`c`) minus the
 number of primary key columns (N\ :sub:`pk`) and static columns
 (N\ :sub:`s`).

 The number of columns tends to be relatively static, although it
 is possible to alter tables at runtime. For this reason, a
 primary driver of partition size is the number of rows in the partition.
 This is a key factor that you must consider in determining whether a
 partition has the potential to get too large. Two billion values sounds
 like a lot, but in a sensor system where tens or hundreds of values are
 measured every millisecond, the number of values starts to add up pretty
 fast.

 Let’s take a look at one of the tables to analyze the partition size.
 Because it has a wide partition design with one partition per hotel,
 look at the ``available_rooms_by_hotel_date`` table. The table has
 four columns total (N\ :sub:`c` = 4), including three primary key columns
 (N\ :sub:`pk` = 3) and no static columns (N\ :sub:`s` = 0). Plugging these
 values into the formula, the result is:

 .. math:: N_v = N_r (4 - 3 - 0) + 0 = 1N_r

 Therefore the number of values for this table is equal to the number of
 rows. You still need to determine a number of rows. To do this, make
 estimates based on the application design. The table is
 storing a record for each room, in each of hotel, for every night.
 Let's assume the system will be used to store two years of
 inventory at a time, and there are 5,000 hotels in the system, with an
 average of 100 rooms in each hotel.

 Since there is a partition for each hotel, the estimated number of rows
 per partition is as follows:

 .. math:: N_r = 100 rooms/hotel \times 730 days = 73,000 rows

 This relatively small number of rows per partition is not going to get
 you in too much trouble, but if you start storing more dates of inventory,
 or don’t manage the size of the inventory well using TTL, you could start
 having issues. You still might want to look at breaking up this large
 partition, which you'll see how to do shortly.

 When performing sizing calculations, it is tempting to assume the
 nominal or average case for variables such as the number of rows.
 Consider calculating the worst case as well, as these sorts of
 predictions have a way of coming true in successful systems.

 Calculating Size on Disk
 ------------------------

 In addition to calculating the size of a partition, it is also an
 excellent idea to estimate the amount of disk space that will be
 required for each table you plan to store in the cluster. In order to
 determine the size, use the following formula to determine the size
 S\ :sub:`t` of a partition:

 .. math:: S_t = \displaystyle\sum_i sizeOf\big (c_{k_i}\big) + \displaystyle\sum_j sizeOf\big(c_{s_j}\big) + N_r\times \bigg(\displaystyle\sum_k sizeOf\big(c_{r_k}\big) + \displaystyle\sum_l sizeOf\big(c_{c_l}\big)\bigg) +

 .. math:: N_v\times sizeOf\big(t_{avg}\big)

 This is a bit more complex than the previous formula, but let's break it
 down a bit at a time. Let’s take a look at the notation first:

 -  In this formula, c\ :sub:`k` refers to partition key columns,
    c\ :sub:`s` to static columns, c\ :sub:`r` to regular columns, and
    c\ :sub:`c` to clustering columns.

 -  The term t\ :sub:`avg` refers to the average number of bytes of
    metadata stored per cell, such as timestamps. It is typical to use an
    estimate of 8 bytes for this value.

 -  You'll recognize the number of rows N\ :sub:`r` and number of values
    N\ :sub:`v` from previous calculations.

 -  The **sizeOf()** function refers to the size in bytes of the CQL data
    type of each referenced column.

 The first term asks you to sum the size of the partition key columns. For
 this example, the ``available_rooms_by_hotel_date`` table has a single
 partition key column, the ``hotel_id``, which is of type
 ``text``. Assuming that hotel identifiers are simple 5-character codes,
 you have a 5-byte value, so the sum of the partition key column sizes is
 5 bytes.

 The second term asks you to sum the size of the static columns. This table
 has no static columns, so the size is 0 bytes.

 The third term is the most involved, and for good reason—it is
 calculating the size of the cells in the partition. Sum the size of
 the clustering columns and regular columns. The two clustering columns
 are the ``date``, which is 4 bytes, and the ``room_number``,
 which is a 2-byte short integer, giving a sum of 6 bytes.
 There is only a single regular column, the boolean ``is_available``,
 which is 1 byte in size. Summing the regular column size
 (1 byte) plus the clustering column size (6 bytes) gives a total of 7
 bytes. To finish up the term, multiply this value by the number of
 rows (73,000), giving a result of 511,000 bytes (0.51 MB).

 The fourth term is simply counting the metadata that that Cassandra
 stores for each cell. In the storage format used by Cassandra 3.0 and
 later, the amount of metadata for a given cell varies based on the type
 of data being stored, and whether or not custom timestamp or TTL values
 are specified for individual cells. For this table, reuse the number
 of values from the previous calculation (73,000) and multiply by 8,
 which gives 0.58 MB.

 Adding these terms together, you get a final estimate:

 .. math:: Partition size = 16 bytes + 0 bytes + 0.51 MB + 0.58 MB = 1.1 MB

 This formula is an approximation of the actual size of a partition on
 disk, but is accurate enough to be quite useful. Remembering that the
 partition must be able to fit on a single node, it looks like the table
 design will not put a lot of strain on disk storage.

 Cassandra’s storage engine was re-implemented for the 3.0 release,
 including a new format for SSTable files. The previous format stored a
 separate copy of the clustering columns as part of the record for each
 cell. The newer format eliminates this duplication, which reduces the
 size of stored data and simplifies the formula for computing that size.

 Keep in mind also that this estimate only counts a single replica of
 data. You will need to multiply the value obtained here by the number of
 partitions and the number of replicas specified by the keyspace’s
 replication strategy in order to determine the total required total
 capacity for each table. This will come in handy when you
 plan your cluster.

 Breaking Up Large Partitions
 ----------------------------

 As discussed previously, the goal is to design tables that can provide
 the data you need with queries that touch a single partition, or failing
 that, the minimum possible number of partitions. However, as shown in
 the examples, it is quite possible to design wide
 partition-style tables that approach Cassandra’s built-in limits.
 Performing sizing analysis on tables may reveal partitions that are
 potentially too large, either in number of values, size on disk, or
 both.

 The technique for splitting a large partition is straightforward: add an
 additional column to the partition key. In most cases, moving one of the
 existing columns into the partition key will be sufficient. Another
 option is to introduce an additional column to the table to act as a
 sharding key, but this requires additional application logic.

 Continuing to examine the available rooms example, if you add the ``date``
 column to the partition key for the ``available_rooms_by_hotel_date``
 table, each partition would then represent the availability of rooms
 at a specific hotel on a specific date. This will certainly yield
 partitions that are significantly smaller, perhaps too small, as the
 data for consecutive days will likely be on separate nodes.

 Another technique known as **bucketing** is often used to break the data
 into moderate-size partitions. For example, you could bucketize the
 ``available_rooms_by_hotel_date`` table by adding a ``month`` column to
 the partition key, perhaps represented as an integer. The comparision
 with the original design is shown in the figure below. While the
 ``month`` column is partially duplicative of the ``date``, it provides
 a nice way of grouping related data in a partition that will not get
 too large.

 .. image:: images/data_modeling_hotel_bucketing.png

 If you really felt strongly about preserving a wide partition design, you
 could instead add the ``room_id`` to the partition key, so that each
 partition would represent the availability of the room across all
 dates. Because there was no query identified that involves searching
 availability of a specific room, the first or second design approach
 is most suitable to the application needs.

 *Material adapted from Cassandra, The Definitive Guide. Published by
 O'Reilly Media, Inc. Copyright © 2020 Jeff Carpenter, Eben Hewitt.
 All rights reserved. Used with permission.*
	.. Licensed to the Apache Software Foundation (ASF) under one
	.. or more contributor license agreements. See the NOTICE file
	.. distributed with this work for additional information
	.. regarding copyright ownership. The ASF licenses this file
	.. to you under the Apache License, Version 2.0 (the
	.. "License"); you may not use this file except in compliance
	.. with the License. You may obtain a copy of the License at
	..
	.. http://www.apache.org/licenses/LICENSE-2.0
	..
	.. Unless required by applicable law or agreed to in writing, software
	.. distributed under the License is distributed on an "AS IS" BASIS,
	.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	.. See the License for the specific language governing permissions and
	.. limitations under the License.

	.. role:: raw-latex(raw)
	:format: latex
	..

	Evaluating and Refining Data Models
	===================================

	Once you’ve created a physical model, there are some steps you’ll want
	to take to evaluate and refine table designs to help ensure optimal
	performance.

	Calculating Partition Size
	--------------------------

	The first thing that you want to look for is whether your tables will have
	partitions that will be overly large, or to put it another way, too
	wide. Partition size is measured by the number of cells (values) that
	are stored in the partition. Cassandra’s hard limit is 2 billion cells
	per partition, but you’ll likely run into performance issues before
	reaching that limit.

	In order to calculate the size of partitions, use the following
	formula:

	.. math:: N_v = N_r (N_c - N_{pk} - N_s) + N_s

	The number of values (or cells) in the partition (N\ :sub:`v`) is equal to
	the number of static columns (N\ :sub:`s`) plus the product of the number
	of rows (N\ :sub:`r`) and the number of of values per row. The number of
	values per row is defined as the number of columns (N\ :sub:`c`) minus the
	number of primary key columns (N\ :sub:`pk`) and static columns
	(N\ :sub:`s`).

	The number of columns tends to be relatively static, although it
	is possible to alter tables at runtime. For this reason, a
	primary driver of partition size is the number of rows in the partition.
	This is a key factor that you must consider in determining whether a
	partition has the potential to get too large. Two billion values sounds
	like a lot, but in a sensor system where tens or hundreds of values are
	measured every millisecond, the number of values starts to add up pretty
	fast.

	Let’s take a look at one of the tables to analyze the partition size.
	Because it has a wide partition design with one partition per hotel,
	look at the ``available_rooms_by_hotel_date`` table. The table has
	four columns total (N\ :sub:`c` = 4), including three primary key columns
	(N\ :sub:`pk` = 3) and no static columns (N\ :sub:`s` = 0). Plugging these
	values into the formula, the result is:

	.. math:: N_v = N_r (4 - 3 - 0) + 0 = 1N_r

	Therefore the number of values for this table is equal to the number of
	rows. You still need to determine a number of rows. To do this, make
	estimates based on the application design. The table is
	storing a record for each room, in each of hotel, for every night.
	Let's assume the system will be used to store two years of
	inventory at a time, and there are 5,000 hotels in the system, with an
	average of 100 rooms in each hotel.

	Since there is a partition for each hotel, the estimated number of rows
	per partition is as follows:

	.. math:: N_r = 100 rooms/hotel \times 730 days = 73,000 rows

	This relatively small number of rows per partition is not going to get
	you in too much trouble, but if you start storing more dates of inventory,
	or don’t manage the size of the inventory well using TTL, you could start
	having issues. You still might want to look at breaking up this large
	partition, which you'll see how to do shortly.

	When performing sizing calculations, it is tempting to assume the
	nominal or average case for variables such as the number of rows.
	Consider calculating the worst case as well, as these sorts of
	predictions have a way of coming true in successful systems.

	Calculating Size on Disk
	------------------------

	In addition to calculating the size of a partition, it is also an
	excellent idea to estimate the amount of disk space that will be
	required for each table you plan to store in the cluster. In order to
	determine the size, use the following formula to determine the size
	S\ :sub:`t` of a partition:

	.. math:: S_t = \displaystyle\sum_i sizeOf\big (c_{k_i}\big) + \displaystyle\sum_j sizeOf\big(c_{s_j}\big) + N_r\times \bigg(\displaystyle\sum_k sizeOf\big(c_{r_k}\big) + \displaystyle\sum_l sizeOf\big(c_{c_l}\big)\bigg) +

	.. math:: N_v\times sizeOf\big(t_{avg}\big)

	This is a bit more complex than the previous formula, but let's break it
	down a bit at a time. Let’s take a look at the notation first:

	- In this formula, c\ :sub:`k` refers to partition key columns,
	c\ :sub:`s` to static columns, c\ :sub:`r` to regular columns, and
	c\ :sub:`c` to clustering columns.

	- The term t\ :sub:`avg` refers to the average number of bytes of
	metadata stored per cell, such as timestamps. It is typical to use an
	estimate of 8 bytes for this value.

	- You'll recognize the number of rows N\ :sub:`r` and number of values
	N\ :sub:`v` from previous calculations.

	- The sizeOf() function refers to the size in bytes of the CQL data
	type of each referenced column.

	The first term asks you to sum the size of the partition key columns. For
	this example, the ``available_rooms_by_hotel_date`` table has a single
	partition key column, the ``hotel_id``, which is of type
	``text``. Assuming that hotel identifiers are simple 5-character codes,
	you have a 5-byte value, so the sum of the partition key column sizes is
	5 bytes.

	The second term asks you to sum the size of the static columns. This table
	has no static columns, so the size is 0 bytes.

	The third term is the most involved, and for good reason—it is
	calculating the size of the cells in the partition. Sum the size of
	the clustering columns and regular columns. The two clustering columns
	are the ``date``, which is 4 bytes, and the ``room_number``,
	which is a 2-byte short integer, giving a sum of 6 bytes.
	There is only a single regular column, the boolean ``is_available``,
	which is 1 byte in size. Summing the regular column size
	(1 byte) plus the clustering column size (6 bytes) gives a total of 7
	bytes. To finish up the term, multiply this value by the number of
	rows (73,000), giving a result of 511,000 bytes (0.51 MB).

	The fourth term is simply counting the metadata that that Cassandra
	stores for each cell. In the storage format used by Cassandra 3.0 and
	later, the amount of metadata for a given cell varies based on the type
	of data being stored, and whether or not custom timestamp or TTL values
	are specified for individual cells. For this table, reuse the number
	of values from the previous calculation (73,000) and multiply by 8,
	which gives 0.58 MB.

	Adding these terms together, you get a final estimate:

	.. math:: Partition size = 16 bytes + 0 bytes + 0.51 MB + 0.58 MB = 1.1 MB

	This formula is an approximation of the actual size of a partition on
	disk, but is accurate enough to be quite useful. Remembering that the
	partition must be able to fit on a single node, it looks like the table
	design will not put a lot of strain on disk storage.

	Cassandra’s storage engine was re-implemented for the 3.0 release,
	including a new format for SSTable files. The previous format stored a
	separate copy of the clustering columns as part of the record for each
	cell. The newer format eliminates this duplication, which reduces the
	size of stored data and simplifies the formula for computing that size.

	Keep in mind also that this estimate only counts a single replica of
	data. You will need to multiply the value obtained here by the number of
	partitions and the number of replicas specified by the keyspace’s
	replication strategy in order to determine the total required total
	capacity for each table. This will come in handy when you
	plan your cluster.

	Breaking Up Large Partitions
	----------------------------

	As discussed previously, the goal is to design tables that can provide
	the data you need with queries that touch a single partition, or failing
	that, the minimum possible number of partitions. However, as shown in
	the examples, it is quite possible to design wide
	partition-style tables that approach Cassandra’s built-in limits.
	Performing sizing analysis on tables may reveal partitions that are
	potentially too large, either in number of values, size on disk, or
	both.

	The technique for splitting a large partition is straightforward: add an
	additional column to the partition key. In most cases, moving one of the
	existing columns into the partition key will be sufficient. Another
	option is to introduce an additional column to the table to act as a
	sharding key, but this requires additional application logic.

	Continuing to examine the available rooms example, if you add the ``date``
	column to the partition key for the ``available_rooms_by_hotel_date``
	table, each partition would then represent the availability of rooms
	at a specific hotel on a specific date. This will certainly yield
	partitions that are significantly smaller, perhaps too small, as the
	data for consecutive days will likely be on separate nodes.

	Another technique known as bucketing is often used to break the data
	into moderate-size partitions. For example, you could bucketize the
	``available_rooms_by_hotel_date`` table by adding a ``month`` column to
	the partition key, perhaps represented as an integer. The comparision
	with the original design is shown in the figure below. While the
	``month`` column is partially duplicative of the ``date``, it provides
	a nice way of grouping related data in a partition that will not get
	too large.

	.. image:: images/data_modeling_hotel_bucketing.png

	If you really felt strongly about preserving a wide partition design, you
	could instead add the ``room_id`` to the partition key, so that each
	partition would represent the availability of the room across all
	dates. Because there was no query identified that involves searching
	availability of a specific room, the first or second design approach
	is most suitable to the application needs.

	*Material adapted from Cassandra, The Definitive Guide. Published by
	O'Reilly Media, Inc. Copyright © 2020 Jeff Carpenter, Eben Hewitt.
	All rights reserved. Used with permission.*