| .. ################################################################################ |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| ################################################################################ |
| |
| |
| ========== |
| Checkpoint |
| ========== |
| |
| CheckpointConfig |
| ---------------- |
| |
| Configuration that captures all checkpointing related settings. |
| |
| :data:`DEFAULT_MODE`: |
| |
| The default checkpoint mode: exactly once. |
| |
| :data:`DEFAULT_TIMEOUT`: |
| |
| The default timeout of a checkpoint attempt: 10 minutes. |
| |
| :data:`DEFAULT_MIN_PAUSE_BETWEEN_CHECKPOINTS`: |
| |
| The default minimum pause to be made between checkpoints: none. |
| |
| :data:`DEFAULT_MAX_CONCURRENT_CHECKPOINTS`: |
| |
| The default limit of concurrently happening checkpoints: one. |
| |
| .. currentmodule:: pyflink.datastream.checkpoint_config |
| |
| .. autosummary:: |
| :toctree: api/ |
| |
| CheckpointConfig.DEFAULT_MODE |
| CheckpointConfig.DEFAULT_TIMEOUT |
| CheckpointConfig.DEFAULT_MIN_PAUSE_BETWEEN_CHECKPOINTS |
| CheckpointConfig.DEFAULT_MIN_PAUSE_BETWEEN_CHECKPOINTS |
| CheckpointConfig.DEFAULT_MAX_CONCURRENT_CHECKPOINTS |
| CheckpointConfig.is_checkpointing_enabled |
| CheckpointConfig.get_checkpointing_mode |
| CheckpointConfig.set_checkpointing_mode |
| CheckpointConfig.get_checkpoint_interval |
| CheckpointConfig.set_checkpoint_interval |
| CheckpointConfig.get_checkpoint_timeout |
| CheckpointConfig.set_checkpoint_timeout |
| CheckpointConfig.get_min_pause_between_checkpoints |
| CheckpointConfig.set_min_pause_between_checkpoints |
| CheckpointConfig.get_max_concurrent_checkpoints |
| CheckpointConfig.set_max_concurrent_checkpoints |
| CheckpointConfig.is_fail_on_checkpointing_errors |
| CheckpointConfig.set_fail_on_checkpointing_errors |
| CheckpointConfig.get_tolerable_checkpoint_failure_number |
| CheckpointConfig.set_tolerable_checkpoint_failure_number |
| CheckpointConfig.enable_externalized_checkpoints |
| CheckpointConfig.set_externalized_checkpoint_cleanup |
| CheckpointConfig.is_externalized_checkpoints_enabled |
| CheckpointConfig.get_externalized_checkpoint_cleanup |
| CheckpointConfig.is_unaligned_checkpoints_enabled |
| CheckpointConfig.enable_unaligned_checkpoints |
| CheckpointConfig.disable_unaligned_checkpoints |
| CheckpointConfig.set_alignment_timeout |
| CheckpointConfig.get_alignment_timeout |
| CheckpointConfig.set_force_unaligned_checkpoints |
| CheckpointConfig.is_force_unaligned_checkpoints |
| CheckpointConfig.set_checkpoint_storage |
| CheckpointConfig.set_checkpoint_storage_dir |
| CheckpointConfig.get_checkpoint_storage |
| ExternalizedCheckpointCleanup |
| |
| |
| CheckpointStorage |
| ----------------- |
| |
| Checkpoint storage defines how :class:`StateBackend`'s store their state for fault-tolerance |
| in streaming applications. Various implementations store their checkpoints in different fashions |
| and have different requirements and availability guarantees. |
| |
| For example, :class:`JobManagerCheckpointStorage` stores checkpoints in the memory of the |
| `JobManager`. It is lightweight and without additional dependencies but is not scalable |
| and only supports small state sizes. This checkpoints storage policy is convenient for local |
| testing and development. |
| |
| :class:`FileSystemCheckpointStorage` stores checkpoints in a filesystem. For systems like HDFS |
| NFS drives, S3, and GCS, this storage policy supports large state size, in the magnitude of many |
| terabytes while providing a highly available foundation for streaming applications. This |
| checkpoint storage policy is recommended for most production deployments. |
| |
| **Raw Bytes Storage** |
| |
| The `CheckpointStorage` creates services for raw bytes storage. |
| |
| The raw bytes storage (through the CheckpointStreamFactory) is the fundamental service that |
| simply stores bytes in a fault tolerant fashion. This service is used by the JobManager to |
| store checkpoint and recovery metadata and is typically also used by the keyed- and operator- |
| state backends to store checkpoint state. |
| |
| **Serializability** |
| |
| Implementations need to be serializable(`java.io.Serializable`), because they are distributed |
| across parallel processes (for distributed execution) together with the streaming application |
| code. |
| |
| Because of that `CheckpointStorage` implementations are meant to be like _factories_ that create |
| the proper state stores that provide access to the persistent layer. That way, the storage |
| policy can be very lightweight (contain only configurations) which makes it easier to be |
| serializable. |
| |
| **Thread Safety** |
| |
| Checkpoint storage implementations have to be thread-safe. Multiple threads may be creating |
| streams concurrently. |
| |
| .. currentmodule:: pyflink.datastream.checkpoint_storage |
| |
| .. autosummary:: |
| :toctree: api/ |
| |
| JobManagerCheckpointStorage |
| FileSystemCheckpointStorage |
| CustomCheckpointStorage |