This document defines the security model for Apache Fory deserialization. It is a public security reference for classifying deserialization behavior and deciding where validation is required. It is not a vulnerability disclosure, does not describe exploit techniques, and does not document implementation history.
The model is intentionally narrow. Fory should prevent resource and policy failures caused by untrusted input, but it should not add hot-path validation that only enforces byte-form strictness when doing so does not protect a Fory security boundary.
This model applies to deserializing Fory binary data from untrusted or partially trusted sources.
It does not treat the semantic content of a successfully deserialized value as a Fory security boundary. A sender can always construct protocol-valid data whose value is chosen by that sender. Application authorization, object-level business rules, and domain-specific validation remain application responsibilities.
This model also does not cover trusted in-memory formats. Row format and other memory-format paths are trusted-data paths unless a runtime explicitly exposes them as untrusted deserialization APIs.
Fory deserialization should treat the encoded input as untrusted at API boundaries that accept external bytes or streams.
Fory security boundaries include:
Fory security boundaries do not include:
Deserialization code must prevent the following outcomes for untrusted input:
When a path cannot produce one of these outcomes, earlier rejection of malformed bytes is normally a correctness or interoperability choice, not a security requirement.
The following patterns are not vulnerabilities by default:
Fory may still reject malformed forms for specification strictness or interoperability. That validation should be added only when it is required by the protocol owner, is effectively free on the relevant path, or protects a security invariant listed above. Do not add protocol-layer validation solely to reject scalar byte forms whose only effect is extra decode cost.
Some read paths intentionally share handling for multiple value-bearing flags. For example, when both NotNullValue and RefValue mean that an encoded value follows, a reader may merge their hot-path handling. This is not a malformed flag bug by itself. Treat it as a bug only if the merged handling loses required reference semantics, returns success across an explicit owner policy, or creates a resource or runtime-safety failure.
Fory should not make large allocations from attacker-declared lengths before the required bytes are available or have been read exactly.
For buffer-backed input:
For stream-backed input:
The byte owner should stay byte-oriented. Buffer, reader, or read-context APIs may expose byte read and byte skip operations, but string decoding, decimal parsing, primitive-array encoding, compression modes, and collection capacity policy belong to the owning serializers.
Large valid collection inputs are allowed. If the input contains many encoded elements, proportional deserialization is expected.
The security requirement is to avoid disproportionate preallocation from a declared logical count before enough input bytes justify that capacity. For a non-empty container, a reader that will allocate or reserve from the declared count should call checkReadableBytes(logicalCount) or the runtime equivalent before that allocation. The check remains byte-owner-only: it does not decode the whole container, validate element semantics, or replace chunk validation. Readers that do not preallocate from the logical count may still grow proportionally as elements are actually read.
Map or collection chunk validation is security-relevant only when missing validation can cause a no-progress loop, unbounded resource growth, retained state, or success across a Fory policy boundary. Protocol-allowed chunk segmentation is normal input and is not a security issue by itself.
Skipping unknown or incompatible data is classified by concrete impact, not by whether the runtime materializes a temporary value.
Directly consuming encoded contents is useful when it is simple and owned by the current runtime path. It is not a security requirement for complex fields such as lists, sets, and maps. A runtime may materialize a value and discard it when that preserves the existing serializer ownership model.
For extension, dynamic, or user-owned types, the owning runtime may not always have enough information to skip without invoking a registered serializer. In that case, classify the behavior by concrete impact:
Metadata parsing is security-sensitive when it affects retained read-side state, type dispatch, or policy decisions.
Metadata readers should:
Metadata byte-form strictness alone is not a security requirement. Rejecting a metadata shape is useful only when the owner wants that strictness or when the shape changes type identity, retained state, resource use, or policy behavior.
Reference tracking is part of the wire protocol and is performance-sensitive. Readers may use sentinel values and shared value-bearing branches to keep hot paths compact.
Reference tracking validation is security-relevant when malformed input can:
Reference tracking validation is not required merely because a malformed flag is not rejected at the earliest possible byte. Lazy rejection is acceptable when the root operation still returns an error and no security invariant is violated.
Fory runtimes may intentionally use lazy error propagation. After a read records an error, later read steps may continue until the outer operation observes and returns the error.
This is acceptable when the continued work cannot:
Nested try/finally or equivalent cleanup should be added only when the outer root-operation cleanup cannot cover the state or resource owned by the nested path.
Security validation must preserve Fory hot-path performance. Do not add validation solely for strictness when it introduces:
Prefer owner-local checks that can be inlined and that already use information available in the current serializer. Do not move serializer-owned semantics into generic read-context helpers.
Use the following questions when reviewing deserialization behavior:
If the answer to the first seven questions is no, the issue is normally not a security finding. If the validation is not effectively free, avoid adding it unless the protocol owner explicitly requires it.
Security model documents must not include exploit samples, CVE narratives, line-level vulnerability candidates, branch history, migration timelines, or cleanup plans. Keep those details in private reports, issues, or pull requests as appropriate.
Public security documentation should describe durable boundaries and invariants, not the history of how the implementation reached them.