| --- |
| title: Field Configuration |
| sidebar_position: 5 |
| id: field_configuration |
| license: | |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --- |
| |
| This page explains how to configure field-level metadata for serialization in Python. |
| |
| ## Overview |
| |
| Apache Fory™ provides field-level configuration through: |
| |
| - **`pyfory.field()`**: Configure field metadata (id, nullable, ref, ignore, dynamic) |
| - **Type annotations**: Control integer encoding (varint, fixed, tagged) |
| - **`Optional[T]`**: Mark fields as nullable |
| |
| This enables: |
| |
| - **Tag IDs**: Assign compact numeric IDs to reduce struct field meta size overhead |
| - **Nullability**: Control whether fields can be null |
| - **Reference Tracking**: Enable reference tracking for shared objects |
| - **Field Skipping**: Exclude fields from serialization |
| - **Encoding Control**: Specify how integers are encoded (varint, fixed, tagged) |
| - **Polymorphism**: Control whether type info is written for struct fields |
| |
| ## Basic Syntax |
| |
| Use `@dataclass` decorator with type annotations and `pyfory.field()`: |
| |
| ```python |
| from dataclasses import dataclass |
| from typing import Optional |
| import pyfory |
| |
| @dataclass |
| class Person: |
| name: str = pyfory.field(id=0) |
| age: pyfory.int32 = pyfory.field(id=1, default=0) |
| nickname: Optional[str] = pyfory.field(id=2, nullable=True, default=None) |
| ``` |
| |
| ## The `pyfory.field()` Function |
| |
| Use `pyfory.field()` to configure field-level metadata: |
| |
| ```python |
| @dataclass |
| class User: |
| id: pyfory.int64 = pyfory.field(id=0, default=0) |
| name: str = pyfory.field(id=1, default="") |
| email: Optional[str] = pyfory.field(id=2, nullable=True, default=None) |
| friends: List["User"] = pyfory.field(id=3, ref=True, default_factory=list) |
| _cache: dict = pyfory.field(ignore=True, default_factory=dict) |
| ``` |
| |
| ### Parameters |
| |
| | Parameter | Type | Default | Description | |
| | ----------------- | -------- | --------- | ------------------------------------ | |
| | `id` | `int` | `-1` | Field tag ID (-1 = use field name) | |
| | `nullable` | `bool` | `False` | Whether the field can be null | |
| | `ref` | `bool` | `False` | Enable reference tracking | |
| | `ignore` | `bool` | `False` | Exclude field from serialization | |
| | `dynamic` | `bool` | `None` | Control whether type info is written | |
| | `default` | Any | `MISSING` | Default value for the field | |
| | `default_factory` | Callable | `MISSING` | Factory function for default value | |
| |
| ## Field ID (`id`) |
| |
| Assigns a numeric ID to a field to minimize struct field meta size overhead: |
| |
| ```python |
| @dataclass |
| class User: |
| id: pyfory.int64 = pyfory.field(id=0, default=0) |
| name: str = pyfory.field(id=1, default="") |
| age: pyfory.int32 = pyfory.field(id=2, default=0) |
| ``` |
| |
| **Benefits**: |
| |
| - Smaller serialized size (numeric IDs vs field names in metadata) |
| - Reduced struct field meta overhead |
| - Allows renaming fields without breaking binary compatibility |
| |
| **Recommendation**: It is recommended to configure field IDs for compatible mode since it reduces serialization cost. |
| |
| **Notes**: |
| |
| - IDs must be unique within a class |
| - IDs must be >= 0 (use -1 to use field name encoding, which is the default) |
| - If not specified, field name is used in metadata (larger overhead) |
| |
| **Without field IDs** (field names used in metadata): |
| |
| ```python |
| @dataclass |
| class User: |
| id: pyfory.int64 = 0 |
| name: str = "" |
| ``` |
| |
| ## Nullable Fields (`nullable`) |
| |
| Use `nullable=True` for fields that can be `None`: |
| |
| ```python |
| from typing import Optional |
| |
| @dataclass |
| class Record: |
| # Nullable string field |
| optional_name: Optional[str] = pyfory.field(id=0, nullable=True, default=None) |
| |
| # Nullable integer field |
| optional_count: Optional[pyfory.int32] = pyfory.field(id=1, nullable=True, default=None) |
| ``` |
| |
| **Notes**: |
| |
| - `Optional[T]` fields must have `nullable=True` |
| - Non-optional fields default to `nullable=False` |
| |
| ## Reference Tracking (`ref`) |
| |
| Enable reference tracking for fields that may be shared or circular: |
| |
| ```python |
| @dataclass |
| class RefOuter: |
| # Both fields may point to the same inner object |
| inner1: Optional[RefInner] = pyfory.field(id=0, ref=True, nullable=True, default=None) |
| inner2: Optional[RefInner] = pyfory.field(id=1, ref=True, nullable=True, default=None) |
| |
| |
| @dataclass |
| class CircularRef: |
| name: str = pyfory.field(id=0, default="") |
| # Self-referencing field for circular references |
| self_ref: Optional["CircularRef"] = pyfory.field(id=1, ref=True, nullable=True, default=None) |
| ``` |
| |
| **Use Cases**: |
| |
| - Enable for fields that may be circular or shared |
| - When the same object is referenced from multiple fields |
| |
| **Notes**: |
| |
| - Reference tracking only takes effect when `Fory(ref=True)` is set globally |
| - Field-level `ref=True` AND global `ref=True` must both be enabled |
| |
| ## Skipping Fields (`ignore`) |
| |
| Exclude fields from serialization: |
| |
| ```python |
| @dataclass |
| class User: |
| id: pyfory.int64 = pyfory.field(id=0, default=0) |
| name: str = pyfory.field(id=1, default="") |
| # Not serialized |
| _cache: dict = pyfory.field(ignore=True, default_factory=dict) |
| _internal_state: str = pyfory.field(ignore=True, default="") |
| ``` |
| |
| ## Dynamic Fields (`dynamic`) |
| |
| Control whether type information is written for struct fields. This is essential for polymorphism support: |
| |
| ```python |
| from abc import ABC, abstractmethod |
| |
| class Shape(ABC): |
| @abstractmethod |
| def area(self) -> float: |
| pass |
| |
| @dataclass |
| class Circle(Shape): |
| radius: float = 0.0 |
| |
| def area(self) -> float: |
| return 3.14159 * self.radius * self.radius |
| |
| @dataclass |
| class Container: |
| # Abstract class: dynamic is always True (type info written) |
| shape: Shape = pyfory.field(id=0) |
| |
| # Force type info for concrete type (support runtime subtypes) |
| circle: Circle = pyfory.field(id=1, dynamic=True) |
| |
| # Skip type info for concrete type (use declared type directly) |
| fixed_circle: Circle = pyfory.field(id=2, dynamic=False) |
| ``` |
| |
| **Default Behavior**: |
| |
| | Mode | Abstract Class | Concrete Object Types | Numeric/str/time Types | |
| | ----------- | -------------- | --------------------- | ---------------------- | |
| | Native mode | `True` | `True` | `False` | |
| | Xlang mode | `True` | `False` | `False` | |
| |
| **Notes**: |
| |
| - **Abstract classes**: `dynamic` is always `True` (type info must be written) |
| - **Native mode**: `dynamic` defaults to `True` for object types, `False` for numeric/str/time types |
| - **Xlang mode**: `dynamic` defaults to `False` for concrete types |
| - Use `dynamic=True` when a concrete field may hold subclass instances |
| - Use `dynamic=False` for performance optimization when type is known |
| |
| ## Integer Type Annotations |
| |
| Fory provides type annotations to control integer encoding: |
| |
| ### Signed Integers |
| |
| ```python |
| @dataclass |
| class SignedIntegers: |
| byte_val: pyfory.int8 = 0 # 8-bit signed |
| short_val: pyfory.int16 = 0 # 16-bit signed |
| int_val: pyfory.int32 = 0 # 32-bit signed (varint encoding) |
| long_val: pyfory.int64 = 0 # 64-bit signed (varint encoding) |
| ``` |
| |
| ### Unsigned Integers |
| |
| ```python |
| @dataclass |
| class UnsignedIntegers: |
| # Fixed-size encoding |
| u8_val: pyfory.uint8 = 0 # 8-bit unsigned (fixed) |
| u16_val: pyfory.uint16 = 0 # 16-bit unsigned (fixed) |
| |
| # Variable-length encoding (default for u32/u64) |
| u32_var: pyfory.uint32 = 0 # 32-bit unsigned (varint) |
| u64_var: pyfory.uint64 = 0 # 64-bit unsigned (varint) |
| |
| # Explicit fixed-size encoding |
| u32_fixed: pyfory.fixed_uint32 = 0 # 32-bit unsigned (fixed 4 bytes) |
| u64_fixed: pyfory.fixed_uint64 = 0 # 64-bit unsigned (fixed 8 bytes) |
| |
| # Tagged encoding (includes type tag) |
| u64_tagged: pyfory.tagged_uint64 = 0 # 64-bit unsigned (tagged) |
| ``` |
| |
| ### Floating Point |
| |
| ```python |
| @dataclass |
| class FloatingPoint: |
| float_val: pyfory.float32 = 0.0 # 32-bit float |
| double_val: pyfory.float64 = 0.0 # 64-bit double |
| ``` |
| |
| ### Encoding Summary |
| |
| | Type | Encoding | Size | |
| | ---------------------- | -------- | ---------- | |
| | `pyfory.int8` | fixed | 1 byte | |
| | `pyfory.int16` | fixed | 2 bytes | |
| | `pyfory.int32` | varint | 1-5 bytes | |
| | `pyfory.int64` | varint | 1-10 bytes | |
| | `pyfory.uint8` | fixed | 1 byte | |
| | `pyfory.uint16` | fixed | 2 bytes | |
| | `pyfory.uint32` | varint | 1-5 bytes | |
| | `pyfory.uint64` | varint | 1-10 bytes | |
| | `pyfory.fixed_uint32` | fixed | 4 bytes | |
| | `pyfory.fixed_uint64` | fixed | 8 bytes | |
| | `pyfory.tagged_uint64` | tagged | 1-9 bytes | |
| | `pyfory.float32` | fixed | 4 bytes | |
| | `pyfory.float64` | fixed | 8 bytes | |
| |
| **When to Use**: |
| |
| - `varint`: Best for values that are often small (default for int32/int64/uint32/uint64) |
| - `fixed`: Best for values that use full range (e.g., timestamps, hashes) |
| - `tagged`: When type information needs to be preserved (uint64 only) |
| |
| ## Complete Example |
| |
| ```python |
| from dataclasses import dataclass |
| from typing import Optional, List, Dict, Set |
| import pyfory |
| |
| |
| @dataclass |
| class Document: |
| # Fields with tag IDs (recommended for compatible mode) |
| title: str = pyfory.field(id=0, default="") |
| version: pyfory.int32 = pyfory.field(id=1, default=0) |
| |
| # Nullable field |
| description: Optional[str] = pyfory.field(id=2, nullable=True, default=None) |
| |
| # Collection fields |
| tags: List[str] = pyfory.field(id=3, default_factory=list) |
| metadata: Dict[str, str] = pyfory.field(id=4, default_factory=dict) |
| categories: Set[str] = pyfory.field(id=5, default_factory=set) |
| |
| # Unsigned integers with different encodings |
| view_count: pyfory.uint64 = pyfory.field(id=6, default=0) # varint encoding |
| file_size: pyfory.fixed_uint64 = pyfory.field(id=7, default=0) # fixed encoding |
| checksum: pyfory.tagged_uint64 = pyfory.field(id=8, default=0) # tagged encoding |
| |
| # Reference-tracked field for shared/circular references |
| parent: Optional["Document"] = pyfory.field(id=9, ref=True, nullable=True, default=None) |
| |
| # Ignored field (not serialized) |
| _cache: dict = pyfory.field(ignore=True, default_factory=dict) |
| |
| |
| def main(): |
| fory = pyfory.Fory(xlang=True, compatible=True, ref=True) |
| fory.register_type(Document, type_id=100) |
| |
| doc = Document( |
| title="My Document", |
| version=1, |
| description="A sample document", |
| tags=["tag1", "tag2"], |
| metadata={"key": "value"}, |
| categories={"cat1"}, |
| view_count=42, |
| file_size=1024, |
| checksum=123456789, |
| parent=None, |
| ) |
| |
| # Serialize |
| data = fory.serialize(doc) |
| |
| # Deserialize |
| decoded = fory.deserialize(data) |
| assert decoded.title == doc.title |
| assert decoded.version == doc.version |
| |
| |
| if __name__ == "__main__": |
| main() |
| ``` |
| |
| ## Cross-Language Compatibility |
| |
| When serializing data to be read by other languages (Java, Rust, C++, Go), use field IDs and matching type annotations: |
| |
| ```python |
| @dataclass |
| class CrossLangData: |
| # Use field IDs for cross-language compatibility |
| int_var: pyfory.int32 = pyfory.field(id=0, default=0) |
| long_fixed: pyfory.fixed_uint64 = pyfory.field(id=1, default=0) |
| long_tagged: pyfory.tagged_uint64 = pyfory.field(id=2, default=0) |
| optional_value: Optional[str] = pyfory.field(id=3, nullable=True, default=None) |
| ``` |
| |
| ## Schema Evolution |
| |
| Compatible mode supports schema evolution. It is recommended to configure field IDs to reduce serialization cost: |
| |
| ```python |
| # Version 1 |
| @dataclass |
| class DataV1: |
| id: pyfory.int64 = pyfory.field(id=0, default=0) |
| name: str = pyfory.field(id=1, default="") |
| |
| |
| # Version 2: Added new field |
| @dataclass |
| class DataV2: |
| id: pyfory.int64 = pyfory.field(id=0, default=0) |
| name: str = pyfory.field(id=1, default="") |
| email: Optional[str] = pyfory.field(id=2, nullable=True, default=None) # New field |
| ``` |
| |
| Data serialized with V1 can be deserialized with V2 (new field will be `None`). |
| |
| Alternatively, field IDs can be omitted (field names will be used in metadata with larger overhead): |
| |
| ```python |
| @dataclass |
| class Data: |
| id: pyfory.int64 = 0 |
| name: str = "" |
| ``` |
| |
| ## Native Mode vs Xlang Mode |
| |
| Field configuration behaves differently depending on the serialization mode: |
| |
| ### Native Mode (Python-only) |
| |
| Native mode has **relaxed default values** for maximum compatibility: |
| |
| - **Nullable**: `str` and numeric types are non-nullable by default unless `Optional` is used |
| - **Ref tracking**: Enabled by default for object references (except `str` and numeric types) |
| |
| In native mode, you typically **don't need to configure field annotations** unless you want to: |
| |
| - Reduce serialized size by using field IDs |
| - Optimize performance by disabling unnecessary ref tracking |
| |
| ```python |
| # Native mode: works without field configuration |
| @dataclass |
| class User: |
| id: int = 0 |
| name: str = "" |
| tags: List[str] = None |
| ``` |
| |
| ### Xlang Mode (Cross-language) |
| |
| Xlang mode has **stricter default values** due to type system differences between languages: |
| |
| - **Nullable**: Fields are non-nullable by default (`nullable=False`) |
| - **Ref tracking**: Disabled by default (`ref=False`) |
| |
| In xlang mode, you **need to configure fields** when: |
| |
| - A field can be None (use `Optional[T]` with `nullable=True`) |
| - A field needs reference tracking for shared/circular objects (use `ref=True`) |
| - Integer types need specific encoding for cross-language compatibility |
| - You want to reduce metadata size (use field IDs) |
| |
| ```python |
| # Xlang mode: explicit configuration required for nullable/ref fields |
| @dataclass |
| class User: |
| id: pyfory.int64 = pyfory.field(id=0, default=0) |
| name: str = pyfory.field(id=1, default="") |
| email: Optional[str] = pyfory.field(id=2, nullable=True, default=None) # Must declare nullable |
| friend: Optional["User"] = pyfory.field(id=3, ref=True, nullable=True, default=None) # Must declare ref |
| ``` |
| |
| ### Default Values Summary |
| |
| | Option | Native Mode Default | Xlang Mode Default | |
| | ---------- | ----------------------------------------------------- | ------------------ | |
| | `nullable` | `False` for `str`/numeric; others nullable by default | `False` | |
| | `ref` | `True` (except `str` and numeric types) | `False` | |
| | `dynamic` | `True` (except numeric/str/time types) | `False` (concrete) | |
| |
| ## Best Practices |
| |
| 1. **Configure field IDs**: Recommended for compatible mode to reduce serialization cost |
| 2. **Use `Optional[T]` with `nullable=True`**: Required for nullable fields in xlang mode |
| 3. **Enable ref tracking for shared objects**: Use `ref=True` when objects are shared or circular |
| 4. **Use `ignore=True` for sensitive data**: Passwords, tokens, internal state |
| 5. **Choose appropriate encoding**: `varint` for small values, `fixed` for full-range values |
| 6. **Keep IDs stable**: Once assigned, don't change field IDs |
| |
| ## Options Reference |
| |
| | Configuration | Description | |
| | -------------------------------------------- | ------------------------------------ | |
| | `pyfory.field(id=N)` | Field tag ID to reduce metadata size | |
| | `pyfory.field(nullable=True)` | Mark field as nullable | |
| | `pyfory.field(ref=True)` | Enable reference tracking | |
| | `pyfory.field(ignore=True)` | Exclude field from serialization | |
| | `pyfory.field(dynamic=True)` | Force type info to be written | |
| | `pyfory.field(dynamic=False)` | Skip type info (use declared type) | |
| | `Optional[T]` | Type hint for nullable fields | |
| | `pyfory.int32`, `pyfory.int64` | Signed integers (varint encoding) | |
| | `pyfory.uint32`, `pyfory.uint64` | Unsigned integers (varint encoding) | |
| | `pyfory.fixed_uint32`, `pyfory.fixed_uint64` | Fixed-size unsigned | |
| | `pyfory.tagged_uint64` | Tagged encoding for uint64 | |
| |
| ## Related Topics |
| |
| - [Basic Serialization](basic-serialization.md) - Getting started with Fory serialization |
| - [Schema Evolution](schema-evolution.md) - Compatible mode and schema evolution |
| - [Cross-Language](cross-language.md) - Interoperability with Java, Rust, C++, Go |