| --- |
| title: Type System |
| sidebar_position: 4 |
| id: type_system |
| license: | |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --- |
| |
| This document describes the FDL type system and how types map to each target language. |
| |
| ## Overview |
| |
| FDL provides a rich type system designed for cross-language compatibility: |
| |
| - **Primitive Types**: Basic scalar types (integers, floats, strings, etc.) |
| - **Enum Types**: Named integer constants |
| - **Message Types**: Structured compound types |
| - **Collection Types**: Lists and maps |
| - **Nullable Types**: Optional/nullable variants |
| |
| ## Primitive Types |
| |
| ### Boolean |
| |
| ```protobuf |
| bool is_active = 1; |
| ``` |
| |
| | Language | Type | Notes | |
| | -------- | --------------------- | ------------------ | |
| | Java | `boolean` / `Boolean` | Primitive or boxed | |
| | Python | `bool` | | |
| | Go | `bool` | | |
| | Rust | `bool` | | |
| | C++ | `bool` | | |
| |
| ### Integer Types |
| |
| FDL provides fixed-width signed integers (varint encoding for 32/64-bit): |
| |
| | FDL Type | Size | Range | |
| | -------- | ------ | ----------------- | |
| | `int8` | 8-bit | -128 to 127 | |
| | `int16` | 16-bit | -32,768 to 32,767 | |
| | `int32` | 32-bit | -2^31 to 2^31 - 1 | |
| | `int64` | 64-bit | -2^63 to 2^63 - 1 | |
| |
| **Language Mapping:** |
| |
| | FDL | Java | Python | Go | Rust | C++ | |
| | ------- | ------- | -------------- | ------- | ----- | --------- | |
| | `int8` | `byte` | `pyfory.int8` | `int8` | `i8` | `int8_t` | |
| | `int16` | `short` | `pyfory.int16` | `int16` | `i16` | `int16_t` | |
| | `int32` | `int` | `pyfory.int32` | `int32` | `i32` | `int32_t` | |
| | `int64` | `long` | `pyfory.int64` | `int64` | `i64` | `int64_t` | |
| |
| FDL provides fixed-width unsigned integers (varint encoding for 32/64-bit): |
| |
| | FDL | Size | Range | |
| | -------- | ------ | ------------- | |
| | `uint8` | 8-bit | 0 to 255 | |
| | `uint16` | 16-bit | 0 to 65,535 | |
| | `uint32` | 32-bit | 0 to 2^32 - 1 | |
| | `uint64` | 64-bit | 0 to 2^64 - 1 | |
| |
| **Language Mapping (Unsigned):** |
| |
| | FDL | Java | Python | Go | Rust | C++ | |
| | -------- | ------- | --------------- | -------- | ----- | ---------- | |
| | `uint8` | `short` | `pyfory.uint8` | `uint8` | `u8` | `uint8_t` | |
| | `uint16` | `int` | `pyfory.uint16` | `uint16` | `u16` | `uint16_t` | |
| | `uint32` | `long` | `pyfory.uint32` | `uint32` | `u32` | `uint32_t` | |
| | `uint64` | `long` | `pyfory.uint64` | `uint64` | `u64` | `uint64_t` | |
| |
| **Examples:** |
| |
| ```protobuf |
| message Counters { |
| int8 tiny = 1; |
| int16 small = 2; |
| int32 medium = 3; |
| int64 large = 4; |
| } |
| ``` |
| |
| **Python Type Hints:** |
| |
| Python's native `int` is arbitrary precision, so FDL uses type wrappers for fixed-width integers: |
| |
| ```python |
| from pyfory import int8, int16, int32 |
| |
| @dataclass |
| class Counters: |
| tiny: int8 |
| small: int16 |
| medium: int32 |
| large: int # int64 maps to native int |
| ``` |
| |
| ### Integer Encoding Variants |
| |
| For 32/64-bit integers, FDL uses varint encoding by default. Use explicit |
| types when you need fixed-width or tagged encoding: |
| |
| | FDL Type | Encoding | Notes | |
| | --------------- | -------- | ------------------------ | |
| | `fixed_int32` | fixed | Signed 32-bit | |
| | `fixed_int64` | fixed | Signed 64-bit | |
| | `fixed_uint32` | fixed | Unsigned 32-bit | |
| | `fixed_uint64` | fixed | Unsigned 64-bit | |
| | `tagged_int64` | tagged | Signed 64-bit (hybrid) | |
| | `tagged_uint64` | tagged | Unsigned 64-bit (hybrid) | |
| |
| ### Floating-Point Types |
| |
| | FDL Type | Size | Precision | |
| | --------- | ------ | ------------- | |
| | `float32` | 32-bit | ~7 digits | |
| | `float64` | 64-bit | ~15-16 digits | |
| |
| **Language Mapping:** |
| |
| | FDL | Java | Python | Go | Rust | C++ | |
| | --------- | -------- | ---------------- | --------- | ----- | -------- | |
| | `float32` | `float` | `pyfory.float32` | `float32` | `f32` | `float` | |
| | `float64` | `double` | `pyfory.float64` | `float64` | `f64` | `double` | |
| |
| **Example:** |
| |
| ```protobuf |
| message Coordinates { |
| float64 latitude = 1; |
| float64 longitude = 2; |
| float32 altitude = 3; |
| } |
| ``` |
| |
| ### String Type |
| |
| UTF-8 encoded text: |
| |
| ```protobuf |
| string name = 1; |
| ``` |
| |
| | Language | Type | Notes | |
| | -------- | ------------- | --------------------- | |
| | Java | `String` | Immutable | |
| | Python | `str` | | |
| | Go | `string` | Immutable | |
| | Rust | `String` | Owned, heap-allocated | |
| | C++ | `std::string` | | |
| |
| ### Bytes Type |
| |
| Raw binary data: |
| |
| ```protobuf |
| bytes data = 1; |
| ``` |
| |
| | Language | Type | Notes | |
| | -------- | ---------------------- | --------- | |
| | Java | `byte[]` | | |
| | Python | `bytes` | Immutable | |
| | Go | `[]byte` | | |
| | Rust | `Vec<u8>` | | |
| | C++ | `std::vector<uint8_t>` | | |
| |
| ### Temporal Types |
| |
| #### Date |
| |
| Calendar date without time: |
| |
| ```protobuf |
| date birth_date = 1; |
| ``` |
| |
| | Language | Type | Notes | |
| | -------- | --------------------------- | ----------------------- | |
| | Java | `java.time.LocalDate` | | |
| | Python | `datetime.date` | | |
| | Go | `time.Time` | Time portion ignored | |
| | Rust | `chrono::NaiveDate` | Requires `chrono` crate | |
| | C++ | `fory::serialization::Date` | | |
| |
| #### Timestamp |
| |
| Date and time with nanosecond precision: |
| |
| ```protobuf |
| timestamp created_at = 1; |
| ``` |
| |
| | Language | Type | Notes | |
| | -------- | -------------------------------- | ----------------------- | |
| | Java | `java.time.Instant` | UTC-based | |
| | Python | `datetime.datetime` | | |
| | Go | `time.Time` | | |
| | Rust | `chrono::NaiveDateTime` | Requires `chrono` crate | |
| | C++ | `fory::serialization::Timestamp` | | |
| |
| ### Any |
| |
| Dynamic value with runtime type information: |
| |
| ```protobuf |
| any payload = 1; |
| ``` |
| |
| | Language | Type | Notes | |
| | -------- | -------------- | -------------------- | |
| | Java | `Object` | Runtime type written | |
| | Python | `Any` | Runtime type written | |
| | Go | `any` | Runtime type written | |
| | Rust | `Box<dyn Any>` | Runtime type written | |
| | C++ | `std::any` | Runtime type written | |
| |
| **Notes:** |
| |
| - `any` always writes a null flag (same as `nullable`) because values may be empty; codegen treats `any` as nullable even without `optional`. |
| - Allowed runtime values are limited to `bool`, `string`, `enum`, `message`, and `union`. Other primitives (numeric, bytes, date/time) and list/map are not supported; wrap them in a message or use explicit fields instead. |
| - `ref` is not allowed on `any` fields (including repeated/map values). Wrap `any` in a message if you need reference tracking. |
| - The runtime type must be registered in the target language schema/IDL registration; unknown types fail to deserialize. |
| |
| ## Enum Types |
| |
| Enums define named integer constants: |
| |
| ```protobuf |
| enum Priority [id=100] { |
| LOW = 0; |
| MEDIUM = 1; |
| HIGH = 2; |
| CRITICAL = 3; |
| } |
| ``` |
| |
| **Language Mapping:** |
| |
| | Language | Implementation | |
| | -------- | --------------------------------------- | |
| | Java | `enum Priority { LOW, MEDIUM, ... }` | |
| | Python | `class Priority(IntEnum): LOW = 0, ...` | |
| | Go | `type Priority int32` with constants | |
| | Rust | `#[repr(i32)] enum Priority { ... }` | |
| | C++ | `enum class Priority : int32_t { ... }` | |
| |
| **Java:** |
| |
| ```java |
| public enum Priority { |
| LOW, |
| MEDIUM, |
| HIGH, |
| CRITICAL; |
| } |
| ``` |
| |
| **Python:** |
| |
| ```python |
| class Priority(IntEnum): |
| LOW = 0 |
| MEDIUM = 1 |
| HIGH = 2 |
| CRITICAL = 3 |
| ``` |
| |
| **Go:** |
| |
| ```go |
| type Priority int32 |
| |
| const ( |
| PriorityLow Priority = 0 |
| PriorityMedium Priority = 1 |
| PriorityHigh Priority = 2 |
| PriorityCritical Priority = 3 |
| ) |
| ``` |
| |
| **Rust:** |
| |
| ```rust |
| #[derive(ForyObject, Debug, Clone, PartialEq, Default)] |
| #[repr(i32)] |
| pub enum Priority { |
| #[default] |
| Low = 0, |
| Medium = 1, |
| High = 2, |
| Critical = 3, |
| } |
| ``` |
| |
| **C++:** |
| |
| ```cpp |
| enum class Priority : int32_t { |
| LOW = 0, |
| MEDIUM = 1, |
| HIGH = 2, |
| CRITICAL = 3, |
| }; |
| FORY_ENUM(Priority, LOW, MEDIUM, HIGH, CRITICAL); |
| ``` |
| |
| ## Message Types |
| |
| Messages are structured types composed of fields: |
| |
| ```protobuf |
| message User [id=101] { |
| string id = 1; |
| string name = 2; |
| int32 age = 3; |
| } |
| ``` |
| |
| **Language Mapping:** |
| |
| | Language | Implementation | |
| | -------- | ----------------------------------- | |
| | Java | POJO class with getters/setters | |
| | Python | `@dataclass` class | |
| | Go | Struct with exported fields | |
| | Rust | Struct with `#[derive(ForyObject)]` | |
| | C++ | Struct with `FORY_STRUCT` macro | |
| |
| ## Collection Types |
| |
| ### List (repeated) |
| |
| The `repeated` modifier creates a list: |
| |
| ```protobuf |
| repeated string tags = 1; |
| repeated User users = 2; |
| ``` |
| |
| **Language Mapping:** |
| |
| | FDL | Java | Python | Go | Rust | C++ | |
| | ----------------- | --------------- | ------------ | ---------- | ------------- | -------------------------- | |
| | `repeated string` | `List<String>` | `List[str]` | `[]string` | `Vec<String>` | `std::vector<std::string>` | |
| | `repeated int32` | `List<Integer>` | `List[int]` | `[]int32` | `Vec<i32>` | `std::vector<int32_t>` | |
| | `repeated User` | `List<User>` | `List[User]` | `[]User` | `Vec<User>` | `std::vector<User>` | |
| |
| **List modifiers:** |
| |
| | FDL | Java | Python | Go | Rust | C++ | |
| | -------------------------- | ---------------------------------------------- | --------------------------------------- | ----------------------- | --------------------- | ----------------------------------------- | |
| | `optional repeated string` | `List<String>` + `@ForyField(nullable = true)` | `Optional[List[str]]` | `[]string` + `nullable` | `Option<Vec<String>>` | `std::optional<std::vector<std::string>>` | |
| | `repeated optional string` | `List<String>` (nullable elements) | `List[Optional[str]]` | `[]*string` | `Vec<Option<String>>` | `std::vector<std::optional<std::string>>` | |
| | `ref repeated User` | `List<User>` + `@ForyField(ref = true)` | `List[User]` + `pyfory.field(ref=True)` | `[]User` + `ref` | `Arc<Vec<User>>`\* | `std::shared_ptr<std::vector<User>>` | |
| | `repeated ref User` | `List<User>` | `List[User]` | `[]*User` + `ref=false` | `Vec<Arc<User>>`\* | `std::vector<std::shared_ptr<User>>` | |
| |
| \*Use `[(fory).thread_safe_pointer = false]` to generate `Rc` instead of `Arc` in Rust. |
| |
| ### Map |
| |
| Maps with typed keys and values: |
| |
| ```protobuf |
| map<string, int32> counts = 1; |
| map<string, User> users = 2; |
| ``` |
| |
| **Language Mapping:** |
| |
| | FDL | Java | Python | Go | Rust | C++ | |
| | -------------------- | ---------------------- | ----------------- | ------------------ | ----------------------- | -------------------------------- | |
| | `map<string, int32>` | `Map<String, Integer>` | `Dict[str, int]` | `map[string]int32` | `HashMap<String, i32>` | `std::map<std::string, int32_t>` | |
| | `map<string, User>` | `Map<String, User>` | `Dict[str, User]` | `map[string]User` | `HashMap<String, User>` | `std::map<std::string, User>` | |
| |
| **Key Type Restrictions:** |
| |
| Map keys should be hashable types: |
| |
| - `string` (most common) |
| - Integer types (`int8`, `int16`, `int32`, `int64`) |
| - `bool` |
| |
| Avoid using messages or complex types as keys. |
| |
| ## Nullable Types |
| |
| The `optional` modifier makes a field nullable: |
| |
| ```protobuf |
| message Profile { |
| string name = 1; // Required |
| optional string bio = 2; // Nullable |
| optional int32 age = 3; // Nullable integer |
| } |
| ``` |
| |
| **Language Mapping:** |
| |
| | FDL | Java | Python | Go | Rust | C++ | |
| | ----------------- | ---------- | --------------- | --------- | ---------------- | ---------------------------- | |
| | `optional string` | `String`\* | `Optional[str]` | `*string` | `Option<String>` | `std::optional<std::string>` | |
| | `optional int32` | `Integer` | `Optional[int]` | `*int32` | `Option<i32>` | `std::optional<int32_t>` | |
| |
| \*Java uses boxed types with `@ForyField(nullable = true)` annotation. |
| |
| **Default Values:** |
| |
| | Type | Default Value | |
| | ------------------ | ------------------- | |
| | Non-optional types | Language default | |
| | Optional types | `null`/`None`/`nil` | |
| |
| ## Reference Types |
| |
| The `ref` modifier enables reference tracking: |
| |
| ```protobuf |
| message TreeNode { |
| string value = 1; |
| ref TreeNode parent = 2; |
| repeated ref TreeNode children = 3; |
| } |
| ``` |
| |
| **Use Cases:** |
| |
| 1. **Shared References**: Same object referenced from multiple places |
| 2. **Circular References**: Object graphs with cycles |
| 3. **Large Objects**: Avoid duplicate serialization |
| |
| **Language Mapping:** |
| |
| | FDL | Java | Python | Go | Rust | C++ | |
| | ---------- | -------- | ------ | ---------------------- | ----------- | ----------------------- | |
| | `ref User` | `User`\* | `User` | `*User` + `fory:"ref"` | `Arc<User>` | `std::shared_ptr<User>` | |
| |
| \*Java uses `@ForyField(ref = true)` annotation. |
| |
| Rust uses `Arc` by default; set `ref(thread_safe = false)` in FDL (or |
| `[(fory).thread_safe_pointer = false]` in protobuf) to use `Rc`. Use |
| `ref(weak = true)` in FDL (or `[(fory).weak_ref = true]` in protobuf) with `ref` |
| to generate weak pointer types: `ArcWeak`/`RcWeak` in Rust and |
| `fory::serialization::SharedWeak<T>` in C++. Java/Python/Go ignore `weak_ref`. |
| |
| ## Type Compatibility Matrix |
| |
| This matrix shows which type conversions are safe across languages: |
| |
| | From → To | bool | int8 | int16 | int32 | int64 | float32 | float64 | string | |
| | ----------- | ---- | ---- | ----- | ----- | ----- | ------- | ------- | ------ | |
| | **bool** | ✓ | ✓ | ✓ | ✓ | ✓ | - | - | - | |
| | **int8** | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - | |
| | **int16** | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | - | |
| | **int32** | - | - | - | ✓ | ✓ | - | ✓ | - | |
| | **int64** | - | - | - | - | ✓ | - | - | - | |
| | **float32** | - | - | - | - | - | ✓ | ✓ | - | |
| | **float64** | - | - | - | - | - | - | ✓ | - | |
| | **string** | - | - | - | - | - | - | - | ✓ | |
| |
| ✓ = Safe conversion, - = Not recommended |
| |
| ## Best Practices |
| |
| ### Choosing Integer Types |
| |
| - Use `int32` as the default for most integers |
| - Use `int64` for large values (timestamps, IDs) |
| - Use `int8`/`int16` only when storage size matters |
| |
| ### String vs Bytes |
| |
| - Use `string` for text data (UTF-8) |
| - Use `bytes` for binary data (images, files, encrypted data) |
| |
| ### Optional vs Required |
| |
| - Use `optional` when the field may legitimately be absent |
| - Default to required fields for better type safety |
| - Document why a field is optional |
| |
| ### Reference Tracking |
| |
| - Use `ref` only when needed (shared/circular references) |
| - Reference tracking adds overhead |
| - Test with realistic data to ensure correctness |
| |
| ### Collections |
| |
| - Prefer `repeated` for ordered sequences |
| - Use `map` for key-value lookups |
| - Consider message types for complex map values |