Fluss is a streaming storage built for real-time analytics which can serve as the real-time data layer for Lakehouse architectures.
It bridges the gap between streaming data and the data Lakehouse by enabling low-latency, high-throughput data ingestion and processing while seamlessly integrating with popular compute engines like Apache Flink, with Apache Spark and StarRocks coming soon.
Fluss supports streaming reads and writes with sub-second latency and stores data in a columnar format, enhancing query performance and reducing storage costs. It offers flexible table types, including append-only Log Tables and updatable PrimaryKey Tables, to accommodate diverse real-time analytics and processing needs.
With built-in replication for fault tolerance, horizontal scalability, and advanced features like high-QPS lookup joins and bulk read/write operations, Fluss is ideal for powering real-time analytics, AI/ML pipelines, and streaming data warehouses.
Fluss (German: river, pronounced /flus/) enables streaming data continuously converging, distributing and flowing into lakes, like a river 🌊
The following is a list of (but not limited to) use-cases that Fluss shines ✨: