| <!--- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| # Introduction |
| |
| DataFusion is an extensible query execution framework, written in |
| Rust, that uses [Apache Arrow](https://arrow.apache.org) as its |
| in-memory format. |
| |
| DataFusion supports SQL and a DataFrame API for building logical query |
| plans, an extensive query optimizer, and a multi-threaded parallel |
| execution execution engine for processing partitioned data sources |
| such as CSV and Parquet files extremely quickly. |
| |
| ## Use Cases |
| |
| DataFusion is used to create modern, fast and efficient data |
| pipelines, ETL processes, and database systems, which need the |
| performance of Rust and Apache Arrow and want to provide their users |
| the convenience of an SQL interface or a DataFrame API. |
| |
| ## Why DataFusion? |
| |
| - _High Performance_: Leveraging Rust and Arrow's memory model, DataFusion is very fast. |
| - _Easy to Connect_: Being part of the Apache Arrow ecosystem (Arrow, Parquet and Flight), DataFusion works well with the rest of the big data ecosystem |
| - _Easy to Embed_: Allowing extension at almost any point in its design, DataFusion can be tailored for your specific usecase |
| - _High Quality_: Extensively tested, both by itself and with the rest of the Arrow ecosystem, DataFusion can be used as the foundation for production systems. |