blob: 5898ae182953b327353f9802490ecbd26dcc50e2 [file] [log] [blame]
***************************************************************************
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
***************************************************************************
Apache MRQL 0.9.6-incubating
============================
Apache MRQL (pronounced miracle) is a query processing and optimization
system for large-scale, distributed data analysis. MRQL (the MapReduce
Query Language) is an SQL-like query language for large-scale data
analysis on a cluster of computers. The MRQL query processing system
can evaluate MRQL queries in four modes:
* in Map-Reduce mode using Apache Hadoop,
* in BSP mode (Bulk Synchronous Parallel mode) using Apache Hama,
* in Spark mode using Apache Spark,
* in Flink mode using Apache Flink.
The MRQL query language is powerful enough to express most common data
analysis tasks over many forms of raw in-situ data, such as XML and
JSON documents, binary files, and CSV documents. MRQL is more powerful
than other current high-level MapReduce languages, such as Hive and
PigLatin, since it can operate on more complex data and supports more
powerful query constructs, thus eliminating the need for using
explicit MapReduce code. With MRQL, users are able to express complex
data analysis tasks, such as PageRank, k-means clustering, matrix
factorization, etc, using SQL-like queries exclusively, while the MRQL
query processing system is able to compile these queries to efficient
Java code.
General Info
============
For the latest information about MRQL, please visit our website at:
http://mrql.incubator.apache.org/
and our wiki, at:
http://wiki.apache.org/mrql/
Getting Started
===============
Installation instructions and a quick tutorial:
http://wiki.apache.org/mrql/GettingStarted
To build MRQL using maven, use 'mvn clean install'. To validate the
installation use 'mvn -DskipTests=false clean install', which runs the
queries in 'tests/queries' in memory, local Hadoop mode, local Hama
mode, local Spark mode, and local Flink mode.
Useful mailing lists
====================
1. user@mrql.incubator.apache.org - To discuss and ask usage questions. Send an
empty email to user-subscribe@mrql.incubator.apache.org in order to subscribe
to this mailing list.
2. dev@mrql.incubator.apache.org - For discussions about code, design and features.
Send an empty email to dev-subscribe@mrql.incubator.apache.org in order to
subscribe to this mailing list.
3. commits@mrql.incubator.apache.org - In order to monitor commits to the source
repository. Send an empty email to commits-subscribe@mrql.incubator.apache.org
in order to subscribe to this mailing list.