blob: b3f91bc413da3d64e7058b10e5f62b1263aacd90 [file] [log] [blame]
~~
~~ Licensed to the Apache Software Foundation (ASF) under one
~~ or more contributor license agreements. See the NOTICE file
~~ distributed with this work for additional information
~~ regarding copyright ownership. The ASF licenses this file
~~ to you under the Apache License, Version 2.0 (the
~~ "License"); you may not use this file except in compliance
~~ with the License. You may obtain a copy of the License at
~~
~~ http://www.apache.org/licenses/LICENSE-2.0
~~
~~ Unless required by applicable law or agreed to in writing,
~~ software distributed under the License is distributed on an
~~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~~ KIND, either express or implied. See the License for the
~~ specific language governing permissions and limitations
~~ under the License.
~~
Lens server components
Lens server by default comes up with an embedded Jersey server. Lens server is composed of multiple services
which are responsible for different functions. The key services inside Lens are Session Management, Query life cycle management,
Schema management, Data availability management, Query schedule management (in progress), Metrics Registry, Besides, it also offers
a simple UI for the users.
Here is the diagram describing all the components:
[/images/serverdesign.png] Lens Server components
* Lens services
** Session Management
All operations in Lens are performed in the context of a session. This would essentially be the first
thing a client would need before any operation can be be performed in the lens system. Session management
component facilitates creation, session specific configuration management, session resource management
within a session and termination of a session. Session info is periodically written out a file (could be a
local file or a file on HDFS) and the same is read during Lens restart. Frequency of this flush to persistent storage is
configurable. Sessions are kept alive for a configurable period (24 hours by default) since last activity on
the session.
** Schema Management
This component allows for cubes, fact tables, dimension tables and storages to be managed. While Storages can
be managed only by the administrator, other objects can be managed directly by the lens users. Lens currently
extends and uses Hive Metastore for managing lens objects. Lens server has to be configured appropriately to point
to the hive metastore end point. See {{{../resource_MetastoreResource.html#path__metastore_storages.html}Storages API}}
and {{{../resource_MetastoreResource.html#path__metastore_storages_-storage-.html}Storage API}} for details about
storage administration.
** Data Availability Management
Once schema is registered with Lens through the Schema management API, data can be made available in the form of partitions.
The Lens data availability management (logical component) manages the partitions available for each fact and maintains an
active timeline of data. This can be consulted by the query execution and managment components. Like the Schema management
component, Lens currently uses the Hive metastore for data availability management as well.
** Query life cycle management
Servicing user queries over cubes, fact, dimension or native tables across different storages by choosing the best storage
and execution engine is the most critical capability of the lens. Lens maintains state of queries submitted and tracks the progress
through the system and also maintains a history of queries submitted and the its status. Unique query handle is issued to each
query submitted and this can be used to track status, retrieve results or kill. Details of query maintained in the history can be
used for performing various analysis on the actual usage.
** Query schedule management (in progress)
Lens should allow for users to schedule their queries with some periodicity. This scheduling capability should allow users to
express the time schedule. Additionally Lens server would apply gating criteria based on data availability. Also any throttling
that needs to be effected would be considered before a scheduled query moves into running state.
** Metrics Registry
Lens today includes support for instrumenting various functions at a server level and at a query level through gauges and metrics.
This can be used to understand the performance characteristics of the lens server and a single query.
** Pluggable query drivers
Lens server allows pluggable execution drivers for running queries. Available execution engines are Hive and JDBC.
For configuring Hive as an execution engine, administrator should configure the HiveServer2 end point. More details on
configuring multiple drivers will be covered in {{{./config-server.html} configuration guide}}