blob: 1c0af113cbdaa2f582347a83b7ea748d31a975e1 [file] [log] [blame]
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
Apache Unomi gathers information about users actions, information that is processed and stored by Unomi services. The
collected information can then be used to personalize content, derive insights on user behavior, categorize the user
profiles into segments along user-definable dimensions or acted upon by algorithms.
=== Items and types
Unomi structures the information it collects using the concept of `Item` which provides the base information (an
identifier and a type) the context server needs to process and store the data. Items are persisted according to their
type (structure) and identifier (identity). This base structure can be extended, if needed, using properties in the
form of key-value pairs.
These properties are further defined by the `Item`’s type definition which explicits the `Item`’s structure and
semantics. By defining new types, users specify which properties (including the type of values they accept) are
available to items of that specific type.
Unomi defines default value types: `date`, `email`, `integer` and `string`, all pretty self-explanatory. While you can
think of these value types as "primitive" types, it is possible to extend Unomi by providing additional value types.
Additionally, most items are also associated to a scope, which is a concept that Unomi uses to group together related
items. A given scope is represented in Unomi by a simple string identifier and usually represents an application or set
of applications from which Unomi gathers data, depending on the desired analysis granularity. In the context of web
sites, a scope could, for example, represent a site or family of related sites being analyzed. Scopes allow clients
accessing the context server to filter data to only see relevant data.
_Base `Item` structure:_
[source,json]
----
{
"itemType": <type of the item>,
"scope": <scope>,
"itemId": <item identifier>,
"properties": <optional properties>
}
----
Some types can be dynamically defined at runtime by calling to the REST API while other extensions are done via Unomi
plugins. Part of extending Unomi, therefore, is a matter of defining new types and specifying which kind of Unomi
entity (e.g. profiles) they can be affected to. For example, the following JSON document can be passed to Unomi to
declare a new property type identified (and named) `tweetNb`, tagged with the `social` tag, targeting profiles and
using the `integer` value type.
_Example JSON type definition:_
[source,json]
----
{
"itemId": "tweetNb",
"itemType": "propertyType",
"metadata": {
"id": "tweetNb",
"name": "tweetNb",
"systemTags": ["social"]
},
"target": "profiles",
"type": "integer"
}
----
____
Unomi defines a built-in scope (called `systemscope`) that clients can use to share data across scopes.
____
=== Events
Users' actions are conveyed from clients to the context server using events. Of course, the required information
depends on what is collected and users' interactions with the observed systems but events minimally provide a type, a
scope and source and target items. Additionally, events are timestamped. Conceptually, an event can be seen as a
sentence, the event's type being the verb, the source the subject and the target the object.
_Event structure:_
[source,json]
----
{
"eventType": <type of the event>,
"scope": <scope of the event>,
"source": <Item>,
"target": <Item>,
"properties": <optional properties>
}
----
Source and target can be any Unomi item but are not limited to them. In particular, as long as they can be described
using properties and Unomi’s type mechanism and can be processed either natively or via extension plugins, source and
target can represent just about anything. Events can also be triggered as part of Unomi’s internal processes for example
when a rule is triggered.
Events are sent to Unomi from client applications using the JSON format and a typical page view event from a web site
could look something like the following:
_Example page view event:_
[source,json]
----
{
"eventType": "view",
"scope": "ACMESPACE",
"source": {
"itemType": "site",
"scope": "ACMESPACE",
"itemId": "c4761bbf-d85d-432b-8a94-37e866410375"
},
"target": {
"itemType": "page",
"scope": "ACMESPACE",
"itemId": "b6acc7b3-6b9d-4a9f-af98-54800ec13a71",
"properties": {
"pageInfo": {
"pageID": "b6acc7b3-6b9d-4a9f-af98-54800ec13a71",
"pageName": "Home",
"pagePath": "/sites/ACMESPACE/home",
"destinationURL": "http://localhost:8080/sites/ACMESPACE/home.html",
"referringURL": "http://localhost:8080/",
"language": "en"
},
"category": {},
"attributes": {}
}
}
}
----
=== Profiles
By processing events, Unomi progressively builds a picture of who the user is and how they behave. This knowledge is
embedded in `Profile` object. A profile is an `Item` with any number of properties and optional segments and scores.
Unomi provides default properties to cover common data (name, last name, age, email, etc.) as well as default segments
to categorize users. Unomi users are, however, free and even encouraged to create additional properties and segments to
better suit their needs.
Contrary to other Unomi items, profiles are not part of a scope since we want to be able to track the associated user
across applications. For this reason, data collected for a given profile in a specific scope is still available to any
scoped item that accesses the profile information.
It is interesting to note that there is not necessarily a one to one mapping between users and profiles as users can be
captured across applications and different observation contexts. As identifying information might not be available in
all contexts in which data is collected, resolving profiles to a single physical user can become complex because
physical users are not observed directly. Rather, their portrait is progressively patched together and made clearer as
Unomi captures more and more traces of their actions. Unomi will merge related profiles as soon as collected data
permits positive association between distinct profiles, usually as a result of the user performing some identifying
action in a context where the user hadn’t already been positively identified.
=== Sessions
A session represents a time-bounded interaction between a user (via their associated profile) and a Unomi-enabled
application. A session represents the sequence of actions the user performed during its duration. For this reason,
events are associated with the session during which they occurred. In the context of web applications, sessions are
usually linked to HTTP sessions.
=== Segments
Segments are used to group profiles together, and are based on conditions that are executed on profiles to determine if
they are part of a segment or not. This also means that a profile may enter or leave a segment based on changes in their
properties, making segments a highly dynamic concept.
Here is an example of a simple segment definition registered using the REST API:
[source]
----
curl -X POST http://localhost:8181/cxs/segments \
--user karaf:karaf \
-H "Content-Type: application/json" \
-d @- <<'EOF'
{
"metadata": {
"id": "leads",
"name": "Leads",
"scope": "systemscope",
"description": "You can customize the list below by editing the leads segment.",
"readOnly":true
},
"condition": {
"type": "booleanCondition",
"parameterValues": {
"operator" : "and",
"subConditions": [
{
"type": "profilePropertyCondition",
"parameterValues": {
"propertyName": "properties.leadAssignedTo",
"comparisonOperator": "exists"
}
}
]
}
}
}
EOF
----
For more details on the conditions and how they are structured using conditions, see the next section.
=== Conditions
Conditions are a very useful notion inside of Apache Unomi, as they are used as the basis for multiple other objects.
Conditions may be used as parts of:
- Segments
- Rules
- Queries
- Campaigns
- Goals
- Profile filters
A condition is composed of two basic elements:
- a condition type identifier
- a list of parameter values for the condition, that can be of any type, and in some cases may include sub-conditions
A condition type identifier is a string that contains a unique identifier for a condition type. Example condition types
may include `booleanCondition`, `eventTypeCondition`, `eventPropertyCondition`, and so on. Plugins may implement new
condition types that may implement any logic that may be needed. The parameter values are simply lists of objects that
may be used to configure the condition. In the case of a `booleanCondition` for example one of the parameter values will
be an `operator` that will contain values such as `and` or `or` and a second parameter value called `subConditions`
that contains a list of conditions to evaluate with that operator. The result of a condition is always a boolean
value of true or false.
Apache Unomi provides quite a lot of built-in condition types, including boolean types that make it possible to
compose conditions using operators such as `and`, `or` or `not`. Composition is an essential element of building more
complex conditions.
Here is an example of a complex condition:
[source,json]
----
{
"condition": {
"type": "booleanCondition",
"parameterValues": {
"operator":"or",
"subConditions":[
{
"type": "eventTypeCondition",
"parameterValues": {
"eventTypeId": "sessionCreated"
}
},
{
"type": "eventTypeCondition",
"parameterValues": {
"eventTypeId": "sessionReassigned"
}
}
]
}
}
}
----
As we can see in the above example we use the boolean `or` condition to check if the event type is of type `sessionCreated`
or `sessionReassigned`.
For a more complete list of available conditions, see the <<Built-in conditions>> reference section.
=== Rules
image::unomi-rule-engine.png[Unomi Rule Engine]
Apache Unomi has a built-in rule engine that is one of the most important components of its architecture. Every time
an event is received by the server, it is evaluated against all the rules and the ones matching the incoming event will
be executed. You can think of a rule as a structure that looks like this:
when
conditions
then
actions
Basically when a rule is evaluated, all the conditions in the `when` part are evaluated and if the result matches
(meaning it evaluates to `true`) then the actions will be executed in sequence.
The real power of Apache Unomi comes from the fact that `conditions` and `actions` are fully pluggeable and that plugins
may implement new conditions and/or actions to perform any task. You can imagine conditions checking incoming event data
against third-party systems or even against authentication systesm, and actions actually pulling or pushing data to third-party
systems.
For example the Salesforce CRM connector is simply a set of actions that pull and push data into the CRM. It is then
just a matter of setting up the proper rules with the proper conditions to determine when and how the data will be
pulled or pushed into the third-party system.
==== Actions
Actions are executed by rules in a sequence, and an action is only executed once the previous action has finished
executing. If an action generates an exception, it will be logged and the execution sequence will continue unless in the
case of a Runtime exception (such as a NullPointerException).
Actions are implemented as Java classes, and as such may perform any kind of tasks that may include calling web hooks,
setting profile properties, extracting data from the incoming request (such as resolving location from an IP address),
or even pulling and/or pushing data to third-party systems such as a CRM server.
Apache Unomi also comes with built-in actions. You may find the list of built-in actions in the <<Built-in actions>> section.
=== Request flow
Here is an overview of how Unomi processes incoming requests to the `ContextServlet`.
image::unomi-request.png[Unomi request overview]