Aggregations
Introduction
RESTHeart manages aggregation operations: both aggregation pipelines and map reduce functions are supported.
“Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result.”
In both cases only inline output type is supported, i.e. no result is written to the DB server.
The aggrs collection metadata
In RESTHeart, not only documents but also dbs and collections have properties. Some properties are metadata, i.e. they have a special meaning for RESTheart that influences its behavior.
The collection metadata property aggrs
allows to declare aggregation
operations and bind them to given URI.
Aggregation operations need to be defined as collection metadata. It is not possible to execute an aggregation via a query parameter and this is by design: clients are not able to execute arbitrary aggregation operations but only those defined (and tested) by the developers.
aggrs
is an array of pipeline or mapReduce objects.
aggregation pipeline metadata object format
pipeline object format
{
"type":"pipeline",
"uri": <uri>,
"stages": [
"<stage_1>",
"<stage_2>",
...
],
"allowDiskUse": boolean
}
Property | Description | Mandatory |
---|---|---|
type | for aggregation pipeline operations is "pipeline" | yes |
uri | specifies the URI when the operation is bound under the path /<db>/<collection>/_aggrs |
yes |
stages | the MongoDB aggregation pipeline stages. For more information refer to https://docs.mongodb.org/manual/core/aggregation-pipeline/ |
yes |
MongoDB does not allow to store fields with names starting with $ or containing dots (.), see Restrictions on Field Names on MongoDB documentation.
In order to allow storing stages with dollar prefixed operators or using the dot notation (to refer to properties of subdocuments), RESTHeart automatically and transparently escapes the properties keys as follows:
- the $ prefix is “underscore escaped”, e.g.
$exists
is stored as_$exists
- if the dot notation has to be used in a key name, dots are replaced
with :: e.g.
SD.prop
is stored asSD::prop
In RESTHeart 1.x, these escapes are not managed automatically: the developer had to explicitly use them; starting from version 2.0 this is not needed anymore.
mapReduce metadata object format
mapReduce object format
{
"type":"mapReduce",
"uri":"<uri>",
"map": "<map_function>",
"reduce": "<reduce_function>",
"query": "<query>"
}
Property | Description | Mandatory |
---|---|---|
type | for aggregation pipeline operations is "mapReduce" | yes |
uri | specifies the URI when the operation is bound under /<db>/<collection>/_aggrs path. | yes |
map | the map function For more information refer to https://docs.mongodb.org/manual/core/map-reduce/ |
yes |
reduce | the reduce function | yes |
query | the filter query | no |
Examples
The following requests upsert a collection defining two aggregation operations:
- aggregation operation test_ap bound at
/db/ao_test/_aggrs/test_ap
- map reduce operation test_mr bound at
/db/ao_test/_aggrs/test_mr
PUT /db/ao_test { "aggrs" : [
{ "stages" : [ { "_$match" : { "name" : { "_$var" : "n" } } },
{ "_$group" : { "_id" : "$name",
"avg_age" : { "_$avg" : "$age" }
} }
],
"type" : "pipeline",
"uri" : "test_ap"
},
{ "map" : "function() { emit(this.name, this.age) }",
"query" : { "name" : { "$var" : "n" } },
"reduce" : "function(key, values) { return Array.avg(values) }",
"type" : "mapReduce",
"uri" : "test_mr"
}
] }
Note between the _links
collection property the URIs of the
aggregation operations.
GET /db/ao_test
HTTP/1.1 200 OK
...
{
"_links": {
...,
"test_ap": {
"href": "/db/ao_test/_aggrs/test_ap"
},
"test_mr": {
"href": "/db/ao_test/_aggrs/test_mr"
}
},
....
}
Passing variables to aggregation operations
The query parameter avars
allows to pass variables to the aggregation
operations.
For example, the previous example aggregations both use a variable named
“n”. If the variable is not passed via the avars
qparam, the request
fails.
GET /test/ao_test/_aggrs/test_ap
HTTP/1.1 400 Bad Request
...
{
"_exceptions": [
{
"exception": "org.restheart.hal.metadata.QueryVariableNotBoundException",
"exception message": "variable n not bound",
...
}
]
}
Passing the variable n, the request succeeds:
GET /test/ao_test/_aggrs/test_ap?avars={"n":1}
HTTP/1.1 200 OK
...
Variables in stages or query
Variables can be used in aggregation pipeline stages and map reduce query as follows:
{ "$var": "<var_name>" }
In case of map reduce operation previous example, the variable was used to filter the documents to have the name property matching the variable n:
{
"query": { "name": { "$var": "n" } },
...
}
Variables in map or reduce functions
Variables are passed also to map and reduce javascript functions
where the variable $vars
can be used. For instance:
PUT /db/ao_test { "aggrs" : [
{ "map" : "function() { var minage = JSON.parse($vars).minage; if (this.age > minage ) { emit(this.name, this.age); }; }",
"reduce" : "function(key, values) { return Array.avg(values) } }",
"type" : "mapReduce",
"uri" : "test_mr"
}
] }
HTTP/1.1 201 Created
...
Note the map function; JSON.parse($vars)
allows to access the
variables passed with the query parameter avars
function() {
var minage = JSON.parse($vars).minage; // <-- here we get minage from avars qparam
if (this.age > minage ) { emit(this.name, this.age); }
};
Security Informations
By default RESTHeart makes sure that the aggregation variables passed as query parameters hasn’t got inside MongoDB operators.
This behaviour is required to protect data from undesiderable malicious query injection.
Even though is highly discouraged, is possible to disable this check by editing the following property into restheart.yml
configuration file.
### Security
# Check if aggregation variables use operators. allowing operators in aggregation variables
# is risky. requester can inject operators modifying the query
aggregation-check-operators: true