- Aggregation Operations >
- Map-Reduce >
- Map-Reduce to Aggregation Pipeline
Map-Reduce to Aggregation Pipeline¶
An aggregation pipeline provides better performance and usability than a map-reduce operation.
Map-reduce operations can be rewritten using aggregation pipeline
operators, such as
$group
, $merge
, and others.
For map-reduce operations that require custom functionality, MongoDB
provides the $accumulator
and $function
aggregation operators starting in version 4.4. Use these operators to
define custom aggregation expressions in JavaScript.
Map-reduce expressions can be re-written as shown in the following sections.
Map-Reduce to Aggregation Pipeline Translation Table¶
The table is only an approximate translation. For instance, the table
shows an approximate translation of mapFunction
using the
$project
.
However, the
mapFunction
logic may require additional stages, such as if the logic includes iteration over an array:Then, the aggregation pipeline includes an
$unwind
and a$project
:The
emits
field in$project
may be named something else. For visual comparison, the field nameemits
was chosen.
Map-Reduce | Aggregation Pipeline |
---|---|
db.collection.mapReduce(
{
query: <queryFilter>,
sort: <sortOrder>,
limit: <number>,
finalize: :highlight-green:`<finalizeFunction>`,
}
)
|
db.collection.aggregate( [
{ $match: <queryFilter> },
{ $sort: <sortOrder> },
{ $limit: <number> },
_id: “$emits.k”},
value: { :highlight-yellow:`$accumulator`: {
init: <initCode>,
accumulate: :highlight-yellow:`<reduceFunction>`,
accumulateArgs: [ “$emit.v”],
finalize: :highlight-green:`<finalizeFunction>`,
lang: “js” }}
} },
{ $out: :highlight-blue:`<collection>` }
] )
|
db.collection.mapReduce(
{
query: <queryFilter>,
sort: <sortOrder>,
limit: <number>,
finalize: :highlight-green:`<finalizeFunction>`,
}
)
|
db.collection.aggregate( [
{ $match: <queryFilter> },
{ $sort: <sortOrder> },
{ $limit: <number> },
_id: “$emits.k”},
value: { :highlight-yellow:`$accumulator`: {
init: <initCode>,
accumulate: :highlight-yellow:`<reduceFunction>`,
accumulateArgs: [ “$emit.v”],
finalize: :highlight-green:`<finalizeFunction>`,
lang: “js” }}
} },
{ $out: { :highlight-blue:`db: <db>, coll: <collection>` } }
] )
|
db.collection.mapReduce(
{
query: <queryFilter>,
sort: <sortOrder>,
limit: <number>,
finalize: :highlight-green:`<finalizeFunction>`,
}
)
|
db.collection.aggregate( [
{ $match: <queryFilter> },
{ $sort: <sortOrder> },
{ $limit: <number> },
_id: “$emits.k”},
value: { :highlight-yellow:`$accumulator`: {
init: <initCode>,
accumulate: :highlight-yellow:`<reduceFunction>`,
accumulateArgs: [ “$emit.v”],
finalize: :highlight-green:`<finalizeFunction>`,
lang: “js” }}
} },
on: “_id”
whenMatched: “replace”,
whenNotMatched: “insert”
} },
] )
|
db.collection.mapReduce(
{
query: <queryFilter>,
sort: <sortOrder>,
limit: <number>,
finalize: :highlight-green:`<finalizeFunction>`,
}
)
|
db.collection.aggregate( [
{ $match: <queryFilter> },
{ $sort: <sortOrder> },
{ $limit: <number> },
_id: “$emits.k”},
value: { :highlight-yellow:`$accumulator`: {
init: <initCode>,
accumulate: :highlight-yellow:`<reduceFunction>`,
accumulateArgs: [ “$emit.v”],
finalize: :highlight-green:`<finalizeFunction>`,
lang: “js” }}
} },
on: “_id”
whenMatched: [
{ $project: {
value: { $function: {
args: [
“$_id”,
[ “$value”, “$$new.value” ]
],
lang: “js”
} }
} }
]
whenNotMatched: “insert”
} },
] )
|
db.collection.mapReduce(
{
query: <queryFilter>,
sort: <sortOrder>,
limit: <number>,
finalize: :highlight-green:`<finalizeFunction>`,
out: { inline: 1 }
}
)
|
db.collection.aggregate( [
{ $match: <queryFilter> },
{ $sort: <sortOrder> },
{ $limit: <number> },
_id: “$emits.k”},
value: { :highlight-yellow:`$accumulator`: {
init: <initCode>,
accumulate: :highlight-yellow:`<reduceFunction>`,
accumulateArgs: [ “$emit.v”],
finalize: :highlight-green:`<finalizeFunction>`,
lang: “js” }}
} }
] )
|
Examples¶
Various map-reduce expressions can be rewritten using aggregation
pipeline operators, such as
$group
, $merge
, and others, without requiring
custom functions. However, for illustrative purposes, the following
examples provide both alternatives.
Example 1¶
The following map-reduce operation on the orders
collection groups
by the cust_id
, and calculates the sum of the price
for each
cust_id
:
Alternative 1: (Recommended) You can rewrite the operation into an aggregation pipeline without translating the map-reduce function to equivalent pipeline stages:
Alternative 2: (For illustrative purposes only) The
following aggregation pipeline provides a translation of the various
map-reduce functions, using $accumulator
to define custom
functions:
First, the
$project
stage outputs documents with anemit
field. Theemit
field is a document with the fields:key
that contains thecust_id
value for the documentvalue
that contains theprice
value for the document
Then, the
$group
uses the$accumulator
operator to add the emitted values:Finally, the
$out
writes the output to the collectionagg_alternative_2
. Alternatively, you could use$merge
instead of$out
.
Example 2¶
The following map-reduce operation on the orders
collection
groups by the item.sku
field and calculates the number of
orders and the total quantity ordered for each sku. The operation
then calculates the average quantity per order for each sku value
and merges the results into the output collection.
Alternative 1: (Recommended) You can rewrite the operation into an aggregation pipeline without translating the map-reduce function to equivalent pipeline stages:
Alternative 2: (For illustrative purposes only) The following
aggregation pipeline provides a translation of the various
map-reduce functions, using $accumulator
to define custom
functions:
The
$match
stage selects only those documents withord_date
greater than or equal tonew Date("2020-03-01")
.The
$unwind
stage breaks down the document by theitems
array field to output a document for each array element. For example:The
$project
stage outputs documents with anemit
field. Theemit
field is a document with the fields:key
that contains theitems.sku
valuevalue
that contains a document with theqty
value and acount
value
The
$group
uses the$accumulator
operator to add the emittedcount
andqty
and calculate theavg
field:Finally, the
$merge
writes the output to the collectionagg_alternative_4
. If an existing document has the same key_id
as the new result, the operation overwrites the existing document. If there is no existing document with the same key, the operation inserts the document.
See also