Comprehensions
Comprehensions provide a powerful way of defining new collections in terms of some other data, and are the primary way of writing checks. Comprehensions were briefly introduced above. Here we provide further details.
Before diving into the notation, let's look at the essential ideas of comprehensions as they occur in everyday, informal language. Sometimes, we need to talk about collections of things without actually listing them out explicitly. For example, if you are talking about a particular car, you could hear someone talk about:
- The prices of the parts in this car that are made of steel and weigh over 500 grams
This phrase is referring to a specific collection of things (some specific prices) without actually listing them out.
Of course, you could actually list these prices out, and get a list like $100, $2.30, etc, but this would be a very
inefficient way of communicating.
Instead, in this phrase, the collection of prices is expressed using the following ingredients:
- A condition that defines a set of things, namely the set of things that satisfies the condition. In this example, the condition is that the thing must be a part of this car and it must be made of steel and it must weigh over 500 grams.
- The attribute of the things of interest. In the example above the attribute is price.
A comprehension is just a precise notation for expressing the above sorts of ideas.
For example, we could write the above collection of prices in comprehension notation as follows,
where we assume that we can refer to a variable thisCar to stand for "this car" in the statements above,
and that the car and parts have attributes as needed for the example:
foreach part in thisCar.parts where part.material == "steel" && part.weightInGrams > 500 select part.price // |------- Qualifier 1 -------| |--------------------- Qualifier 2 ----------------------| // |------------------------------------ Condition --------------------------------------| |-- Attribute --|
Let's dissect this example. The comprehension has a condition and an attribute.
The condition is itself built out of any number of qualifiers. In this example, there are two qualifiers.
The first qualifier defines a condition, namely membership in the thisCar.parts collection, and it also gives a name
to stand for any item satisfying this condition, namely part.
Giving each item satisfying the condition a name is useful in order to specify further conditions on the items.
In fact, the second qualifier, which starts with the where keyword, does exactly this:
it requires that the matching parts also satisfy some conditions on materials and weight, using the name for the thing,
part, just introduced in the first qualifier in order to express this additional condition.
Finally, the attribute indicates that the things in the collection are the prices of the satisfying parts.
Let's use some sample data to illustrate the above ideas and see how this comprehension works in detail.
Suppose we have the following parts, which we've named P1, P2 and P3 in order to easily refer to them.
We define these parts as NQE declarations, so that we can use them in writing the above comprehension.
We also define thisCar that has exactly these parts:
P1 = {name: "muffler", material: "steel", weightInGrams: 5000, price: 17};
P2 = {name: "door handle", material: "plastic", weightInGrams: 300, price: 1};
P3 = {name: "front seat", material: "fabric", weightInGrams: 5000, price: 86};
thisCar = {parts: [P1, P2, P3]};
With these declarations, we can try out the query above. For example, you can copy and run the following to try out our example:
P1 = {name: "muffler", material: "steel", weightInGrams: 5000, price: 17};
P2 = {name: "door handle", material: "plastic", weightInGrams: 300, price: 1};
P3 = {name: "front seat", material: "fabric", weightInGrams: 5000, price: 86};
thisCar = {parts: [P1, P2, P3]};
foreach part in thisCar.parts
where part.material == "steel" && part.weightInGrams > 500
select {price: part.price}
The main query here is a slightly-modified form of the original example. Specifically, the only change is that the select expression is a record with the price as its single field. We make this change to satisfy a technical requirement on top-level queries. See Section "Top-level Expression" for details on this.
After the first qualifier, namely foreach part in thisCar.parts we have a set of satisfying assignments for part.
They are just all the parts. We can list the satisfying assignments in a table, with one column for each variable
introduced in the qualifiers so far. In this case, we just have a single variable, part so we will have just a single
column for this variable:
| part |
|---|
{name: "muffler", material: "steel", weightInGrams: 5000, price: 17} |
{name: "door handle", material: "plastic", weightInGrams: 300, price: 1} |
{name: "front seat", material: "fabric", weightInGrams: 5000, price: 86} |
The second qualifier imposes the condition part.material == "steel" && part.weightInGrams > 500, which limits the set
of satisfying assignments to:
| part |
|---|
{name: "muffler", material: "steel", weightInGrams: 5000, price: 17} |
Finally, the select computes item values for each of satisfying part, of which there is only one:
| part | attribute |
|---|---|
{name: "muffler", material: "steel", weightInGrams: 5000, price: 17} | {price: 17} |
A Second Example: Two Variables
By using more qualifiers, comprehensions can define more complex collections. For example, suppose we wanted to refer to
the total price of pairs of parts that have a part-subpart relationship where the parent part and child parts are made
of different materials. If we assume that each part has a parent field that specifies the name of its parent (if any),
then we could express that precisely with the following comprehension:
foreach parent in thisCar.parts
foreach child in thisCar.parts
where parent.name == child.parent
where parent.material != child.material
select {totalPrice: parent.price + child.price}
Here the condition is about two parts that satisfy some specific relationship to each other. Therefore, the set of
things that satisfy the condition is a set of pairs of parts. The names introduced for each part (parent and child),
are useful when expressing the conditions. Again, the attribute expression indicates that the set consists of the total
price for each pair of parts satisfying the given condition.
We can illustrate this more complex query by seeing how each qualifier changes the set of satisfying assignments to
variables. For this example, let's suppose that we extend the parts with the parent field:
P1 = {name: "muffler", material: "steel", price: 17, parent: ""};
P2 = {name: "door handle", material: "plastic", price: 1, parent: "door"};
P3 = {name: "door", material: "steel", price: 10, parent: ""};
P4 = {name: "muffler widget", material: "steel", price: 2, parent: "muffler"};
thisCar = {parts: [P1, P2, P3, P4]};
After the first qualifier introduces the parent variable, the set of satisfying assignments are all parent values:
| parent |
|---|
{name: "muffler", material: "steel", price: 17, parent: ""} |
{name: "door handle", material: "plastic", price: 1, parent: "door"} |
{name: "door", material: "steel", price: 10, parent: ""} |
{name: "muffler widget", material: "plastic", price: 2, parent: "muffler"} |
After the second qualifier introduces the child variable, the set of satisfying assignments are all combinations of
parts, since no filtering has been applied to parent or child yet:
| parent | child |
|---|---|
{name: "muffler", material: "steel", price: 17, parent: ""} | {name: "muffler", material: "steel", price: 17, parent: ""} |
{name: "muffler", material: "steel", price: 17, parent: ""} | {name: "door handle", material: "plastic", price: 1, parent: "door"} |
{name: "muffler", material: "steel", price: 17, parent: ""} | {name: "door", material: "steel", price: 10, parent: ""} |
{name: "muffler", material: "steel", price: 17, parent: ""} | {name: "muffler widget", material: "plastic", price: 2, parent: "muffler"} |
{name: "door handle", material: "plastic", price: 1, parent: "door"} | {name: "muffler", material: "steel", price: 17, parent: ""} |
{name: "door handle", material: "plastic", price: 1, parent: "door"} | {name: "door handle", material: "plastic", price: 1, parent: "door"} |
{name: "door handle", material: "plastic", price: 1, parent: "door"} | {name: "door", material: "steel", price: 10, parent: ""} |
{name: "door handle", material: "plastic", price: 1, parent: "door"} | {name: "muffler widget", material: "plastic", price: 2, parent: "muffler"} |
{name: "door", material: "steel", price: 10, parent: ""} | {name: "muffler", material: "steel", price: 17, parent: ""} |
{name: "door", material: "steel", price: 10, parent: ""} | {name: "door handle", material: "plastic", price: 1, parent: "door"} |
{name: "door", material: "steel", price: 10, parent: ""} | {name: "door", material: "steel", price: 10, parent: ""} |
{name: "door", material: "steel", price: 10, parent: ""} | {name: "muffler widget", material: "plastic", price: 2, parent: "muffler"} |
{name: "muffler widget", material: "plastic", price: 2, parent: "muffler"} | {name: "muffler", material: "steel", price: 17, parent: ""} |
{name: "muffler widget", material: "plastic", price: 2, parent: "muffler"} | {name: "door handle", material: "plastic", price: 1, parent: "door"} |
{name: "muffler widget", material: "plastic", price: 2, parent: "muffler"} | {name: "door", material: "steel", price: 10, parent: ""} |
{name: "muffler widget", material: "plastic", price: 2, parent: "muffler"} | {name: "muffler widget", material: "plastic", price: 2, parent: "muffler"} |
The third qualifier limits this set of assignments to just those where the child.parent matches parent.name:
| parent | child |
|---|---|
{name: "muffler", material: "steel", price: 17, parent: ""} | {name: "muffler widget", material: "plastic", price: 2, parent: "muffler"} |
{name: "door", material: "steel", price: 10, parent: ""} | {name: "door handle", material: "plastic", price: 1, parent: "door"} |
Finally the fourth qualifier limits to just the one case where the parent material is different than the child material:
| parent | child |
|---|---|
{name: "door", material: "steel", price: 10, parent: ""} | |
{name: "door handle", material: "plastic", price: 1, parent: "door"} |
Finally, the select clause computes the attribute for each satisfying assignment
| parent | child | attribute |
|---|---|---|
{name: "door", material: "steel", price: 10, parent: ""} | {name: "door handle", material: "plastic", price: 1, parent: "door"} | {totalPrice: 11} |
You can continue this way, by using more qualifiers to express conditions on any number of items. In every case, the condition specifies a set of assignments to variables defined in the condition. If there are 3 variables introduced in the condition, then each satisfying assignment provides a value for the 3 variables, and so on.
How Qualifiers Find Satisfying Assignments to Variables
In some cases, qualifiers can be reordered, but in general the order of qualifiers is important. Specifically, qualifiers are read from left-to-right. This matters because:
- Every qualifier can only make use of variables in scope at the qualifier.
- A
foreachqualifier introduces a new variable which is in scope in the subsequent qualifier. Similarly,letandgroup-byqualifiers, introduced in the next section, also introduce variables that are in scope at the subsequent qualifier.
That means that this is legal:
foreach part in thisCar.parts
where part.material == "steel" && part.weightInGrams > 500
select part.price
but this is not:
where part.material == "steel" && part.weightInGrams > 500
foreach part in thisCar.parts
select part.price
because the first qualifier refers to a variable part which is not in scope at the first qualifier.
On the other hand, in some cases, the order does not matter. For example:
foreach part1 in thisCar.parts
foreach part2 in thisCar.parts
where part1 != part2 && part1.weightInGrams == part2.weightInGrams
select part1.price + part2.price
could be rewritten as:
foreach part2 in thisCar.parts
foreach part1 in thisCar.parts
where part1 != part2 && part1.weightInGrams == part2.weightInGrams
select part1.price + part2.price
(first two qualifiers swapped) without affecting the meaning.
Additional Forms of Qualifiers
So far, we have introduced two forms of qualifiers: foreach and where qualifiers. There are two further forms of
qualifiers that you can use.
Let Qualifier
A let qualifier introduces a new variable in terms of existing variables. This allows you to write more readable
comprehensions by naming intermediate values. For example, we can avoid the duplicated x + y expression in the
following query:
foreach x in [1, 2, 3]
foreach y in [3, 4, 1]
where x + y > 4
select x + y
by defining a variable z to stand for that value:
foreach x in [1, 2, 3]
foreach y in [3, 4, 1]
let z = x + y
where z > 4
select z
As another example, you could write the following to give a short name to the device’s platform’s OS:
foreach device in network.devices
let os = device.platform.os
select {os: os}
Note: let is only allowed after the first foreach clause.
Group-By Qualifier
The final additional form of qualifier is the group-by qualifier. This qualifier allows you to refer to groups of items that share some common property.
Let's look at an example to illustrate the idea. Suppose we want to determine the set of part weights (and the count of parts for that weight) for which there is more than 1 part price. This set can be expressed with the following comprehension:
foreach part in thisCar.parts
group part.price as pricesForWeight by part.weightInGrams as weight
where length(pricesForWeight) > 1
select {weight: weight, count: length(pricesForWeight)}
The group-by qualifier in the second qualifier introduces two new variables, pricesForWeight and weight. Each
assignment to these two variables will correspond to a group of parts that all have the same value
for part.weightInGrams. For each group, the assignment to the weight variable will be the weight of the parts in the
group, given by the part.weightInGrams expression applied to any of the parts in the group, and the pricesForWeight
variable will have the collection of prices (given by applying part.price expression to all the parts in the group)
for the group.
Let's look at an example evaluation to see how this works. Consider our initial collection of parts above, which were:
P1 = {name: "muffler", material: "steel", weightInGrams: 5000, price: 17};
P2 = {name: "door handle", material: "plastic", weightInGrams: 300, price: 1};
P3 = {name: "front seat", material: "fabric", weightInGrams: 5000, price: 86};
thisCar = {parts: [P1, P2, P3]};
Then after the first qualifier, the satisfying assignments to variables are:
| part |
|---|
{name: "muffler", material: "steel", weightInGrams: 5000, price: 17} |
{name: "door handle", material: "plastic", weightInGrams: 300, price: 1} |
{name: "front seat", material: "fabric", weightInGrams: 5000, price: 86} |
Next, after the group-by qualifier, the satisfying assignments consist of the following assignments to the weight
and pricesForWeight variables:
| weight | pricesForWeight |
|---|---|
300 | [1] |
5000 | [17, 86] |
We see that for each weight value in the satisfying assignments occurring after the first qualifier, we have one
assignment with that weight value. For each assignment, the pricesForWeight variable is assigned the list of prices
of parts with that weight value. Also, note that the satisfying assignments after the group-by qualifier no longer
include assignments to variables occuring prior to the qualifier (in this case, just the part variable).
Then, the third qualifier places an additional condition on the assignments, namely that
the length(pricesForWeight) > 1. The assignments are then filtered to:
| weight | pricesForWeight |
|---|---|
5000 | [17, 86] |
Finally, the select clause includes the weight and total length of pricesForWeight for each group:
| weight | pricesForWeight | attribute |
|---|---|---|
5000 | [17, 86] | {weight: 5000, count: 2} |
In general, the group-by qualifier has the
form group valueExpression as valuesVariable by keyExpression as keyVariable. Each variable assignment to this
group-by will correspond to a possible value, v, of keyExpression as applied to the variable assignments satisfying
the previous qualifiers. For each such value v, the assignment will put v into keyVariable, and will put
into valueVariable, the collection of valueExpression values for assignments where keyExpression equals v.
You can find further usages of the group statement under the Examples tab.
Select Distinct
The select clause of a comprehension can use the distinct keyword. When this is used, the resulting list contains the
unique values from the list. In other words, the overall list is de-duplicated. For example:
foreach x in [1, 2, 1, 1]
select distinct {a: x}
will evaluate to the list [{a: 1}, {a: 2}], despite the repeated appearance of 1 in the list which x is ranging
over.
Nested Comprehensions
Comprehensions can appear at the top-level, or they can appear nested within another expression. However, when a comprehension is nested, it must be enclosed in parentheses. For example, the following expression is legal:
foreach d in network.devices
select {
device: d.name,
ifaceNames: (foreach iface in d.interfaces select iface.name)
}
The above returns a list of records with device and ifaceNames fields. The device field has the device name, while
the ifaceNames field has the names of all of the device’s interfaces. The value of this latter field is computed via a
nested comprehension, which is enclosed in parentheses.
Technical Restrictions
A comprehension always begins with foreach ... and ends with a single select ... expression.