🏡 Home 📖 Chapter Home 👈 Prev 👉 Next

⚡  ElasticsearchBook.com is crafted by Jozef Sorocin and powered by:

In the previous chapter we've discussed the usefulness of bucket_script aggregations which allow for per-bucket computations. When combined with a scripted_metric aggregation, other practical applications arise.

Let me illustrate.

Use Case: Device Fleet Health

I have a fleet of devices, each of which posts a message to ES every 10 minutes in the form of:

{
    "deviceId": "unique-device-id",
    "timestamp": "2021-01-19 06:54:00",
    "message" : "morning ping at 06:54 AM"
}

I'm trying to get a sense of the health of this fleet by finding devices that haven't reported anything in a given period of time. What I dream of is getting:

  1. the total count of distinct deviceIds seen in the last 7 days
  2. the total count of deviceIds NOT seen in the last hour
  3. the IDs of the devices that stopped reporting (→ reported in the last 2hrs but not the last 1h)

Approach: Bucket Scripts & Scripted Metrics

After that, let's assume it's exactly 5 PM on Jan 20, 2021.

1. The total count of distinct deviceIds seen in the last 7 days

We're going to use a [range filter](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html) to restrict the timestamp, plus a [cardinality aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html) to obtain the unique device count. In pseudo-code:

"last7d": {
  "filter":
		"range": "2021-01-13 <= timestamp <= 2021-01-20"
  "aggs":
    "cardinality": "on the field deviceId"
}