5. Relevance, Scoring, & Sorting

⚡ ElasticsearchBook is crafted by Jozef Sorocin (🟢 Book a consulting hour) and powered by:

Spatialized.io (Elasticsearch & Google Maps consulting)
in cooperation with Garages-Near-Me.com (Effortless parking across Germany)

Relevance is Relative

Arguably the most important part of search is relevance but there are dozens of strategies to asymptotically reach it and equally as many factors that affect it.

Everyone wants a better search but "success" means different things to different people:

as few search-as-you-type keystrokes as possible
more clicks on the first search result
increased usage of the search box in general etc.

This topic is too broad and so I'm not going to go into the different techniques here but will rather refer you to this insightful article (section "Relevancy").

Scouting for Scores

You'll have noticed by now that the Search API response typically includes the _score attribute inside of each retrieved hit. By default, each hit has a score of 1.0. This score is then affected by what queries matched a given doc and how good the match was.

How good the match was introduces the concept of similarity scoring. Scoring in Elasticsearch is since v5.x governed by an algorithm called Okapi BM25 which is explained here in great detail.

Now, when you're completely lost as to why ES assigned a given score to a given doc, or wondered why the response hits are ordered the way they are, you can count on the Explain API to provide a great deal of feedback:

POST index_name/_search?explain=true
{
  "query": {
    "simple_query_string": {
      "query": "abc"
    }
  }
}

https://tally.so/r/w7LQba

Ordering & Sorting

Hits are ordered by their scores in the descending order by default.

<aside> 🔑 If you ever need to randomize the search results (→ do the "opposite" of scoring), you can use a random function score query. On the other hand, if you need to assign a constant score, use the constant score query.

</aside>