We recently released a new version of our docs.
04 Apr 2018

Relevance Overview

Relevance Configuration

In our ranking section, we discussed how Algolia’s relevance is based off a tie-breaking algorithm - certain criteria, or “rules”, are used to sort and bucket results together, and break ties between equal matches. We also described how you can add you own business metrics as additional ranking criteria.

There is yet another kind of relevance, a textual relevance, that consists of defining attributes and text-based rules that fine-tune the search, affecting the engine’s choice of which objects to return. These include typos, prefixes, plurals, stop words, and other such text-based criteria that Algolia uses to enhance relevancy.

Putting it all together, Algolia uses:

  • a tie-breaking algorithm + key metrics, like sales_count, popularity, and customer_rating, to help define business relevancy
  • and various text-based criteria + choosing the attributes that you want to search in, such as the name of a product, the brand, keywords, and description, which create textual relevancy

Let’s go over these features in detail.

Configuring Textual Relevance

Textual relevance is easily built by setting searchableAttributes (also referred to as AttributesToIndex) either in the Dashboard or via the API.

Searchable attributes are what you can control to influence three aspects of textual relevance within the engine:

  • Declare the attributes of your records that you want to make searchable
  • Order these attributes by importance
  • Declare if the order of the words inside the attribute matters or not

You’ll see that the attribute rule within the Ranking Formula is directly tied to these, and the searchable attributes defined in an index are unique to each dataset.

Define the attributes you want to be searchable

Consider the example of an index with the following records:

    "name": "John Doe",
    "company": "Acme",
    "title": "Developer",
    "url": "http://www.acme.com",
    "previous_titles": ["Web Developer", "Intern"]
    "name": "Jane Dawson",
    "company": "John & Bill",
    "title": "Engineer",
    "url": "http://www.johnandbill.com",
    "previous_titles": ["Junior Engineer", "Student"]

You might want to search into the name, company, title, and previous_titles attributes, but not the url. In this case, you would need to set all these attributes in the setting searchableAttributes to make them searchable.

Order the searchable attributes by order of importance

What will happen if we type “John”? This word is present in both objects: the first record’s name is “John Doe” and the second record’s company is “John & Bill”. For the matching “John Doe” record to appear first, you will need to ensure the name attribute is above the company attribute in the list. The higher an attribute is in the list, the more important it will be. It’s possible to declare two or more attributes with the same level of importance by placing them in the same string delineated by a comma. In this example, company and title are considered to be of the same importance.

Declaring the order of words

For each attribute we also have an additional setting: ordered or unordered. In ordered, matching words at the beginning of a given attribute will be considered more important than words further in this attribute.

For instance, the object iPhone 7 will be ranked higher than Case for iPhone for the query “iphone”, because this word is in first position of the attribute (instead of the third position).

On the other hand, if you had specified unordered for the same attribute, then iPhone 7 and Case for iPhone would be tied. This is because the unordered directive tells the engine to not prioritize the position within the attribute - all that matters is that the search term is found in the attribute.

In most cases, it is recommended to put array-type attributes and attributes longer than an average of 5 words as unordered.

here’s how to set the searchable attributes for this example:

  searchableAttributes: [

unordered(previous_titles) tells the engine to consider previous_title as a searchable attribute and also makes it unordered

Configuring Business Relevance

In combination with what is textually relevant, what is business relevant is a very important factor for calculating relevancy. This is called Custom Ranking, and it is also covered in more detail here.

Business metrics, like most-sold products, most-favorited videos, or other numerical or boolean values that your data contains, can be defined in the Algolia Ranking Formula. These attributes are used to promote business relevant results to the top of a results set.

Custom Ranking

To communicate your business metrics to the engine, you can set them in the setting called customRanking. You can put any type of numerical or boolean value that represents the popularity/importance of your records.

It can be a raw value like the number of sales, views or likes. It can also be a computed value such as an overall popularity score or a computed user rating.

Let’s take an example:

    "name": "iPhone 5",
    "units_sold": 20
    "name": "iPhone 6",
    "units_sold": 10
    "name": "iPhone 7",
    "units_sold": 200

If we use the units_sold attribute in our Custom Ranking, and type the query “iPhone”, we’ll get the following results: iPhone 7 will be first, followed by iPhone 5 and iPhone 6.

You can decide whether you want the sort to be descending (bigger values appear first in the results) or ascending (smaller values appear first in the results).

Customranking units

Adjusting Precision

While which metrics you choose to include and their relative order are important, you have an additional lever for further influencing ranking: the precision (that is, granularity) of the numeric metrics.

For instance, you might want to include “recency” as a factor when ranking your products. If you were to add the date timestamp to your custom ranking, results will never tie because the number is so precise. This means that secondary metrics you might have included won’t have a chance to come into play. Instead, you can map the timestamp to a “recency boost” - for example, items released in the last day are boosted “1”, while those released in the past week are boosted “2”, and so on and so forth.

This principle applies across any numeric attributes. Similarly, you would want to map a pageviews attribute to a more reasonable scale. Oftentimes, applying a logarithm is the easiest way to achieve this effect.

© Algolia - Privacy Policy