Navigating Elasticsearch: The Ultimate Guide to Effective Pagination

Pagination is the most important concept that one must learn when it comes to navigating through the elastic search. And it's not just about pagination, but effective pagination. There are quite a few methods to do this, but the one I prefer is the sort and search_after method.

Before going into the example, let's look into what these terms mean.

  1. sort: This attribute sorts the search results in a particular order which can be either ascending or descending for a set of fields in the data.

  2. search_after: This key marks the starting point for the next request, or more appropriately the next set of results that needs to be fetched.

Let's illustrate this with an example

When the first request is made the sort fields should be passed to the request body as shown in the code block below.

{
  "size": 100
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "field_name_1": {
        "order": "asc"
      }
    },
    {
      "field_name_2": {
        "order": "desc"
      }
    }
  ]
}

The first request returns the response as shown below which contains the sort values as an array, if there are multiple fields on which sorting was performed.

{
  "took": 10,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 100,
      "relation": "eq"
    },
    "max_score": null,
    "hits": [
      {
        "_index": "index_name",
        "_type": "_doc",
        "_id": "1",
        "_score": null,
        "_source": {
          "field_name_1": "1.00",
          "field_name_2": "100.00"

        },
        "sort": [ "1.00", "100.00" ]
      },
      {
        "_index": "index_name",
        "_type": "_doc",
        "_id": "2",
        "_score": null,
        "_source": {
          "field_name_1": "2.00",
          "field_name_2": "99.00"

        },
        "sort": [ "2.00", "99.00" ]
      },
      // More documents with incremental values for "field_name_1" and "field_name_2" sorted in the specified order
     {
        "_index": "index_name",
        "_type": "_doc",
        "_id": "100",
        "_score": null,
        "_source": {
          "field_name_1": "100.00",
          "field_name_2": "1.00"

        },
        "sort": [ "100.00", "1.00" ]
      }
    ]
  }
}

The sort value from the last fetched document is used to initiate the next request by including the search_after parameter in the subsequent requests body and appending the sort values from the last document to this search_after parameter.

This process continues until all the documents in the particular index are retrieved.

Here is an example illustrating the same.

If there are 100 hits in the query starting from 1 to 100 the sort values from the 99th doc needs to be taken as the value for search_after in the subsequent query.

{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "field_name_1": {
        "order": "asc"
      }
    },
    {
      "field_name_2": {
        "order": "desc"
      }
    }
  ],
  "search_after": ["100.00", "1.00"]
}

In this way, we can retrieve all the documents from the given Elasticsearch index.

Let me know your comments and thoughts below :)