Shard selection in distributed collaborative search engines
Search engines consist of nodes responsible for one or a few parts of the index (called shards). However, nowadays many search engines are distributed systems, meaning that several nodes collaborate in handling search operations. One of these distributed search engines is ElasticSearch, which has recently become popular for medium- and large scale searching.
When using ElasticSearch, all nodes and shards generally process queries, which can result in high latency. Methods for handling this are called shard selection, which means sending queries only to relevant shards. By limiting the number of participating nodes in each query, one can improve scalability.
This lab sample is about developing shard selection for ElasticSearch. We call the plugin SAFE and use four shard selection algorithms and support all query types in ElasticSearch.