A relevance model is what a search engine uses to rank documents in a search result, i.e. how it finds the document you are looking for. An axiomatic analysis of relevance models is asking the questions: how and why does a relevance model work? Findwise attended the ICTIR 2013 conference in Copenhagen where one of the recurring topics was the axiomatic analysis of relevance models.
The relevance model is represented through a mathematical function of a set of input variables, and therefore just by looking at its formula it is likely to be very difficult to answer those two questions. What the axiomatic analysis aims to do is to break down the formulas and to isolate and analyze each of its individual components, with the goal of making improvements in the performance.
The idea is to formulate a set of axioms, meaning laws that a relevance model should abide by. One of the more obvious axioms, from a purely statistical point of view, relates to term frequency (TF): a document d1, where the terms of the query occur more times than in some other document d2, is to be assigned a higher relevance than d2. These are called axioms because they should be relevance truths – statements that are obvious and that everyone can agree on. Other examples of axioms could be that very long documents should be penalized simply because they have a higher probability to contain any word, and that terms frequent in many documents should contribute less to the relevance than terms that are more unique.
From an Enterprise Search perspective, these axioms do not have to be general relevance truths, but more adapted to your organization and your users. Here we see a shift in the type of axioms from pure statistics-based towards more metadata-based, e.g. which fields are more relevant than others and which sources are more relevant. A very simple example of this is that a hit in the title is more relevant than a hit in the body. These are usually conveniently configurable in most search engines, e.g. Apache Solr.
This concept is useful and interesting for many reasons since it not only allows you to modify and improve existing relevance models but you can also create new ones from scratch. This process can also be automated using Machine Learning algorithms, which leaves us with the task of finding the optimal set of axioms. Can you think of axioms that can be applied to your organization, your users informational needs and the content that is made searchable?