Enterprise Search Europe 2014 – Short Review

ESE Summit

At the end of April  a third edition of Enterprise Search Europe conference took place.  The venue was Park Plaza Victoria Hotel in London. Two-day event was dedicated to widely understood search solutions. There were two tracks covering subjects relating to search management, big data, open source technologies, SharePoint and as always –  the future of search. According to the organizer’ information, there were 30 experts presenting their knowledge and experience in implementation search systems and making content findable. It was  opportunity to get familiar with lots of case studies focused on relevancy, text mining, systems architecture and even matching business requirements. There were also speeches on softer skills, like making  decisions or finding good  employees.

In a word, ESE 2014 summit was great chance to meet highly skilled professionals with competence in business-driven search solutions. Representatives from both specialized consulting companies and universities were present there. Even second day started from compelling plenary session about the direction of enterprise search. Presentation contained two points of view: Jeff Fried, CTO in BA-Insight and Elaine Toms, Professor of Information Science, University of Sheffield. From industrial aspect analyzing user behavior,  applying world knowledge or improving information structure is a  real success. On the other hand, although IR systems are currently in mainstream, there are many problems: integration is still a challenge, systems working rules are unclear, organizations neglect investments in search specialists. As Elaine Toms explained, the role of scientists is to restrain an uncertainty by prototyping and forming future researchers. According to her, major search problems are primitive user interfaces and too few systems services. What is more, data and information often become of secondary importance, even though it’s a core of every search engine.


Despite of many interesting presentations, particularly one caught my attention. It was “Collaborative Search” by Martin White, Conference Chair and Managing Director in Intranet Focus. The subject was current condition of enterprise search and  requirements which such systems will have to face in the future. Martin White is convinced that limited users satisfaction is mainly fault of poor content quality and insufficient information management. Presentation covered  absorbing results of various researches. One of them, described in “Characterizing and Supporting Cross-Device Search Tasks” document, was analysis of commercial search engine logs in order to find behavior patterns associated with cross device searching. Switching between devices can be a hindrance because of device multiplicity. That is why each user needs to remember both what he was searching and what has already been found. Findings show that there are lots of opportunities to handle information seeking more effectively in multi-device world. Saving and re-instating user session, using time between switching devices to get more results or making use of behavioral, geospatial data to predict task resumption are just a few examples of ideas.

Despite everything, the most interesting part of Martin White’s presentation was dedicated to Collaborative Information Seeking (CIS).

Collaborative Information Seeking

It is natural that difficult and complex tasks forced people to work together. Collaboration in information retrieval helps to use systems more effectively. This idea concentrate on situations when people should cooperate to seek information or sense-make. In fact, CIS covers on the one hand elements connected with organizational behavior or making decision, on the other – evolution of user interface and designing systems of immediate data processing. Furthermore, Martin White considers CIS context to be focused around the complex queries, “second phase” queries, results evaluation or ranking algorithms. This concept is able to bring the highest values in the domains like chemistry, medicine and law.

During the CIS exploration some definitions appeared:  collaborative information retrieval, social searching, co-browsing, collaborative navigation, collaborative information behavior, collaborative information synthesis.  My intention is to introduce some of them.

"Collaborative Information Seeking", Chirag Shah

1. “Collaborative Information Seeking”, Chirag Shah

Collaborative Information Retrieval (CIR) extends traditional IR for the purposes of many users. It supports scenarios when problem is complicated and when seeking common information is a need. To support groups’ actions, it is crucial to know how they work, what are their strengths and weaknesses. In general, it might be said that such system could be an overlay on search engine re-ranking results, based on users community knowledge. In agreement with Chirag Shah, the author of “Collaborative Information Seeking” book, there are some examples of systems where workgroup’s queries and related results are captured and used to filtering more relevant information for particular user. One of the most absorbing case is SearchTogether – interface designed for collaborative web search, described by Meredith R. Morris and Eric Horvitz. It allows to work both synchronously and asynchronously. History of queries, page metadata and annotations serve as information carrier for user. There had been implemented an automatic and manual division of labor. One of its feature was recommending pages to another information seeker. All sessions and past findings were persisted and stored for future collaborative searching.

Despite of many efforts made in developing such systems, probably none of them has been widely adopted. Perhaps it was caused partly by its non-trivial nature, partly by lack of concept how to integrate them with other parts of collaboration in organizations.

Another ideas associated with CIS are Social Search and Collaborative Filtering. First one is about how social interactions could help in searching together. What is interesting,  despite of rather weak ties between people in social networks, their enhancement may be already observed in collaborative networks. Second definition referred to provide more relevant search results based on user past behavior, but also community of users displaying similar interests. It is noteworthy that it is an example of asynchronous interaction, because its value is based on past actions – in contrast with CIS where emphasis is laid to active users communication. Collaborative Filtering has been applied in many domains: industry, financial, insurance or web. At present the last one is most common and it’s used in e-commerce business. CF methods make a base for recommender systems predicting users preferences. It is so broad topic, that certainly deserves a separate article.

CIS Barriers

Regardless of all these researches, CIS is facing many challenges nowadays. One of them is information security in the company. How to struggle out of situation when team members do not have the same security profile or when some person cannot even share with others what has been found? Discussed systems cannot be only created for information seeking, but also they need to  provide managing security, support situations when results were not found because of permissions or situations when it is necessary to view a new document created in cooperation process. If it is not enough, there are various organization’s barriers hindering CIS idea. They are divided into categories – organizational, technical, individual, and team. They consist of things such as organization culture and structure, multiple and un-integrated systems, individual person perception or varied conflicts appeared during team work. Barriers and their implications have been described in detail in document “Barriers to Collaborative Information Seeking in Organizations” by Arvind Karunakaran and Madhu Reddy.

Collaborative information seeking is exciting field of research and one of the search trend. Another absorbing topic is gamification adopting in IR systems. This is going to be a subject of my next article.

Impressions from Findability Day 2013

We at Findwise host Findability Day to raise awareness of the importance of enterprise search and search in business, big data and to share best practices in implementation and management as well as inform about technology developments. Attending and being part of Findability Day this year was a real energy boost for all of us at Findwise. We were about 200 attendants all with focus on search and big data.

On stage, we had some very inspiring presentations. Martin White explaining the journey of search and pointing out its future direction showed how the principles of search have been around for decades. What we use it for and how we approach it is key along with enabling technology. Martin has also written a blog post about his impressions from the event, read it here.

Ravi Mynampaty of Harvard Business School showed how the search journey has evolved within Harvard Business School. One take away was the importance of step by step implementation and ensuring satisfied stakeholders along the way. Christian Finstad of Meltwater explained how they connected business values with technology to convince their clients. I think an internal decision within an organization needs similar argumentation in order to win acceptance.

Johan Johansson gave a thorough presentation about the search project at the Municipality of Norrköping. This was a tight budget project with strong deliverables. One thing to remember from Johan was his talk about “you need to try it out yourself – do the most common searches and experience it”.

DJ Skillman from Splunk, Troels Walsted Hansen from Microsoft and Daniel Bergqvist from Google gave some interesting insights into new technologies. How Splunk can be used to just harvest every imaginable data type, just bring it in and worry about using it later. How Google want to align enterprise search with consumer search and Microsofts Facebook inspired developments within graph search.

We also had some great breakout sessions with Jonas Berg, Svensk Byggtjänst who showed us their partner search application, Martin Öhléen, SKF who talked about mobility, Sebastian Forseland, Husqvarna who gave an expert lecture on master data management and Niclas Lillman&Nicklas Eriksson, Scania who talked about their journey towards a common search solution for all their knowledge workers.

If you weren’t there or if you just want to see it again we have posted videos of the presentations and most of the slides here.

We would like to thank everybody who came to the event – you made it a real success. A big thanks also goes out to our sponsors Google and Splunk who made this event possible.

The networking possibilities at the location were great and really demonstrated how the Search industry is growing.  We are very happy with the event but there is of course always room for improvement for next year. Make sure to be there!

A look at European Conference on Information Retrieval (ECIR) 2012

European Conference on Information Retrieval

The 34th European Conference on Information Retrieval was held  1-5 April 2011, in the lovely but crowded city of Barcelona, Spain. The core conference attracted over 100 attendees, with a total of 35 accepted full papers, 28 posters, and 7 demos being presented. As opposed to the previous year, which had 2 parallel sessions, this year’s conference included a single running session. The accepted papers covered a diverse range of topics, and were divided into query representation, blog and online-community search, semi-structured retrieval, applications, evaluation, retrieval models, classification, categorisation and clustering, image and video retrieval, and systems efficiency.

The best paper award went to Guido Zuccon, Leif Azzopardi, Dell Zhang and Jun Wang for their work entitled “Top-k Retrieval using Facility Location Analysis” and presented by Leif Azzopardi during the retrieval models session. The authors propose using facility location analysis taken from the discipline of operations research to address the top-k retrieval problem of finding “the optimal set of k documents from a number of relevant documents given the user’s query”.

Meanwhile, “Predicting IMDB Movie Ratings using Social Media” by Andrei Oghina, Mathias Breuss, Manos Tsagkias and Maarten de Rijke won the best poster award. With a different goal from the best paper, the authors of the poster experiment with a prediction model for rating movies using a set of qualitative and quantitative features extracted from the stream of two social media channels, YouTube and Twitter. Their findings show that the highest predictive performance is obtained by combining features from both channels, and propose as future work to include other social media channels.

Workshop Days

The conference was preceded by a full day of workshops and tutorials running in parallel. I attended two workshops: Information Retrieval Over Query Sessions (SIR) during the morning and Task-Based and Aggregated Search (TBAS) in the afternoon. The second workshop ended with an interactive discussion. A third, full-day workshop was Searching 4 Fun!.

Industry Day

The last day was the Industry Day. Only 2 papers here, plus 5 oral contributions, and around 50 attendees. A strong focus of the talks given at the industry day was on opinion-mining: four of the six participating companies/institutions presented work on sentiment analysis and opinion mining from social media streams. Jussi Karlgren, from Gavagai, argued that sentiment analysis from social media can be used by companies for example in finding reviews or comments made about their product or service, analyse their market position, and predict price movements. Rianne Kaptein, from Oxyme, backed this up by adding that businesses are interested by what the consumers say about their brand, products or campaigns on social media streams. Furthermore, Hugo Zaragoza from Websays identified two basic needs inside a company: a need for help in reading so that someone can act, and a need for help in explaining so that it can convince. Very interesting topic indeed, and research in this direction will advance as companies become more aware of the business gains from opinion mining of social media.

Overall, ECIR 2012 was a very inspiring conference. It also seemed a very friendly conference, offering many opportunities to network with the fellow attendees. Despite that, several participants said that the number of attendees at this year’s conference has decreased in comparison with previous years. The workshops and the core conference gave me the impression that it has a strong focus on young researchers, as many of the accepted contributions had a student as a first author and presenter at the conference. The fact that there was only one session running at a time was a good decision in my opinion, as the attendees were not forced to miss presentations. Nevertheless, the workshops and tutorials were running in parallel, and although the proceedings of the workshops will be made freely available, I still feel that I missed something that day. The industry day was very exciting, offering the opportunity to share ideas between academia and industry. However, there were not so many presentations, and the topics were not as diverse. I propose that next year Findwise will be among the speakers at the Industry track!

ECIR 2013 will be held in Moscow, Russia, between 24-28 March. See you there!