Gamification in Information Retrieval

My last article was mainly about Collaborative Information Seeking – one of the trends in enterprise search. Another interesting topic is the use of games’ mechanics in CIS systems. I met up with this idea during previously mentioned ESE 2014 conference, but interest is so high, that this year in Amsterdam a GamifIR (workshops on Gamification for Information Retrieval) took place. IR community have debated about what kind of benefits can IR tasks bring from games’ techniques. Workshops cover gamified task in context of searching, natural language processing, analyzing user behavior or collaborating. The last one was discussed in article titled “Enhancing Collaborative Search Systems Engagement Through Gamification” and has been mentioned by Martin White in his great presentation about search trends on last ESE summit.

Gamification is a concept which provides and uses game elements in non-game environment. Its goal is to improve customers or employees motivation for using some services. In the case of Information Retrieval it is e.g. encouraging people to find information in more efficient way. It is quite instinctive because competition is  an inherent part of human nature. Long time ago, business sectors have noticed that higher engagement, activating new users and establishing interaction between them, rewarding the effort of doing something lead to measurable results. Even if quality of data given by users could be higher. Among those elements can be included: leaderboards, levels, badges, achievements, time or resources limitation, challenges and many others. There are even described design patterns and models connected with gameplay, components, game practices and processes. Such rules are essential because virtual badge has no value until being assigned by user.

Collaborative Information Seeking is an idea suited for people cooperating on complex task which leads to find specific information. Systems like this support team work, coordinate actions and improve communication in many different ways and with usage of various mechanisms. At first glance it seems that gamification is perfect adopted to CIS projects. Seekers become more social, feeling of competence foster actions which in turn are rewarded.

The most important thing is to know why do we need gamified system and what kind of benefits we will get. Next step is to understand fundamental elements of a game and find out how adopt them to IR case. In their article “Enhancing Collaborative Search Systems Engagement Through Gamification”, researchers of Granada and Holguin universities have listed propositions how to gamify CIS system.  Based on their suggestions I think essential points are to prepare highly sociable environment for seekers. Every player (seeker) needs to have own personal profile which stores previous achievements and can be customized. Constant feedback on progress, list of successful members, time limitations, keeping the spirit of competition by all kinds of widgets are important for motivating and building a loyalty. Worth to remember that points collected after achieving goals need to be converted into virtual values which can distinguish the most active players. Crucial thing is to construct clear and fair principles, because often information seeking with such elements is a fun and it can’t be ruined.

Researchers from Finnish universities, who published article “Does Gamification Work?”, have broken down a problem of gamifying into components and have thoroughly studied them. Their conclusion was that concept of gamification can work, but there are some weaknesses – context which is going to be gamified and the quality of the users. Probably, the main problem is lack of knowledge which elements really provide benefits.

Gamification can be treated as a new way to deal with complex data structures. Limitations of data analyzing can be replaced by mechanism which increase activity of users in Information Retrieval process. Even more – such concept may leads to more higher quality data, because of increased people motivation. I believe, Collaborative Information Seeking, Gamification and similar ideas are one of the solutions how to improve search experience by helping people to become better searchers than not by just tuning up algorithms.

Enterprise Search Europe 2014 – Short Review

ESE Summit

At the end of April  a third edition of Enterprise Search Europe conference took place.  The venue was Park Plaza Victoria Hotel in London. Two-day event was dedicated to widely understood search solutions. There were two tracks covering subjects relating to search management, big data, open source technologies, SharePoint and as always –  the future of search. According to the organizer’ information, there were 30 experts presenting their knowledge and experience in implementation search systems and making content findable. It was  opportunity to get familiar with lots of case studies focused on relevancy, text mining, systems architecture and even matching business requirements. There were also speeches on softer skills, like making  decisions or finding good  employees.

In a word, ESE 2014 summit was great chance to meet highly skilled professionals with competence in business-driven search solutions. Representatives from both specialized consulting companies and universities were present there. Even second day started from compelling plenary session about the direction of enterprise search. Presentation contained two points of view: Jeff Fried, CTO in BA-Insight and Elaine Toms, Professor of Information Science, University of Sheffield. From industrial aspect analyzing user behavior,  applying world knowledge or improving information structure is a  real success. On the other hand, although IR systems are currently in mainstream, there are many problems: integration is still a challenge, systems working rules are unclear, organizations neglect investments in search specialists. As Elaine Toms explained, the role of scientists is to restrain an uncertainty by prototyping and forming future researchers. According to her, major search problems are primitive user interfaces and too few systems services. What is more, data and information often become of secondary importance, even though it’s a core of every search engine.

Trends

Despite of many interesting presentations, particularly one caught my attention. It was “Collaborative Search” by Martin White, Conference Chair and Managing Director in Intranet Focus. The subject was current condition of enterprise search and  requirements which such systems will have to face in the future. Martin White is convinced that limited users satisfaction is mainly fault of poor content quality and insufficient information management. Presentation covered  absorbing results of various researches. One of them, described in “Characterizing and Supporting Cross-Device Search Tasks” document, was analysis of commercial search engine logs in order to find behavior patterns associated with cross device searching. Switching between devices can be a hindrance because of device multiplicity. That is why each user needs to remember both what he was searching and what has already been found. Findings show that there are lots of opportunities to handle information seeking more effectively in multi-device world. Saving and re-instating user session, using time between switching devices to get more results or making use of behavioral, geospatial data to predict task resumption are just a few examples of ideas.

Despite everything, the most interesting part of Martin White’s presentation was dedicated to Collaborative Information Seeking (CIS).

Collaborative Information Seeking

It is natural that difficult and complex tasks forced people to work together. Collaboration in information retrieval helps to use systems more effectively. This idea concentrate on situations when people should cooperate to seek information or sense-make. In fact, CIS covers on the one hand elements connected with organizational behavior or making decision, on the other – evolution of user interface and designing systems of immediate data processing. Furthermore, Martin White considers CIS context to be focused around the complex queries, “second phase” queries, results evaluation or ranking algorithms. This concept is able to bring the highest values in the domains like chemistry, medicine and law.

During the CIS exploration some definitions appeared:  collaborative information retrieval, social searching, co-browsing, collaborative navigation, collaborative information behavior, collaborative information synthesis.  My intention is to introduce some of them.

"Collaborative Information Seeking", Chirag Shah

1. “Collaborative Information Seeking”, Chirag Shah

Collaborative Information Retrieval (CIR) extends traditional IR for the purposes of many users. It supports scenarios when problem is complicated and when seeking common information is a need. To support groups’ actions, it is crucial to know how they work, what are their strengths and weaknesses. In general, it might be said that such system could be an overlay on search engine re-ranking results, based on users community knowledge. In agreement with Chirag Shah, the author of “Collaborative Information Seeking” book, there are some examples of systems where workgroup’s queries and related results are captured and used to filtering more relevant information for particular user. One of the most absorbing case is SearchTogether – interface designed for collaborative web search, described by Meredith R. Morris and Eric Horvitz. It allows to work both synchronously and asynchronously. History of queries, page metadata and annotations serve as information carrier for user. There had been implemented an automatic and manual division of labor. One of its feature was recommending pages to another information seeker. All sessions and past findings were persisted and stored for future collaborative searching.

Despite of many efforts made in developing such systems, probably none of them has been widely adopted. Perhaps it was caused partly by its non-trivial nature, partly by lack of concept how to integrate them with other parts of collaboration in organizations.

Another ideas associated with CIS are Social Search and Collaborative Filtering. First one is about how social interactions could help in searching together. What is interesting,  despite of rather weak ties between people in social networks, their enhancement may be already observed in collaborative networks. Second definition referred to provide more relevant search results based on user past behavior, but also community of users displaying similar interests. It is noteworthy that it is an example of asynchronous interaction, because its value is based on past actions – in contrast with CIS where emphasis is laid to active users communication. Collaborative Filtering has been applied in many domains: industry, financial, insurance or web. At present the last one is most common and it’s used in e-commerce business. CF methods make a base for recommender systems predicting users preferences. It is so broad topic, that certainly deserves a separate article.

CIS Barriers

Regardless of all these researches, CIS is facing many challenges nowadays. One of them is information security in the company. How to struggle out of situation when team members do not have the same security profile or when some person cannot even share with others what has been found? Discussed systems cannot be only created for information seeking, but also they need to  provide managing security, support situations when results were not found because of permissions or situations when it is necessary to view a new document created in cooperation process. If it is not enough, there are various organization’s barriers hindering CIS idea. They are divided into categories – organizational, technical, individual, and team. They consist of things such as organization culture and structure, multiple and un-integrated systems, individual person perception or varied conflicts appeared during team work. Barriers and their implications have been described in detail in document “Barriers to Collaborative Information Seeking in Organizations” by Arvind Karunakaran and Madhu Reddy.

Collaborative information seeking is exciting field of research and one of the search trend. Another absorbing topic is gamification adopting in IR systems. This is going to be a subject of my next article.

Speaking about Search as a Service @ PROMISE Technology Transfer day, want to meet up?

Tomorrow morning I leave Gothenburg to attend the PROMISE Technology Transfer day @ CeBIT 2013 in Hanover, Germany.

The event is a workshop introducing its participants to methodologies for the systematic evaluation and monitoring of search engines, and for discussing future trends and requirements for the next generation of information access systems. In other words, it is right up our alley at Findwise.

As Director of Research at Findwise I will speak about Search as a Service. If you are at the event or just nearby I would be happy to meet up and have a chat.  I will be around from Tuesday March 5 until Thursday March 7. Feel free to email me, henrik.strindberg@findwise.com or give me a call at +46709443905.

Hope to see you there!

SLTC 2012 in retrospect – two cutting-edge components

The 4th Swedish Language Technology Conference (SLTC) was held in Lund on 24-26 October 2012.
It is a biennial event organized by prominent research centres in Sweden.
The conference is, therefore, an excellent venue to exchange ideas with Swedish researchers in the field of Natural Language Processing (NLP), as well as present own research and be updated of the state-of-the-art in most of the areas of Text Analytics (TA).

This year Findwise participated in two tracks – in a workshop and in the main conference.
As the area of Search Analytics (SA) is very important to us, we decided to be proactive and sent an application to organize a workshop on the topic of “Exploratory Query Log Analysis” in connection with the main conference. The application was granted and the workshop was very successful. It gathered researchers who work in the area of SA from very different perspective – from utilizing deep Machine Learning to discover users’ intent,  to looking at query logs as a totally new genre. I will do a follow-up on that in another post. All the contributions to the workshop will also be uploaded on our research page.

As for the main conference, we had two papers accepted for presentation. The first one dealt with the topic of document summarization – both single and multidocument summarization
(http://www.slideshare.net/findwise/extractive-document-summarization-an-unsupervised-approach).
The second paper was about detecting Named Enities in Swedish
(http://www.slideshare.net/findwise/identification-of-entities-in-swedish).

These two papers presented de facto state-of-the-art results for Swedish both when it comes to document summarization and Named Entity Recognition (NER). As for the former task, there is neither a standard corpus for evaluation of summarization systems, nor many previous results and just few other systems which made it unfeasible to compare our own system with. Thus, we have contributed two things to the research in document summarization – a Swedish corpus based on featured Wikipedia articles to be used for evaluation and a system based on unsupervised Machine Learning, which by relying on domain boosting achieves state-of-the-art results for English and Swedish. Our system can be further improved by relying on our enhanced NER and Coreference resolution modules.

As for the NER paper, our Entity recognition system for Swedish achieves 74.0% F-score, which is 4% higher than another study presented simultaneously at SLTC (http://www.ling.su.se/english/nlp/tools/stagger). Both systems were evaluated on the same corpus, which is considered a de facto standard for evaluation of different NLP resources for Swedish. The unlabelled score (i.e. no fine-grained division of classes but just entity vs non-entity) of our system achieved 91.3% F-score (93.1% Precision and 89.6% Recall). When identifying people, the Findwise NER system achieves 78.1% Precision and 90.5% Recall (83.9% F-score).

So, what did we take home from the conference? We were really happy to see that the tools we develop for our customers are not something mediocre but rather something that is of very high quality and is the state-of-the-art in Swedish NLP. We actively share our results and our corpora for research perposes. Findwise showed keen interest in cooperating with other researchers in developing better tools and systems in the area of NLP and Text Analytics. And this I think is a huge bonus to all our current and prospective customers – we actively follow the current trends in the research community and cooperate with researchers, and our products do incorporate the latest findings in the field, which make us leverage both high quality and cutting-edge technology.

As we continuously improve our products, we have also released a Polish NER and some work has been initiated on Danish and Norwegian ones. More NLP components will be soon available for demo and testing on our research page.

Presentation: Enterprise Search and Findability in 2013

This was presented 8 November at J. Boye 2012 Conference in Aarhus, Denmark, by Kristian Norling.

Presentation Summary

There is a lot of talk about social, big data, cloud, digital workplace and semantic web. But what about search, is there anything interesting happening within enterprise search and findability? Or is enterprise search dead?

In the spring of 2012,  we conducted a global survey on Enterprise Search and Findability. The resulting report based on the answers from survey tells us what the leading practitioners are doing and gives guidance for what you can do to make your organisation’s enterprise search and findability better in 2013.

This presentation will give you a sneak peak into the near future and trends of enterprise search, based on data form the survey and what the leaders that are satisfied with their search solutions do.

Topics on Enterprise Search

  •  Help me! Content overload!
  • The importance of context
  • Digging for gold with search analytics
  • What has trust to do with enterprise search?
  • Social search? Are you serious?
  • Oh, and that mobile thing

The Enterprise Search and Findability Report 2012 is ready

No strategy, no budget, no resources. This is the common scenario for enterprise search and findability in many organisations today. Still Enterprise Search is considered a critical success factor in 75% of organisations that responded to the global survey that ran from March to May this year.

The Enterprise Search and Findability Report 2012 is now ready for download.

The Enterprise Search and Findability report 2012 shows that 60% of the respondents expressed that it is very/moderately hard to find the right information. Only 11% stated that it is fairly easy to search for information and as few as 3% consider it very easy to find the desirable information. This shows that there still is a large untapped potential for any organisation to get great value from investing in enterprise search. For a relatively small investment, preferably in personnel it is possible to make search a lot better. The survey also reveals that  organisations who are very satisfied with their search, have a (larger) budget, more resources and systematically work with analysing search.

What is your primary goal for utilising search technology in your organisation?Figure. What is your primary goal for utilising search technology in your organisation?

The primary goal for using search is to accelerate retrieval of known information sources, 91%, and to improve the re-use of content (information/knowledge), 72%. This indicates that often search within organisations is used as a discovery tool for what already is known. If looking over the next three years, as many as 77% think that the amount of information in the organisation will increase. This means that every year it will be even more important be able to find the right information and that means Enterprise search is still very much needed, as stated in the following great presentations (on video):  Why Business Success Depends on Enterprise Search (by Martin White of Intranet Focus) and The Enterprise Search Market – What should be on your radar? (by Alan Pelz-Sharpe of 451 Research)

Download the full report.

Video and results from the Enterprise Search and Findability Survey

More than 200 people, primarily from Europe and North America, have responded to the Enterprise Search and Findability survey, providing a unique insight into how search is currently being managed, or rather is not being managed, in the best interest of the organisation. However, to get a deeper understanding in how search is used and managed at a regular basis in an enterprise context, search vendors and integrators have been excluded in this report, resulting in 170 unique responses from 28 countries globally.

A few findings

The survey has shown that the majority of the respondents find it difficult to find relevant information within the organization. To be more precise, 59.5 % of the respondents expressed that it is very/moderately hard to find the right information. Only 11.2% stated that it is fairly easy to search for information and as few as 2.8 % consider it very easy to find the desirable information. The ease of finding the right information clearly has a connection with the size of the organization. When looking at organizations with less than 1000 employees, one can see that 30.9% of the respondents feel that it is moderately/very hard to find the right information, while the corresponding percentage for organizations with 1001 or more employees is 77.3%.

The larger the organisation, the more information, and the more cumbersome it is to search and use the right information at the right time.

Video

Kristian Norling presents the findings from the global survey on Enterprise Search implementations. Presented at Enterprise Search Summit #ESS12 in New York, Enterprise Search Europe #ESEU in London, #IKS Workshop in Salzburg and Findability Day 2012 #findday12 in Stockholm. The slides are available here.

Presentation: Results from the Enterprise Search and Findability Survey

This is a mashup of the presentations made at Enterprise Search Summit in New York, US on the 15th of May 2012 and at Enterprise Search Europe in London on the 30th of May 2012.

Global results are presented with numbers only, results from Europe and North America are clearly stated as such.

If you are interested in participating in the survey next year, please sign-up. All sign-ups will receive this years report.

Sign up for the Enterprise Search and Findability Survey 2013!
* = required field
View more presentations from Findwise

Architecture of Search Systems and Measuring the Search Effectiveness

Lecture made at the 19th of April 2012, at the Warsaw University of Technology. This is the 9th lecture in the regular course for master grade studies, “Introduction to text mining”.

View more presentations from Findwise

A look at European Conference on Information Retrieval (ECIR) 2012

European Conference on Information Retrieval

The 34th European Conference on Information Retrieval was held  1-5 April 2011, in the lovely but crowded city of Barcelona, Spain. The core conference attracted over 100 attendees, with a total of 35 accepted full papers, 28 posters, and 7 demos being presented. As opposed to the previous year, which had 2 parallel sessions, this year’s conference included a single running session. The accepted papers covered a diverse range of topics, and were divided into query representation, blog and online-community search, semi-structured retrieval, applications, evaluation, retrieval models, classification, categorisation and clustering, image and video retrieval, and systems efficiency.

The best paper award went to Guido Zuccon, Leif Azzopardi, Dell Zhang and Jun Wang for their work entitled “Top-k Retrieval using Facility Location Analysis” and presented by Leif Azzopardi during the retrieval models session. The authors propose using facility location analysis taken from the discipline of operations research to address the top-k retrieval problem of finding “the optimal set of k documents from a number of relevant documents given the user’s query”.

Meanwhile, “Predicting IMDB Movie Ratings using Social Media” by Andrei Oghina, Mathias Breuss, Manos Tsagkias and Maarten de Rijke won the best poster award. With a different goal from the best paper, the authors of the poster experiment with a prediction model for rating movies using a set of qualitative and quantitative features extracted from the stream of two social media channels, YouTube and Twitter. Their findings show that the highest predictive performance is obtained by combining features from both channels, and propose as future work to include other social media channels.

Workshop Days

The conference was preceded by a full day of workshops and tutorials running in parallel. I attended two workshops: Information Retrieval Over Query Sessions (SIR) during the morning and Task-Based and Aggregated Search (TBAS) in the afternoon. The second workshop ended with an interactive discussion. A third, full-day workshop was Searching 4 Fun!.

Industry Day

The last day was the Industry Day. Only 2 papers here, plus 5 oral contributions, and around 50 attendees. A strong focus of the talks given at the industry day was on opinion-mining: four of the six participating companies/institutions presented work on sentiment analysis and opinion mining from social media streams. Jussi Karlgren, from Gavagai, argued that sentiment analysis from social media can be used by companies for example in finding reviews or comments made about their product or service, analyse their market position, and predict price movements. Rianne Kaptein, from Oxyme, backed this up by adding that businesses are interested by what the consumers say about their brand, products or campaigns on social media streams. Furthermore, Hugo Zaragoza from Websays identified two basic needs inside a company: a need for help in reading so that someone can act, and a need for help in explaining so that it can convince. Very interesting topic indeed, and research in this direction will advance as companies become more aware of the business gains from opinion mining of social media.

Overall, ECIR 2012 was a very inspiring conference. It also seemed a very friendly conference, offering many opportunities to network with the fellow attendees. Despite that, several participants said that the number of attendees at this year’s conference has decreased in comparison with previous years. The workshops and the core conference gave me the impression that it has a strong focus on young researchers, as many of the accepted contributions had a student as a first author and presenter at the conference. The fact that there was only one session running at a time was a good decision in my opinion, as the attendees were not forced to miss presentations. Nevertheless, the workshops and tutorials were running in parallel, and although the proceedings of the workshops will be made freely available, I still feel that I missed something that day. The industry day was very exciting, offering the opportunity to share ideas between academia and industry. However, there were not so many presentations, and the topics were not as diverse. I propose that next year Findwise will be among the speakers at the Industry track!

ECIR 2013 will be held in Moscow, Russia, between 24-28 March. See you there!