Event related data – the buzz word at ECIR 2013

One of the major trends at the 35th annual European Conference on Information Retrieval was event related data. The conference took place between the 24th and 27th of March this year in a snowy Moscow, Russia. It attracted around 300 participants from all over the globe, 3 of them findwizards. While ECIR 2013 provided talks on a large variety of topics from across the field, event related data was definitely a buzz word.

The keynote speaker opening the second day of conference was Rutgers University assistant professor and Mahaya inc. CTO Mor Naaman. In his talk, Mr Naaman let the following image explain why Mahaya inc. are in business.


The past two papal elections.

The image above clearly shows that the way people act at events has changed considerably in the past few years, nowadays everyone is a reporter and their stories can be found on social media. Using platforms such as Twitter, Facebook and YouTube as data sources Naaman’s company creates products which not only extracts, but also synchronizes event coverage. One interesting feature in their latest product is the synchronization of video clips, making it possible for a user to easily switch view when watching video footage of for example a concert.  An arguably even stronger feature of this use of social media is the fact that news and event footage can reach the world even if no press is present at the scene. Slides from this inspiring talk can be found here.

Another presentation the same day displayed promising results in the task of automatic event detection. Using machine learning algorithms a team of researchers from Hanover, Germany have designed a system for detecting and summarizing entity related events from Wikipedia edit history data. Basically the idea is that when a Wikipedia article is edited by a large amount of users in a short period of time that can mark an important event considering the subject of the article. More information about this research can be found here.

The last day of the conference opened with a presentation from Jimmy Lin of Twitter. His talk centered on the importance of fast real-time indexing in social media platform architecture. One of the strengths of Twitter is presenting the users with information about events as they happen. As an example of this he used the event of an earthquake hitting eastern USA in 2011. Tweets from locations closer to the epicenter of the earthquake reached Twitter users in New York City before the actual quake did. I have to admit “Twitter, faster than earthquakes” is a pretty good slogan.

So whether it’s using social media data to let people (re)visit events, automatic event detection in open source dictionaries, making sure your indexing is fast enough to let your users cover events as they happen or something else, event based data seems to be one of the driving forces in the field of IR at the moment.

European Conference on Information Retrieval

The 34th European Conference on Information Retrieval was held  1-5 April 2011, in the lovely but crowded city of Barcelona, Spain. The core conference attracted over 100 attendees, with a total of 35 accepted full papers, 28 posters, and 7 demos being presented. As opposed to the previous year, which had 2 parallel sessions, this year’s conference included a single running session. The accepted papers covered a diverse range of topics, and were divided into query representation, blog and online-community search, semi-structured retrieval, applications, evaluation, retrieval models, classification, categorisation and clustering, image and video retrieval, and systems efficiency.

The best paper award went to Guido Zuccon, Leif Azzopardi, Dell Zhang and Jun Wang for their work entitled “Top-k Retrieval using Facility Location Analysis” and presented by Leif Azzopardi during the retrieval models session. The authors propose using facility location analysis taken from the discipline of operations research to address the top-k retrieval problem of finding “the optimal set of k documents from a number of relevant documents given the user’s query”.

Meanwhile, “Predicting IMDB Movie Ratings using Social Media” by Andrei Oghina, Mathias Breuss, Manos Tsagkias and Maarten de Rijke won the best poster award. With a different goal from the best paper, the authors of the poster experiment with a prediction model for rating movies using a set of qualitative and quantitative features extracted from the stream of two social media channels, YouTube and Twitter. Their findings show that the highest predictive performance is obtained by combining features from both channels, and propose as future work to include other social media channels.

Workshop Days

The conference was preceded by a full day of workshops and tutorials running in parallel. I attended two workshops: Information Retrieval Over Query Sessions (SIR) during the morning and Task-Based and Aggregated Search (TBAS) in the afternoon. The second workshop ended with an interactive discussion. A third, full-day workshop was Searching 4 Fun!.

Industry Day

The last day was the Industry Day. Only 2 papers here, plus 5 oral contributions, and around 50 attendees. A strong focus of the talks given at the industry day was on opinion-mining: four of the six participating companies/institutions presented work on sentiment analysis and opinion mining from social media streams. Jussi Karlgren, from Gavagai, argued that sentiment analysis from social media can be used by companies for example in finding reviews or comments made about their product or service, analyse their market position, and predict price movements. Rianne Kaptein, from Oxyme, backed this up by adding that businesses are interested by what the consumers say about their brand, products or campaigns on social media streams. Furthermore, Hugo Zaragoza from Websays identified two basic needs inside a company: a need for help in reading so that someone can act, and a need for help in explaining so that it can convince. Very interesting topic indeed, and research in this direction will advance as companies become more aware of the business gains from opinion mining of social media.

Overall, ECIR 2012 was a very inspiring conference. It also seemed a very friendly conference, offering many opportunities to network with the fellow attendees. Despite that, several participants said that the number of attendees at this year’s conference has decreased in comparison with previous years. The workshops and the core conference gave me the impression that it has a strong focus on young researchers, as many of the accepted contributions had a student as a first author and presenter at the conference. The fact that there was only one session running at a time was a good decision in my opinion, as the attendees were not forced to miss presentations. Nevertheless, the workshops and tutorials were running in parallel, and although the proceedings of the workshops will be made freely available, I still feel that I missed something that day. The industry day was very exciting, offering the opportunity to share ideas between academia and industry. However, there were not so many presentations, and the topics were not as diverse. I propose that next year Findwise will be among the speakers at the Industry track!

ECIR 2013 will be held in Moscow, Russia, between 24-28 March. See you there!