European Conference on Information Retrieval (ECIR) 2011 in retrospect

Posted on April 27, 2011 by

The European Conference on Information Retrieval (ECIR) 2011 took place in Dublin last week, 18-21 April. In this blogpost I would try to highlight some of the papers and talks from the conference which caught my attention and back it up with what other attendees said about it.

First, I was intrigued by the session on evaluation for IR and especially the topic of Croudsourcing. In my opition, the paper A Methodology for Evaluating Aggregated Search Results, which also got the prize for best student paper, was among the most pedagogically presented ones. It deals with the task of incorporating search results from a number of different sources, called verticals, into Web search results. By using a small number of human judgements for a given query the authors present the way to evaluate any possible permutation of verticals in the result presentation. I think that this methodology should be adopted in the world of Enterprise search, since it is exactly there where we crawl, index and present information from a number of different sources – Web, databases, fileshares, etc. The prerequisites are really minimal and low cost but the return value, the user experience, seems quite high.

Amazon Mechanical Turk, or the Artificial Artificial Intelligence, which is the marketplace for Croudsourcing, provides a way for a ridiculously small sum of money to perform evaluation, relevance assessment or any task for which you would need humans to give you some judgements. Leaving aside ethical issues, two papers in the conference presented ways of how you can utilize this service for some IR tasks.

Evgeniy Gabrilovich from Yahoo! Research, who won the Karen Sparck Jones award for 2010, gave a very interesting keynote talk on Computational Advertising. Up to now, it has never struck me how hard advertising in Information Retrieval systems is actually. I liked one of his points on the future of Ads – by using product feeds, one can automatically create product description via Text Summarization and Natural Language Generation and index this, thus avoiding bid words.

Another interesting and very pedagogically presented paper was about the gensim package by Radim Řehůřek. I definitely think we can use it in some of our projects. In general, text categorization and IR for social network were the dominant tracks. In one of the social networks tracks, Oscar Täckström presented a neat way of discovering fine-grained sentiment where some coarse-grained supervision is available. It really hooked me on trying it for any of our customers where sentiment analysis is required.

Thorsten Joachims, the last of the keynote speakers, gave a very inspiring talk on The Value of User Feedback. He put forward the idea of designing retrieval systems for feedback. In stead of just looking at the clicklogs post factum one can think of a system which uses the clicks feedback to learn, thus creating a better ranker for a given query and a given user need. In a single session, we can use click feedback to disambiguate the query and deliver results on the run which are of immediate benefit to the users.

Unfortunately, I guess I could have missed other interesting presentations but with two parallel sessions and several workshops there was a limit to what I could devour. What surprised me though, was that there were very few papers by the industry. We do try to solve exactly the same problems and tackle the same issues as academia. We, at Findwise, have constantly flagged the huge benefit of good, relevant Metadata for the task of achieving better search performace, which was also touched upon in the paper “Topic Classification in Social Media using Metadata from Hyperlinked Objects”.

It was really great to visit Dublin and attent ECIR 2011. It was an inspiring conference and I do believe that at next ECIR we, from Findwise, can be on the podium, sharing our knowledge and hands-on experience on Enterprise search and IR.

Sláinte!

Tagging, Social Networks, Interaction and Findability

Posted on December 20, 2010 by

Events the past days has got me thinking about the power of social tagging and its connection to findability. Thoughts that commend me to writing my most personal (and perhaps off topic) post yet on this blog. (All thoughts expressed in this post are my own and do not necessarily reflect the opinions of my employer.)

Rumors about the shut down of Delicious have been circling the web. Even though it is still unconfirmed from Yahoo, my Twitter feed has been filled with comments about how to save your bookmarks, export bookmarks to other services, petitions to Yahoo about saving Delicious or making it open source.

Traditionally when talking about user tagging of content the topic is re-finding things. Users tag information on the web or an intranet in order to be able to find their way back to them. However most of the comments that I’ve seen about Delicious being shut down has nothing to do with this. As I see it, users don’t claim to be missing the bookmarks themselves, but the social network, research, collaboration and search capabilities that came with the bookmarking service. Delicious seems to have emerged from a service that helps you bookmark your things for re-finding them to a service that helps you find new things based on the tagging of others. Tagging, or social bookmarking may very well have started as a way of re-finding your information but has grown into a new way of discovering information, in parallel to search. (Maybe that is an explanation to the tweets wishing for Google to buy delicious from Yahoo?)

So, tagging can not only help you re-find your own stuff but also explore new things and spread information. One good example of this is what is currently going on in the swedish Twitterverse. It all started with one journalist’s discussion with her friends about the disbelief towards the women accusing Julian Assange of sexual assault. It quickly turned into so much more; a profound discussion about the fine lines of sexuality, what is OK, what we want and like and how to say no. Using the hash tag #prataomdet swedish twitter users are writing about and discussing their experiences in an effort to change the cultural climate so that people talk about it, start communicating with each other about sexuality. You can easily follow all the tweets real time and read blog posts on the topic at prataomdet.se. Many of the major news sites have now started reporting on this as well after the massive activity on twitter. (For non-swedish speaking readers an effort has also been made to start discussions in English as well at #talkaboutit on twitter.)

The feed in itself is thought provoking and can easily keep you busy for hours. Besides the content and openness of the discussions I find something else amazing. In a matter of hours this one tag joined together users, many of whom have never interacted with each other before, helping them share and find new information about something that was unspoken of earlier. Combining the power of social networks and tagging made this possible.

I usually write very different sorts of blog posts at this blog. This one time I just wanted to revel over the amazing possibilities for interaction that technology offers us today. Then maybe the next step is to think about how to tap into this power of interaction and how findability within the enterprise can benefit from this as well. In the mean time I recommend reading about What social networks reveal about interaction or how Västra Götalands Region are currently working on incorporating user tagging into their metadata.

Why is Search Easy and Hard?

Posted on September 16, 2010 by

Last year my colleague Lina and I went to the Workshop on Human Computer Interaction and Information Retrieval (HCIR) in Washington DC. This year we did not have the possibility to attend but since all the material is available online I took part remotely any way. I wanted to share with you what I found most interesting this year. (Daniel Tunkelang who was one of the organizers also posted a good overview of the event on his blog.)

This years keynote speaker was Dan Russell, a researcher from Google. He talked about Search Quality and user happiness; Why search is easy and hard. The point I found most interesting in his presentation was how improvement is not only needed when it comes to tools and data but also improving the users’ search skills. My own experience from various search projects is similar; users are not good at searching. Even though they are looking for a specific version of a technical documentation for a specific product they might just enter the name of the product, or even the product family. (It’s a bit like searching for ‘camera’ when you expect to find support documentation on your Dioptric lens for you Canon EOS 60D.) So I agree that users need better search skills. In his presentation Russell also presented some ideas on how a search application can help users improve their search skills.

Dan Russell – Search Quality and User Happiness

Search is both easy and hard. Perhaps this is one of the reasons for the introduction of the HCIR Challenge as a new part of the workshop . From the HCIR website:

The aims of the challenge are to encourage researchers and practitioners to build and demonstrate information access systems satisfying at least one of the following:

Not only deliver relevant documents, but provide facilities for making meaning with those documents.

Increase user responsibility as well as control; that is, the systems require and reward human effort.

Offer the flexibility to adapt to user knowledge / sophistication / information need.

Are engaging and fun to use.

The winner of the challenge was a team of researchers from Yahoo Labs who presented Searching Through Time in the New York Times. The Time Explorer features a results page with an interactive time line that illustrates how the volume of articles (results) have changed over time. I recommend that you read the article in tech review to learn more about the project, or try out the Time explorer demo yourself. You can also learn more about the challenge in this blog post by Gene Golovchinsky.

All the papers and posters from the workshop can be found on the new website.

Query Suggestions Help Users Get Unstuck

Posted on November 9, 2009 by

Several papers at the HCIR09 workshop touched on the topic of query suggestions. Chirag Shah and Gary Marchionini presented a poster about query reuse in exploratory search tasks and Diane Kelly presented results from two different studies that examined people’s use of query suggestions and how usage varied depending on topic difficulty. (Their papers are available for download as part of the proceedings from the workshop.)

According to Shah and Marchionini users often search for the same things. They reuse their previous queries e.g. search for the same things multiple times. Users use their previous searches to refind information and also to expand or further filter their previous searches by adding one or more keywords. There is also a significant overlap between what different users search for suggesting that users have a tendency to express their information needs in similar ways. These results support the idea that query suggestions can be used to help users formulate their query. Yahoo and YouTube are two of the systems that uses this technique, where users get suggestions of queries and how they can add more words to their query based on what other users have searched for.

Diane Kelly concludes that users use query suggestion both by typing in the same thing as shown in the suggestion and by clicking on it. Users also tend to use more query suggestions when searching for difficult topics. Query suggestions help users get “unstuck” when they are searching for information. It is however hard to know whether query suggestions actually return better results. The users expectation and preferences do have an effect on user satisfaction as well. User generated query suggestions are also found to be better than query suggestions generated by the search system. So the mere expectation that the query suggestions will help a user could have an positive effect on his or hers experience…

Query suggestions are meant to help the users formulate a good query that will provide them with relevant results. Query suggestions can also work as with yahoo search where query suggestions both suggest more words to add to the query but also provides the users with suggestions for other related concepts to search for. So searching for Britney Spears will for example suggest the related search for Kevin Federline (even though they are now divorced) and searching for enterprise search will suggest concepts such as relevance, information management and off course the names of the different search vendors.

If you apply this to the enterprise search setting the query suggestion could provide the user with several different kinds of help. Combining the user’s previous searches with things other users searched for but also providing suggestions for recommended queries or concepts. The concepts will be high quality information and suggestions controlled by the team managing the search application. It is a way of combining quick links or best bets with query suggestions and a way to hopefully improve the experienced value of the query suggestions. The next step then is to work with these common queries that users search for and make sure that they return relevant results, but that is an entirely different topic…

Information Discovery: Search-in-page

Posted on September 28, 2008 by

Sometimes the users know exactly what they are looking for, sometimes they are just looking to discover new areas. When it comes to information discovery, a plain, one dimensional result list is not the most suitable tool.

Worldwide you’ll find quite a few innovative solutions, some of them mentioned in Findwise’s blog earlier: Quintura and KartOO are two search engines that visualize the clusters of results and the relationships between them, as Clusty that let you discover related topics. Other examples are projects like Zuula and Dogpile that aggregates results and let you know what you can find in Google, Yahoo, Live, Exalead etc from one single search box – hopefully helping you find new perspectives.

In a few days time Searchinpage, created by entrepreneurs in Sweden, will be available.
Searchinpage let you use any word in the result, mark it and use it as input for a new query. By enabling the users to search instantly, this will hopefully create other ways to explore and discover areas related to your initial query. Searchinpage will be available in a public version and as a special solution for enterprises and organizations with specific needs. The new player seems to have a lot of cards up their sleeves (including linguistic functionality and ideas similar to Zuula and Dogpile) – worth keeping an eye on.

Microsoft is Opening its Wallet for Search

Posted on February 3, 2008 by

Three weeks after making a $1.2 billion bid for FAST search & Transfer Microsoft announces that they make a $44.6 billion offer to buy Yahoo. So far it‘s only an offer which Yahoo’s board and stockholders are considering but, to conclude, Microsoft is serious about going into strategic search markets.

Web search engines, such as Google and Yahoo, are making really good money from online advertising such as contextual ads when searching, banners etc. This market is, according to analyst firms such as Yankee group, predicted to double over the next four years giving somewhere between $40- $50.3 billion in revenue.
Google is still a leader within this field, but it seems as if the competition is getting tougher.

Apart from web search there has also been a lot of talk about mobile search, a new emerging market where Yahoo last year made an acquisition of TellMe, a hosted speech applications company. Since this purchase Yahoo has done development for using its online advertising platform Panama for local mobile search and services as well.

If the offer is accepted Microsoft will have a strong portfolio, reaching from critical enterprise search with FAST technology and consumer focused search with Yahoo.
(An interesting perspective is the historical background where FAST developed AlltheWeb, one of the most popular and sophisticated internet search engines in the beginning of 21:st century, which in 2003 was sold to Overture and later bought by Yahoo.)

If the future holds a merger of advanced search technologies remains to be seen, but we will probably see some really interesting development within this field the forthcoming year.
What is your opinion? Can this affect Google’s position as a leader within web search? And more importantly, how do you think Microsoft’s purchases will affect the market for search in general?

Internet life in the Future

Posted on October 1, 2007 by

I always think it’s nice when I hear people talking about the same things that are on my mind these days. It makes me reflect upon things in new ways and also makes me realize that I’m on to something. I attended a presentation by Björn Jeffery from Good Old (hosted by Region Västra Götaland). His talk on internet strategy was interesting and had many things in common with the keynote by Elizabeth Churchill (Yahoo) that I recently heard at the HCI2007 conference. Two things interested me most; the future of mobility and the inevitable question of integrity. So here are my thoughts today, on internet strategy and the future of internet usage.

Integrity

Today young people have become used to using different web 2.0 technologies such as Flickr, Facebook, Delicious etc. So we have seen the emergence of things such as social search and folksonomies. People gladly contribute with information about themselves and what they think and like. I believe this is a good thing, but there are also some risks with this. These risks are that once something is on the internet and is indexed, it’s out there and it stays there. Many people are not aware of that fact. How do you keep your integrity when everything about you can be found online? Integrity is very important when implementing these solutions in an enterprise setting.

How can people contribute without having to share their stuff with everyone else if they don’t want to? Björn Jeffery mentioned that we’ve gone from sharing nothing with noone to sharing everything with everyone and that he thought this would change back to us sharing a lot of things with many people. I hope he’s right. Teenagers might note care who they share their stuff with, but security and integrity are vital issues when considering enterprise solutions.

Mobility

In these days mobility has become an important thing. We not only expect to be able to find the information we need but to find it whenever we want from where ever we want to. I am actually writing this blog post on a train, and off course I expect to have access to all Findwise and other resources from here as well. As technology changes our behavior and expectations change with it, and so does society. (I covered excitement generators in a previous post about Jared Spools keynote on HCI2007.)

“I don’t use computers, love. This is just the internet”.

quote from Elizabeth Churchills keynote

Today there is no longer an association between internet and the computer screen. Mobile phones have become an increasingly popular way of accessing the internet. So, you can use search to access all your company’s information from a single point of access when ever you need it. Then maybe next step is mobile search on your intranet? That would not only make information become available at all time but from where ever you might be, and exactly when you want it.

So in conclusion of these talks; I think that in the future we will want to be able to access everything from everywhere at any time. We used to talk about time we spent online. That distinction isn’t really there any more. Today our tasks are interweawed, we don’t separate time we spend online and offline. (Something that becomes painfully obvious when trying to work on the train when you’ve forgotten the usbconnection for the mobile internet.) And in that time we spend online we also need to define what things we want to share with whom. If we as designers can solve these things, I think we’re on to something promising.

Search is Fun

Posted on August 23, 2007 by

Luckily, search is not all finding critical business information, it also is the ticket to finding new enjoyments. Recently Yahoo has launched a new audio search that lets you search multiple music libraries, including for example iTunes, containing millions of songs. In the search result you can see who provided the song and also listen to a free 30- second clip.

If you are more into video clips, check out the new video search Lumerias. It not only include videos from the large sites like Youtube, but also crawls the entire web for videos. Lumerias also lets you download you favorite clips, which is nice if you want to view them offline or are scared of forgetting where you found them.

Enjoy!

The Findability blog

the enterprise search and findability blog by Tietoevry Findwise

Tag Archives: Yahoo

Why is Search Easy and Hard?

Query Suggestions Help Users Get Unstuck

Microsoft is Opening its Wallet for Search

Search is Fun