Web crawling is the last resort

Data source analysis is one of the crucial parts of an enterprise search deployment project. Search engine results quality strongly depends on an indexed data quality. In case of web-based sources, there are two basic ways of reaching the data: internal and external. Internal method involves reading the data directly from its storage place, such as a database, filesystem files or API. Documents are read by some criteria or all documents are read, depending on requirements. External technique relies on reading a rendered HTML with content via HTTP, the same way as it is read by human users. Reaching further documents (so called content discovery) is achieved by following hyperlinks present in the content or with a sitemap. This method is called a web crawling.

The crawling, in contrary to a direct source reading, does not require particular preparations. In a minimal variant, just a starting URL is required and that’s it. Content encoding is detected automatically, off the shelf components extract text from the HTML. The web crawling may appear as a quick and easy way to collect a content to be indexed. But after deeper analysis, it turns out to have multiple serious drawbacks.

Continue reading

Query Completion with Apache Solr

There are plenty of names for this functionality: query completion, suggestions, auto-complete, auto-suggest, word completion, type ahead and maybe some more. Even if we may point slight differences between them (suggestions can base on your index documents or external input such users queries), from technical point of view it’s all about the same: to propose a query for the end user.

google-suggestearly Google Suggest from 2008. Source: http://www.wpromote.com/blog/4-things-in-08-that-changed-the-face-of-search/

 

Suggester feature was started 8 years ago by Google, in 2008. Users got used to the query completion and nowadays it’s a common feature of all mature search engines, e-commerce platforms and even internal enterprise search solutions.

Suggestions help with navigating users through the web portal, allow to discover relevant content and recommend popular phrases (and thus search results). In the e-commerce area they are even more important because well implemented query completion is able to high up conversion rate and finally – increase sales revenue. Word completion never can lead to zero results, but this kind of mistake is made frequently.

And as many names describe this feature there are so many ways to build it. But still it’s not so trivial task to implement good working query completion. Software like Apache Solr doesn’t solve whole problem. Building auto-suggestions is also about data (what should we present to users), its quality (e.g. when we want to suggest other users’ queries), suggestions order (we got dozens matches, but we can show only 5; which are the most important?) or design (user experience or similar).

Going back to the technology. Query completion can be built in couple of ways with Apache Solr. You can use mechanisms like facets, terms, dedicated suggest component or just do a query (with e.g. dismax parser).

Take a look at Suggester. It’s very easy to run. You just need to configure searchComponent and requestHandler. Example:

<searchComponent name="suggester" class="solr.SuggestComponent">
  <lst name="suggester">
    <str name="name">suggester1</str>
    <str name="lookupImpl">FuzzyLookupFactory</str>
    <str name="dictionaryImpl">DocumentDictionaryFactory</str>
    <str name="field">title</str>
    <str name="weightField">popularity</str>
    <str name="suggestAnalyzerFieldType">text</str>
  </lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
  <lst name="defaults">
    <str name="suggest">true</str>
    <str name="suggest.count">10</str>
  </lst>
  <arr name="components">
    <str>suggester</str>
  </arr>
</requestHandler>

SuggestComponent is a ready-to-use implementation, which is responsible for serving up suggestions based on commands and queries. It’s an efficient solution, i.e. because it works on structure separated from main index and it’s being kept in memory. There are some basic settings like field used for autocompleting or defining text analyzing chain. LookImpl defines how to match terms in index. There are about 10 algorithms with different purpose. Probably the most popular are:

  • AnalyzingLookupFactory (default, finds matches based on prefix)
  • FuzzyLookupFactory (finds matches with misspellings),
  • AnalyzingInfixLookupFactory (finds matches anywhere in the text),
  • BlendedInfixLookupFactory (combines matches based on prefix and infix lookup)

You need to choose the one which fulfill your requirements. The second important parameter is dictionaryImpl which represents how indexed suggestions are stored. And again, you can choose between couple of implementations, e.g. DocumentDictionaryFactory (stores terms, weights, and optional payload) or HighFrequencyDictionaryFactory (works when very common terms overwhelm others, you can set up proper threshold).

There are plenty of different settings you can use to customize your suggester. SuggestComponent is a good start and probably covers many cases, but like everything, there are some limitations like e.g. you can’t easily filter out results.

Example execution:

http://localhost:8983/solr/index/suggest?wt=json&suggest.dictionary=analyzingSuggester&suggest.q=lond

suggestions: [
  { term: "london" },
  { term: "londonderry" },
  { term: "londoño" },
  { term: "londoners" },
  { term: "londo" }
]

Another way to build a query completion is to use mechanisms like faceting, terms or highlighting.

The example of QC built on facets:

http://localhost:8983/solr/index/select?q=*:*&facet=on&facet.field=title_keyword&facet.mincount=1&facet.contains=lon&rows=0&wt=json

title_keyword: [
  "blonde bombshell", 2,
  "12-pounder long gun", 1,
  "18-pounder long gun", 1,
  "1957 liga española de baloncesto", 1,
  "1958 liga española de baloncesto", 1
]

Please notice that here we have used facet.contains method, so query matches also in the middle of phrase. It works on the basis of regular expression. Additionally, we have a count for every suggestion in Solr response.

TermsComponent (returns indexed terms and the number of documents which contain each term) and highlighting (originally, emphasize fragments of documents that match the user’s query) can be also used, what is presented below.

Terms example:

<searchComponent name="terms" class="solr.TermsComponent"/>
<requestHandler name="/terms" class="solr.SearchHandler" startup="lazy">
  <lst name="defaults">
    <bool name="terms">true</bool>
    <bool name="distrib">false</bool>
  </lst>
  <arr name="components">
    <str>terms</str>
  </arr>
</requestHandler>
http://localhost:8983/solr/index/terms?terms.fl=title_general&terms.prefix=lond&terms.sort=index&wt=json

title_general: [
  "londinium",
  "londo",
  "london",
  "london's",
  "londonderry"
]

Highlighting example:

http://localhost:8983/solr/index/select?q=title_ngram:lond &fl=title&hl=true&hl.fl=title&hl.simple.pre=&hl.simple.post=

title_ngram: [
  "londinium",
  "londo",
  "london",
  "london's",
  "londonderry"
]

You can also do auto-complete even with usual, full-text query. It has lots of advantages: Lucene scoring is working, you have filtering, boosts, matching through many fields and whole Lucene/Solr queries syntax. Take a look at this eDisMax example:

http://localhost:8983/solr/index/select?q=lond&qf=title_ngram&fl=title&defType=edismax&wt=json

docs: [
  { title: "Londinium" },
  { title: "London" },
  { title: "Darling London" },
  { title: "London Canadians" },
  { title: "Poultry London" }
]

The secret is an analyzer chain whether you want to base on facets, query or SuggestComponent. Depending on what effect you want to achieve with your QC, you need to index data in a right way. Sometimes you may want to suggest single terms, another time – whole sentences or product names. If you want to suggest e.g. letter by letter you can use Edge N-Gram Filter. Example:

<fieldType name="text_ngram" class="solr.TextField">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.EdgeNGramFilterFactory minGramSize="1" maxGramSize="50" />
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

N-Gram is a structure of n items (size depends on given range) from a given sequence of text. Example: term Findwise, minGramSize = 1 and maxGramSize = 10 will be indexed as:

F
Fi
Fin
Find
Findw
Findwi
Findwis
Findwise

With such indexed text you can easily achieve functionality where user is able to see changing suggestions after each letter.

Another case is an ability to complete word after word (like Google does). It isn’t trivial, but you can try with shingle structure. Shingles are similar to N-Gram, but it works on whole words. Example: Searching is really awesome, minShingleSize = 2 and minShingleSize = 3 will be indexed as:

Searching is
Searching is really
is really
is really awesome
really awesome

Example of Shingle Filter:

<fieldType name="text_shingle" class="solr.TextField">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.ShingleFilterFactory" maxShingleSize="10" />
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

What if your users could use QC which supports synonyms? Then they could put e.g. abbreviation and find a full suggestion (NYC -> New York City, UEFA -> Union Of European Football Associations). It’s easy, just use Synonym Filter in your text field:

<fieldType name="text_synonym" class="solr.TextField">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
  </analyzer>
</fieldType>

And then just do a query:

http://localhost:8983//select?defType=edismax&fl=title&q=nyc&qf=title_synonym&wt=json

docs: [
  { title: "New York City" },
  { title: "New York New York" },
  { title: "Welcome to New York City" },
  { title: "City Club of New York" },
  { title: "New York" }
]

Another very similar example concerns language support and matching suggestions regardless of the terms’ form. It can be especially valuable for languages with  the rich grammar rules and declination. In the same way how SynonymsFilter is used, we can configure a stemmer / lemmatization filter e.g. for English (take a look here and remember to put language filter both for index and query time) and expand matching suggestions.

As you can see, there are many ways to run query completion, you need to adjust right mechanism and text analysis based on your own limitations and also on what you want to achieve.

There are also other topics connected with preparing type ahead solution. You need to consider performance issues, they are mostly centered on response time and memory consumption. How many requests will generate QC? You can assume that at least 3 times more than your regular search service. You can handle traffic growth by optimizing Solr caches, installing separated Solr instanced only for suggesting service. If you’ll create n-gram, shingles or similar structures, be aware that your index size will increase. Remember that if you decided to use facets or highlighting for some reason to provide suggester, this both mechanisms make your CPU heavy loaded.

In my opinion, the most challenging issue to resolve is choosing a data source for query completion mechanism. Should you suggest parts of your documents (like titles, keywords, authors)? Or use NLP algorithms to extract meaningful phrases from your content? Maybe parse search/application logs and use the most popular users queries? Be careful, filter out rubbish, normalize users input). I believe the answer is YES – to all. Suggestions should be diversified (to lead your users to a wide range of search resources) and should come from variety of sources. More than likely, you will need to do a hard job when processing documents – remember that data cleaning is crucial.

Similarly, you need to take into account different strategies when we talk about the order of proposed suggestions. It’s good to show them in alphanumeric order (still respect scoring!), but you can’t stop here. Specificity of QC is that application can return hundreds of matches, but you can present only 5 or 10 of them. That’s why you need to promote suggestions with the highest occurrence in index or the most popular among the users. Further enhancements can involve personalizing query completion, using geographical coordinates or implementing security trimming (you can see only these suggestions you are allowed to).

I’m sure that this blog post doesn’t exhaust the subject of building query completion, but I hope I brought this topic closer and showed the complexity of such a task. There are many different dimension which you need to handle, like data source of your suggestions, choosing right indexing structure, performance issues, ranking or even UX and designing (how would you like to present hints – simple text or with some graphics/images? Would you like to divide suggestions into categories? Do you always want to show result page after clicked suggestion or maybe redirect to particular landing page?).

Search engine like Apache Solr is a tool, but you still need an application with whole business logic above it. Do you want to have a prefix-match and infix-match? To support typos and synonyms? To suggest letter after the letter or word by word? To implement security requirements or advanced ranking to propose the best tips for your users? These and even more questions need to be think over to deliver successful query completion.

How it all began: a brief history of Intranet Search

In accordance to sources, the birth of the intranet fell on a 1994 – 1996, that was true prehistory from an IT systems point of view. Intranet history is bound up with the development of Internet – the global network. The idea of WWW, proposed in 1989 by Tim Berners-Lee and others, which aim was to enable the connection and access to many various sources, became the prototype for the first internal networks. The goal of intranet invention was to increase employees productivity through the easier access to documents, their faster circulation and more effective communication. Although, access to information was always a crucial matter, in fact, intranet offered lots more functionalities, i.e.: e-mail, group work support, audio-video communication, texts or personal data searching.

Overload of information

Over the course of the years, the content placed on WWW servers had becoming more important than other intranet components. First, managing of more and more complicated software and required hardware led to development of new specializations. Second, paradoxically the easiness of information printing became a source of serious problems. There was too much information, documents were partly outdated, duplicated, without homogeneous structure or hierarchy. Difficulties in content management and lack of people responsible for this process led to situation, when final user was not able to reach desired piece of information or this had been requiring too much effort.

Google to the rescue

As early as in 1998 the Gartner company made a document which described this state of Internet as a “Wild West”. In case of Internet, this problem was being solved by Yahoo or Google, which became a global leader on information searching. In internal networks it had to be improved by rules of information publishing and by CMS and Enterprise Search software. In many organizations the struggle for easier access to information is still actual, in the others – it has just began.

cowboys

And the Search approached

It was search engine which impacted the most on intranet perception. From one side, search engine is directly responsible for realization of basic assumptions of knowledge management in the company. From the other, it is the main source of complaints and frustration among internal networks users. There are many reasons of this status quo: wrong or unreadable searching results, lack of documents, security problems and poor access to some resources. What are the consequences of such situation? First and foremost, they can be observed in high work costs (duplication of tasks, diminution in quality, waste of time, less efficient cooperation) as well as in lost chances for business. It must not be forgotten that search engine problems often overshadow using of intranet as a whole.

How to measure efficiency?

In 2002 Nielsen Norman Group consultants estimated that productivity difference between employees using the best and the worst corporate network is about 43%. On the other hand, annual report of Enterprise Search and Findability Survey shows that in situation, when almost 60% of companies underline the high importance of information searching for their business, nearly as 45% of employees have problem with finding the information.
Leaving aside comfort and level of employees satisfaction, the natural effect of implementation and improvement of Enterprise Search solutions is financial benefit. Contrary to popular belief, investments profits and savings from reaching the information faster are completely countable. Preparing such calculations is not pretty easy. The first step is: to estimate time, which is spent by employees on searching for information, to calculate what percentage of quests end in a fiasco and how long does it take to perform a task without necessary materials. It should be pointed out that findings of such companies as IDC or AIIM shows that office workers set aside at least 15-35% of their working hours for searching necessary information.
Problems with searching are rarely connected with technical issues. Search engines, currently present on our market, are mature products, regardless of technologies type (commercial/open-source). Usually, it is always a matter of default installation and leaving the system in untouched state just after taking it “out of the box”. Each search engine is different because it deals with various documents collections. Another thing is that users expectations and business requirements are changing continually. In conclusion, ensuring good quality searching is an unremitting process.

Knowledge workers main tool?

Intranet has become a comprehensive tool used for companies goals accomplishment. It supports employees commitment and effectiveness, internal communication and knowledge sharing. However, its main task is to find information, which is often hide in stack of documents or dispersed among various data sources. Equipped with search engine, intranet has become invaluable working tool practically in all sectors, especially in specific departments as customer service or administration.

So, how is your company’s access to information?


This text makes an introduction to series of articles dedicated to intranet searching. Subsequent articles are intended to deal with: search engine function in organization, benefit from using Enterprise Search, requirements of searching information system, the most frequent errors and obstacles of implementations and systems architecture.

Google Instant – Can a Search Engine Predict What We Want?

On September 8th Google released a new feature for their search engine: Google instant.
If you haven’t seen it yet, there is an introduction on Youtube that is worth spending 1:41 minutes on.

Simply put, Google instant is a new way of displaying results and helping users find information faster. As you type, results will be presented in the background. In most cases it is enough to write two or three characters and the results you expect are already right in front of you.

Google instant

The Swedish site Prisjakt has been using this for years, helping the users to get a better precision in their searches.

At Google you have previously been guided by “query suggestion” i.e. you got suggestions of what others have searched for before – a function also used by other search engines such as Bing (called Type Ahead). Google instant is taking it one step further.

When looking at what the blog community has to say about the new feature it seems to split the users in two groups; you either hate it or love it.

So, what are the consequences? From an end-user perspective we will most likely stop typing if something interesting appears that draws our attention. The result?
The search results shown at the very top will generate more traffic , it will be more personalized over time and we will most probably be better at phrasing our queries better.

From an advertising perspective, this will most likely affect the way people work with search engine optimization. Some experts, like Steve Rubel, claims Google instant will make SEO irrelevant, wheas others, like Matt Cutts think it will change people behavior in a positive way over time  and explains why.

What Google is doing is something that they constantly do: change the way we consume information. So what is the next step?

CNN summarizes what the Eric Schmidt, the CEO of Google says:

“The next step of search is doing this automatically. When I walk down the street, I want my smartphone to be doing searches constantly: ‘Did you know … ?’ ‘Did you know … ?’ ‘Did you know … ?’ ‘Did you know … ?’ ”

Schmidt said at the IFA consumer electronics event in Berlin, Germany, this week.

“This notion of autonomous search — to tell me things I didn’t know but am probably interested in — is the next great stage, in my view, of search.”

Do you agree? Can we predict what the users want from search? Is this the sort of functionality that we want to use on the web and behind the firewall?

Findability in Customer Service Search

We have previously introduced Findability by Findwise, involving solutions that make optimal use of search technology to support and strengthen the business of our customers. In a series of blog posts we will present how findability solutions can be deployed within different parts of your organisation. Initially I will focus on how efficient implementation of search technology, by a good customer service search, can improve your customer service offering.

Ultimately, the goal of most customer service interactions is to increase customer satisfaction and thereby improve customer retention in a cost efficient way. In times when the amount of available information increases by the minute, one key success factor is to provide both customer service agents and customers with quick and easy access to relevant information. A findability solution based on state-of-the-art search technology and optimised along the findability dimensions will fuel your customer service search offering in two primary ways:

  1. Improved support to customer service agents
  2. Improved online customer service

Example of customer service search

Improved support to customer service agents

While more traditional customer service interaction solutions tend to be based on a knowledge database, that needs to be built and maintained, a Findability solution is more dynamic in its nature and is based on a dynamic search index created by the already existing data residing in corporate systems. In other words, the solution makes optimal use of existing information and systems to support customer service agents in accessing relevant information. The positive effects are illustrated by the case study below.

Case study: Telecom call centre

Findwise implemented a findability solution at a call centre for a large Swedish mobile operator. The solution introduced the powerful ability to search in the most important information source, which previously only had been accessible via tree-structure navigation.

The graph below presents the result of a test performed by the call centre agents to evaluate the new search function. The test encompassed a number of tasks in which the agents compared using the search functionality to the traditional navigation, in terms of both level of difficulty and time consumption in finding desired information. The graph shows that the agents found the search function very helpful, making the information both easier and less time consuming to find.

 The graph shows that the agents found the customer service search function very helpful, making the information both easier and less time consuming to find.

The most evident effects of improved support and information access via search technology are:

  • Reduced handling time
  • Higher first time resolution
  • Reduced Tier-2 escalations
  • Increased customer service agent satisfaction
  • Increased agent productivity
  • Less training needed to introduce new agents

In a white paper, Google has also pinpointed, and quantified, the above benefits of implementing a Findability solution in call centre operations, in this case fuelled by the Google Search Appliance (GSA) search platform. For example, Google states that handling time can be reduced by up to 20% on average and that is it possible to save up to 25% on training costs for each new call centre agent. The full article is available here.

Improved online customer service

Naturally a Findability solution can also improve your online customer service offering. Below I have outlined three solution elements that will help drive customer self-service and thereby deflect issues from being forwarded to the customer service organisation.

Improved search functionality

As in the case of agent support, a powerful search functionality that provides relevant information from all required sources in a user-friendly way will increase the ability of customer self-resolution.

Personalised user interface

Using the power of an enterprise search platform you can customise the self-service experience, in a dynamical way, to the individual and the incident to simplify and speed up the process of finding answers.

Dynamic FAQ

Self-service can also be fuelled by providing a relevant and updated FAQ section. The information can be made dynamic and include answers to the most recent questions by using both query log information, i.e. what users are searching for, and call centre comments as input to the FAQs.

For many enterprises, self-service is seen as the solution that can provide customers with the support they need while significantly reducing customer service costs. However, self-service must do more than just cut costs. When customers perceive self-service as simply a means to shift interaction costs onto their shoulders, it can reduce customer satisfaction. Customers need a self-service experience that provides them with higher levels of interaction convenience and information availability, faster issue resolution and more personalised interactions. A Findability solution including the above elements provides that.

The most evident effects of an improved online customer service offering gained from the use of search technology and search analytics are:

  • Less number of incoming calls/e-mails
  • Increased customer satisfaction
  • Increased browser- to-buyer conversion rate
  • Increased knowledge of user interests and behaviour (to fuel additional sales)

Visit our website to learn more about findability solutions that make our customers truly benefit from state-of-the-art search technology.

Search in SharePoint 2010

This week there has been a lot of buzz about Microsoft’s launch of SharePoint 2010 and Office 2010. Since SharePoint 2007 has been the quickest growing server product in the history of Microsoft, the expectations on SharePoint 2010 are tremendous. And also great expectations for search in Sharepoint 2010

Apart from a great deal of possibilities when it comes to content creation, collaboration and networking, easy business intelligence etc. the launch also holds another promise: that of even better capabilities for search in Sharepoint 2010 (with the integration of FAST).

Since Microsoft acquired FAST in 2008, there have been a lot of speculations about what the future SharePoint versions may include in terms of search. And since Microsoft announced that they will drop their Linux and UNIX versions in order to focus on higher innovation speed, Microsoft customer are expecting something more than the regular. In an early phase it was also clear that Microsoft is eager to take market shares from the growing market in internet business.

So, simply put, the solutions that Microsoft now provide in terms of search is solutions for Business productivity (where the truly sophisticated search capabilities are available if you have Enterprise CAL-licenses, i.e. you pay for the number of users you have) and Internet Sites (where the pricing is based on the number of servers). These can then be used in a number of scenarios, all dependent on the business and end-user needs.
Microsoft has chosen to describe it like this:

  • Foundation” is, briefly put, basic SharePoint search (Site Search).
  • Standard” adds collaboration features to the “Foundation” edition and allows it to tie into repositories outside of SharePoint.
  • Enterprise ” adds a number of capabilities, previously only available through FAST licenses, such as contextual search (recognition of departments, names, geographies etc), ability to tag meta data to unstructured content, more scalability etc.

I’m not going to go into detail, rather just conclude that the more Microsoft technology the company or organization already use, the more benefits it will gain from investing in SharePoint search capabilities.

And just to be clear:  non-SharePoint versions (stand-alone) of FAST are still available, even though they are not promoted as intense as the SharePoint ones.

Apart from Microsoft’s overview above, Microsoft Technet provides a more deepdrawing description of the features and functionality from both an end-user and administrator point of view.

We look forward describing the features and functions in more detail in our upcoming customer cases. If you have any questions to our SharePoint or FAST search specialist, don’t hesitate to post them here on the blog. We’ll make sure you get all the answers.

Search and Accessibility

Västra Götalands regionen has introduced a new search solution that Findwise created together with Netrelations. Where both search and accessibility is important. We have also blogged about it earlier (see How to create better search – VGR leads the way). One important part of the creation of this solution was to create an interface that is accessible to everyone.

Today the web offers access to information and interaction for people around the world. But many sites today have barriers that make it difficult, and sometimes even impossible for people with different disabilities to navigate and interact with the site. It is important to design for accessibility  – so that no one is excluded because of their disabilities.

Web accessibility means that people with disabilities can perceive, understand, navigate, interact and contribute to the Web. But web accessibility is not only for people that use screen readers, as is often portrayed. It is also for people with just poor eyesight who need to increase the text size or for people with cognitive disabilities (or sometimes even for those without disabilities). Web accessibility can benefit people without disabilities, such as when using a slow Internet connection, using a mobile phone to access the web or when someone is having a broken arm. Even such a thing as using a web browser without javascript because of company policy can be a disability on the web and should be considered when designing websites.

So how do you build accessible websites?

One of the easiest things is to make sure that the xhtml validates. This means that the code is correct, adheres to the latest standard from W3C (World Wide Web Consortium) and that the code is semantically correct i.e. that the different parts of the website use the correct html ”tags” and in the correct context. For example that the most important heading of a page is marked up with ”h1” and that the second most important is ”h2” (among other things important when making websites accessible for people using screen readers).

It is also important that a site can easily be navigated only by keyboard, so that people who cannot use a mouse still can access the site. Here it is important to test in which order the different elements of the web page is selected when using the keyboard to navigate through the page. One thing that is often overlooked is that a site often is inaccessible for people with cognitive disabilities because the site contains content that uses complex words, sentences or structure. By making content less complex and more structured it  will be readable for everyone.

Examples from VGR

In the search application at VGR elements in the interface that use javascript will only be shown if the user has a browser with java script enabled. This will remove any situations where elements do not do anything because java script is turned off. The interface will still be usable, but you will not get all functionality. The VGR search solution also works well with only the keyboard, and there is a handy link that takes the user directly to the results. This way the user can skip unwanted information and navigation.

How is accessibility related to findability?

http://www.flickr.com/photos/morville/4274260576/in/set-72157623208480316/

Search and Accessibility

Accessibility is important for findability because it is about making search solutions accessible and usable for everyone. The need to find information is not less important if you are blind,  if you have a broken arm or if you have dyslexia. If you cannot use a search interface you cannot find the information you need.

“what you find changes who you become” -Peter Morville

In his book Search Patterns Peter Morville visualizes this in the ”user experience honeycomb”. As can been seen in the picture accessibility is as much a part of the user experience as usability or findability is and a search solution will be less usable without any of them.