Elastic{ON} 2017 – breaking all the records!

Elastic{ON} 2017 draws 2200 participants to Pier 48 during these somewhat chilly San Francisco days in March. It’s a 40% increase from the 1600 or so participants last year, in line with the growing interest for the Elastic Stack and the successes of Elastic commercially.

From Findwise – we are a team of 4 Findwizards, networking, learning and reporting.

Shay Banon, the creator of Elasticsearch and Elastic CTO, is doing both the opening and closing keynote. It is apparent that the transition of the CEO role from Steven Schuurman has already started.

ElasticON 2017

2016 in retrospective with the future in mind

Elastic reached 100 million downloads in 2016, and have managed to land approximately 4000 paying subscription customers out of this installed base to date. A lot of presentations during the conference is centered around new functionality that is developed and will be released to the open source community freely. Other functionality goes into the commercial X-pack subscriptions. Some X-pack functionality is available freely under the Basic subscription level that only requires registration.

Most presentations are centered around search powered analytics, and fewer around regular free text search. Elasticsearch and the Elastic Stack got its main use cases within logging, analytics and in various applications as a data platform or middle-layer with search use-cases as a strong sidekick.

A strong focus on analytics

There’s 22 sponsors at the event, and most of the companies are either offering cloud based monitoring or machine learning services. IBM, the platinum sponsor, are promoting the Bluemix cloud services for cognitive Watson functionality and uses the conference to reach out to the predominantly developer-focused audience.

Prelert was acquired in September last year, and is now being integrated into the Elastic Stack as the Machine Learning component and is used for unsupervised anomaly detection to give operation log insights. Together with the new modular Beats architecture and various Kibana improvements, it looks apparent that Elastic is chasing the huge market Splunk currently controls within logging and analytics.

Elasticsearch SQL – giving BI what it needs

Elasticsearch SQL will give the search engine SQL capability just like Solr got with their parallel SQL interface. Elasticsearch is becoming more and more a “data platform”. Increasingly becomming an competitor to HPE Vertica and Amazon RedShift as it hits a sweet spot use-case where a combination of faster data loading and extreme scalability is needed, and it is acceptable with the tradeoffs of limited functionality (such as the lack of JOIN operations). With SQL support the platform can use existing visualization tools such as Tableu and it expands the user base as many people in the Business Intelligence sector knows SQL by heart.

Fast and simple Beats is music to our ears

Beats will become modular in the next release, and more beats modules will be created either by Elastic or in the open source or commercial community. This increases simple connectivity to various data sources, and adds standardized dashboards for the data source, which will increase simplicity and speed in implementation.

Heartbeat is a new Beat (with a beautiful name!) that send pings to check that services are alive and functioning.

Kibana goes international

Kibana is maturing with some new key updates coming soon. A Time series visual builder that will give graphical guidance on how to build the dashboards, Kibana Canvas gives custom dynamic reports and enables slide show presentations with live data, and the GUI frontend is translated to various languages.

There’s a new tile service for maps, so instead of relying on external map services, Elastic now got control over the maps functionality. The service can be used free of charge but requires registration (Basic subscription) to use all 18 zoom levels.

kibana-int

 

To conclude, we’ve had three good days with exciting product news and lots of interesting meetings in what could very well be the biggest show for search and search-driven analytics right now! Be sure to see us at the next year’s Elastic{ON} again. If not before, see you then!

 

From San Francisco with love,

/Andreas, Christian, Joar and Peter

Digital wizardry for customers & employees – the next elements

A reflection on Mobile World Congress topics mobility, digitalisation, IoT, the Fourth Industrial Revolution and sustainability

MWC2017Commerce has always had a conversational, today it is digital. Organisations are asking how to organise clean effective data for an open digital conversation. 

Digitalization’s aim is to answer customer/consumer-centric demands effectively (with relevant and related data) and in an efficient manner. [for the remainder of the article read consumer and customer interchangeably]

This essentially means Joining the dots between clean data and information and being able to recognise the main and most consumer-valuable use cases, be it common transaction behaviour or negating their most painful user experiences.

This includes treading the fine line between being able to offer “intelligent” information (intelligent in terms of relevance and context)  to the consumer effectively and not seeming to be freaky or stalker-like. The latter is dealt with by forming a digital conversation where the consumer understands the use of their information only being used for their end needs or wants.

While clean, related data from the many various multi-channel customer touch-points forms the basis of an agile digital organisation, it is the combination of significant data analysis insight of user demand & behaviour (clicks, log analysis etc), machine learning and sensible prediction that forms the basis of artificial Intelligence. Artificial intelligence broken down is essentially resultant action based on the inferences of knowing certain information, i.e. the elementary Dr Watson, but done by computers.

This new digital data basis means being able to take data from what were previous data silos and combine it effectively in a meaningful way, for a valuable purpose. While the tag of Big Data becomes weary in a generalised context, key is the picking of data/information to get relevant answers to the mosts valuable questions, or in consumer speak, to get a question answered or a job done effectively.

Digitalisation (and then the following artificial intelligence) relies obviously on computer automation, but it still requires some thoughtful human-related input. Important steps in the move towards digitalization include:

  • Content and Data Inventory, to clean data/ the cleansing of data and information;
  • Its architecture (information modelling, content analysis, automatic classification and annotation/tagging);
  • Data analysis in combination with text analysis (or NLP: natural language processing for the more abundant unstructured data, content), the latter to put flesh on the bone as it were, or adding meaning and context
  • Information Governance: the process of being responsible for the collection, proper storage and use of important digital information (now made less ignorable with new citizen-centric data laws (GDPR) and the need for data agility or anonymization of data)
  • Data/system Interoperability: which data formats, structures, and standards, are most appropriate for you? What data collections are most Relational databases, Linked/graph data, data lakes etc.?); 
  • Language/cultural interoperability: letting people with different perspectives accessing the same information topics using their own terminology.
  • Interoperability for the future also means being able to link everything in your business ecosystem for collaboration, in- and outbound conversations, endless innovation and sustainability.
  • IoT or the Internet of Things is making the physical world digital and adding further to the interlinked network, soon to be superseded by the AoT (the analysis of things)
  • Newer steps of Machine learning (learning about consumer preferences and behaviour etc.) and artificial intelligence (being able to provide seemingly impossible relevant information, clever decision-making and/or seamless user experience).

The fusion of technologies continues further as the lines between the physical, digital, and biological spheres with developments in immersive Internet, as with Augmented Reality (AR) and Virtual Reality (VR).

The next elements are here already: semantic (‘intelligent’) search, virtual assistants, robots, chat bots… with 5G around the corner to move more data, faster.

Progress within mobility paves the way for a more sustainable world for all of us (UN Sustainable Development), with a future based on participation. In emerging markets we are seeing giant leaps in societal change. Rural areas now have access to the vast human resources of knowledge to service innovation e.g. through free-access to Wikipedia on cheap mobile devices and Open Campuses. Gender equality with changed monetary and mobile financial practices and blockchain means to raise to the challenge with interoperability. We have to address the open paradigm (e.g Open Data) and the participation economy, building the next elements. Shared experience and information commons. This also falls back to the intertwingled digital workplace, and practices to move into new cloud based arenas.

Some last remarks on the telecom industry, it is loaded with acronyms and for laymen in the area sometimes a maze to navigate and to build some sensemaking.

So are these steps straightforward, or is the reality still a potential headache for your organisation? 

Contact Findwise now to ease the process, before your competitor does 😉
View Fredric Landqvist's LinkedIn profileFredric Landqvist research blog
View Peter Voisey's LinkedIn profilePeter Voisey

What will happen in the information sector in 2017?

As we look back at 2016, we can say that it has been an exciting and groundbreaking year that has changed how we handle information. Let’s look back at the major developments from 2016 and list key focus areas that will play an important role in 2017.

3 trends from 2016 that will lay basis for shaping the year 2017

Cloud

There has been a massive shift towards cloud, not only using the cloud for hosting services but building on top of cloud-based services. This has affected all IT projects, especially the Enterprise Search market when Google decided to discontinue GSA and replace it with a cloud based Springboard. More official information on Springboard is still to be published in writing, but reach out to us if you are keen on hearing about the latest developments.

There are clear reasons why search is moving towards the cloud. Some of the main ones being machine learning and the amount of data. We have an astonishing amount of information available, and the cloud is simply the best way to handle this overflow. Development in the cloud is faster, the cloud gives practically unlimited processing power and the latest developments available in the cloud are at an affordable price.

Machine learning

One area that has taken huge steps forward has been machine learning. It is nowadays being used in everyday applications. Google wrote a very informative blog post about how they use Cloud machine learning in various scenarios. But Google is not alone in this space – today, everyone is doing machine learning. A very welcome development was the formation of Partnership on AI by Amazon, Google, Facebook, IBM and Microsoft.

We have seen how machine learning helps us in many areas. One good example is health care and IBM Watson managing to find a rare type of leukemia in 10 minutes. This type of expert assistance is becoming more common. While we know that it is still a long path to come before AI becomes smarter than human beings, we are taking leaps forward and this can be seen by DeepMind beating a human at the complex board game Go.

Internet of Things

Another important area is IoT. In 2016 most IoT projects have, in addition to consumer solutions, touched industry, creating a smart city, energy utilization or connected cars. Companies have realized that they nowadays can track any physical object to the benefits of being able to serve machines before they break, streamline or build better services or even create completely new business based on data knowledge. On the consumer side, we’ve in 2016 seen how IoT has become mainstream with unfortunate effect of poorly secured devices being used for massive attacks.

 

3 predictions for key developments happening in 2017

As we move towards the year 2017, we see that these trends from 2016 have positive effects on how information will be handled. We will have even more data and even more effective ways to use it. Here are three predictions for how we will see the information space evolve in 2017.

Insight engine

The collaboration with computers are changing. For decades, we have been giving tasks to computers and waited for their answer. This is slowly changing so that we start to collaborate with computers and even expect computers to take the initiative. The developments behind this is in machine learning and human language understanding. We no longer only index information and search it with free text. Nowadays, we can build a computer understanding information. This information includes everything from IoT data points to human created documents and data from other AI systems. This enables building an insight engine that can help us formulate the right question or even giving us insight based on information to a question we never ask. This will revolutionize how we handle our information how we interact with our user interfaces.

We will see virtual private assistants that users will be happy to use and train so that they can help us to use information like never before in our daily live. Google Now, in its current form, is merely the first step of something like this, being active towards bringing information to the user.

Search-driven analytics

The way we use and interact with data is changing. With collected information about pretty much anything, we have almost any information right at our fingertips and need effective ways to learn from this – in real time. In 2017, we will see a shift away from classic BI systems towards search-driven evolutions of this. We already have Kibana Dashboards with TimeLion and ThoughtSpot but these are only the first examples of how search is revolutionizing how we interact with data. Advanced analytics available for anyone within the organization, to get answers and predictions directly in graphs and diagrams, is what 2017 insights will be all about.

Conversational UIs

We have seen the rise of Chatbots in 2016. In 2017, this trend will also take on how we interact with enterprise systems. A smart conversational user interface builds on top of the same foundations as an enterprise search platform. It is highly personalized, contextually smart and builds its answers from information in various systems and information in many forms.

Imagine discussing future business focus areas with a machine that challenges us in our ideas and backs everything with data based facts. Imagine your enterprise search responding to your search with a question asking you to detail what you actually are achieving.

 

What are your thoughts on the future developement?

How do you see the 2017 change the way we interact with our information? Comment or reach out in other ways to discuss this further and have a Happy Year 2017!

 

Written by: Ivar Ekman

How to improve search relevance using machine learning and statistics – Apache Solr Learning to Rank

In search, the relevance denotes how well a retrieved document or set of documents meets the information need of the user. Natural languages, synonymy, homonymy, term frequency, norms, relevance model, boosting, performance, subjectivity… the reasons why search relevancy still remains a hard problem are multiple. This article will deal with how can machine learning and search statistics improve the relevance using the learning to rank plugin which will be included in a newer version of Solr.  If you want more information than is provided in this blogpost, be sure to visit our website or contact us!

Background

Considering an Intranet search solution where users can be divided into two groups: developers and sales.
Search and clicks statistics are collected and the following picture illustrates a specific search performed 569 times by the users with the click statistics for each document.

Example of a search with click statistics

Example of a search with click statistics

As noticed, the top search hit, of which the score is computed from term frequency, index documents frequency and field-length norm, is less relevant (got less clicks) than documents with lower scores. Instead of trying to manually tweak the relevancy model and the different field boosts, which will probably lead, by tweaking for a specific query, to decrease the global relevancy for other queries, we will try to learn from the users click statistics to automatically generate a relevancy model.

Architecture, features and model

Architecture

Using search and click statistics as training data, the search engine can learn, from input features and with a ranking model, the probability that a document is relevant for a user.

Search architecture

Search architecture with a ranking training model

Features

With the Solr learning to rank plugin, features can be defined using standard Solr queries.
For my specific example, I will choose the following features:
– originalScore: Original score computed by Solr
– queryMatchTitle: Boolean stating if the user query text match the document title
– queryMatchDescription: Boolean stating if the user query text match the document description
– isPowerPoint: Boolean stating if the document type is PowerPoint

[
   { "name": "isPowerPoint",
     "class": "org.apache.solr.ltr.feature.SolrFeature",
     "params":{ "fq": ["{!terms f=filetype }.pptx"] }
   },
   {
    "name":"originalScore",
    "class":"org.apache.solr.ltr.feature.OriginalScoreFeature",
    "params":{}
   },
   {
    "name" : " queryMatchTitle",
    "class" : "org.apache.solr.ltr.feature.SolrFeature",
    "params" : { "q" : "{!field f=title}${user_query}" }
   },
   {
    "name" : " queryMatchDescription",
    "class" : "org.apache.solr.ltr.feature.SolrFeature",
    "params" : { "q" : "{!field f=description}${user_query}" }
   }
]

Training a ranking model

From the statistics click data, a training set (X, y), where X is the feature input vector and y a boolean stating if the user clicked a document or not, can be generated and used to compute a linear model using regression which will output the weight of each feature.

Example of statistics:

{
    q: "i3",
    docId: "91075403",
    userId: "507f1f77bcf86cd799439011",
    clicked: 0,
    score: 3,43		
},
{
    q: "i3",
    docId: "82034458",
    userId: "507f1f77bcf86cd799439011",
    clicked: 1
    score: 3,43		
},
{
    q: "coucou",
    docId: "1246732",
    userId: "507f1f77bcf86cd799439011",
    clicked: 0	
    score: 3,42	
}

Training data generated:
X= (originalScore, queryMatchTitle, queryMatchDescription, isPowerPoint)
y= (clicked)

{
    originalScore: 3,43,
    queryMatchTitle: 1
    queryMatchDescription: 1,
    isPowerPoint: 1, 
    clicked: 0		
},
{
    originalScore: 3,43,
    queryMatchTitle: 0
    queryMatchDescription: 1,
    isPowerPoint: 0, 
    clicked: 1			
},
{
    originalScore: 3,42	
    queryMatchTitle: 1
    queryMatchDescription: 0,
    isPowerPoint: 1, 
    clicked: 0		
}

Once the training is completed and the different features weight computed, the model can be sent to Solr using the following format:

{
    "class":"org.apache.solr.ltr.model.LinearModel",
    "name":"myModelName",
    "features":[
        { "name": "originalScore"},
        { "name": "queryMatchTitle"},
        { "name": "queryMatchDescription"},
        { "name": "isPowerPoint"},
    ],
    "params":{
        "weights": {
            "originalScore": 0.6783,
            "queryMatchTitle": 0.4833,
            "queryMatchDescription": 0.7844,
            "isPowerPoint": 0.321
      }
    }
}

Run a Rerank Query

Solr LTR plugin allows to easily apply the re-rank model on the documents result by adding rq={!ltr model=myModelName reRankDocs=25} to the query.

Personalization of the search result

If your statistics data include information about users, specific re-rank model can be trained according different user groups.
In my current example, I trained a specific model for the developer group and for the sales representatives.

Dev model:

{
    "class":"org.apache.solr.ltr.model.LinearModel",
    "name":"devModel",
    "features":[
        { "name": "originalScore"},
        { "name": "queryMatchTitle"},
        { "name": "queryMatchDescription"},
        { "name": "isPowerPoint"},
    ],
    "params":{
        "weights": {
            "originalScore": 0.6421,
            "queryMatchTitle": 0.4561,
            "queryMatchDescription": 0.5124,
            "isPowerPoint": 0.017
      }
    }
}

Sales model:

{
    "class":"org.apache.solr.ltr.model.LinearModel",
    "name":"salesModel",
    "features":[
        { "name": "originalScore"},
        { "name": "queryMatchTitle"},
        { "name": "queryMatchDescription"},
        { "name": "isPowerPoint"},
    ],
    "params":{
        "weights": {
            "originalScore": 0.712,
            "queryMatchTitle": 0.582,
            "queryMatchDescription": 0.243,
            "isPowerPoint": 0.623
      }
    }
}

From the statistics data, the system learnt that a PowerPoint document is more relevant for a sales representative than for a developers.

Developer search

Developer search with re-ranking

Sales representative search

Sales representative search with re-ranking

To conclude, with a search system continuously trained from a flow of statistics, not only the search relevance will be more customized and personalized to your users, but the relevance will also be automatically adapted to the users behavior change.

If you want more information, have further questions or need help please visit our website or contact us!

Solr LTR plugin to be release soon: https://github.com/bloomberg/lucene-solr/tree/master-ltr-plugin-release/solr/contrib/ltr

Swedish language support (natural language processing) for IBM Content Analytics (ICA)

Findwise has now extended the NLP (natural language processing) in ICA to include both support for Swedish PoS tagging and Swedish sentiment analysis.

IBM Content Analytics with Enterprise Search (ICA) has its strength in natural language processing (NLP) which is achieved in the UIMA pipeline. From a Swedish perspective, one concern with ICA has always been its lack of NLP for Swedish. Previously the Swedish support in ICA consisted only of dictionary-based lemmatization (word: “sprang” -> lemma: “springa”). However, for a number of other languages ICA has also provided part of speech (PoS) tagging and sentiment analysis. One of the benefits of the PoS tagger is its ability to disambiguate words, which belong to multiple classes (e.g. “run” can be both a noun and a verb) as well as assign tags to words, which are not found in the dictionary. Furthermore, the POS tagger is crucial when it comes to improving entity extraction, which is important when a deeper understanding of the indexed text is needed.

Findwise has now extended the NLP in ICA to include both support for Swedish PoS tagging and Swedish sentiment analysis. The two images below shows simple examples of the PoS support.

Example when ICA uses NLP to analyse the string "ICA är en produkt som klarar entitetsextrahering"Example when ICA uses NLP to analyse the string "Watson deltog i jeopardy"

The question is how this extended functionality could be used?

IBM uses ICA and its NLP support together with several of their products. The jeopardy playing computer Watson may be the most famous example, even if it is not a real product. Watson used NLP in its UIMA pipeline when it analyzed its data from sources such as Wikipedia and Imdb.

One product which leverage from ICA and its NLP capabilities is Content and Predictive Analytics for Healthcare. This product helps doctors to determine which action to take for a patient given the patient’s journal and the symptoms. By also leveraging the predictive analytics from SPSS it is possible to suggest the next action for the patient.

ICA can also be connected directly to IBM Cognos or SPSS where ICA is the tool which creates structure to unstructured data. By using the NLP or sentiment analytics in ICA, structured data can be extracted from text documents. This data can then be fed to IBM Cognos, SPSS or non IBM products such as Splunk.

ICA can also be used on its own as a text miner or a search platform, but in many cases ICA delivers its maximum value together with other products. ICA is a product which helps enriching data by creating structure to unstructured data. The processed data can then be used by other products which normally work with structured data.

Why search and Findability is critical for the customer experience and NPS on websites

To achieve a high NPS, Net Promoter Score, the customer experience (cx) is crucial and a critical factor behind a positive customer experience is the ease of doing business. For companies who interact with their customers through the web (which ought to be almost every company these days) this of course implies a need to have good Findability and search on the website in order for visitors to be able to find what they are looking for without effort.

The concept of NPS was created by Fred Reichheld and his colleagues of Bain and Co who had an increasing recognition that measuring customer satisfaction on its own wasn’t enough to make conclusions of customer loyalty. After some research together with Satmetrix they came up with a single question that they deemed to be the only relevant one for predicting business success “How likely are you to recommend company X to a friend or colleague.” Depending upon the answer to that single question, using a scale of 0 to 10, the respondent would be considered one of the following:

net-promoter

The Net Promoter Score model

The idea is that Promoters—the loyal, enthusiastic customers who love doing business with you—are worth far more to your company than passive customers or detractors. To obtain the actual NPS score the percentage of Detractors is deducted from the percentage of Promoters.

How the customer experience drives NPS

Several studies indicate four main drivers behind NPS:

  • Brand relationship
  • Experience of / satisfaction with product offerings (features; relevance; pricing)
  • Ease of doing business (simplicity; efficiency; reliability)
  • Touch point experience (the degree of warmth and understanding conveyed by front-line employees)

According to ‘voice of the customer’ research conducted by British customer experience consultancy Cape Consulting the ease of doing business and the touch point experience accounts for 60 % of the Net Promoter Score, with some variations between different industry sectors. Both factors are directly correlated to how easy it is for customers to find what they are looking for on the web and how easily front-line employees can find the right information to help and guide the customer.

Successful companies devote much attention to user experience on their website but when trying to figure out how most visitors will behave website owners tend to overlook the search function. Hence visitors who are unfamiliar with the design struggle to find the product or information they are looking for causing unnecessary frustration and quite possibly the customer/potential customer runs out of patience with the company.

Ideally, Findability on a company website or ecommerce site is a state where desired content is displayed immediately without any effort at all. Product recommendations based on the behavior of previous visitors is an example but it has limitations and requires a large set of data to be accurate. When a visitor has a very specific query, a long tail search, the accuracy becomes even more important because there will be no such thing as a close enough answer. Imagine a visitor to a logistics company website looking for information about delivery times from one city to another, an ecommerce site where the visitor has found the right product but wants to know the company’s return policy before making a purchase or a visitor to a hospital’s website looking for contact details to a specific department. Examples like these are situations where there is only one correct answer and failure to deliver that answer in a simple and reliable manner will negatively impact the customer experience and probably create a frustrated visitor who might leave the site and look at the competition instead.

Investing in search have positive impacts on NPS and the bottom line 

Google has taught people how to search and what to expect from a search function. Step one is to create a user friendly search function on your website but then you must actively maintain the master data, business rules, relevance models and the zero-results hits to make sure the customer experience is aligned. Also, take a look at the keywords and phrases your visitors use when searching. This is useful business intelligence about your customers and it can also indicate what type of information you should highlight on your website. Achieving good Findability on your website requires more than just the right technology and modern website design. It is an ongoing process that successfully managed can have a huge impact on the customer experience and your NPS which means your investment in search will generate positive results on your bottom line.

More posts on this topic will follow.

/Olof Belfrage

Predictive Analytics World 2012

At the end of November 2012 top predictive analytics experts, practitioners, authors and business thought leaders met in London at Predictive Analytics World conference. Cameral nature of the conference combined with great variety of experiences brought by over 60 attendees and speakers made a unique opportunity to dive into the topic from Findwise perspective.

Dive into Big Data

In the Opening Keynote, presented by Program Chairman PhD Geert Verstraeten, we could hear about ways to increase the impact of Predictive Analytics. Unsurprisingly a lot of fuzz is about embracing Big Data.  As analysts have more and more data to process, their need for new tools is obvious. But business will cherish Big Data platforms only if it sees value behind it. Thus in my opinion before everything else that has impact on successful Big Data Analytics we should consider improving business-oriented communication. Even the most valuable data has no value if you can’t convince decision makers that it’s worth digging it.

But beeing able to clearly present benefits is not everything. Analysts must strive to create specific indicators and variables that are empirically measurable. Choose the right battles. As Gregory Piatetsky (data mining and predictive analytics expert) said: more data beats better algorithms, but better questions beat more data.

Finally, aim for impact. If you have a call center and want to persuade customers not to resign from your services, then it’s not wise just to call everyone. But it might also not be wise to call everyone you predict to have high risk of leaving. Even if as a result you loose less clients, there might be a large group of customers that will leave only because of the call. Such customers may also be predicted. And as you split high risk of leaving clients into “persuadable” ones and “touchy” ones, you are able to fully leverage your analytics potencial.

Find it exciting

Greatest thing about Predictive Analytics World 2012 was how diverse the presentations were. Many successful business cases from a large variety of domains and a lot of inspiring speeches makes it hard not to get at least a bit excited about Predictive Analytics.

From banking and financial scenarios, through sport training and performance prediction in rugby team (if you like at least one of: baseball, Predictive Analytics or Brad Pitt, I recommend you watch Moneyball movie). Not to mention Case Study about reducing youth unemployment in England. But there are two particular presentations I would like to say a word about.

First of them was a Case Study on Predicting Investor Behavior in First Social Media Sentiment-Based Hedge Fund presented by Alexander Farfuła – Chief Data Scientist at MarketPsy Capital LLC. I find it very interesting because it shows how powerful Big Data can be. By using massive amount of social media data (e.g. Twitter), they managed to predict a lot of global market behavior in certain industries. That is the essence of Big Data – harness large amount of small information chunks that are useless alone, to get useful Big Picture.

Second one was presented by Martine George – Head of Marketing Analytics & Research at BNP Paribas Fortis in Belgium. She had a really great presentation about developing and growing teams of predictive analysts. As the topic is brisk at Findwise and probably in every company interested in analytics and Big Data, I was pleased to learn so much and talk about it later on in person.

Big (Data) Picture

Day after the conference John Elder from Elder Research led an excellent workshop. What was really nice is that we’ve concentrated on the concepts not the equations. It was like a semester in one day – a big picture that can be digested into technical knowledge over time. But most valuable general conclusion was twofold:

  • Leverage – an incremental improvement will matter! When your turnover can be counted in millions of dollars even half percent of saving mean large additional revenue.
  • Low hanging fruit – there is lot to gain what nobody else has tried yet. That includes reaching for new kinds of data (text data, social media data) and daring to make use of it in a new, cool way with tools that weren’t there couple of years ago.

Plateau of Productivity

As a conclusion, I would say that Predictive Analytics has become a mature, one of the most useful disciplines on the market. As in the famous Gartner Hype, Predictive Analytics reached has reached the Plateau of Productivity. Though often ungrateful, requiring lots of resources, money and time, it can offer your company a successful future.

Tutorial: Optimising Your Content for Findability

This tutorial was done on the 6th of November at J. Boye 2012 conference in Aarhus Denmark. Tutorial was done by Kristian Norling.

Findability and Your Content

As the amount of content continues to increase, new approaches are required to provide good user experiences. Findability has been introduced as a new term among content strategists and information architects and is most easily explained as:

“A state where all information is findable and an approach to reaching that state.”

Search technology is readily used to make information findable, but as many have realized technology alone is unfortunately not enough. To achieve findability additional activities across several important dimensions such as business, user, information and organisation are needed.

Search engine optimisation is one aspect of findability and many of the principles from SEO works in a intranet or website search context. This is sometimes called Enterprise Search Engine Optimisation (ESEO). Getting findability to work well for your website or intranet is a difficult task, that needs continuos work. It requires stamina, persistence, endurance, patience and of course time and money (resources).

Tutorial Topics

In this tutorial you will take a deep dive into the many aspects of findability, with some good practices on how to improve findability:

  • Enterprise Search Engines vs Web Search
  • Governance
  • Organisation
  • User involvement
  • Optimise content for findability
  • Metadata
  • Search Analytics

Brief Outline

We will start some very brief theory and then use real examples and also talk about what organisations that are most satisfied with their findability do.

Experience level

Participants should have some intranet/website experience. A basic understanding of HTML, with some previous work with content management will make your tutorial experience even better. A bonus if you have done some Search Engine Optimisation (SEO) for public websites.

Analyzing the Voice of Customers with Text Analytics

Understanding what your customer thinks about your company, your products and your service can be done in many different ways. Today companies regularly analyze sales statistics, customer surveys and conduct market analysis. But to get the whole picture of the voice of customer, we need to consider the information that is not captured in a structured way in databases or questionnaires.

I attended the Text Analytics Summit earlier this year in London and was introduced to several real-life implementations of how text analytics tools and techniques are used to analyze text in different ways. There were applications for text analytics within pharmaceutical industry, defense and intelligence as well as other industries, but most common at the conference were the case studies within customer analytics.

For a few years now, the social media space has boomed as platforms of all kinds of human interaction and communication, and analyzing this unstructured information found on Twitter and Facebook can give corporations deeper insight into how their customers experience their products and services. But there’s also plenty of text-based information within an organization, that holds valuable insights about their customers, for instance notes being taken in customer service centers, as well as emails sent from customers. By combining both social media information with the internally available information, a company can get a more detailed understanding of their customers.

In its most basic form, the text analytics tools can analyze how different products are perceived in different customer groups. With sentiment analysis a marketing or product development department can understand if the products are retrieved in a positive, negative or just neutral manner. But the analysis could also be combined with other data, such as marketing campaign data, where traditional structured analysis would be combined with the textual analysis.

At the text analytics conference, several exciting solutions where presented, for example an European telecom company that used voice of customer analysis to listen in on the customer ‘buzz’ about their broadband internet services, and would get early warnings when customers where annoyed with the performance of the service, before customers started phoning the customer service. This analysis had become a part of the Quality of Service work at the company.

With the emergence of social media, and where more and more communication is done digitally, the tools and techniques for text analytics has improved and we now start to see very real business cases outside the universities. This is very promising for the adaptation of text analytics within the commercial industries.

Video: Search Analytics in Practice

Search Analytics in Practice from Findwise on Vimeo.

This presentation is about how to use search analytics to improve the search experience. A small investment in time and effort can really improve the search on your intranet or website. You will get practical advice on what metrics to look at and what actions can be taken as a result of the analysis.

Video in swedish “Sökanalys i praktiken”.

The presentation was recorded in Gothenburg on the 4th of May 2012.

The presentation featured in the video:

Search Analytics in Practice

View more presentations from Findwise