Big Data is a Big Challenge

Big Data is also a Big Challenge for a number of companies that would like to be ahead of the competition. I think Findwise can help a lot with both technical expertise in text analytics and search technology but also with how to put Big Data to use in a business.

During the last days of February I had the pleasure to attend IDG Big Data conference in Warsaw, Poland. It brought plenty of people from both vendors and industry that shared interesting insights on the topic. In general, big vendors that try to be associated with Big Data dominated the conference. IBM, SAS, SAP, Teradata has provided massive marketing information on software products and capabilities around Big Data. Interestingly every single presentation had its own definition on what Big Data is. This is probably caused by the fact that everybody tries to find the best definitions for fitting own products into it.

From my perspective it was very nice to hear that everyone agrees text analytics and search components are of big importance in any Big Data solution. In multiple applications analysis (both predictive and deductive) and for mass social media one must use advanced linguistic techniques for retrieving and structuring the data streams. This sounded especially strong in IBM and SAS presentations.

A couple of companies revealed what they have already achieved in so called Big Data. Orange and T-Mobile presented their approach of extending traditional business intelligence to harness Big Data. They want to go beyond standard data collected in transaction databases and open up for all the information they have from calls (picked and non-answered), SMS, data transmission logs, etc. Telecom companies consider this kind of information to be a good source for data about their clients.

But the most interesting sessions were held by companies that openly shared their experience about evolution of their Big Data solutions based mainly on open source software. In this way Adam Kawa from Spotify showed how they based their platform on Hadoop cluster starting from a single server to a few hundreds nowadays. To me that seems like a good way to grow and adapt easily to changing business needs and altering external conditions.

Nasza Klasa – a Polish Facebook competitor had a very good presentation on several dimensions connected to challenges in Big Data solutions that might be used for summarisation of this post:

  1. Lack of legal regulations – Currently there are no clear regulations on how the data might be used and how to make money out of it. It is especially important for social portals where all our personal information might be used for different kinds of analysis and sold in aggregated or non-aggregated form. But the laws might be changed soon, thus changing the business too.
  2. Big Data is a bit like research – it is hard to predict return on investment on Big Data as it is a novelty but also a very powerful tool. For many who are looking into this the challenge is internal, to convince executives to invest in something that is still rather vague.
  3. Lack of data scientists – even if there are tools for operating on Big Data, there is a huge lack of skilled people – Big Data operators. These are not IT people nor developers but rather open-minded people with a good mathematical background able to understand and find patterns in a constantly growing stream of various structured and unstructured information.

As I stated at the beginning of this post, Big Data is also a Big Challenge for a number of companies that would like to be ahead of the competition. I truly believe we at Findwise can help a lot within this area, we have both the technical expertise and experience on how to put Big Data to use in a business.

Architecture of Search Systems and Measuring the Search Effectiveness

Lecture made at the 19th of April 2012, at the Warsaw University of Technology. This is the 9th lecture in the regular course for master grade studies, “Introduction to text mining”.

View more presentations from Findwise