A Health Care Information Commons Vision: from frozen assets to liquid gold

This is the second post in a series (1), unpacking interoperability in the healthcare system. The basis in this post is semantic and technical interoperability, hence a systemic overview.

The future of health care relies on the improved flow of captured patient health information across the whole care continuum. This means a shared information system linking systems and devices from participating health care organisations while maintaining patient privacy and security standards. Such a realization would not only enhance the clinician and patient experience but also enable faster treatment and better care coordination for patients.

Information Commons is an information system, …, that exists to produce, conserve, and preserve information for current and future generations.

 A seamless and secure hub, heavily-linked, providing point-of-care access to critical patient data and care decision support information for the delivery of timely care, reducing the duplication of tests and procedures.

All in all, this has to be built upon a participatory community paradigm, where clinicians, policy makers and leaders, and patients share a vision to create an interoperable information space – that is sustainable, regardless of previous lock-in mechanisms set by different technical, and semantic standards, vendors and process and policy making.

Healthcare Information Commons

How do we create a interoperability climate?

 Changes for interoperability lie in the development of new pilots with strong collaboration. They are generally more successful where they are based on patient or illness groups, value-orientated, open and scalable. Post requirements phase, iteration based on early adopters’ feedback can identify the need for improvements and enhancements around the relevancy, format and visual display of data and information, the usability of the solution and provide insight into workflow impact. The Information Commons is also a good arena for clinicians to share positive anecdotes from their experiences upon which scalable pilots can be expanded.

Such developed infrastructure and services can also support or be leveraged by other national or regional health initiatives.

Technical Layers of interoperability

Interoperability can cover many layers but at its basis would be an interoperable access layer that integrates and securely shares clinical data from multiple sources giving one point of access. The user interface (GUI) could then provide and display data and information based on stakeholder users and medical/situational context.

Such a layer would have to accommodate and support various data from the distributed system of actors, aligning both to open standards while at the same time being plastic enough in design and instantiation.

Interoperability not only covers the sharing of information but also its usage. This may include added functionality by the EHR vendor themselves or the creation of further value-adding knowledge layers that can take advantage of both structured and (the untapped wealth of) unstructured data within EHRs.

Findwise in its EU funded KConnect project is doing just that. It is currently collecting use case studies from Jönköping (RJI/Qulturum) in order to create a pilot solution for clinicians to take advantage of ‘hidden’ textual data.

Questions of interoperability also lie in the physical user experience of the systems themselves. Should the basic layer provided by EHR vendors be open to include value-added software from other parties, should it be embedded or be made into another GUI? Which ultimately is best for the clinician workflow and the agility of software solutions in supporting new value-based outcomes and reiteration for improvements in efficiency and effectiveness?

Semantic Transformer

The annotations made in the healthcare systems across different domains, all have very similar outset, but lack coherent interoperable mechanism to work smoothly outside the local context. On a international, and national and regional level there should be services that acts as the electric grid to provide society with energy to be used in many contexts. A semantic grid that host controlled vocabularies within the domain, but also share practices and processes. With the use of open standards these could bridge across organisational boundaries and help clean the current messy Healthcare information space.

The healthcare information commons, do not per se have to be one system, but rather an interoperable set of services/systems that share standards to be able to exchange information and data. Very similar to they way Internet and linked data work today –  not restricted by walled gardens. The governance of the commons, should be a matter of public services, with sustainable resources and open governance agenda that can invite participation and engagement. No single actor in the network, be it a large hospital, private caretaker or regional public governing body will be able take care of this single-handedly. It should be a true “commons” undertaking!

The infusion of the Information Commons into everyday healthcare provisioning use cases with semantic transformer applications could be in several modalities: finding and acting upon information or contributing in the local context.

In the data entry or capture point, there will be options to add semantic layers and attributes to the type of content and data provisioned. An easy way to illustrate this, is the emerging use of schema.org templated entities and properties for the MedicalTypes, MedicalConditions, Drugs, Guidelines, Codes from controlled vocabularies like SnoMedCT, Mesh, ICD10 and the like.

 Analogously using digital cameras from smartphones or other devices, means that the user might add “some” metadata or tags about the picture. Devices and sensors add more layers of granularity with attributes that most end-users, never see or bother about. These extra resource descriptions, will interplay with cloud based services as Google Photos – where different algorithms reformat, package the content into new forms, as contextual albums, scenes and so forth.

 A set of semantic transformer application layers should be intertwingled with the Healthcare Information Commons. Firstly to make easy linkages between data sets – as the Web of Data scenarios and Linked Data propose –  but also to  provide smarter integration points in back-end supporting processes in the Healthcare systems where more private and locked-in data-sets exist about the patient conditions, treatments and drugs etc.

 The semantic transformer applications could both be open api:s developed by the community for the commons, but also could be commercial applications provided by line-of-business specialist software vendors. As long as all of these layers, are compliant with the open standards!

For such legacy systems as EHR , and off-the-shelf healthcare applications and business applications that are semantically impaired, these semantic transformer applications could work as a repair-kit for already old broken systems. Consequently there would be no need to overhaul all legacy software within the caretaker’s organisation. A kind of smoother migration path to interoperability.

There also exists the need for semantic interoperability between the contextual patient information within the EHR and the provision of clinical decision support information. This could be in the form of internal medical guidelines and best practices, or from external resources such as medical journals or clinical trial reports.

The KConnect project are providing semantic annotation and semantic search services in different languages for clinicians and researchers to access the very latest in medical literature. This is achievable by semantically annotating required medical information (EHRs, guidelines, journals etc) and having the semantic search engine take full advantage of known key medical entities/concepts and their relationships.

Through the indexing of new information about drug usage, best practices, guidelines, new clinical trials and journals, clinicians then access up-to-date relevant information whenever they need.

In the near future to maximise both clinician and patient user engagement with EHRs, different uses and views of the EHR will have to be driven by suitable context and stakeholder semantics.

Shared Decision making

When moving into valued-based health care and outcome measurement, (as presented here by Sveus), it is critical that all actors participate on a connected level field, so that communication between healthcare practitioners and patients and their social networks works.  This includes the need for shared norms and definitions as well as systems to support the decision making – and obviously a harmonised set of metrics to measure outcomes.

As presented by Peter Ubel, in his talks and recent book on Critical Decisions, it is key that we are able to share a common view between the clinician and the patient. All practitioners share jargon that do not always communicate well to the receiver. Hence there are plenty of communication breakdowns recorded in the everyday practices, leading to “malpractice” in the worst cases for the patient. In the last couple of decades, there has been a shift in power relations between healthcare professionals and patients and their families. Patient empowerment is a good thing, but if things get lost in translation, there is the risk that critical decisions are not fully supported.

With a Healthcare Information Commons pool of resources, there lays opportunities to guide patients and practitioners in their critical decision making. But also to strengthen the learning and innovation within the communities of practice, with open feedback loops to the pool.

Privacy & Security upfront

Just as data interoperability can be seen as the sharing of data, data security can be seen as the sharing of data in the right way and data privacy seen as the sharing of data with the right person in the right way. We are naturally concerned as to who may be using our data and want to be able to control its use.

The boundary between citizens’ App data and their medical data is blurring rapidly as App developments and sensors continue to provide new and different data that the individual, health care and clinical research can capitalise on in the effort to move towards better wellbeing and more value-based healthcare.

While data privacy and security have become the headline darlings of the media, they can often be distractors of innovation, often masking the true benefits of the flow of information. Just as with physical assets there are best practices for data misuse prevention, protection and policing. The majority of misuse or abuse of personal data is more often caused by human error and misjudgement than by the failure of technology.

Data interoperability can be better supported when services have clear guidelines to inform citizens as to who, when and how their data is shared, for what purpose and the available steps to alter said process. A better informed public would then see more free data resources being used for clinical research e.g. the Million Hearts initiative in the US where citizen data is being used to lower heart attacks and strokes.

Open regulations, collaboration and co-ordination along with risk assessment and protection practices such as encryption, anonymisation and de-identification, all can go a long way to allowing secure data interoperability, be it personal or aggregated data. IT has the potential too of rule-based access and forensic data access reports. No system can be made fool-proof, however precautions and the presence of well-designed data breach response plan are achievable.

Obviously we do not want all our healthcare records to be open in the air for anybody to use or read, as little as we want our financial records to be in the open. Privacy is really key! The means with the Information Commons should work with aggregated data. Not the singular set of records for one patient.

Patient security derives the need to a more free flow of data between actor systems. The medical conditions and contexts sets the standards for sharing, where extracts or segments should be possible to share aligned with privacy policies.

Future real-life experience exposé

Having a recent Swedish report on diabetes care and outcome measurement in mind. It makes sense, to illustrate the case of a diabetes patient living and acting in Göteborg, West of Sweden. They have a medical condition, being a lifelong journey with an endocrine system out of order. This has a great impact on the patient’s everyday life, and diabetes related complications. With good life balance to training, exercise and eating habits, it is possible to keep the glucose patterns in such a way that your life expectancy will equal to anybody else.

The use of personal choices to trigger improved behavior, gives the person options to chose selected wellbeing (e.g. Weight Watchers), fitness (e.g. Runkeeper) and health monitoring applications. In most cases these are closed down ecosystems, e.g. iOS included Health app, with options to share in social-media (about your progress, in terms of eating well, or improve your personal training). Many Life Science corporations are developing medical condition / disease area / treatment specific Health monitoring applications (e.g. FreeStyle Libre from Abbot for improving Glucose Monitoring) that clinicians recommend during patient consultations.

For clinical researchers there are ecosystem specific toolkits, like the open-sourced Apple Research Kit.  The existence of a closed ecosystem naturally makes it more problematic to share and exchange data. In this space a Open Standards based on the idea Information Commons makes sense too – where semantic translators could improve the transmission of data from one closed ecosystem to another, without privacy infringement.

A Personal Health Record (PHR) , is a health record where health data and information related to the care of a patient is maintained by the patient

In a future more seamlessly interoperable world, the citizen / patient should be provided one-secure-access point to his/hers health account, e.g. in Sweden 1177 and Mina Vårdkontakter and Hälsa för mig.

The outstanding question: How to get interoperability between PHR and Wellbeing, Fitness and Health apps where it is easy to share vital data bits in a sound manner?

In this scene, open standards should be applied to create a make-do semantic transformation.

Lastly – interoperability within the Professional Clinician Workplace?

The statements and real-life stories from the trenches in any clinical workplace, show a mess of supporting information systems. EHRs that by no means either cooperate or interoperate. Many clinicians realise that they have to do data provision into a handful of systems with significant double manual workload. This comes with risks, given the stressful environment, and many “malpractice” incidents can arise from this workplace disorder.

Each system support its part of the process. While some software suites try to close-down into one-system to ‘rule them all paradigm,’ they still barely lean upon any open standards and they lack semantic and structured ways for the use of data and information outside of the supporting system’s narrow scope.

 A diabetes nurse (post patient consultation) has to enter data into more than 10 different areas, including quality assurance and measurement systems e.g. NDR in Sweden. In some cases there have been integrated point-to-point solutions put in place, but mostly this is not the case and so unnecessary frustration is created.

In every intervention where clinicians and patients communicate, regardless of it being online, remote, on-site, there should be opportunities to tap into the Healthcare Information Commons space. With the potential to find recent new medical treatments, emerging standards/guidelines, breaking news for clinicians as well as patient-oriented and formatted communications. In the best of worlds, semantic translator applications will bridge between ecosystems inside the personal health space as well as into the workplace environment for clinicians – helping, guiding and improving all dimensions of interoperability.

Concluding remarks

Having value-based Healthcare and Outcome Measurement domain as a specific health care change driver, will push the use of standards on all levels to the limit. In the following blog post in this series, the ambition is to unpack information governance, since the data ownership and trust also have to be ironed out. And as stated by Prof Michael E. Porter, the capture of data to do proper Outcome Measurement is one of the major road-blocks ahead. The orchestration of all resources and governance still have to be unfolded. Happily some building blocks to the Healthcare Information Commons have emerged, so we do not need to reinvent the wheel:

  • Wikimedia realm “commons“- with all entries of semantic useful data in wikidata.org
  • Standard Sets for Medical Conditions by international collaboration at ICHOM, and in Sweden Sveus. Standards from Hl7 FHIR, W3C and Web of Data / Semantic Web. The Swedish National Board of Health and Welfare, have an embroic information structure (not in semantic machine readible, RDF, format). Information intermediaries like Google have settle for simple schemas for health and medicin.
  • Open Innovation, and the “open” paradigm, will change evidence based medicine, Bad Pharma and Science on a sociatal level, as stated by Ben Goldacre (TED) where we as patient together with clinicians are able to question treatments based on open data, and improve quality to Healthcare Information Commons.
  • The technology stack with smarter devices, sensors and things, along with Internet anywhere with cognitive computing and computational knowledge on-top of the commons will bring forward semantic translators.
  • New leaps in collaborative work and development with the use of the notebook theme, language and platform agnostic ways.

Making sense, defrosting health data into liguid gold improving healthcare for all.

For more information on Findwise research, please visit KConnect and Orios (Open Standards)


View Fredric Landqvist's LinkedIn profileFredric Landqvist research blog
View Peter Voisey's LinkedIn profilePeter Voisey

Wagon Trains to the Cloud

This is the first post in a series(2, 3, 4, 5, 6, 7) on the challenges organisations face when they move from having online content and tools hosted firmly on their estate to renting space in the cloud.  We will help you to consider the options and guide you on the steps you need to take.

In this first post we show you  the most common challenges that you are likely to face and how you may overcome these.

A fast migration path, to become tenants in a cloud apartment housing unfolds a set of business critical issues that have to be mitigated:

  • Wayfinding in a maze of content buckets and social habitats.
  • Emerging digital Ghost Towns due to lack of information governance.
  • Digital Landfills without organising principles for information and data.
  • Digital Litter with little or no governance or principles for ownership, with redundant, outdated and trivial (ROT) content.
  • With no strategy or plan, erodes any possibility to positive business outcome from moving to the clouds.

WagonTrn.jpg
WagonTrn” by Tillman at en.wikipedia – Transferred from en.wikipedia by SreeBot. Licensed under Public domain via Wikimedia Commons.

The way forward is to settle a sustainable information architecture, that supports an information environment in constant flux. With information and data interoperable on any platform, everywhere, anytime and on any device.

You need to show how everything is managed and everyone fits together.  A governance framework can help do this.  It can show who is responsible for the intranet, what their responsibilities are and fit with the strategy and plan.  Making it available to everyone on the intranet helps their understanding of how it is managed and supports the business.

The main point is to have a governance framework and information architecture with the same scope to avoid gaps in content being managed or not being found.

Both need to be in harmony and included in any digital strategy.  This avoids competing information architectures and governance frameworks being created by different people that causes people to have inconsistent experiences not finding that they need and using alternative, less efficient, ways in future to find what they need to help with their work.

Background

Building huts, houses and villages is an emerging social construction. As humans we coordinate our common resources, tools and practices. A habitat populated by people needs housekeeping rules with available resources for cooking, cleaning, social life and so on. Routines that defines who does what task and by when in order to keep everything ok.

A framework with governing principles that set out roles and responsibilities along with standards that set out the expected level of quality and quantity of each task that everyone is engaged and complies with, is similar to how the best intranets and digital workplaces are managed.

In the early stages with a small number of habitats the rules for coordination are pretty simple, both for shared resources between the groups and pathways to connect them. The bigger a village gets, it taxes the new structures to keep things smooth. When we move ahead into mega cities with 20+ million people living close, it boils down to a general overarching plan and common infrastructures, but you also need local networked communities, in order to find feasible solutions for living together.

Like villages and mega cities there is a need for consistency that helps everyone to work and live together.  Whenever you go out you know that there are pavements to walk on, roads for driving, traffic lights that we stop at when they turn red and signs to help us show the easiest way to get to our destination.

Sustainable architecture and governance creates a consistent user experience. A well structured information architecture that is aligned with a clear governance framework sets out roles and responsibilities. Publishing standards based on business needs that supports the publishers follow them. This means wherever content is published, whether it is accredited or collaborative, it will appear to be consistent to people and located where they expect it to be.  This encourages a normal way to move through a digital environment with recognizable headings and consistently placed search and other features.

This allegori, fits like a glove when moving into large enterprise-wide shared spaces for collaboration. Whether it is cloud based, on-premises or a mix thereof. The social constructions and constraints still remain the same. As an IT-services on tap, cloud, has certainly constraints for a flexible and adjustable habitual construction to be able to host as many similar habitats as possible. But offers a key solutions to instantly move into! Tenants share the same apartment building (Sharepoint online).

When the set of habitats grow, navigation in this maze becomes a hazard for most of us. Wayfinding in a digital mega city, is extremely difficult. To a large extent, enterprises moving into collaboration suites suffer from the same stigma. Regardless if it is SharePoint, IBM Connections, Google Apps for Work, or a similar setting. It is not a discussion of which type of house to choose, but rather which architecture and plan that work in the emerging environment.

Information Architecture for Digital Habitats

If one leans upon linked-data,  linked-open-data, and emerging semantic web and web of data standards, there are a set of very simple guidelines that one should adhere to when building a Digital Village or Mega City. The 5 stars, our beacon of light!

All collections and shared spaces, should have persistent URI:s, which is the fourth star in the ladder. When it comes to the third star of non-proprietary formats it obviously becomes a bit tricky, since i.e. MS Sharepoint and MS Office like to encourage their own format to things. But if one add resource descriptions to collections and artifacts using Dublin Core elements, it will be possible to connect different types of matter. With feasible and standardised resource descriptions it will be possible to add schemas and structures, that can tell us a little bit more about the artifacts or collection thereof. Hence the option to adhere to the second star. The first star, will inside the corporate setting become key to connect different business units, areas with open licenses and with restrictions to internal use only and in some cases open for other external parties.

Linking data-sets, that is collections or habitats, with different artifacts is the fifth star. This is where it all starts to make sense, enabling a connected digital workplace. Building a city plan, with pathways, traffic signals and rules, highways, roads, neighborhoods  and infrastructural services and more. In other words, placemaking!

Placemaking is a multi-faceted approach to the planning, design and management of public spaces. Placemaking capitalizes on a local community’s assets, inspiration, and potential, with the intention of creating public spaces that promote people’s health, happiness, and well being.

We will cover more about how this applies to Office 365 and SharePoint in our next post.

Please join our Live Stream on YouTube the 20th November 8.30AM – 10AM Central European Time
View Fredric Landqvist's LinkedIn profileFredric Landqvist research blog
View Mark Morrell's LinkedIn profileMark Morell intranet-pioneer

Enterprise-Linked-Data and the Connected Digital Workplace

The emerging hyper-connected and agile enterprises of today are stigmatised by their IS/IT-legacy, so the question is: Will emerging web and semantic technologies and practices undo this stigma?

The Shift

Semantic Technologies and Linked-Open-Data (LOD) have evolved since Tim Berners-Lee introduced their basic concepts, and they are now part of everyday business on the Internet, thanks mainly due to their uptake by information and data-run companies like Google, social networks like Facebook and large content sites, like Wikipedia. The enterprise information landscape is ready to be enhanced by semantic web, to increase findability and usability. This change will enable a more agile digital workplace where members of staff can use cloud based services, anywhere, anytime on any device, in combination with the set of legacy systems backing their line-of-business. All in all, more efficient organising principles for information and data.

The Corporate Information Landscape of today

In everyday workplace we use digital tools to cope with the tasks at hand. These tools have been set into action to address meta models to structure the social life dealing with information and data. The legacy of more than 60 years of digital records keeping, has left us in an extremely complex environment, where most end-users have a multitude of spaces where they are supposed to contribute. In many cases their information environment lacks interoperability.

A good, or rather bad example of this, is the electronic health records (EHR) of a hospital, where several different health professionals try to codify their on-going work in order to make better informed decisions regarding the different medical treatments. While this is a good thing, it is heavily hampered with closed-down silos of data that do not work in conjunction with the new more agile work practices. It is not uncommon to have more than 20 different information systems employed to do provisioning during a workday.

The information systems architecture, in any organisation or enterprise, may comprise of home-grown legacy systems from the past, or bought off-the-shelf software suites and extremely complex enterprise-wide information systems like ERP, BI, CRM and the like. The connections between these information systems (or integration points) often resemble “spaghetti” syndrome, point-to-point. The work practice for many IT professionals is to map this landscape of connections and information flows, using for example Enterprise Architecture models. Many organisations use information integration engines, like enterprise-service-bus applications, or master data applications, as means to decouple the tight integration and get away from the proprietary software lock-in.

On top of all these schema-based, structured data, information systems, lies the social and collaborative layer of services, with things like intranet (web based applications), document management, enterprise wide social networks (e.g. Yammer) and collaborative platforms (e.g SharePoint) and more obviously e-mail, instant messaging and voice/video meeting applications. All of these platforms and spaces where one  carries out work tasks, have either semi-structured (document management) or unstructured data.

Wayfinding

A matter of survival in the enterprise information environment, requires a large dose of endurance, and skills. Many end-users get lost in their quest to find the relevant data when they should be concentrating on making well-informed decisions. Wayfinding is our in-built adaptive way of coping with the unexpected and dealing with it. Finding different pathways and means to solve the issues. In other words … Findability.

Outside-in and Inside-Out

Today most organisations and enterprises workers act on the edge of the corporate landscape – in network conversations with customers, clients, patients/citizens, partners, or even competitors, often employing means not necessarily hosted inside the corporate walls. On the Internet we see newly emerging technologies become used and adapted at a faster rate and in a more seamless fashion than the existing cumbersome ones of the internal information landscape. So the obvious question raised in all this flux is: why can’t our digital workplace (the inside information landscape) be as easy to use and to find things / information as in the external digital landscape? Why do I find knowledgeable peers in communities of practice more easily outside than I do on the inside? Knowledge sharing on the outpost of the corporate wall is vivid, and truly passionate whereas inside it is pretty stale and lame to say the least.

Release the DATA now

Aggregate technologies, such as Business Intelligence and Datawarehouse, use a capture, clean-up, transform and load mechanism (ETL) from all the existing supporting information systems. The problem is that the schemas and structures of things do not compile that easily. Different uses and contexts make even the most central terms difficult to unleash into a new context. This simply does not work. The same problem can be seen in the enterprise search realm where we try to cope with both unstructured or semi-structured data. One way of solving all this is to create one standard that all the others have to follow and including a least common denominator combined with master data management. In some cases this can work, but often the set of failures fromsuch efforts are bigger than those arising from trying to squeeze an enterprise into a one-size-fits-all mega-matrix ERP-system.

Why is that? you might ask, from the blueprint it sounds compelling. Just align the business processes and then all data flows will follow a common path. The reality unfortunately is way more complex because any organisation comprises of several different processes, practices, professions and disciplines. These all have a different perspectives of the information and data that is to be shared. This is precisely why we have so many applications in the first place! To what extent are we able to solve this with emerging semantic technologies? These technologies are not a silver bullet, far from it! The Web however shows a very different way of integration thinking, with interoperability and standards becoming the main pillars that all the other things rely on. If you use agreed and controlled vocabularies and standards, there is a better chance of actually being able to sort out all the other things.

Remember that most members of staff, work on the edges of the corporate body, so they have to align themselves to the lingo from all the external actor-networks and then translate it all into codified knowledge for the inside.

Semantic Interoperability

Today most end-users use internet applications and services that already use semantic enhancements to bridge the gap between things, without ever having to think about such things. One very omnipresent social network is Facebook, that relies upon the FOAF (Friend-of-a-Friend) standard for their OpenGraph. Using a Graph to connect data, is the very corner stone of linked-data and the semantic web. A thing (entity) has descriptive properties, and relations to other entities. One entity’s property might be another entity in the Graph. The simple relationship subject-predicate-object. Hence from the graph we get a very flexible and resilient platform, in stark contrast to the more traditional fixed schemas.

The Semantic Web and Linked-Data are a way to link different data sets that may grow from a multitude of schemas and contexts into one fluid interlinked experience. If all internal supporting systems or at least the aggregate engines could simply apply a semantic texture to all the bits and bytes flowing around, it could well provide a solution to the area where other set ups have failed. Remember that these linked-data sets are resilient by nature.

There is a  set of controlled vocabularies (thesauri, ontologies and taxonomies) that capture all the of topics, themes and entities that make up the world. These vocabs have to some extent already been developed, classified and been given sound resource descriptors (RDF). The Linked-Open-Data clouds are experiencing a rapid growth of meaningful expressions. WikiData, dbPedia, Freebase and many more ontologies have a vast set of crispy and useful data that when intersected with internal vocabularies, can make things so much easier. A very good example of such useful vocabularies, are the ones developed by professional information science people is that of the Getty Institute’s recently released thesari for AAT (Arts and Architecture), CONA (Cultural Object Authority) and TGN (Geographical Names). These are very trustworthy resources, and using linked-data anybody developing a web or mobile app can reuse their namespace for free and with high accuracy. And the same goes for all the other data-sets in the linked-open-data cloud. Many governments have declared open data as the main innovation space in which to release their things, under the realm of the “Commons”.

Inaddition to this, all major search engines have agreed on a set of very simple-to-use schemas captured in the www.schema.org world. These schemas have been very well received from their very inception by the webmaster community. All of these are feeding into the Google Knowledge Graph and all the other smart-things (search-enabled) we are using daily.

From the corporate world, these Internet mega-trends, have, or should have, a big impact on the way we do information management inside the corporate walls. This would be particularly the case if the siloed repositories and records were semantically enhanced from their inception (creation), for subsequent use and archiving. We would then see more flexible and fluid information management within the digital workplace.

The name of the game is interoperability at every level: not just technical device specifics, but interoperability at the semantic level and at the level we use governing principles for how we organise our data and information, regardless of their origin.

Stepping down, to some real-life examples

In the law enforcement system in any country, there is a set of actor-networks at play: the police, attorneys, courts, prisons and the like. All of them work within an inter-organisational process from capturing a suspect, filing a case, running a court session, judgement, sentencing and imprisonment; followed at the end by a reassimilated member of society.  Each of these actor-networks or public agencies have their own internal information landscape with supporting information systems, and they all rely on a coherent and smooth flow of information and data between each other. The problem is that while they may use similar vocabularies, the contexts in which they are used may be very different due to their different responsibilities and enacted environment (laws, regulations, policies, guidelines, processes and practices) when looking from a holistic perspective.

IA LOD Innovation

A way to supersede this would be to infuse semantic technologies and shared controlled vocabularies throughout, so that the mix of internal information systems could become interoperable regardless of the supporting information system or storage type. In such a case linked-open-data and semantic enhancements could glue and bridge the gaps to form one united composite, managed by just one individual’s record keeping. In such a way, the actual content would not be exposed, rather a metadata schema would be employed to cross any of the previously existing boundaries.

This is a win-win situation, as semantic technologies and any linked-open-data tinkering use the shared conversation (terms and terminologies) that already exists within the various parts of the process. While all parts cohere to the semantic layers, there is no need to reconfigure  internal processes or apply other parties’ resource descriptions and elements. In such a way only parts of schemas are used that are context specific for a given part of a process, and so allowing the lingo of the related practices and professions to be aligned.

This is already happening in practice in the internal workplace environment of an existing court, where a shared intranet is based on such organising principles as already mentioned, uses applied sound and pragmatic information management practices and metadata standards like Dublin Core and Common Vocabularies –  all of which are infused in Content Provisioning.

For the members of staff, working inside a court setting, this is a major improvement, as they use external databases everyday to gain insights in order to carry out their duties. And when the internal workplace uses such a set up, their knowledge sharing can grow –  leading to both improved wayfinding and findability.

Yet another interesting case, is a service company that operates on a global scale. They are an authoritative resource in their line-of-business, maintaining a resource of rules and regulations that have become a canonical reference. By moving into a new expanded digital workplace environment (internet, extranet and intranet) and using semantic enhancement and search, they get a linked-data set that can be used by clients, competitors and all others working within their environment. At the same time their members of staff can use the very same vocabularies to semantically enhance their provision of information and data into the different information systems internally.

The last example is an industrial company with a mix of products within their line-of-business. They have grown through M&A over the years, and ended up in a dead-end mess of information systems that do not interoperate at all. A way to overcome the effect of past mergers and aquisitions, was to create an information governance framework. Applying it  with MDM and semantic search they were able to decouple data and information, and as a result making their workplace more resilient in a world of constant flux.

One could potentially apply these pragmatic steps to any line of business, since most themes and topics have been created and captured by the emerging semantic web and linked-data realm. It is only a matter of time before more will jump on this bandwagon in order to take advantage of changes that have the ability to make them a canonical reference, and a market leader. Just think of the film industry’s IMDB.

A final thought: Are the vendors ready and open-minded enough to alter their software and online services in order to realise this outlined future enterprise information landscape?

For more information please read these online resources, or go for the executive brief video clip:
Enterprise-Linked-Data
http://testing.rachaelkalicun.info/led_book/led-contents.html

Exec Brief

Europeana brief for memory institutions using linked-open-data:
http://en.wikipedia.org/wiki/File:Linked-open-data-Europeana-video.ogv

Linked-Open-Data network Sweden 2014 presentation:
http://livingarchives.mah.se/2014/03/linked-data-2014/
and Fredric’s talk about semantic enhanced citizen participation and slides.

The future linked-data enterprise, from Intranätverk conference in Göteborg, in May 2014
Fredric Landqvist and Kerstin Forsbergs’s talk, and slides.

Uncover hidden insights using Information Retrieval and Social Media

Arjen de Vries’ talk at ESSIR 2013 (Granada, Spain) highlighted the opportunities and difficulties in using Information Retrieval (IR) and social media to both make sense of unstructured data that a computer cannot easily make sense of by itself and realise deeper hidden information.

Today social media has become more important as interactions on sites have developed to include user-generated content of all types from ratings, comments and experiences to uploaded images, blogs and videos. Users now not only consume content and products but also co-create, interact with other users and help categorise content through means of tagging or hash-tags. All of which may result in ‘data consumption’ traces.

Some of these social media platforms like Twitter provide access to a variety of data such as user profiles, their connections with other users, their shared or published content and even how they react to each other’s content through comments and ratings. Analysis of this data can provide new insights.

One example is a case from CIW research. They calculated a top music artist popularity chart for each of the following different music sites: EchoNest, Last.fm and Spotify. Further research was then done for the band The Black Keys, where it was found that their popularity did not vary over time for either Last.fm or Spotify. However when using bit.ly data from tweets about the band for the very same period, it was found that interest in them rocketed after their Grammy win in the States, information that was not apparent from the previous research.

Using social media data to enrich information

The challenge that remains for IR research however is that these social media platforms vary in functionality. What they let users do will often determine the usefulness of the resultant data. For example, YouTube and Flickr will only let the up-loader tag their own content, while the film site IMDb allows tagging but the tags are not registered personally, rather they go into a pool. Arjen cited ‘Red Hot Chilli Peppers’ as a simple example of the usefulness of such social media data in allowing disambiguation either through implicit metadata from a user comment about eating chillies and/or information derived from the organisational data from say Flickr where a picture of red hot chilli peppers is grouped with other pictures of fruit and vegetables.

The key point here is that researchers often bemoan the fact that they do not always have access to log server files. Social media data left by users about content or objects can at times provide a richer representation in matching an information need and the response to that need. The potential benefits here for Information Retrieval are several:

  • The expanded content representation
  • The reduction in the ‘vocabulary gap(s)’ between content creator, indexers and information seekers
  • The increase in diversity of view on the same content
  • And the opportunity to make better assumptions about a user’s context and the variety of contexts that may exist

Allowing all users to tag all available content improves retrieval tasks

Where information about users on media sites is ‘open,’ (sometimes it is not, sites like Facebook and Linkedin are notoriously difficult to retrieve data from) there is the ability to discover which user labels what item with what word, and in some cases, even what rating they give. In essence many new sources play the role of anchor texts, be they tags, ratings, tweets, comments or reviews. The standard triangle of user, tag and item allows a unifying research approach based on random walks and can answer many questions. The talk emphasised that the area clearly has ample opportunity for researchers to make their mark.

One example of the research potential was shown in the case where LibraryThing.com was used to detect synonyms. The website allows users to keep a record of books they have read, tag, rate and comment on them. From these connections a synonym detector was created, allowing the example query word of ‘humour,’ to have a proposed list of synonyms created that included: ‘humor (US English), funny, humorous and British humour.

Analysis of this type of research has shown that allowing all users to tag all available content improves retrieval tasks, and that combining tags and ratings may both improve search and recommendation tasks even though there are some cases where lost relations between user, tag and item may occur.

The takeaway from this talk was that there is no one means or approach in retrieving information from Social Media due to the many complexities involved – not least because the limitations of user interactivity in some platforms but also due to limitations in the usage of data accessed. As Arjen demonstrated though, it is possible in certain cases to be innovative in the collection of rewarding data – an approach that may be utilized more and more, particularly as users/customers move closer to product and service providers through increased online interactions.

Architecture of Search Systems and Measuring the Search Effectiveness

Lecture made at the 19th of April 2012, at the Warsaw University of Technology. This is the 9th lecture in the regular course for master grade studies, “Introduction to text mining”.

View more presentations from Findwise