Alfresco CMS permission hooking

Integration between Alfresco and Findwise i3

In Findwise we have been working on creating a custom AMP (Alfresco Module Package… essentially a nicely packeted extension to Alfresco) for quite a while making a nice integration between Alfresco and Findwise i3.

Alfresco is essentially an open source SharePoint written in Java. It’s a CMS (Content Management System), where the basic functionality is to allow end users to import his/her documents, to make them easy to manage and find later on.

Findwise i3 is built on top of a number of open source products, like Solr, Elastic Search etc. and its purpose is to provide a nice, clean, pluggable pipelined framework for loading in data in one end, making them searchable and applying search aspects to the documents, to allow for searching intelligently on subsets of data. Further it’s also a pluggable framework for presenting a nice web based search frontend, where the defined search aspects can be selected, along with all other criteria’s and nicely formatted and organized search result sets.

So why integrate the two?

There are many users of Alfresco around the world, and there are also many users of Findwise search functionality. As Findwise framework allows for making lots of different data searchable across an organization, it makes sense also to be able to search data stored in Alfresco, as just one of these many sources. That’s the background for the project and why we entered integrating the two.

Plugging into Alfresco’s repository (which stores documents) is fairly easy, because the developers of Alfresco have prepared hooks one can use. Thus it’s possible to get informed about whenever a document is added into Alfresco, when its content is changed, or when it’s deleted. On each of these occasions, we want to let Findwise i3 know about it, so it can live update its databases, and thus always provide relevant, current data for the end-users. As said, this is fairly straightforward and documented in the Alfresco documentation.

However, there is another scenario, for which no nice solution exists in Alfresco. Namely hooking into when permissions are changed on documents stored in Alfresco.

Why do we need information about permission changes?

Because Findwise i3 of course respects the IT security roles and settings defined in a company’s infrastructure. It wouldn’t be cool to have a document protected by ownership or security roles within Alfresco, but at the same time be fully accessible via the Findwise search API. So we need to be informed about all permission changes made to each and every document hosted within Alfresco. All permission requests and settings within Alfresco happens via the PermissionService interface (permissionService implementation).

So one could choose to replace or override that class with one of our own. That would result in having to dig deep into Alfresco’s configuration, with the risk that when a minor update of Alfresco is installed, our overridden class will have to be reconfigured into the Alfresco configuration once again. That’s basically a mess and should be avoided at all costs. We would much rather prefer to have the nicely AMP package, for which exists simple to use tools to reinstall into an Alfresco installation, be able to hook also the permission events.

The preferred way to do that is to create a method interceptor. That’s actually a Spring thing, and as Alfresco is built upon Spring, it makes sense to use one of those. The interceptor does exactly that, intercepts all method calls for a particular class or interface for which it has been registered. The easy part is to make the interceptor, the difficult part (due to its not very well described) is to have it registered from within the AMP itself

The following is skeleton code for a method interceptor:

screen-shot-2016-12-21-at-09-49-07

And this is how to register it in your AMP’s service-context.xml

screen-shot-2016-12-21-at-10-17-21

When you register your AMP with Alfresco using the usual apply_amps script, your new interceptor class will spring into life when a method on the PermissionService interface is called. It’s actually pretty simple, but the hard part is to figure out the registration process, and provide the correct case and naming for the PermissionService. We have spent quite allot of time searching for this on Google, reading the Alfresco documentation and only found fragments that didn’t show the whole picture.

Now we hope our findings were helpful to others.

Written by: Kim Bo Madsen, Consultant Findwise

Plan for General Data Protection Regulation (GDPR)

Another new regulation from EU? Will this affect us? It seems so complex. Can’t we sit back and wait for the first fine to come and then act if necessary?

We have to care and act – start planning now!

I think we have to care and act now. Start planning now so you get it right. The GDPR is a good thing. This is not another EU thing about the right size of a strawberry or how bendy a banana could be. This is about the fact that all individuals should feel safe giving their personal information to business. Cyber security is a good thing, not protecting our data and our customers’ data is a bad thing for us. Credit card numbers and personal data leaks out from companies worldwide with large business risks, companies don’t just face fines or reputational damage, they can have their permission to issue credit cards and other financial services products withdrawn by the regulator and responsible employees faces imprisonment. We can only guess whether a company needs to be GDPR compliant or not to be allowed to compete in a bidding process?

What is the General Data Protection Regulation (GDPR)?

The General Data Protection Regulation (GDPR) is a new legal framework approved by the European Union (EU) to strengthen and unify data protection of personal information. GDPR will replace the current data protection directive (in Sweden Personuppgiftslagen, PUL) and applies from 25 May 2018.

Who is affected?

GDPR has global reach and applies to all companies worldwide that process personal data of European Union citizens.

Identify personal data and protect it

GDPR widely defines what constitutes personal data. Organisations needs to fully understand what information they have, where it is located and how it was collected. Discover, classify and manage all information, both structured and unstructured data and secure it.

Data breach notifications

GDPR requires organisations to notify the local data protection authority of a data breach within 72 hours after discovery.

Do you have the right to store this information? Explicit consent

Personal data should be gathered under strict conditions. Organisations need to ask for consent to collect personal data and they need to be clear about how they will use the information.

The right of access

Individuals will have the right to obtain access to their personal data and other supplementary information in a portable format. You must provide a copy of the information free of charge. GDPR also give individuals the right to have personal data corrected if it is inaccurate or incomplete.

The right to be forgotten

GDPR also introduces the right to be forgotten, or erased. Data are not to be hold for any longer than absolutely necessary, and data should not be used in any other way than it was originally collected for.

Penalties and fines

Companies that fails to protect customer data adequately will face significant fines up to €20m, or up to 4% of global turnover. This should be a serious incentive for companies to start preparing now.

First steps to GDPR compliance

  1. Create awareness and allocate resources
    First step is to make sure that your organisation is aware of the new EU legislation and what it means for you. How will your business be affected by the new regulation? You need to allocate enough resources, make sure you involve decision-makers and stakeholders in your organisation. Last, but not least, start today!
  2. Content Inventory
    The second step is to discover and classify all your information to identify exactly what types of personal identifiable data you have, where you have it and how it is collected.

Findwise can assist you in this process, please contact Maria Sunnefors and visit our website for more information.

Want to read more?

Read more about the GDPR at Datainspektionen (in Swedish) or at iCO.

First Meteor Meetup this year

Yesterday we arranged a really successful Meetup to start this years Meteor activites.

This years first Meteor Meetup in Gothenburg -18 people showed up, more than expected and more than registered in beforehand on meetup.com – which is a first! So I can tell you that it was really successful already the beginning.

Pizza and beverages were served in Findwise Gothenburg office. Big thanks to Sebastian Ilves and Benjamin Lilland from Devkittens for helping out with arrangements.

We started up with a short presentation about what has happened in the community and in Gothenburg since last meetup last year. Here is the presentation with some newly added facts.

Then we carried on with demonstrations of some apps we have built at Findwise with Meteor, just mentioning a few:

  • Burnout – A search driven problem finder for websites using crawl-techniques and a search engine backend to deliver diagnostics to the Meteor front end application which handles user sessions and user specific data.
  • Keybox – A brilliant and quick app, which sprung out of a problem with key accesses on many different environment. The app helps out distributing access to servers for people just like you manage your keys in Github but for servers.
  • Signatures – Also a search driven app that helps Findwise staff to generate their email signatures by themselves, using the search to gather data from the company active directory.

Between a couple in-house app demonstrations we invited people from the community to demo what they have built or are working on.

Patrik Göthe first demoed an iPhone app built with Meteor to paint vectors with SVG almost like you paint with the pen tool in Photoshop and you can also change the background hue of the artboard your painting on. The original idea was to enable people to get nice colored background images for their phone, with a hue one could control just by the touch drag event.

Patrik demoed another application for people who do live coding. The app was reconfigured from being only a Meteor in the browser to being a desktop app. You can open code files and divide the code into chunks. In addition you can use this app in the background to help paste each part with a short command in the Mac when you are presenting and live coding a piece of software.

Andreas Rolén from GBG Startup Hack was here and demoed an app that can be used for hackathons or competitions. You can login to the app and adjust the teams and score etc from your phone, and updates would then be visible live on the website and on screens mounted in the hackathon location.

In total I think we got to see 10 apps demoed. As that wasn’t enough, Robin Lindh Nilsson and Johan Carlberg caught everyone’s attention when we all suddenly were playing their game on our own laptops and live on the TV. This was fun and exciting to say the least. Here’s a link to the game: http://globbyonline.diamonde.se/

Robin also demoed his conversion app that has been live for a while now – you can convert anything on this app: http://www.convertercentral.com/

Finally Mickaël Delaunay held a presentation about how you can in the best way publish your app on your own server, which was very professional and greatly done.

So thanks for this great meetup, I hope we see more like this soon again!

Using search technologies to create apps that even leaves Apple impressed

At Findwise we love to see how we can use the power of search technologies in ways that goes beyond the typical search box application.

One thing that has exploded the last few years is of course apps in smartphones and tablets. It’s no longer enough to store your knowledge in databases that are kept behind locked doors. Professionals of today want to have instant access to knowledge and information right where they are. Whether if it’s working at the factory floor or when showcasing new products for customers.

When you think of enterprise search today, you should consider it as a central hub of knowledge rather than just a classical search page on the intranet. Because when an enterprise search solution is in place, when information from different places have been normalized and indexed in one place, then there really are no limits for what you can do with the information.

By building this central hub of knowledge it’s simple to make that knowledge available for other applications and services within or outside of the organization. Smartphone and tablet applications is one great example.

Integrating mobile apps with search engine technologies works really well because of four reasons:

  • It’s fast. Search engines can find the right information using advanced queries or filtering options in a very short time, almost regardless of how big the index is.
  • It’s lightweight. The information handled by the device should only be what is needed by the device, no more, no less.
  • It’s easy to work with. Most search engine technologies provides a simple REST interface that’s easy to integrate with.
  • A unified interface for any content. If the content already is indexed by the enterprise search solution, then you use the same interface to access any kind of information.

We are working together with a large Swedish manufacturing company. A company that has transformed itself from a traditional industry company into a knowledge engineering company over the last years. I think it’s safe to say that Findwise have been a big part of that journey by helping them create their enterprise search solution.

And of course, since we love new challenges, we have also helped them create a few mobile apps. In particular there are two different apps that we have helped out with:

  • A portable product brochures archive. The main use case is quick and easy access to product information for sales reps when visiting customers.
  • A mobile web app that you get to if you scan QR-codes printed on the package.

And even more recently the tech giant Apple has noticed how the apps makes the day to day work of employees easier.

Results from Enterprise Search and Findability Survey 2013

Although data is growing rapidly information is still difficult to find in most organizations. That is the most obvious conclusion from our annual global Enterprise Search and Findability Survey. 64 percent of respondents from organizations with more than 1000 employees say it is difficult to find the right information internally. Surprising since 79 percent think it is of high importance. But there are some good news too, 36 percent of respondents have a search strategy in place and 38 percent plan to implement one.

A recent report by European Union’s Joint Research Centre called “Enterprise Search in the European Union: A Techno-economic Analysis” found two main reasons for adopting a strategy for Enterprise Search; the growth in data generation and a more worrying one – the fact that this huge amount of information is largely unstructured. An estimated 80% of the information stored is either unstructured or has no adequate metadata for the needs of employees.

The survey findings show the need for enterprises to adopt enterprise search solutions to overcome the burden of information overload which faces the knowledge workers of today’s organizations. Not finding information, or even worse, finding the wrong information is still the reality for most organizations. Companies may be sitting on a considerable stock of digital assets but still being unable to capture and value from them.

Get budget for search

McKinsey & Company wrote an article recently, “Measuring the full impact of digital capital” saying that the need for growth and competitiveness will force companies to build strong digital capabilities. For those concerned about how to get funding for search I want to point out that the benefits from this investment are clear with improved quality in business decisions, increased efficiency and a more harmonized global offering as direct results. Much happier employees is a positive side effect.

Read more about the survey and download the report for free here!

Speaking about Search as a Service @ PROMISE Technology Transfer day, want to meet up?

Tomorrow morning I leave Gothenburg to attend the PROMISE Technology Transfer day @ CeBIT 2013 in Hanover, Germany.

The event is a workshop introducing its participants to methodologies for the systematic evaluation and monitoring of search engines, and for discussing future trends and requirements for the next generation of information access systems. In other words, it is right up our alley at Findwise.

As Director of Research at Findwise I will speak about Search as a Service. If you are at the event or just nearby I would be happy to meet up and have a chat.  I will be around from Tuesday March 5 until Thursday March 7. Feel free to email me, henrik.strindberg@findwise.com or give me a call at +46709443905.

Hope to see you there!

SLTC 2012 in retrospect – two cutting-edge components

The 4th Swedish Language Technology Conference (SLTC) was held in Lund on 24-26 October 2012.
It is a biennial event organized by prominent research centres in Sweden.
The conference is, therefore, an excellent venue to exchange ideas with Swedish researchers in the field of Natural Language Processing (NLP), as well as present own research and be updated of the state-of-the-art in most of the areas of Text Analytics (TA).

This year Findwise participated in two tracks – in a workshop and in the main conference.
As the area of Search Analytics (SA) is very important to us, we decided to be proactive and sent an application to organize a workshop on the topic of “Exploratory Query Log Analysis” in connection with the main conference. The application was granted and the workshop was very successful. It gathered researchers who work in the area of SA from very different perspective – from utilizing deep Machine Learning to discover users’ intent,  to looking at query logs as a totally new genre. I will do a follow-up on that in another post. All the contributions to the workshop will also be uploaded on our research page.

As for the main conference, we had two papers accepted for presentation. The first one dealt with the topic of document summarization – both single and multidocument summarization
(http://www.slideshare.net/findwise/extractive-document-summarization-an-unsupervised-approach).
The second paper was about detecting Named Enities in Swedish
(http://www.slideshare.net/findwise/identification-of-entities-in-swedish).

These two papers presented de facto state-of-the-art results for Swedish both when it comes to document summarization and Named Entity Recognition (NER). As for the former task, there is neither a standard corpus for evaluation of summarization systems, nor many previous results and just few other systems which made it unfeasible to compare our own system with. Thus, we have contributed two things to the research in document summarization – a Swedish corpus based on featured Wikipedia articles to be used for evaluation and a system based on unsupervised Machine Learning, which by relying on domain boosting achieves state-of-the-art results for English and Swedish. Our system can be further improved by relying on our enhanced NER and Coreference resolution modules.

As for the NER paper, our Entity recognition system for Swedish achieves 74.0% F-score, which is 4% higher than another study presented simultaneously at SLTC (http://www.ling.su.se/english/nlp/tools/stagger). Both systems were evaluated on the same corpus, which is considered a de facto standard for evaluation of different NLP resources for Swedish. The unlabelled score (i.e. no fine-grained division of classes but just entity vs non-entity) of our system achieved 91.3% F-score (93.1% Precision and 89.6% Recall). When identifying people, the Findwise NER system achieves 78.1% Precision and 90.5% Recall (83.9% F-score).

So, what did we take home from the conference? We were really happy to see that the tools we develop for our customers are not something mediocre but rather something that is of very high quality and is the state-of-the-art in Swedish NLP. We actively share our results and our corpora for research perposes. Findwise showed keen interest in cooperating with other researchers in developing better tools and systems in the area of NLP and Text Analytics. And this I think is a huge bonus to all our current and prospective customers – we actively follow the current trends in the research community and cooperate with researchers, and our products do incorporate the latest findings in the field, which make us leverage both high quality and cutting-edge technology.

As we continuously improve our products, we have also released a Polish NER and some work has been initiated on Danish and Norwegian ones. More NLP components will be soon available for demo and testing on our research page.

Enterprise Search in Practice: A Presentation of Survey Results and Areas for Expert Guidance

Enterprise search in practice presentation has two main focuses. First, to present some interesting and sometimes rather contradicting findings from the Enterprise Search and Findability survey 2012. Second, to introduce an holistic approach to implementing search technology involving five different aspects that are all important to succeed and to reach findability rather than just the ability to search.

Presented at Gilbane Conference 2012 in Boston USA on the 28th of November by Mattias Ellison.

Presentation: Enterprise Search and Findability in 2013

This was presented 8 November at J. Boye 2012 Conference in Aarhus, Denmark, by Kristian Norling.

Presentation Summary

There is a lot of talk about social, big data, cloud, digital workplace and semantic web. But what about search, is there anything interesting happening within enterprise search and findability? Or is enterprise search dead?

In the spring of 2012,  we conducted a global survey on Enterprise Search and Findability. The resulting report based on the answers from survey tells us what the leading practitioners are doing and gives guidance for what you can do to make your organisation’s enterprise search and findability better in 2013.

This presentation will give you a sneak peak into the near future and trends of enterprise search, based on data form the survey and what the leaders that are satisfied with their search solutions do.

Topics on Enterprise Search

  •  Help me! Content overload!
  • The importance of context
  • Digging for gold with search analytics
  • What has trust to do with enterprise search?
  • Social search? Are you serious?
  • Oh, and that mobile thing

Tutorial: Optimising Your Content for Findability

This tutorial was done on the 6th of November at J. Boye 2012 conference in Aarhus Denmark. Tutorial was done by Kristian Norling.

Findability and Your Content

As the amount of content continues to increase, new approaches are required to provide good user experiences. Findability has been introduced as a new term among content strategists and information architects and is most easily explained as:

“A state where all information is findable and an approach to reaching that state.”

Search technology is readily used to make information findable, but as many have realized technology alone is unfortunately not enough. To achieve findability additional activities across several important dimensions such as business, user, information and organisation are needed.

Search engine optimisation is one aspect of findability and many of the principles from SEO works in a intranet or website search context. This is sometimes called Enterprise Search Engine Optimisation (ESEO). Getting findability to work well for your website or intranet is a difficult task, that needs continuos work. It requires stamina, persistence, endurance, patience and of course time and money (resources).

Tutorial Topics

In this tutorial you will take a deep dive into the many aspects of findability, with some good practices on how to improve findability:

  • Enterprise Search Engines vs Web Search
  • Governance
  • Organisation
  • User involvement
  • Optimise content for findability
  • Metadata
  • Search Analytics

Brief Outline

We will start some very brief theory and then use real examples and also talk about what organisations that are most satisfied with their findability do.

Experience level

Participants should have some intranet/website experience. A basic understanding of HTML, with some previous work with content management will make your tutorial experience even better. A bonus if you have done some Search Engine Optimisation (SEO) for public websites.