Pragmatic or spontaneous – What are the most common personal qualities in IT-job ads?

Open Data Analytics

At Findwise we regularly have companywide hackathons with different themes. The latest theme was insights in open data, which I personally find very interesting.

Our group chose to fetch data from the Arbetsförmedlingen (Swedish Employment Agency), where ten years of job ads are available. There are about 4 million job ads in total during this time-period, so there is some material to analyze.

To make it easier to enable ad hoc analysis, we started off by extracting competences and personal traits mentioned in the job ads. This would allow us to spot trends in competences over time, in different regions or correlate competences and trait. Lots of possibilities.

 

Personal qualities and IT competences

As an IT-ninja I find it more exciting to focus on jobs, competences and traits within the IT industry. A lot is happening, and it is easy for me to relate to this area, of course. A report from Almega suggests that there is a huge demand of competences within IT for the coming years and it brings up a lot of examples of lacking technical skills. What is rarely addressed is what personality types are connected to these specific competences. We’re able to answer this interesting question from our data:

 

What personal traits are common complementary to the competences that are in demand?

arbetsförmedlingen hack

Figure 1 – Relevant worktitles, competences and traits for the search term “big data”

 

The most wanted personal traits are in general “Social, driven, passionate, communicative”. All these results should of course be taken with a grain of salt, since a few staffing/general IT consulting companies are a big part of the number of job ads within IT. But we can also look at a single competence and answer the question:

 

What traits are more common with this competence than in general? (Making the question a bit more specific.)

Some examples of competences in demand are system architecture, support and JavaScript. The most outstanding traits for system architecture are sharp, quality orientated and experienced. It can always be discussed if experienced is a trait (although our model thoughts so) but it makes sense in any case since system architecture tend to be more common among senior roles. For support we find traits such as service orientated, happy and nice, which is not unexpected, Lastly, for job-ads needing javascript-competence, personal traits such as quality orientated, quality aware and creative are the most predominant.

 

Differences between Stockholm and Gothenburg

Or let’s have a look at geographical differences between Sweden’s two largest cities when it comes to personal qualities in IT-job ads. In Gothenburg there is a stronger correlation to the traits spontaneous, flexible and curious while Stockholm correlates with traits such as sharp, pragmatic and delivery-focused.

 

What is best suitable for your personality?

You could also look at it the other way around and start with the personal traits to see which jobs/competences are meant for you. If you are analytical then jobs as controller or accountant could be jobs for you. If you are an optimist, then job coach or guidance counselors seems to be a good fit. We created a small application where you can type in competences or personal traits and get suggested jobs in this way. Try it out here!

 

Lear more about Open Data Analytics

In addition, we’re hosting a breakfast seminar December 12th where we’ll use the open data from Arbetsförmedlingen to show a process of how to make more data driven decisions. More information and registration (the seminar will be held in Swedish)

 

Author: Henrik Alburg, Data Scientist

SharePoint optimized – part 2, Search power

Last week I wrote a post about how I fix CSOM code in order to accelerate whole query execution. Final result was not that bad though still not good enough:

  • 0.8s for fetching ~500 subsites
  • 6.5s for fetching ~900 subsites recursively for whole subsites hierarchy

My aim is to fetch whole subsites hierarchy within time that is reasonable to wait (1-2s total).

In this post I show you how to achieve it – we can fetch whole subsites hierarchy in less than 2s!

Continue reading

Summary from Enterprise Search and Discovery Summit 2017

This year at Enterprise Search and Discovery Summit, Findwise was represented by us – search experts Simon Stenström and Amelia Andersson. With over a thousand attendees at the event, we’ve enjoyed the company of many peers. Let’s stay in touch for inspiration and to create magic over the Atlantic – you know who you are!

Enterprise Search and Discovery 2017 - findwise experts

Amelia Andersson and Simon Stenström, search experts from Findwise

 

Back to the event: We opened the Enterprise Search-track with our talk on how you can improve your search solutions through taking several aspects of relevance into account. (The presentation can be found in full here, no video unfortunately). If you want to know more about how to improve relevancy feel free to contact us or download the free guide on Improved search relevancy.

A few themes kept reoccurring during the Enterprise Search-track; Machine learning and NLP, bots and digital assistants, statistics and logs and GDPR. We’ve summarized our main takeaways from these topics below.

 

Machine learning and NLP

Machine learning and NLP were the unchallenged buzzwords of the conference. Everybody wants to do it, some have already started working with it, and some provided products for working with it. Not a lot of concrete examples of how organizations are using machine learning were presented unfortunately, giving us the feeling that few organizations are there yet. We’re at the forefront!

 

Bots, QA systems and digital assistants

Everyone is walking around with Siri or Google assistant in their pocket, but still our enterprise search solutions don’t make use of it. Panels were discussing voice based search (TV remote controls that could search content on all TV channels to set the right channel, a demo om Amazon Alexa providing answers for simple procedures for medical treatments etc.) pointing out that voice-to-text is now working well enough (at least in English) to use in many mobile use cases.

But bots can of course be used without voice input. A few different examples of using bots in a dialog setting were showed. One of the most exciting demos showed a search engine powered bot that used facet values to ask questions to specify what information the user was looking for.

 

Statistics and logs

Collect logs! And when you’ve done that: Use them! A clear theme was how logs were stored, displayed and used. Knowledge managements systems where content creators could monitor how users were finding their information inspired us to consider looking at dashboard for intranet content creators as well. If we can help our content creators understand how their content is found, maybe they are encouraged to use better metadata or wordings or to create information that their users are missing.

 

GDPR

Surprisingly, GDPR is not only a “European thing”, but will have a global impact following the legislation change in May. American companies will have to look at how they handle the personal information of their EU customers. This statement took many attendees by surprise and there were many worried questions on what was considered non-compliant of GDPR.

 

We’ve had an exciting time in Washington and can happily say that we are able bring back inspiration and new experience to our customers and colleagues at Findwise. On the same subject, a couple of weeks ago some or our fellow experts at Findwise wrote the report “In search for Insight”, addressing the new trends (machine learning, NLP etc) in Enterprise Search. Make sure to get your copy of the report if you are interested in this area.

Most of the presentations from Enterprise Search and Discovery Summit can be found here.

 

AuthorsAmelia Andersson and Simon Stenström, search experts from Findwise

SharePoint optimized – part 1, CSOM calls

Intranet home page should contains all information that are needed in daily manner. In fact many companies use home page as a traffic node where everybody comes just to find a navigation link pointing to another part of intranet. In my current company, Findwise, we do that too. However one of our components that allows us to quickly navigate through intranet sites gets slower and slower year by year. Currently it’s loading time is almost 10 seconds! I decided to fix it or even rebuild it if needed. Especially that few weeks ago on ShareCon 365 conference I talked about SharePoint Framework in Search Driven Architecture where I described the customer case, PGNIG Termika, who saved about 600k PLN (~$165.000) per year thanks to their information accessibility improvements (information time access dropped from 5-10 minutes to 1-2 seconds).

In this post I wanted to show you what was the problem, how I fixed it and how my fix cuts the component loading time 6 times!

Continue reading

Microsoft Ignite 2017 – from a Search and Findability perspective

Microsoft Ignite – the biggest Microsoft conference in the world. 700+ sessions, insights and roadmaps from industry leaders, and deep dives and live demos on the products you use every day. And yes, Findwise was there!

But how do you summarize a conference with more than 700 different sessions?

Well – you focus on one subject (search and findability in this case) and then you collaborate with some of the most brilliant and experienced people around the world within that subject. Add a little bit of your own knowledge – and the result is this Podcast.

Enjoy!

Expert Panel Shares Highlights and Opportunities in Microsoft’s Latest Announcements

microsoft ignite podcast findwise

Do you want to know more about Findwise and Microsoft? Find our how you can make SharePoint and Office 365 more powerful than ever before.

 

Sharepoint Framework in SharePoint Search Driven Architecture

On 16.10.2017 I had a privilege to be one of speakers on ShareCon365. I had technical speech where I showed how to make Sharepoint Framework (SPFx) apps in Search Driven Architecture. If you were on my speech you are probably interested in materials which you can find here: My presentation materials.

If you were not…than keep reading 🙂

Continue reading

3 easy ways to integrate external data sources with SharePoint Online

Introduction

SharePoint Online provides powerful tools able to search through various types of data. At Findwise we have worked with Microsoft search applications since the begining of the FAST era. If you have questions about or need help with integration of external sources – feel free to write me a couple of lines: lukasz.wojcik@findwise.com

 

Lets get started! First you must provide some content to SharePoint.

Here are some solutions you can choose from to feed SharePoint Online with your data:

Pushing data to SharePoint list using RESTful service

Using Business Connectivity Service

Using custom connector in hybrid infrastructure

 

Pushing data to SharePoint list using RESTful service

The most simple method to put some data in SharePoint is to write it directly to the SharePoint lists.

SharePoint Online exposes a REST API which can be used to manipulate lists.

Following steps will guide you through pushing data to SharePoint lists.

1. No token, no ride

First things first. In order to use any manipulation in SharePoint you must obtain an access token.

To do so, you must follow these steps:

  1. Handle page load event
  2. In the load event handler, read either of the following request parameters:
    • AppContext
    • AppContextToken
    • AccessToken
    • SPAppToken
  3. Create a SharePointContextToken from previously retrieved token using JsonWebSecurityTokenHandler
  4. Get the access token string using OAuth2S2SClient

2.     Know your list

By the time you want to manipulate your list you should probably have known your list name but you may not know its ID.

So, if you want to retrieve lists, you should call a GET method:

/_api/Web/lists

with header:

Authorization=Bearer <access token>

with content type:

application/atom+xml;type=entry

and accept header:

application/atom+xml

3.     Add entries to the list

Once you finally retrieve your list, you are ready to actually push your data.

There are few additional steps that need to be taken in order to execute POST request needed to add the items to the list:

  1. Get context info by calling POST method:
    /_api/contextinfo
  2. Get the form digest from received context info xml
  3. Get the list item entity type full name from the list data by calling GET method:
    /_api/Web/lists(guid'<list ID>')
  4. Form query string used to add new item to the list:
    {'__metadata':{'type':'" + <list item entity type full name> + "'}, 'Title':'" + <new item name> + "'}}
  5. Get the list items data by calling POST method:
/_api/Web/lists(guid'<list ID>')/Items

with headers:

Authorization=Bearer <access token>
X-RequestDigest=<form digest>

with content type:

application/json;odata=verbose

and accept header:

application/json;odata=verbose
  1. Write the byte array created upon the query string to the request stream.

That’s all, you’ve just added an entry to your list.

A full example code can be found here:

https://github.com/OfficeDev/SharePoint-Add-in-REST-OData-BasicDataOperations

 

 

Using Business Connectivity Service

SharePoint can gather searchable data by itself in a process called crawling. Crawling is getting all data from a specified content source and indexing its metadata.

There are various possible content sources that SharePoint can crawl using its built-in mechanisms, such as:

  • SharePoint Sites
  • Web Sites
  • File Shares
  • Exchange Public Folders
  • Line of Business Data
  • Custom Repository

In first four types of content you can choose multiple start addresses that are base paths where crawling process starts looking for data to index.

SharePoint Sites include all SharePoint Server and SharePoint Foundation sites available at the addresses specified as start addresses.

Web Sites include all sites over the Internet.

File Shares include files available via FTP or SMB protocols.

Exchange Public Folders include messages, discussions and collaborative content in Exchange servers.

Line of Business Data and Custom Repository include custom made connectors that provide any type of data. These are described in another method of connecting external data below.

To use first four types of content, all you have to do is to specify addresses where the crawling process should start its operation. Alternatively you can specify crawling schedule which will automatically start indexing data at the time specified in schedule.

There are two types of crawling:

  • Full – slower, indexes all encountered data, replacing any already existing data by new version
  • Incremental – faster, compares dates of encountered data and existing data and indexes the data only if the existing data is outdated

Though these methods are very simple and easy to use, they provide very limited flexibility and if you need more personalized way of storing your data in SharePoint which will be searchable in the future you should use more advanced technique involving creating Business Data Connectivity model, which is described below.

 

 

Using custom connector in hybrid infrastructure

Business Connectivity Service is a powerful extension but to get the most out of it, you must make some effort to prepare the Business Data Connectivity model used to define the structure of data you want to be able to search through.

1.     Create Business Data Connectivity Model

There are two simple ways to create the Business Data Connectivity model:

  • Using Microsoft SharePoint Designer
  • Using Microsoft Visual Studio

The Business Data Connectivity model is in fact stored in XML file so there’s the third way of creating the model – the hard way – edit the file manually.
Although, editing the Business Data Connectivity model file is not that easy as using visual designers, in many cases it’s the only way  to add some advanced functionalities, so it is advised to get familiar with the Business Data Connectivity model file structure.

Using Microsoft SharePoint Designer and Microsoft Visual Studio methods involve connecting to SharePoint On-Premise where the model is deployed. After the deployment the model needs to be exported to a package which can be installed on destination site.

1.1. Create Business Data Connectivity Model using Microsoft SharePoint Designer

The simplest  way to get started with Business Data Connectivity Model is to:

  • Run Microsoft SharePoint Designer
  • Connect to the destination site
  • Select External Content Types from the Navigation pane
  • Select External Content Type button from the External Content Types ribbon menu

The SharePoint Designer allows to choose the external data source from:

  • .NET assembly
  • Database connection
  • WCF web-service

The advantage of this method is that the model is automatically created from data discovered from data source.
For example if you choose database connection as a source of your data, the designer allows you to pick the database entities (such as tables, views, etc.) as a source and guides you through adding operations you want to be performed during the search process.
Saved model is automatically deployed in connected site and ready to use.

The disadvantage of this method is that only simple data types are supported and you won’t be able to add operations providing functionality of downloading attachments or checking user permissions to view searched elements,
thus adding parts responsible of these functionalities to the model file may be required.

1.2. Create Business Data Connectivity Model using Microsoft Visual Studio

In order to use the Visual Studio to create the Business Data Connectivity Model you must be running the environment on a system with SharePoint installed
To create the Business Data Connectivity Model you must take a few steps:

  • Run Visual Studio with administrative privileges
  • Create new project and select the SharePoint Project from SharePoint from either Visual C# or Visual Basic templates
  • Select Farm Solution and connect to your SharePoint site
  • Add new item to your newly created SharePoint project and select the Business Data Connectivity Model

Your new BDC Model can be now designed either in built-in SharePoint BDC Designer or in built-in XML Editor but with only one of them at time.

The advantage of designing the model in visual designer is that all defined methods are automatically generated in corresponding service source code.
Once project is built, it can be deployed directly to the connected destination site with a single click.

The disadvantage however is that you must define all fields of your business data by yourself and also create corresponding business model class.
You must also provide your connection with external system such as database.

While this method is very convenient when deploying the solution on SharePoint On-Premise, you must bear in mind that SharePoint Online doesn’t allow additional .NET assemblies that often come along with the model when creating a SharePoint Project containing a Business Data Connectivity Model.

2.     Export Business Data Connectivity Model

Once the model is created it needs to be exported to a package that can be installed on a destination site.

If you created your model in SharePoint Designer you can just right click on the model and select Export BCD model.

If you created your model in Visual Studio, you can export the model by selecting Publish command from Build menu. The simplest way to save the package is to select filesystem as a destination and then point where the package file should be saved.

3.     Import Business Data Connectivity Model into destination system

Once you have an installation package, you can import it as a solution in your SharePoint site settings.

To do so, navigate to the site settings and the to the solutions, where you can click on the Upload solution button and select the installation package.

 

Since SharePoint Online doesn’t allow using your own code as a data connector, you can use Hybrid infrastructure which involves using Business Data Connectivity Model on SharePoint Online side and the .NET assembly containing all the logic on correlated SharePoint On-Premise side. The logic provides all the necessary connections to data sources, formatting the data and all other customer required processing.

 

 

 

Conclusion

As you can see, integrating external data seem to be pretty simple and straight forward, but it still needs some effort to do it properly.

In the future posts I’ll cover the methods described above with details and some examples.

Time of intelligence: from Lucene/Solr revolution 2017

Lucene/Solr revolution 2017 has ended with me, Daniel Gómez Villanueva, and Tomasz Sobczak from Findwise on the spot.

First of all, I would like to thank LucidWorks for such a great conference, gathering this talented community and engaging companies all together. I would also like to thank all the companies reaching out to us. We will see you all very soon.

Some takeaways from Lucene/Solr revolution 2017

The conference basically met all of my expectations specially when it comes to the session talks. They gave ideas, inspired and reflected the capabilities of Solr and how competent platform it is when it comes to search and analytics.

So, what is the key take-away from this year’s conference? As usual the talks about relevance attract most audience showing that it is still a concern of search experts and companies out there. But what is different in this years relevance talks from previous years is that, if you want to achieve better result you need to add intelligent layers above/into your platform to achieve it. It is no longer lucrative nor profitable to spend time to tune field weights and boosts to satisfy the end users. Talk from Home Depo: “User Behaviour Driven Intelligent Query Re-ranking and Suggestion”, “Learning to rank with Apache Solr and bees“ from Bloomberg,  “An Intelligent, Personalized Information Retrieval Environment” from Sandia National Laboratories, are just a few examples of many talks where they show how intelligence comes to rescue and lets us achieve what is desired.

Get smarter with Solr

Even if we want to use what is provided out of the box by Solr, we need to be smarter. “A Multifaceted Look at Faceting – Using Facets ‘Under the Hood’ to Facilitate Relevant Search” by LucidWorks shows how they use faceting techniques to extract keywords, understand query language and rescore documents. “Art and Science Come Together When Mastering Relevance Ranking” by Wolters Kluwer is another example where they change/tune Solr default similarity model and apply advanced index time boost techniques to achieve better result. All of this shows that we need to be smarter when it comes to relevance engineering. The time of tuning and tweaking is over. It is the time of intelligence, human intelligence if I may call it.

Thanks again to LucidWorks and the amazing Solr Community. See you all next year. If not sooner.

Writer: Mohammad Shadab, search consultant / head of Solr and fusion solutions at Findwise

SharePoint Framework vs Sharepoint apps vs Sharepoint solutions

  • “I’m so confused with all this SharePoint Framework, apps, solutions…I just wanted to develop for Sharepoint!”
  • “What can I use SharePoint Framework (SPFx) for? Can I use it for branding?”
  • “When should I pick custom SharePoint solution over SharePoint Framework or SharePoint App?
  • “How can I make elevated privilages in sharepoint hosted app?”

During our years of experience in SharePoint development we’ve seen those questions many times. They were asked by IT devs of our clients, by users of tech blogs/forums and also by ourselves (yes, we’re learning all the time!). Since you’re here we assume that you are confused a little bit too but don’t worry. We know what you feel. That’s why we’ve created this post.

At Findwise we work with improving the SharePoint experience on a daily basis. If you want to know more about our areas and offers visit our site.

Continue reading

Web crawling is the last resort

Data source analysis is one of the crucial parts of an enterprise search deployment project. Search engine results quality strongly depends on an indexed data quality. In case of web-based sources, there are two basic ways of reaching the data: internal and external. Internal method involves reading the data directly from its storage place, such as a database, filesystem files or API. Documents are read by some criteria or all documents are read, depending on requirements. External technique relies on reading a rendered HTML with content via HTTP, the same way as it is read by human users. Reaching further documents (so called content discovery) is achieved by following hyperlinks present in the content or with a sitemap. This method is called a web crawling.

The crawling, in contrary to a direct source reading, does not require particular preparations. In a minimal variant, just a starting URL is required and that’s it. Content encoding is detected automatically, off the shelf components extract text from the HTML. The web crawling may appear as a quick and easy way to collect a content to be indexed. But after deeper analysis, it turns out to have multiple serious drawbacks.

Continue reading