linked data cloud

Linked Data Cloud

This article was originally published June 1st, 2012 in my old blog.

Few weeks ago I gave an introduction to Semantic Web at the Sao Paulo Ruby User Group (GURU, in portuguese). Since Semantic Web is a very broader term that entail tons of technologies and tools, I decided to focus more on explaining the single main reason that make companies apply it in their projects:

Data Integration

We, as software engineers and product managers, face a lot of problems, specially in big companies, when we need to integrate data coming from several sources. Anytime I needed to do this, I realize that I kept rumbling to myself: "Why we don’t have only one universal metadata for our data? How we let this happen?"

The big triumph provided by the Semantic Web set of technologies, specifications and tools is to enable you, as owner of highly variable data and metadata, to organize it in a way that you can find any information and derive knowledge as easily as a SQL-like query. This is possible because we start to represent the data in the simplest way possible: a triple (subject -> predicate -> object). For example:

  • Abril_Engineering_blog > is_owned_by > Abril
  • Abril_Engineering_blog > is_hosted_by > Tumblr
  • Luis_Cipriani > work_at > Abril

By establishing relationships with all your data based on a controlled and known metadata (better known as ontology), you create the most flexible format for representing a data, and from this point, you can derive anything the quality of your data would allow.

At Abril we have a perfect environment to use this strategy for data integration, since we produce a lot of content that we are able to extract structure, such as, people, events, venues, articles, user comments, etc. Some projects are on their way and others are about to start. We hope to be talking about it sooner.

Meanwhile, check the presentation slides below, it has more detail about the technologies and cases:

You might also like

A System of systems to Publish Digital Content

Alexandria is a microservice architecture built to empower content publishing at Abril, a media company in Brazil.

Fearless HTTP requests abuse

A series of talks and tutorials about Web Caching and HTTP

HTTP API security guide

Talk with a guide for concerned developers about HTTP API available authentication mechanisms. Presented at QCon São Paulo.