3. Core concepts

The following section will introduce the main concepts and terms used within platform.

LinkedPipes Applications consist of multiple services encorporated into docker containers, interacting within docker-compose environment. As a part of [LinkedPipes] suite, we are relying on two extra software services that are responsible for discovering, retreiving and transforming your LinkedData sources to be visualized. It is recommended to get yourself familiar with conceptual description of those services, since it provides a better understanding on what exactly is happening during Data Preparation workflow process.

Discovery

LinkedPipes Discovery is a service used by LinkedPipes Applications to discover whether provided LinkedData sources can be processed and visualized by the platform. After a successfull request sent to Discovery service it executes a session that we refer to as Discovery session. Upon successfull completion of a Discovery session, service generates specific files in json-ld format, we refer to those files as Pipelines. A Pipeline describes how ETL needs to extract the whole LinkedData set and store the data into a Graph database called Virtuoso, which is yet another component used by LinkedPipes Applications platform.

The diagram above, recaps the core functionality of Discovery. If you would like to learn more about the LinkedPipes Discovery project refer to their official GitHub repository

ETL

LinkedPipes ETL is an Extract Transform Load for Linked Data. It runs so called Pipelines which are defined data transformation processes. We call any execution of such Pipelines as Pipeline executions.

The diagram above, recaps the core functionality of ETL. If you would like to learn more about the LinkedPipes ETL project refer to their official project documentation

What’s next?

As you might understand at this point, both services are trying to achieve relatively simple and straighforward goals. Discovery needs to check if it can visualize user’s LinkedData sources, while ETL needs to do the heavy work with the discovered data, extract, transform and load parsed data in RDF format into the database. However, the underlying implementation of both services represent very complex codebases but since the intention of this chapter is to simply get you familiar with their core concepts, that is about everything you need to know aboit them. Please, note that upcoming chapters are going to assume that you have read this section and understand responsibilities of these components.