Answers to the most important questions about Open Data
Every change raises questions, and so does the topic of Open Data: “What is Open Data anyway, and what’s behind schema.org?”
In order to work together towards the goal of available, open, tourism data in Germany, it is necessary to define and understand the refering terms. For a basic understanding, we have compiled answers to the most important questions.
Content services are either produced in-house or purchased. In both cases, it is recommended that an agreement be reached with the authors that the purchased services (contents or “data”) can be used without restriction in terms of time and content.
Accordingly, care must be taken to ensure that data is prioritised with the CC0 licence or alternatively the CCby licence.
Further information can be found in the “Sample contract for a commissioned production with photographers”.
Users are all companies such as global players, startups, higher-level organizations, partners and all actors inside and outside tourism who create products and services for users from the data.
The move to Open Data will mean that tenders and content purchase contracts will have to be revised in order to obtain the appropriate rights of use. Educating existing partners about this innovation will be important, as until now an unrestricted right of use has come at a price. As expected, some of the existing partners will hold back for the time being or offer more expensive services, but new applicants will also come forward at conditions comparable to the current ones. Over time, it is likely that existing partners will return to offering at original prices.
Structured data is data that is labeled according to a given de facto standard. Schema.org is the best known standard on which the major search engines have agreed.
Both initiatives are in exchange with each other. Differences between the initiatives exist in the way the data is prepared. GovData provides the data in tabular form. The Open Data project for German tourism will prepare and make available its data in the form of a knowledge graph. In this way, an initial, use-oriented preparation of the data takes place even before the data is distributed.
All data with information content in all content formats (images, texts, videos, podcasts, etc.) are of interest: points such as POIs/restaurants, lines such as tours/trails, areas such as regions/ski areas, dynamic information such as opening times/snow heights, statistics such as visitor flows/occupancy rates and other content such as stories, recipes or people.
GovData is the data portal for Germany. Here open data of the administrations of the federal government, the federal states and the municipalities are published.
Existing content should be upgraded to CC0 license whenever possible if it is strategically and operationally relevant.
The intellectual property of data is regulated differently by different national laws. For this reason, meaningful licences have been created for the digital world as an orientation to which conditions the handling of corresponding data is linked or even how open a use is possible.
The best known licensing system is called Creative Commons. The most useful license types that have become established for actors who want to use Open Data are:
- CC0: all rights are waived, the data can be used as desired, i.e. completely open.
- CC-by: when using the data, reference must be made to the author.
- CC-by-sa: when using the data, reference must be made to the author. The work may only be distributed under the same license.
All other types of licenses are too limited, make usage very difficult and are not very useful.
Further information can be found in the article “The legal implications of Open Data”.
Linked Open Data is the most extensive opening of data on the Internet. Linked data is part of the Linked Open Data Cloud and can thus be found and used by stakeholders.
Data and content may be used for purposes that cannot be controlled. However, the danger always exists with published content on the Internet and experience shows that in reality it is very small. In our view, the possibilities and opportunities offered by Open Data far outweigh the risks.
In the Open Data landscape, it is a matter of contents for the above-mentioned data types (such as POIs, tours, regional information, dynamic information, etc.). The DSVGO protects personal data, which may only be published with consent. It can become problematic when it comes to contact data.
Data must be available separately in the content management and described in a uniform ontology such as schema.org. As an example: a list in the text editor with weekdays and opening hours is not sufficient because it cannot be interpreted cleanly by a machine. Weekdays must be specifically marked, times accordingly. Depending on the content management, the data is therefore entered according to predefined structures so that it can then be output in a user-friendly manner. For this purpose, the system must offer the possibility of differentiating the respective data rights at content field level according to format.
Since open data is available from everyone via interfaces, the data can be used to develop new digital products and services that can be useful to the general public in everyday life or on vacation.
Existing KPIs (Key Performance Indicators) such as visits and page views no longer apply if content and data from third-party providers can be used freely. New KPIs need to be developed here.
Since structured open data can in principle be used by anyone, there is generally a greater reach. In addition, structured open data is rated with higher quality by search engines.
First reference projects show that players with structured open data benefit from significantly more organic hits on search engines. So you get away from the idea that content only has to be available on your own website. This is especially true for content that is already defined in the Google Search Gallery. Google shows there how the content can be displayed preferentially on the search results page, which is a clear signal that more than less visibility can be generated via the correct markup of these content types.
Freely available data is an essential prerequisite for the further development of new technologies such as conversational interfaces, mobility and the Internet of Things, as well as the development of new tourism offers and services. The user and guest will benefit from this.
Data must be provided with appropriate open licenses so that every market participant (e.g. global player, startup, service provider) can use this data as unrestrictedly as possible. The data must be made available on the Internet as structured and linked data in order to be used. Open Data can therefore be interpreted and used automatically.
Open Data creates the preconditions for the digital transformation towards artificial intelligence. In this way, we are safeguarding Germany as a tourism location, strengthening its competitiveness, and promoting tourism regions in particular as well as digital innovations in tourism and beyond. Last but not least, we increase the brand presence of Destination Germany at home and abroad.
The term Knowledge Graph was coined by Google for the knowledge base used by Google and its services to enrich the results of its search engine with information related to the search term from a variety of sources. The information is displayed to users in an info box next to the search results or output via Google Voice Assistant.
In the meantime, the term Knowledge Graph has become established for corresponding products, also from other creators. A Knowledge Graph is therefore a knowledge base in network form – similar to a semantic network, in which individual knowledge points and their descriptions are placed in a semantic relationship. Knowledge graphs can be the basis for many artificial intelligence applications. The best known Knowledge Graph is Google’s Knowledge Graph.
In order to make all structured tourism data available in one place, e.g. for artificial intelligence applications or other applications, the LMOs, the GNTB and a large number of tourism service providers in Germany would like to create their own open tourism knowledge graph.
In a first step, the data from the local and regional “data silos” must be uniformly labelled or this uniformity must be realised via interfaces.
In the next step, the data is made available via a central graph database (Knowledge Graph). A graph database is characterized by the fact that it provides the data for the application area of artificial intelligence semantically unambiguous, structured and powerful. Interested users can then use the data from the Knowledge Graph.
Rights of use
Structured data must follow a standard markup language for use in semantic contexts. An established ontology for this is “schema.org”. Schema.org is an initiative of the major search engines Bing, Google, Yahoo! and Yandex. It provides a description system so that data can be provided in a particular structure. So within schema.org there are schemas that can be used to describe different types of data (about a hotel, an event, a POI, etc.).
Open Data are all data sets that are made accessible and usable without restriction for further dissemination and further use in the interest of the general public.
This might also interest you