Privacy, Data Quality & More in Data Spaces
The European H2020 projects MOSAICrOWN and TRAPEZE joined forces to organise a workshop entitled “Privacy, data quality & more in Data Spaces” during the European Big Data Value Forum 2021 held on 1st December 2021. The workshop attracted 72 participants.
Rigo Wenning from ERCIM/W3C introduced the speakers and the objectives of the workshop, addressing leading-edge research on data markets, data spaces and privacy-related issues. In the European Union, data processing is subject to rules like GDPR and also to constraints from business imperatives. The workshop presented solutions for issues that arise when data is shared or monetized and presents a possible architecture for interoperability and data management for data markets and data spaces. As an example of the research advances made in this area, a use case of intelligent connected vehicles was presented. This workshop also included a presentation on advances in policy management, protection techniques, and also in standardisation of linked data to overcome interoperability issues.
Pierangela Samarati from Università degli Studi di Milano, coordinator of the MOSAICrOWN project (Multi-Owner data Sharing for Analytics and Integration respecting Confidentiality and OWNer control) gave an overview on the architecture developed by the project. An important issue is data wrapping and security. She explained how data wrapping provides protection by disabling the visibility of data for storage and collaborative computations and how this is achieved through intelligent indexing and an authorisation model.
Data sanitization & anonymization is another important part of the MOSAICrOWN architecture, presented by Stefano Paraboschi from the University of Bergamo. He explained that privacy metrics can be based on different privacy definitions and outlined the difficulties that arise when data needs to be anonymised. The presented solution is based on applying an algorithm called Mondrian, a multidimensional anonymization method within the Apache Spark framework, an engine for large-scale data analytics.
Pierre-Antoine Champin from ERCIM/W3C presented how the RDF-star draft standard is bridging the gap between linked data and property graphs in the frame of the MOSAICrOWN project. He introduced the concept of Linked Data and Property Graphs and demonstrated with examples how RDF-star reduces the impedance mismatch between Linked Data and Property Graphs.
Aidan O’Mahony from the OCTO Research Office at Dell Technologies concluded the workshop with a presentation of the use case “Intelligent Connected Vehicles” developed in MOSAICrOWN. He provided insight to the architecture for an automotive scenario involving data owners (drivers) ingesting their data into the data market, consumers accessing data in the data market, and the data market provider offering storage and computation services to data owners and consumers. In this scenario, RDF-star is applied to intelligently connect vehicles.
At the end of the workshop, the presenters had the opportunity to answer questions raised during the online sessions.
The speakers at the EBDVF workshop. From top left: Pierangela Samarati, Stefano Paraboschi, Rigo Wenning, Pierre-Antoine Champin, Aidan O Mahony and Piero Bonatti.