Artificial Intelligence and Machine Learning
Machine Learning Models and Artificial Intelligence
Trusted research environments (TREs) provide secure access to very sensitive data for research. All TREs operate manual checks on outputs to ensure there is no disclosure risk. Machine learning (ML) models require a large amount of data. If the data is personal, the TRE is a well-established data management solution. However, ML models present disclosure risks in both type and scale.
ML use in TREs create a risk of releasing raw or disclosive data from the safe environment. Once ML models are in the world, uses of the algorithm are uncontrolled, there is not enough overall models on their capability, who can use them once outside TREs, and how they could be used to reidentify individuals within the data.
Output SDC and AI models
Artificial intelligence and machine learning models present different risks for output checking.
The GRAIMATTER project provided some initial guidance and automatic tools. These were extended and simplified as part of the SACRO project, and more guidelines for data service staff were added.
This is a rapidly developing subject matter, and as technology changes as will the necessary guidelines and tools to maintain safe research and output checking. The SDC-REBOOT community network is currently coordinating this ongoing development.
You can also join the SDC-REBOOT mailing list: