{"id":19118,"date":"2024-02-02T17:43:36","date_gmt":"2024-02-02T16:43:36","guid":{"rendered":"https:\/\/www.intellias.com\/?p=19118"},"modified":"2024-07-03T17:32:43","modified_gmt":"2024-07-03T15:32:43","slug":"how-to-train-an-ai-with-gdpr-limitations","status":"publish","type":"blog","link":"https:\/\/intellias.com\/how-to-train-an-ai-with-gdpr-limitations\/","title":{"rendered":"\u200bGDPR and AI: Balancing Privacy and Innovation"},"content":{"rendered":"
We are living through the data big bang, in which the number of bytes of data we collectively create is a 30-digit number. This is good, as data is the raw material for innovation \u2014 so long as we can harness, systematize, and analyze it. What makes mastering colossal data streams possible is artificial intelligence. Hardly anything can digest these enormous piles of data to derive meaningful information as quickly as artificial intelligence (AI) algorithms.<\/p>\n
But what data are algorithms going to analyze? How much? What for? In 2016, the European Union adopted a regulation that answers these questions to some extent. However, it is also a game-changer for AI and machine learning (ML) development.<\/p>\n
Learn how Intellias helped a US-based company providing personalized learning programs for EU residents comply with GDPR requirements<\/p>\n
By leveraging the capabilities of AI and ML, tech companies and research institutions make new materials, discover drugs, detect fraud<\/a>, protect crops<\/a>, and so on. In our daily lives, we face AI algorithms, too \u2014 from email filters to personalized product offerings and music suggestions to digital assistants<\/a>.<\/p>\n Another prominent result of AI and data synergy is smart city technology that aims at addressing common urban issues, such as optimizing traffic and parking, managing emergencies, preventing vandalism, and ensuring public safety. In other words, it aims to create social harmony through technology.<\/p>\n An example of smart city technology is the social credit scoring system being implemented in China. The country\u2019s government uses an AI-powered system capable of comparing vast amounts of data with official databases and developing knowledge based on this analysis. Most of the data is gathered from traditional sources such as financial, criminal, and government records, registry offices, and third-party sources such as online credit platforms.<\/p>\n Systems like the Chinese social credit scoring system do bring benefits for citizens and make urban services more efficient. However, such vast access to data raises concerns about privacy, bias, and political interference. The alliance of AI and big data has brought data subjects\u2019 privacy rights and freedoms to the table.<\/p>\n Discover five use cases of machine learning in FinTech and banking and learn how to apply best practices to your business<\/p>\n While China was expanding the use of personal data for urban governance, the European Union was passing regulations to limit data use.<\/p>\n \u200b\u200bThe European Union adopted a Data Protection Directive long before people started to share their data online. And after years of discussions and preparations, the European Parliament replaced this directive by adopting the General Data Protection Regulation in May 2016. With the GDPR, the EU aimed to harmonize data privacy laws across all its member countries, safeguard data being transferred abroad, and provide individuals with more control over their personal data. The GDPR sets high standards for privacy and applies to data that, either alone or in combination with other data, can identify a person. \u200bBy setting clear guidelines for data protection, the European Union fosters innovation and economic growth. Companies are encouraged to develop new, privacy-centric technologies and business models, which can lead to new opportunities and market growth.<\/p>\n The EU’s data protection laws have a global impact. The GDPR has been the starting point for data protection laws in countries outside the EU and in individual US states. For example, the California Consumer<\/a> Privacy Act was signed in June 2018, becoming the first data privacy act in the US. This extraterritorial effect has led many non-EU countries to adopt similar standards, effectively raising the bar for data protection worldwide.<\/p>\n The GDPR went into force in May 2018, affecting Europe-based companies and all companies processing and holding the personal data of those residing in the EU. The tech industry took issue with the stringent rules, as the regulation touches on the two main pillars of artificial intelligence and machine learning.<\/p>\n First, it enhances data security, as AI and data privacy always come together. The GDPR poses strict obligations on companies that collect and process any personal data. Most AI-based systems require large volumes of information to train and learn from. Usually, personal data is among these training datasets. The GDPR\u2019s impact on AI and machine learning development<\/a> is inevitable.<\/p>\n Second, the regulation explicitly addresses \u201cautomated individual decision-making\u201d and profiling. According to Article 22, a person has a right not to be subject to either if they produce legal effects concerning him or her. Automated individual decision-making here covers an AI\u2019s decisions made without any human intervention. Profiling means the automated processing of personal data to evaluate certain things about the data subject. For instance, an AI system might analyze a user\u2019s credit card history to identify the user\u2019s spending patterns.<\/p>\n Learn how AI-driven analytics helps turn analytical insights into tangible business outcomes<\/p>\n The GDPR has six data protection principles at its core. According to a report by the Norwegian Data Protection Authority, artificial intelligence and data protection face four challenges associated with these principles.<\/p>\n The GDPR fairness principle addresses fair processing of personal data. In other words, data must be processed with respect for the data subject\u2019s interests. Also, the regulation obligates that a data controller take measures to prevent discriminatory effects on individuals. It\u2019s no secret that many AI systems are trained using biased data<\/a>. Or that their algorithmic models contain certain biases. That\u2019s why AI systems often demonstrate racial, gender, health, religious, or ideological discrimination. For GDPR compliance using AI, companies have to learn how to mitigate those biases in their AI systems.<\/p>\n The purpose limitation principle of the GDPR states that a data subject has to be informed about the purpose of data collection and processing<\/a>. Only then can a person choose whether to consent to the processing. The interesting thing is that sometimes AI systems use information that\u2019s a side product of the original data collection. For instance, an AI application can use social media data for calculating a user\u2019s insurance rate. The GDPR states that data can be processed further if the further purpose is compatible with the original. If it isn\u2019t, the data collector should get additional approval from the data subject. But this principle has a few exceptions.<\/p>\n Further data processing is always compatible with the previous purpose if it\u2019s connected to scientific, historical, or statistical research<\/a>. Herein lies a problem, since there\u2019s no clear definition of scientific research. This means that in some cases, AI development may be considered such research. The rule of thumb is that when the AI model is static and already deployed, the purpose of its data collection can\u2019t be regarded as research.<\/p>\n Learn what alternative data can increase accuracy in evaluating a consumer\u2019s creditworthiness<\/p>\n The GDPR data minimization principle controls the degree of intervention into a data subject\u2019s privacy. It ensures that data collected fits the purpose of the project. Collected information should be adequate, limited, and relevant. These requirements encourage developers to think through the application of their AI models. Engineers have to determine what data and what quantity of it is necessary for a project. Sometimes, this can be a challenge. It\u2019s not always possible to predict how and what a model will learn from data. Developers should continuously reassess the type of and minimum quantity of training data required to fulfil the data minimization principle<\/a>.<\/p>\n The GDPR aims to ensure that individuals have the power to decide which of their information is used by third parties. This means that data controllers have to be open and transparent about their actions. They should provide a detailed description of what they\u2019re doing with personal information to the owners of that information. Unfortunately, with AI systems, this may be hard to do.<\/p>\n That\u2019s because AI is essentially a black box<\/a>. It\u2019s not always clear how the model makes decisions. Which makes it impossible to explain an AI\u2019s complicated processes to an everyday user. Naturally, when AI is not entirely transparent, the question of liability arises.<\/p>\n According to the GDPR and AI, a data subject has the right to an explanation of an automated decision. So data controllers have to figure out ways to give one.<\/p>\n The European Union wants to lead the way with responsible AI by introducing a comprehensive legal framework to regulate the ethical use of artificial intelligence. The EU AI Act<\/a> emphasizes reducing bias and ensuring human control over automation. The directive aims to ensure that AI systems are safe, transparent, and respect EU standards on fundamental rights and values. The framework could impact AI use as significantly as the GDPR affected personal data processing.<\/p>\n Who will be affected by the new EU AI regulations?<\/p>\n Essentially, most businesses will be.<\/p>\n These regulations will likely impact any business that uses AI, including companies based in or operating within the EU that work with AI or use AI-embedded components. Even companies not developing their own AI systems, but using systems with AI components, must comply with the new rules.<\/p>\n The emergence of generative AI has brought forward significant challenges in ensuring the quality, reliability, and ethical use of its outputs. The EU AI Act is poised to address these challenges by categorizing AI systems based on their risk level. This law directly impacts generative AI technologies, like ChatGPT or deepfakes, requiring compliance with the requirements<\/a>, including clear disclosure that content has been artificially created.<\/p>\n The EU AI Act mandates transparency for AI systems that interact with humans and create text, images, videos, and other types of content. The AI Act could require generative AI applications to use high-quality and ethically sourced data, ensuring the respect of data protection laws, especially GDPR. The EU emphasizes the ethical use of data, aligning with GDPR principles. AI companies will need to ensure that their generative models are trained on ethically sourced, non-biased, and GDPR-compliant datasets.<\/p>\n Generative AI models should be designed to prevent them from producing illegal content. All copyrighted data used for training AI algorithms should be published in summaries. High-risk AI models used for general purpose, such as GPT-4, would be thoroughly evaluated.<\/p>\n Since the Act ensures that high-risk generative AI systems adhere to strict standards for data quality and human oversight, posing new challenges to AI companies:<\/p>\n In 2024, Intellias announced a strategic partnership with 2021.AI<\/a>, a key player in applied AI technology, to navigate the challenges presented by the EU AI Act\u2019s focus on high-risk systems and its potential impact on innovation. This alliance will help address the issues enterprises face in adopting AI technologies<\/a>.<\/p>\n Intellias and 2021.AI will cooperate to build technology-centric Governance, Risk, and Compliance (GRC) solutions that will help businesses adhere to evolving regulations. These solutions will offer an effective approach to meeting standards and foster AI-driven innovation within the dynamic landscape.<\/p>\nHow does the GDPR impact AI and machine learning?<\/h2>\n
\n<\/p>\nWhat challenges arise from GDPR limitations on AI?<\/h2>\n
\n
\n
\n
\n
Navigating the EU AI Act’s View of Generative AI<\/h2>\n
\n