About
- The AI Standards Hub
- National AI Strategy
Observatory
Forums
Training
- “Standards at a glance”
- Training database
Events
Global Summit
- Global Summit 2026
- Global Summit 2025
News / Blog
- Blog
- Newsletters
Login
- Log In
  Username or Email address:
  
  Password:
  
  Keep me signed in
  Forgot password
  
  Create an account
Create an account

Select Page

Content Type

Safety, security and resilience

How do you assess your AI’s training data?

Author

Posts
12 January 2023 at 3:42 pm
Up
2
::
- Report
Jon

Generative AI tools are very much in the limelight at the moment. If your social media feed is anything like mine, you’ve been seeing a lot of ChatGPT, Stable Diffusion, Midjourney, DALL-E, and others, alongside the usual talk of it being a revolutionary new tool in a number of sectors.

To work these tools train on a vast amount of data pulled from the web, and this training data has led to some controversy, including a graphic artist objecting to her copywritten original works being used as training data and the misogynistic portraits often produced by Lensa.

Having good quality training data for an AI system is one of the key challenges mentioned by stakeholders in BSI’s research, with one interviewee telling us that especially when buying in an AI system, one must understand exactly what data has gone into building the system. Without this, the risks produced could compromise the safety and robusness of the output.

If you have experience of training or procuring AI, how do you assess the quality of the data used to train your AI system? What criteria do you use, and how much of a concern are the potential biases and/or legal and ethical risks it may contain?

If you’ve used AI in your work—or indeed one of the generative AI tools mentioned above—how did you find it? Was the quality of the training data a consideration?
Author

Posts

You must be logged in to contribute to the discussion

Login

© 2024 The Alan Turing Institute

Terms and Conditions

Terms of acceptable use

Accessibility Statement

Report abuse

We are committed to ensuring that the AI Standards Hub platform provides a safe experience for all users. Where we determine that content violates our rules, we will work to remove it and may take further action including suspending the relevant user account.

Please complete the form below to report this contribution if you have observed or suspect a violation.

If there is immediate danger, please call your local emergency services in addition to reporting.

You can review our Terms of Acceptable Use here. More information about reporting harmful content can be found here.

URL where abuse took place

What kind of issue are you reporting? What kind of issue are you reporting? Spam Violation of privacy, confidentiality, or data protection legislation Infringement of intellectual property rights Deception or impersonation Defamatory content Offensive, hateful, or inflammatory content Bullying, intimidation, or harassment Threatening or abusive behaviour Promotion of unlawful activity Violent or sexually explicit content Content relating to or encouraging self-harm or suicide Other forms of harm or problematic content

Please share any additional information to help us deal with the issue. For example, the user involved, nature of content or time and date of content posting.

Report submitted

We appreciate you taking the time to submit this report. Thank you for helping to make the AI Standards Hub safer for everyone.

Close

Provide feedback on the site

Please use the form below below to provide your thoughts on what works and doesn’t work on the site. This helps the Hub to continuously improve, so we appreciate any feedback you can provide.

URL where the issue occured

Type of issue

What were you trying to do?

What went wrong?

Feedback submitted

We appreciate you taking the time to submit this feedback. Thank you for helping to improve the AI Standards Hub.

Close

Submit a missing item

Please suggest any standards, documents or other items that would be a good addition to the website repositories.

Type of content that is missing

Relevant URL of missing content

Why is this content relevant to the site?

Feedback submitted

We appreciate you taking the time to submit this feedback. Thank you for helping to improve the AI Standards Hub.

Close

This site uses cookies to store information on your computer. We use cookies to improve your experience when you browse our website. For more information see our cookie notice here.

Powered by GDPR Cookie Compliance

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible.

You can see our cookie notice here, and our privacy notice here.

Strictly Necessary Cookies

Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.

Enable or Disable Cookies

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

3rd Party Cookies

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Enable or Disable Cookies

Please enable Strictly Necessary Cookies first so that we can save your preferences!