Skip to content
Back to Blog

AI Bias: Cultural Stereotypes in AI Writing

From September to December, 2024, I interned at the Global Nomad Group’s Content Creation Lab, where I co-designed a course exploring the topic of AI bias. During our initial research, we analysed samples of ChatGPT-genetrated writing and were taught to identify any biases in them. This experience compelled me to experiment, too: I prompted ChatGPT to write a poem about my culture—Punjabi—without specifying whether I’m from the Pakistani (West) or Indian (East) side of the region. There isn’t much difference between Punjabi culture on either side, except that West Punjab is a Muslim-majority area while East Punjab is a Sikh-majority area. The poem contained multiple references to Sikh customs and festivals, and little about Islam, confirming the fact that AI can be prejudiced. But how is that possible?

This happens because AI “learns” by identifying and analysing recurring patterns in large training datasets fed into its system. AI begins to “recognise” what kind of inputs appear together and are related to specific outcomes. Datasets may include historical records, annotated images, audio files, and publicly available content on the internet; because humans source, write or record this initial content, they can contain linguistic, racial and gender-related prejudices—either due to under-representation or misrepresentation.

ai-bias-training

Types of Bias in AI

Under-representation occurs when certain categories, such as cultural demographics, are absent or not sufficiently sampled in the training datasets. If a category is under-sampled, it means that the AI does not have enough information about it to draw satisfactory conclusions, so the model resorts to stereotyping. Even if a large subset of people in the population have been sampled, the data can be misrepresentative since it may be labelled incorrectly, or skewed to favour one group over another.

In other words, representation is not the same as cultural responsiveness.

Bias can be introduced at any point, from the creation of an AI model to its usage. The three main forms of AI bias are:

  • Sampling bias
    • due to outdated, homogeneous training datasets that are not representative of the full target population.
  • Algorithmic bias
    • due to systemic errors, meaning consistent, repeatable inaccuracies in the AI model’s design during the development stage.
  • Interaction bias
    • due to human interactions with AI, where a person might introduce new biases or amplify existing stereotypes if the AI model actively “learns” based on live user behaviour and feedback.

Currently, we’re seeing harmful real-world examples of AI bias in many fields. Several examples include:

  • Human Resources
    • Due to a phenomenon known as bias from association, AI-generated text might link higher-paying professions with men, causing inequality in staff hiring.
  • Healthcare
  • Criminal justice
    • Facial recognition technologies utilising AI perpetrate racial biases against people of colour. They’re 10 to 100 times more likely to be misidentified as criminals than those with Eurocentric features, whose pictures are primarily used to train AI models. This leads to wrongful arrests, impacting civil rights.

In a similar vein, bias can subtly but dangerously creep into AI-generated creative writing, so subtle at times that you wouldn’t notice it. Most AI models are trained on Western data sources— 90% of them in American English—so, unless specifically prompted, they won’t generate stories or poems that incorporate non-Western cultural perspectives, settings, or character names. If they do, these portrayals magnify harmful stereotypes in a racist way, even when using seemingly positive descriptors.

Using biased AI models poses risks in science writing and journalism, too. In 2024, Cosmos magazine, an Australian publication, experimented with AI to generate explainers on scientific topics. Their article “What happens to our bodies after death?” contained inaccurate phrasing that lacked nuance because the AI operated on black-and-white statistical correlations.

Misinformation erodes the public’s trust in legitimate healthcare policies and peer-reviewed research. When combined with the fact that AI might overlook non-Western developments, this limits one’s exposure to science’s full potential—and what about the news? In an era of clickbait YouTube thumbnails, differentiating fact from fiction is already difficult, but the rise of AI creates an echo chamber of similar content. When it generates images depicting gender or racial stereotypes to accompany news stories, it fuels a viewer’s confirmation bias, giving their long-held false beliefs a thumbs-up.

ai-bias-identification

Solutions to Mitigate AI Bias

So what can we do about this problem? While no one can create an AI model that’s unbiased—just like no person can be unbiased—there are steps we can take to mitigate it. For starters, since Write the World’s AI tool, Clara, is Socratic, it’s less biased than a generative AI chatbot. It doesn’t produce its own content, so it can’t fabricate any, and it encourages reflection through open-ended questioning.

Here are other potential solutions:

  • Programmers can start from scratch by collecting diverse data that includes all the population’s demographics, preventing under-representation. They can re-weight the data—which is to adjust its proportional importance—so that the algorithm gives each minority group equal consideration.
  • Programmers can design AI models to include fairness metrics. An example is demographic parity, which ensures that members of sensitive groups have an equal probability of receiving a positive outcome; for instance, if an HR team at Company A uses AI during the recruitment process and men have a 35% chance of being selected for a leadership role, then women have a 35% chance too.
  • Experts can conduct independent audits of AI models regularly. Biases that are scanned for can be rectified through technical and organisational interventions. This holds AI developers accountable, ensuring that the systems they create comply with relevant benchmarks.
  • Governments must write and advocate for detailed legislations—like the White House’s AI Bill of Rights—that protect their citizens. Such laws would make ethical guidelines compulsory and ensure transparency. For example, companies could be mandated to use model cards, which are short documents provided with AI models that list their intended uses, training data, and any risk of bias.
  • The teams that companies build to design AI tools must be diverse, not only in race or gender, but also in profession: engineers, social scientists and ethicists must work together with AI developers. Cross-functional teams are more efficient, because each member can focus on a different aspects of the AI model, spotting unintended biases the other might not have noticed. Where an engineer is trained to rectify systemic biases, an ethicist might look at the broader moral implications.

AI Biases Solutions

AI’s predominant usage is marginalising those who were already treated like shadows—like Dr. Joy BuolamwiniAI-powered facial analytic systems from major tech companies, such as Microsoft, wouldn’t recognise her dark-toned face until she put on a white mask. But implementing the methods listed above will ensure that AI benefits everyone, not just a white, Western audience.

iffah

Author Bio

My name is Iffah Shamim. I’m a 19-year-old Pakistani on a gap year after graduating high school. I moved to the UAE when I was 4, lived there for 14 years and am now temporarily based in Sialkot, Pakistan. I write poetry in my free time, focusing on using traditional forms. Having fallen in love with creative writing in 6th grade, I can’t imagine how dull my life would be without the beauty that words hold. But I’m an academic at heart: I love exploring the intersection of literature with other fields, particularly AI and science.



Share this post: