College UA Ruhr: AI in Science Communication: Undermining Trust and Reproducing Stereotypes?

How do young people perceive and react to AI-generated scientific content? Do AI-generated images of scientists reinforce stereotypes? The postdoctoral researchers Maximilian Krug and Valentina Nachtigall conducted two studies to explore the role and effects of using generative AI in science communication. Their first study with university students aimed to understand how AI-generated content affects trust and acceptance. In a second study, the researchers investigated how secondary school students engage with AI-generated images of scientists. In this interview, they explain their approach and share central findings and interpretations.

Maximilian Krug is a communication studies scholar at the University of Duisburg-Essen, Valentina Nachtigall is an educational psychologist at Ruhr University Bochum. They have been collaborating in the cross-university working group ‘Artificially Intelligent Communication of Science’ (AICOS), which was funded by the PostdocLab programme from autumn 2024 until spring 2026.

With your first study, you wanted to find out how, in science communication, disclosing the use of artificial intelligence (AI) affects recipients’ trust. What was your central research question?

Maximilian Krug:

Our central research question was very simple: Does openly stating that AI was used in science communication change how much people trust the communicator? We developed this question because AI is increasingly used to produce scientific content, and many institutions now emphasise transparency about AI use. At the same time, research in psychology and on human-AI interaction suggests that people often react with scepticism to algorithms, even when the information itself is accurate. What was missing, however, was evidence from science communication research specifically.

What do we generally know about the role that trust plays in science communication? Which factors influence credibility of scientists and of scientific content?

Maximilian Krug:

Trust is fundamental in science communication because most people cannot verify scientific claims themselves. Instead, they rely on trust to decide whether information is credible and acceptable. Research shows that credibility depends less on the content alone and more on how trustworthy the source appears. In regard to individual scientists, key factors include perceived expertise, honesty, good intentions, and authenticity. Regarding scientific content, credibility is related to who presents the information, how transparent they are, and the format in which the information is delivered.

For your study, you decided to combine an experimental setting with conversation analysis – how exactly was the study designed? How can you assess trust or credibility?

Maximilian Krug:

We designed a controlled experiment in which sixty-two university students watched short scientific videos that were identical in content but differed in how the presenter was framed (either as human or AI-generated), and whether an AI avatar of the same researcher was used. This allowed us to examine the effect of AI attribution in isolation while keeping all other variables constant. After watching the video, participants discussed it in small groups. Instead of asking them directly how much they trusted the presenter, we analysed these discussions using conversation analysis. We assessed trust and credibility based on how participants talked about the presenter: did they align with, question, or distance themselves from the presenter’s claims and authority?

What were your main findings? Were there significant differences between the participant groups?

Maximilian Krug:

The main finding was that any kind of ‘AI-ness’, in labelling or using an avatar, significantly reduced credibility. Participants only trusted the presenter when no AI avatar and no AI label were used. Furthermore, there were distinct differences between groups. Videos with an AI label, regardless of whether the presenter was actually an AI or human, were evaluated more negatively than unlabelled videos. The lowest credibility occurred when an AI avatar was both visibly artificial and explicitly labelled. Overall, the results show that perceived artificiality and AI disclosure have more influence on trust than the quality of the scientific message itself.

Did anything in the course of the study surprise you, e.g., was anything contrary to your hypotheses?

Maximilian Krug:

One surprising result was that a human presenter who was falsely labelled as AI was trusted more than an actual AI-generated presenter that was correctly labelled. We expected the mislabelling itself to be most damaging, but the data showed that visible artificiality, combined with an AI label, led to the lowest perceived credibility. One methodological challenge – and at the same time one of the most innovative aspects of the study – was how to measure trust. Most research relies on questionnaires, which capture what people say they think, but not how trust actually plays out in real situations. Studying participants’ group discussions allowed us to observe trust as an interactional phenomenon: how people perceive a presenter, how confidently they treat the information, and whether they display scepticism, hesitation, or acceptance in real time. Applying conversation analysis in an experimental setting is still rare in communication research. Combining interactional data with an experimental setting allowed us to keep strict control over the stimulus while capturing fine-grained, socially meaningful indicators of credibility that would be invisible in standard surveys.

What conclusions do you draw from these results? What does this mean for science communication in practice?

Maximilian Krug:

The key conclusion is that trust in science communication is strongly shaped by how communicators are framed, not just by what they say. Disclosing AI involvement can undermine credibility, even when the scientific content is accurate and unchanged. Our study suggests that, at least for now, visibly AI-generated presenters may not be the best choice for science communication. A sounder approach may be to keep human scientists visible and accountable as communicators, even if AI tools were used in part for producing the video.

Let us shift our focus to images: AI image generators are known to reproduce or even reinforce stereotypes, particularly in the dimensions of gender and ethnicity. How does this manifest itself in images of scientists? And why is this problematic?

Valentina Nachtigall:

In AI-generated images, scientists are predominantly depicted as white men. This reinforces a long-standing stereotype. Science education research shows that students’ conceptions of scientists are similar. This is particularly problematic with regard to children and youth because such stereotypes can influence how they develop interests, attitudes, and self-concepts towards science. When young people do not find people they can identify with represented in science, it can reduce their sense of belonging and self-efficacy in these fields. This particularly affects female students and students from diverse cultural backgrounds. In a recent publication based on a 2022 study, I examined how advanced students at a German secondary school (grades 10 to 13) conceptualise natural scientists in contrast to educational scientists and how these conceptions relate to their study interest and academic self-image. Stereotypical and rather negative images of scientists were associated with lower interest and weaker self-efficacy beliefs and academic self-concept in these disciplines, the study found. This suggests that biased representations, including those produced by AI, may reinforce motivational barriers.

In your second recent study, you sought to find out how secondary school students in Germany engage with images of scientists. What was the design of this study?

Valentina Nachtigall:

In this exploratory study with 48 pupils in 10th grade at a secondary school located in the Ruhr area in Germany, we manipulated two factors: first, the labelling of its source and second, diversity features of the depicted scientists. We presented images either as AI-generated or as sourced from the internet. Importantly, the images themselves were identical across conditions; one half originated from the internet and the other half were generated by AI. As for diversity, the low-diversity condition included only male scientists, whereas the high-diversity condition additionally included female scientists and people of colour. This resulted in four experimental conditions. Students were asked to imagine selecting an image for a school presentation. They rated each image and indicated their preferences. Afterwards, they discussed their choices with peers. We audio-recorded these discussions and analysed how students reasoned about image selection and source credibility.

Were there differences in how the secondary school students evaluated the images? Did knowledge about images being AI-generated influence this evaluation?

Valentina Nachtigall:

There were no quantitative differences between the four conditions in how the students rated the images. This may suggest that pupils uncritically accept stereotypical and AI-generated images. However, this study was exploratory and the sample size was rather small, limiting statistical power. Qualitatively, however, we found differences, but these were independent of our experimental conditions and the labelling: most school students confronted with images labelled as sourced from the internet recognised that some of them were AI-generated. Thus, the labelling we implemented for standardisation did not work. Our qualitative analysis revealed four types of how students engaged with AI-generated images: either critically, by questioning the authenticity of the represented scientists, or uncritically, by accepting the AI-generated images. Within the critical and uncritical types, we could further distinguish by whether students discussed stereotypical attributes of the depicted scientists or not. Based on the analysis of the discussions, the majority of students were categorised as one of the two critical types. Interestingly, pupils who engaged uncritically with AI-generated images assessed their AI-related self-efficacy significantly stronger than students who engaged critically with the images. This might indicate that these students tend to overestimate their actual competences.

Based on your interpretation of the study’s results, what conclusions do you draw regarding how AI literacy and the ability to engage critically with AI should be addressed as educational goals in school settings?

Valentina Nachtigall:

We were quite surprised that the majority of students recognised the AI-generated images and engaged critically with them. This finding does not support the common concern that pupils overly rely on AI-generated output and uncritically accept it — at least when it comes to images of scientists. The further findings suggest that, in order to understand the educational implications of biased AI outputs, we must pay attention not only to the technological properties of AI systems, but also to the ways learners perceive, interpret, and reflect on such content. The finding that uncritical students probably overestimate their own competence, however, raises concerns and should be addressed in school education. Students should be supported in developing AI literacy and in becoming aware of the limitations and risks of AI-generated outputs. Strengthening these human capabilities is just as important as improving AI technologies as such. Moreover, our findings indicate that stereotypical attributes of the depicted scientists, although recognised and discussed, were largely accepted and rarely prompted critical questioning or rejection of the images. Independent of AI literacy, students’ stereotypical conceptions of scientists should therefore also be addressed in school education in order to avoid that certain students are discouraged from pursuing science as an effect of inaccurate and stereotypical images of who scientists are.

Based on your results in both studies, which related research questions need further investigation? Are you thinking about a follow-up study?

Maximilian Krug:

One key issue arising from the study with university students is whether the negative effects of AI labelling in science communication depend on context, such as the topic, audience, or communication goal. It is also important to examine whether repeated exposure to AI-generated science content reduces scepticism over time. Methodologically, future research should investigate whether interaction-based measures of trust, such as conversation analysis, align with or diverge from traditional survey measures across different settings.

Valentina Nachtigall:

The study with high school students raises questions about ways to support the development of actual AI literacy and a more accurate estimation of students’ self-efficacy. We are currently preparing data collection for a follow-up study that aims to address precisely these questions.

Taken together, both studies raise broader questions about how to promote adaptive and context-sensitive attitudes toward AI-generated outputs. Science communication is an important task for which scientists often lack sufficient resources, and AI could provide support. However, ways need to be found to ensure that recipients can trust and accept AI-generated outputs that are created or supervised by scientists and that present valid content. This may, for example, be a question of appropriate labelling. In other contexts, however, where the validity of outputs is uncertain and where scientists have not controlled the content, recipients should engage critically. This creates a dilemma and raises important questions for future research.

College for Social Sciences and Humanities

AI in Science Communication: Undermining Trust and Reproducing Stereotypes?

23 April 2026

With your first study, you wanted to find out how, in science communication, disclosing the use of artificial intelligence (AI) affects recipients’ trust. What was your central research question?

What do we generally know about the role that trust plays in science communication? Which factors influence credibility of scientists and of scientific content?

For your study, you decided to combine an experimental setting with conversation analysis – how exactly was the study designed? How can you assess trust or credibility?

What were your main findings? Were there significant differences between the participant groups?

Did anything in the course of the study surprise you, e.g., was anything contrary to your hypotheses?

What conclusions do you draw from these results? What does this mean for science communication in practice?

Let us shift our focus to images: AI image generators are known to reproduce or even reinforce stereotypes, particularly in the dimensions of gender and ethnicity. How does this manifest itself in images of scientists? And why is this problematic?

In your second recent study, you sought to find out how secondary school students in Germany engage with images of scientists. What was the design of this study?

Were there differences in how the secondary school students evaluated the images? Did knowledge about images being AI-generated influence this evaluation?

Based on your interpretation of the study’s results, what conclusions do you draw regarding how AI literacy and the ability to engage critically with AI should be addressed as educational goals in school settings?

Based on your results in both studies, which related research questions need further investigation? Are you thinking about a follow-up study?

Search

Filter Settings