BSides Cape Town 2025

You Are an Expert CFP Submitter: Prompting AI to Hallucinate
2025-12-06 , Track 2

Large Language Models (LLMs) are trained on vast amounts of internet data, which means their understanding of what it means to be an expert is largely shaped by how that word is used in forums, blogs, and online arguments. Let's be honest, this isn't always positive. Despite this, we constantly preface our prompts with phrases like “You are a senior pentester…” or “You are the world’s best developer…”, thinking we’re nudging the LLM in the right direction.

But what if we’re actually setting it up to fail?

In this lightning talk, we’ll explore how the way we frame prompts contributes to the very problems we complain about: hallucinations, overconfidence, and incorrect output. More importantly, can we get better results by prompting LLMs to think more like real experts? The kind who research, collaborate, and aren’t afraid to say “I don’t know”?


LLMs don’t just reflect data, they reflect how we ask them questions. And lately, we’ve been outsourcing our critical thinking to AI with dangerously confident prompt patterns. Leveraging real-world examples of cases where I had to test these implementation and provide feedback, this talk unpacks:

  1. “You are an expert…” - where did this come from?
    -- A quick origin of warm-up prompts and the culture around them.
  2. The “Expert” problem in LLM training data
    -- How the word “expert” is used online (and why that’s a problem).
  3. How real experts behave (and why LLMs don’t)
    -- Nuance, uncertainty, domain limitation - the traits of real experts.
  4. Prompting with humility
    -- Examples of how LLM responses shift when prompted differently
  5. Can we warmup prompt away hallucinations?
    -- Practical experiments with AI graders and assistants, showing how better prompts yield better outcomes.

As this is a lightning talk, the expectation is to spend 2-3 minutes on each of these topics. It will be kept practical and relevant with real-world examples that I have seen and tested with the implementation of AI auto-graders.

Key Takeaways:
* LLMs are only as "thoughtful" as the prompts we give them.
* The common “You are an expert…” warm-up may reinforce overconfident, hallucinated answers.
* Real experts are nuanced, cautious, collaborative, and okay with uncertainty, which should be modelled in our prompt warmups
* Better prompt hygiene = better AI output. The responsibility isn’t just on the model, it’s also on the human at the keyboard.

Passionate about cybersecurity, helping upskill others, and generally getting involved in the cybersecurity community!