Published
May 31, 2024
Updated
Jun 3, 2024

Can AI Write Jokes? Comedians Put LLMs to the Test

A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians
By
Piotr Wojciech Mirowski|Juliette Love|Kory W. Mathewson|Shakir Mohamed

Summary

Can AI write jokes that will actually make you laugh? A fascinating new study put large language models like ChatGPT and Bard in the hot seat, asking professional comedians to use them as comedy writing tools. The results? Let's just say the robots still have a lot to learn about humor. Comedians at the Edinburgh Festival Fringe and online participated in workshops where they used LLMs to generate jokes, setups, and even entire scenes. While some found the AI helpful for generating initial ideas and structures, the overall consensus was that the jokes themselves fell flat. Many described the LLM-generated humor as bland, generic, and lacking the incisiveness of human-written material. One comedian quipped that the jokes were like "cruise ship comedy from the 1950s, but a bit less racist." The study also highlighted the challenges of AI moderation and safety filtering. Comedians found that these filters often stifled creativity and prevented them from exploring edgier or more offensive themes, which are often key elements of successful comedy. Furthermore, the study revealed concerns about bias and representation. Participants noted that LLMs struggled to generate material that reflected diverse perspectives and often defaulted to stereotypical characters and narratives. While AI might not be ready to take the stage as a stand-up comedian just yet, the study offers valuable insights into the potential and limitations of LLMs as creative writing tools. It also raises important questions about the future of AI in the arts and the ethical considerations surrounding its use. As AI technology continues to evolve, it will be interesting to see if future iterations can overcome these limitations and truly collaborate with humans to create truly funny and original comedy.
🍰 Interesting in building your own agents?
PromptLayer provides the tools to manage and monitor prompts with your whole team. Get started for free.

Question & Answers

What technical limitations did the AI comedy study reveal about LLM content moderation and safety filters?
The study identified that LLM safety filters created significant technical constraints in comedy generation. These AI moderation systems operated through pre-trained parameters that automatically filtered potentially offensive or controversial content, often blocking essential elements of comedy writing. The process worked in three main steps: 1) Initial content screening against predetermined guidelines, 2) Automatic rejection of content containing flagged terms or themes, and 3) Forced redirection toward 'safer' alternatives. For example, when comedians attempted to write edgy political satire, the systems would often default to producing more sanitized, generic jokes about everyday observations.
How is AI changing the landscape of creative writing and content generation?
AI is transforming creative writing by offering new tools for ideation and content development. It serves as a collaborative partner that can generate initial drafts, suggest plot points, and help overcome writer's block. The main benefits include increased productivity, endless idea generation, and the ability to explore multiple creative directions quickly. For instance, writers can use AI to brainstorm story concepts, create character backgrounds, or generate dialogue variations. However, as shown in the comedy study, AI still lacks the nuanced understanding and emotional intelligence needed for truly sophisticated creative work, making it best suited as a supplementary tool rather than a replacement for human creativity.
What are the potential applications of AI in entertainment and performing arts?
AI is increasingly finding applications in entertainment and performing arts as a creative assistance tool. It can help with script writing, music composition, visual effects generation, and performance analysis. The key advantages include streamlined pre-production processes, enhanced creative exploration, and new forms of interactive entertainment. For example, theaters could use AI to test different versions of scripts, musicians might use it for arrangement suggestions, and filmmakers could employ it for preliminary storyboarding. However, as demonstrated by the comedy study, AI works best when supporting human creators rather than replacing them, particularly in areas requiring emotional connection and cultural understanding.

PromptLayer Features

  1. Testing & Evaluation
  2. The study evaluated AI-generated comedy through professional comedian feedback, suggesting a need for systematic testing frameworks
Implementation Details
Set up A/B testing pipelines comparing different prompt variations for humor generation, establish scoring metrics based on comedian feedback, create regression tests for joke quality
Key Benefits
• Quantifiable measurement of joke effectiveness • Systematic comparison of different prompt approaches • Consistent quality tracking across model versions
Potential Improvements
• Integrate subjective humor ratings system • Add demographic-aware testing metrics • Develop specialized comedy evaluation frameworks
Business Value
Efficiency Gains
Reduce time spent manually reviewing comedy output
Cost Savings
Minimize testing resources through automated evaluation
Quality Improvement
More consistent and reliable humor generation
  1. Analytics Integration
  2. Study revealed issues with bias and stereotypes in AI comedy, indicating need for performance monitoring and pattern analysis
Implementation Details
Deploy analytics tracking for bias detection, monitor prompt performance across different comedy styles, analyze usage patterns for successful vs failed jokes
Key Benefits
• Real-time bias detection in generated content • Pattern identification in successful comedy • Performance tracking across different contexts
Potential Improvements
• Add sentiment analysis capabilities • Implement diversity metrics • Create comedy-specific success indicators
Business Value
Efficiency Gains
Faster identification of problematic output
Cost Savings
Reduced need for manual content review
Quality Improvement
Better alignment with audience preferences and cultural sensitivity

The first platform built for prompt engineering