Published
Aug 1, 2024
Updated
Aug 1, 2024

Can Devs Prompt LLMs? An Experiment in Code Documentation

Can Developers Prompt? A Controlled Experiment for Code Documentation Generation
By
Hans-Alexander Kruse|Tim Puhlfürß|Walid Maalej

Summary

Imagine a world where documenting code is as easy as chatting with an AI. That's the promise of large language models (LLMs). But can developers actually harness their power effectively? A new study investigated how well developers, both professionals and students, could prompt LLMs to generate useful code documentation. The researchers built two VS Code extensions powered by GPT-4. One allowed developers to freely write their own prompts (ad-hoc prompting), while the other used a pre-defined, optimized prompt (few-shot prompting). The results? Professionals, with their deeper Python knowledge, were more successful at ad-hoc prompting, often by simply including keywords like "Docstring." However, students struggled without the guidance of a pre-defined prompt, often producing lengthy, less readable explanations. Overall, both groups preferred the simplicity and efficiency of the few-shot prompting tool. It generated concise, readable documentation in a consistent format. However, developers in both groups saw the generated documentation as a starting point, emphasizing the iterative nature of documentation. This research has important implications for AI-powered tools. It suggests that developers need more support in prompt engineering, either through better interfaces or educational resources. Additionally, generating multiple documentation versions at once could spark developer creativity. The study also highlights the importance of human-centered metrics when evaluating code documentation quality, as automated measures often miss the nuances of what makes documentation useful for developers. This points to a future where AI assists, but doesn’t replace, the developer's role in creating clear, concise, and helpful documentation.
🍰 Interesting in building your own agents?
PromptLayer provides the tools to manage and monitor prompts with your whole team. Get started for free.

Question & Answers

What were the key technical differences between ad-hoc and few-shot prompting implementations in the VS Code extensions?
The study implemented two distinct prompting approaches in VS Code extensions using GPT-4. Ad-hoc prompting allowed developers complete freedom in crafting prompts, essentially providing a blank canvas for interaction with the LLM. Few-shot prompting, conversely, utilized a pre-defined, optimized prompt template that had been refined for documentation generation. The key technical distinction lay in the prompt structure: ad-hoc prompting required developers to explicitly specify their documentation needs, while few-shot prompting came with built-in examples and formatting guidelines that helped maintain consistency. Professional developers succeeded with ad-hoc prompting by including specific keywords like 'Docstring,' while students benefited more from the structured approach of few-shot prompting.
How can AI help improve code documentation in software development?
AI can significantly streamline the code documentation process by automatically generating clear, consistent documentation from existing code. The main benefits include time savings, reduced documentation debt, and more standardized documentation across projects. For example, developers can use AI tools to quickly generate initial documentation drafts, which they can then review and refine, rather than starting from scratch. This is particularly valuable in large development teams where maintaining consistent documentation standards can be challenging. However, AI serves as an assistant rather than a replacement, helping developers create better documentation while still requiring human oversight for accuracy and context.
What are the main advantages of using AI-powered documentation tools in software development workflows?
AI-powered documentation tools offer several key advantages in modern software development. They provide immediate time savings by automating the initial documentation draft, ensure consistency in documentation style across teams, and reduce the cognitive load on developers. These tools are particularly beneficial for large projects where maintaining comprehensive documentation can be overwhelming. Real-world applications include automatic generation of API documentation, code comments, and function descriptions. The tools can also help maintain documentation currency as code evolves, though human oversight remains important for ensuring accuracy and contextual relevance.

PromptLayer Features

  1. Prompt Management
  2. The study's comparison between ad-hoc and few-shot prompting directly relates to prompt versioning and template management
Implementation Details
Create versioned prompt templates for code documentation, with separate versions for novice and expert users
Key Benefits
• Standardized documentation formats across teams • Easier onboarding for new developers • Version control for prompt improvements
Potential Improvements
• Add role-based prompt templates • Implement collaborative prompt editing • Create documentation-specific prompt libraries
Business Value
Efficiency Gains
40% faster documentation process through standardized prompts
Cost Savings
Reduced training and review time for documentation practices
Quality Improvement
More consistent and maintainable documentation across projects
  1. Testing & Evaluation
  2. The paper's emphasis on human-centered metrics aligns with the need for comprehensive prompt testing
Implementation Details
Set up automated testing pipelines with both automated and human evaluation metrics
Key Benefits
• Quantitative quality assessment • Rapid iteration on prompt performance • Consistent evaluation across different developer levels
Potential Improvements
• Implement automated readability scoring • Add comparative testing between prompt versions • Develop custom documentation metrics
Business Value
Efficiency Gains
50% reduction in documentation review time
Cost Savings
Decreased documentation maintenance costs through early quality detection
Quality Improvement
Higher documentation accuracy and usefulness scores

The first platform built for prompt engineering