Published
Jul 23, 2024
Updated
Jul 23, 2024

Can AI Grade Your Code? TAMIGO and the Future of Teaching

TAMIGO: Empowering Teaching Assistants using LLM-assisted viva and code assessment in an Advanced Computing Class
By
Anishka IIITD|Diksha Sethi|Nipun Gupta|Shikhar Sharma|Srishti Jain|Ujjwal Singhal|Dhruv Kumar

Summary

Grading student code is a time-consuming task for teaching assistants (TAs). What if AI could help? Researchers explored this idea by creating TAMIGO, an LLM-powered tool designed to assist TAs with code and viva assessments in a university-level distributed systems course. TAMIGO helped TAs generate targeted viva questions and provided feedback on student answers and code submissions. The results were promising, with TAMIGO demonstrating a knack for crafting relevant questions and balanced, constructive feedback. However, the LLM-generated feedback wasn’t always perfect, sometimes exhibiting “hallucinations” or inaccuracies and struggling at times to align with predefined grading rubrics. The TAs’ experiences were mixed. Some found TAMIGO helpful for streamlining the assessment process, while others felt it added extra steps. This research highlights the potential of AI to transform educational support roles, but also underscores the need for continued refinement to ensure accuracy, rubric alignment, and seamless integration into existing workflows. While not a perfect replacement for human TAs, TAMIGO offers a glimpse into a future where AI can handle the heavy lifting of grading, freeing up educators to focus on higher-level teaching and student interaction.
🍰 Interesting in building your own agents?
PromptLayer provides the tools to manage and monitor prompts with your whole team. Get started for free.

Question & Answers

How does TAMIGO's technical architecture enable it to assess student code and generate viva questions?
TAMIGO utilizes Large Language Models (LLMs) to analyze student code submissions and generate assessment content. The system follows a structured process: First, it processes the submitted code and compares it against predefined rubrics. Then, it generates targeted viva questions based on the specific implementation details in the student's code. The system also provides feedback by analyzing code structure, functionality, and adherence to distributed systems principles. For example, when assessing a distributed system implementation, TAMIGO might generate questions about the student's chosen synchronization mechanisms or their approach to handling network failures.
What are the main benefits of using AI in educational assessment?
AI in educational assessment offers several key advantages for both educators and students. It primarily saves time by automating repetitive grading tasks, allowing teachers to focus more on personalized instruction and mentoring. AI systems can provide instant feedback, enabling students to learn from their mistakes immediately rather than waiting days for grades. In practical applications, AI can grade multiple-choice tests, evaluate written assignments, and even assess coding projects. For instance, in programming courses, AI can check code functionality, style, and efficiency automatically, providing consistent and objective evaluation across large class sizes.
How is AI transforming the future of education?
AI is revolutionizing education by introducing personalized learning experiences and automated assessment tools. It helps create adaptive learning paths that adjust to each student's pace and learning style, while also assisting teachers with administrative tasks like grading and feedback generation. The technology enables 24/7 learning support through chatbots and intelligent tutoring systems. For example, AI can analyze student performance patterns to identify areas where they need additional help, recommend relevant resources, and provide immediate feedback on assignments. This transformation is making education more efficient, accessible, and tailored to individual needs.

PromptLayer Features

  1. Testing & Evaluation
  2. TAMIGO's varying performance in feedback generation highlights the need for robust testing and evaluation frameworks
Implementation Details
Set up A/B testing comparing LLM outputs against human-graded benchmarks, implement regression testing for rubric alignment, create scoring metrics for feedback quality
Key Benefits
• Systematic evaluation of LLM feedback accuracy • Early detection of hallucinations and inconsistencies • Quantifiable measurement of rubric alignment
Potential Improvements
• Integrate automated rubric compliance checks • Develop customized evaluation metrics for educational feedback • Implement confidence scoring for generated responses
Business Value
Efficiency Gains
Reduce time spent manually validating AI feedback by 60%
Cost Savings
Decrease error correction overhead by identifying problematic outputs early
Quality Improvement
Ensure 95% alignment with grading standards through systematic testing
  1. Workflow Management
  2. The need for streamlined integration of AI feedback into existing TA grading processes
Implementation Details
Create templated workflows for different assessment types, implement version tracking for prompt improvements, establish feedback review pipelines
Key Benefits
• Standardized assessment processes • Traceable feedback generation steps • Simplified TA workflow integration
Potential Improvements
• Add conditional logic for different assignment types • Implement feedback revision tracking • Create collaborative review workflows
Business Value
Efficiency Gains
Streamline grading workflow by 40% through automated orchestration
Cost Savings
Reduce TA time investment through optimized processes
Quality Improvement
Ensure consistent assessment quality across different TAs and assignments

The first platform built for prompt engineering