TAMIGO: Empowering Teaching Assistants using LLM-assisted viva and code assessment in an Advanced Computing Class

Back

Published

Jul 23, 2024

Updated

Jul 23, 2024

Can AI Grade Your Code? TAMIGO and the Future of Teaching

TAMIGO: Empowering Teaching Assistants using LLM-assisted viva and code assessment in an Advanced Computing Class

https://arxiv.org/abs/2407.16805v1

Summary

Grading student code is a time-consuming task for teaching assistants (TAs). What if AI could help? Researchers explored this idea by creating TAMIGO, an LLM-powered tool designed to assist TAs with code and viva assessments in a university-level distributed systems course. TAMIGO helped TAs generate targeted viva questions and provided feedback on student answers and code submissions. The results were promising, with TAMIGO demonstrating a knack for crafting relevant questions and balanced, constructive feedback. However, the LLM-generated feedback wasn’t always perfect, sometimes exhibiting “hallucinations” or inaccuracies and struggling at times to align with predefined grading rubrics. The TAs’ experiences were mixed. Some found TAMIGO helpful for streamlining the assessment process, while others felt it added extra steps. This research highlights the potential of AI to transform educational support roles, but also underscores the need for continued refinement to ensure accuracy, rubric alignment, and seamless integration into existing workflows. While not a perfect replacement for human TAs, TAMIGO offers a glimpse into a future where AI can handle the heavy lifting of grading, freeing up educators to focus on higher-level teaching and student interaction.

🍰 Interesting in building your own agents?

PromptLayer provides the tools to manage and monitor prompts with your whole team. Get started for free.

Question & Answers

How does TAMIGO's technical architecture enable it to assess student code and generate viva questions?

TAMIGO utilizes Large Language Models (LLMs) to analyze student code submissions and generate assessment content. The system follows a structured process: First, it processes the submitted code and compares it against predefined rubrics. Then, it generates targeted viva questions based on the specific implementation details in the student's code. The system also provides feedback by analyzing code structure, functionality, and adherence to distributed systems principles. For example, when assessing a distributed system implementation, TAMIGO might generate questions about the student's chosen synchronization mechanisms or their approach to handling network failures.

What are the main benefits of using AI in educational assessment?

AI in educational assessment offers several key advantages for both educators and students. It primarily saves time by automating repetitive grading tasks, allowing teachers to focus more on personalized instruction and mentoring. AI systems can provide instant feedback, enabling students to learn from their mistakes immediately rather than waiting days for grades. In practical applications, AI can grade multiple-choice tests, evaluate written assignments, and even assess coding projects. For instance, in programming courses, AI can check code functionality, style, and efficiency automatically, providing consistent and objective evaluation across large class sizes.

How is AI transforming the future of education?

AI is revolutionizing education by introducing personalized learning experiences and automated assessment tools. It helps create adaptive learning paths that adjust to each student's pace and learning style, while also assisting teachers with administrative tasks like grading and feedback generation. The technology enables 24/7 learning support through chatbots and intelligent tutoring systems. For example, AI can analyze student performance patterns to identify areas where they need additional help, recommend relevant resources, and provide immediate feedback on assignments. This transformation is making education more efficient, accessible, and tailored to individual needs.

PromptLayer Features

Testing & Evaluation
TAMIGO's varying performance in feedback generation highlights the need for robust testing and evaluation frameworks

Implementation Details

Set up A/B testing comparing LLM outputs against human-graded benchmarks, implement regression testing for rubric alignment, create scoring metrics for feedback quality

Key Benefits

• Systematic evaluation of LLM feedback accuracy • Early detection of hallucinations and inconsistencies • Quantifiable measurement of rubric alignment

Potential Improvements

• Integrate automated rubric compliance checks • Develop customized evaluation metrics for educational feedback • Implement confidence scoring for generated responses

Business Value

Efficiency Gains

Reduce time spent manually validating AI feedback by 60%

Cost Savings

Decrease error correction overhead by identifying problematic outputs early

Quality Improvement

Ensure 95% alignment with grading standards through systematic testing

Analytics
Workflow Management
The need for streamlined integration of AI feedback into existing TA grading processes

Implementation Details

Create templated workflows for different assessment types, implement version tracking for prompt improvements, establish feedback review pipelines

Key Benefits

• Standardized assessment processes • Traceable feedback generation steps • Simplified TA workflow integration

Potential Improvements

• Add conditional logic for different assignment types • Implement feedback revision tracking • Create collaborative review workflows

Business Value

Efficiency Gains

Streamline grading workflow by 40% through automated orchestration

Cost Savings

Reduce TA time investment through optimized processes

Quality Improvement

Ensure consistent assessment quality across different TAs and assignments

Can AI Grade Your Code? TAMIGO and the Future of Teaching

Summary

Question & Answers

PromptLayer Features

The first platform built for prompt engineering