code_trans_t5_large_source_code_summarization_python_multitask_finetune

code_trans_t5_large_source_code_summarization_python_multitask_finetune

SEBIS

CodeTrans T5-Large model for Python code summarization, featuring 220M parameters and multi-task training on 13 supervised tasks. Achieves 13.24 BLEU score for Python.

PropertyValue
Parameters220M
ArchitectureT5-Large
Training InfrastructureTPU Pod V3-8
BLEU Score (Python)13.24

What is code_trans_t5_large_source_code_summarization_python_multitask_finetune?

This is a specialized code transformation model based on T5-Large architecture, specifically designed for Python code summarization. The model has been trained through a sophisticated multi-task approach, incorporating 13 supervised tasks in software development and 7 unsupervised datasets, followed by fine-tuning for Python code summarization.

Implementation Details

The model utilizes an encoder-decoder architecture with custom SentencePiece vocabulary. Training was conducted in two phases: initial training for 500,000 steps with a batch size of 4096, followed by fine-tuning for 100 steps on Python-specific data. The optimization process employed AdaFactor with an inverse square root learning rate schedule.

  • Multi-task training on 13 supervised software development tasks
  • Custom SentencePiece vocabulary model
  • Sequence length of 512 tokens
  • TPU Pod V3-8 infrastructure for training

Core Capabilities

  • Generate natural language descriptions for Python functions
  • Process both tokenized and untokenized Python code
  • Achieve competitive BLEU scores (13.24 for Python)
  • Support for fine-tuning on related Python code tasks

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its comprehensive multi-task training approach, combining 13 supervised tasks with 7 unsupervised datasets, making it particularly robust for Python code understanding and summarization.

Q: What are the recommended use cases?

The model excels at generating documentation for Python functions and can be fine-tuned for related code tasks. It performs optimally with tokenized Python code but can handle untokenized code as well.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026