BART-Large-MNLI Yahoo Answers

Property	Value
Author	joeddav
Base Model	facebook/bart-large-mnli
Task	Zero-shot Topic Classification
Model Hub	Hugging Face

What is bart-large-mnli-yahoo-answers?

This model is a specialized version of BART-Large-MNLI that has been fine-tuned on Yahoo Answers topic classification data. It's designed to perform zero-shot classification tasks, meaning it can predict whether a topic label applies to a given text sequence, even for labels it hasn't seen during training. The model leverages Natural Language Inference (NLI) architecture to make these predictions.

Implementation Details

The model builds upon the BART-Large-MNLI architecture and employs a unique training approach where sequences are treated as premises and candidate labels as hypotheses. It uses the template "This text is about {}" for hypothesis generation. During training, the model was specifically trained on 5 out of 10 Yahoo Answers labels to test its zero-shot capabilities.

Trained on specific categories: Society & Culture, Health, Computers & Internet, Business & Finance, and Family & Relationships
Achieves F1 scores of 0.68 and 0.72 for unseen and seen labels respectively
Implements a 30% probability adjustment for seen labels to balance predictions

Core Capabilities

Zero-shot topic classification without requiring examples of new labels
Flexible hypothesis template system for classification tasks
Support for both single-label and multi-label classification
Easy integration with Hugging Face's transformers library

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to classify topics it hasn't seen during training makes it particularly valuable for real-world applications where new categories frequently emerge. Its fine-tuning on Yahoo Answers data while maintaining zero-shot capabilities sets it apart from traditional classification models.

Q: What are the recommended use cases?

The model is best suited for topic classification tasks, especially when working with unknown or evolving category sets. It's recommended to use the specific hypothesis template "This text is about {}." for optimal performance, as this was used during fine-tuning.