HTML-Pruner-Llama-1B

HTML-Pruner-Llama-1B

zstanjj

HTML-optimized 1.24B parameter LLaMA model for efficient HTML content pruning in RAG systems, featuring two-step block-tree pruning approach.

PropertyValue
Parameter Count1.24B parameters
LicenseApache 2.0
PaperHtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems
Base Modelmeta-llama/Llama-3.2-1B

What is HTML-Pruner-Llama-1B?

HTML-Pruner-Llama-1B is a specialized language model designed to enhance RAG (Retrieval-Augmented Generation) systems by optimizing HTML content processing. This 1.24B parameter model implements an innovative two-step HTML pruning approach that maintains semantic integrity while reducing content length for more efficient processing.

Implementation Details

The model employs a sophisticated two-step block-tree-based HTML pruning strategy: first utilizing an embedding model for block scoring, followed by a path generative model for further refinement. It includes a Lossless HTML Cleaning process that preserves semantic information while removing redundant structures.

  • Two-Step Block-Tree-Based HTML Pruning architecture
  • Lossless HTML Cleaning capability
  • Built on LLaMA architecture with BF16 tensor type
  • Optimized for context windows up to 60 tokens

Core Capabilities

  • Efficient HTML content pruning while maintaining semantic meaning
  • Block-tree structure analysis and optimization
  • Integration with various embedding models (BM25, BGE, E5-Mistral)
  • Competitive performance across multiple QA datasets (ASQA, HotpotQA, NQ, TriviaQA)

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized HTML processing capabilities, offering a novel two-step pruning approach that outperforms traditional text-based RAG systems. It achieves state-of-the-art results across multiple benchmarks while maintaining HTML structure integrity.

Q: What are the recommended use cases?

The model is ideal for RAG systems requiring HTML document processing, question-answering systems, and applications needing efficient HTML content summarization while preserving semantic structure. It's particularly effective for scenarios where context length optimization is crucial.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026