tqc-PandaReach-v1

Maintained By
sb3

TQC PandaReach-v1 Model

PropertyValue
Research Paperarxiv.org/abs/2106.13687
Frameworkstable-baselines3
Mean Reward-2.30 ± 0.78
EnvironmentPandaReach-v1

What is tqc-PandaReach-v1?

tqc-PandaReach-v1 is a reinforcement learning model implementing the Truncated Quantile Critics (TQC) algorithm for robotic control. It's specifically trained on the PandaReach-v1 environment, which simulates a Franka Emika Panda robot arm reaching tasks. The model is built using the stable-baselines3 library and trained through the RL Zoo framework.

Implementation Details

The model utilizes a sophisticated architecture with specific hyperparameters optimized for the reaching task. It employs a MultiInputPolicy with two hidden layers of 64 units each and uses a HER (Hindsight Experience Replay) buffer for efficient learning.

  • Batch size: 256 with buffer size of 1,000,000
  • Learning rate: 0.001 with 1000 learning start steps
  • Gamma (discount factor): 0.95
  • Uses TimeFeatureWrapper for enhanced temporal understanding
  • Implements normalization for observations

Core Capabilities

  • Efficient reaching task performance with robotic arm simulation
  • Automated goal-oriented learning using HER with future strategy
  • Normalized observation processing for stable learning
  • Multi-input policy handling for complex state spaces

Frequently Asked Questions

Q: What makes this model unique?

This model combines TQC with HER replay buffer and specialized wrappers for robotic control, making it particularly effective for reaching tasks. The implementation includes careful hyperparameter tuning and observation normalization for robust performance.

Q: What are the recommended use cases?

The model is specifically designed for robotic arm reaching tasks in simulation environments. It's ideal for research in robotic control, particularly when working with Franka Emika Panda robot simulations or similar reaching tasks requiring precise end-effector control.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.