Platform
Prompt Management
Evaluations
Observability
Dataset Management
Prompt Chaining
Docs
Blog
Case Studies
Careers
Contact Us
Log In
Research Papers
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Published
Jun 28, 2024
Can AI Follow Orders? New Benchmark Tests LLMs' Ability to Multitask
Xinyi Chen|Baohao Liao|Jirui Qi|Panagiotis Eustratiadis|Christof Monz|Arianna Bisazza|Maarten de Rijke
Published
Jun 28, 2024
Can AI Really Use Tools? A New Benchmark Reveals the Truth
Yuxiang Zhang|Jing Chen|Junjie Wang|Yaxin Liu|Cheng Yang|Chufan Shi|Xinyu Zhu|Zihao Lin|Hanwen Wan|Yujiu Yang|Tetsuya Sakai|Tian Feng|Hayato Yamana
Published
Jun 28, 2024
Unlocking Biomedical Secrets: A New Dataset for AI
Chen Tang|Bohao Yang|Kun Zhao|Bo Lv|Chenghao Xiao|Frank Guerin|Chenghua Lin
Published
Jun 28, 2024
BMW's AI Agents: Automating Tasks Through Teamwork
Noel Crawford|Edward B. Duffy|Iman Evazzade|Torsten Foehr|Gregory Robbins|Debbrata Kumar Saha|Jiya Varma|Marcin Ziolkowski
Published
Jun 28, 2024
The Curious Case of AI's Language Confusion
Kelly Marchisio|Wei-Yin Ko|Alexandre Bérard|Théo Dehaze|Sebastian Ruder
Published
Jun 28, 2024
The Hidden Threat of AI Finetuning: Can We Secure It?
Danny Halawi|Alexander Wei|Eric Wallace|Tony T. Wang|Nika Haghtalab|Jacob Steinhardt
Published
Jun 28, 2024
Can AI Write Secure Smart Contracts? An Investigation
Siddhartha Chatterjee|Bina Ramamurthy
Published
Jun 28, 2024
Unlocking the Power of Text: How EVF-SAM Enables Text-Prompted Segmentation
Yuxuan Zhang|Tianheng Cheng|Rui Hu|Lei Liu|Heng Liu|Longjin Ran|Xiaoxin Chen|Wenyu Liu|Xinggang Wang
Published
Jun 28, 2024
Beyond True or False: Why AI Fact-Checkers Need Molecular Facts
Anisha Gunjal|Greg Durrett
1
...
The first platform built for
prompt engineering
Start for free