Skip to content
  • AT SNOWFLAKE
  • Industry solutions
  • Partner & Customer Value
  • Product & Technology
  • Strategy & Insights
Languages
  • English
  • Français
  • Deutsch
  • Italiano
  • 日本語
  • 한국어
  • Português
  • Español
  • English
  • Français
  • Deutsch
  • Italiano
  • 日本語
  • 한국어
  • Português
  • Español
  • AT SNOWFLAKE
  • Industry solutions
  • Partner & Customer Value
  • Product & Technology
  • Strategy & Insights
  • English
  • Français
  • Deutsch
  • Italiano
  • 日本語
  • 한국어
  • Português
  • Español
  • Overview
    • Why Snowflake
    • Customer Stories
    • Partners
    • Professional Services
  • OVERVIEW
    • Platform
    • Snowflake Horizon
    • Data Cloud Explained
    • Snowflake Marketplace
    • Snowpark
    • Streamlit
    • Powered by Snowflake
    • Live Demo
    • Cross-Cloud Snowgrid
  • WORKLOADS
    • AI / ML
    • Applications
    • Collaboration
    • Data Engineering
    • Data Lake
    • Data Warehouse
    • Unistore
  • PRICING
    • Pricing Options
    • Cost & Performance Optimization
  • Industries
    • Advertising, Media, and Entertainment
    • Financial Services
    • Healthcare & Life Sciences
    • Manufacturing
    • Public Sector
    • Retail & Consumer Goods
    • Technology
    • Telecom
  • Departments
    • Marketing
    • IT
    • Finance
    • Cybersecurity
  • Learn
    • Resources
    • Developers
    • Quickstarts
    • Documentation
    • Hands-on Labs
    • Training
    • Certifications
    • Guides
    • Glossary
  • Connect
    • Blog
    • Community
    • Events
    • Webinars
    • Podcast
    • Support
    • Trending
  • Overview
    • About Snowflake
    • Investor Relations
    • Leadership & Board
    • Careers
    • Newsroom
    • ESG at Snowflake
    • Snowflake Ventures
Author
Snowflake AI Research
Share
Subscribe
May 02, 2024

Moving Beyond MTEB and BEIR: Snowflake AI Research Joins Forces with the University of Waterloo to Evolve RAG and Retrieval Benchmarks

  • Product and Technology
    • AI & ML
Moving Beyond MTEB and BEIR: Snowflake AI Research Joins Forces with the University of Waterloo to Evolve RAG and Retrieval Benchmarks

To accurately answer business questions using LLMs, companies must augment models with their data. Retrieval Augmented Generation (RAG) is a popular solution to this problem, as it integrates the organization’s factual, real-time data into the prompt for the LLM. While the adoption of RAG has increased, an open question remains: How do enterprises know how effective their system is?

Due to the growth of interest in improving retrieval quality, open and collaboratively developed benchmarks, such as BEIR, MTEB, and MSMARCO, have made it easier to compare and evaluate the surge of new retrieval systems. These benchmarks evolved from collections of independent data sets in well-studied workloads that we, along with many other retrieval experts, used to quantify the performance of Arctic-embed model. As we continue to develop more advanced and efficient retrieval that enables enterprises to talk to their data, it’s crucial to ensure that the benchmarking data sets represent these use cases directly. Building on Snowflake’s broadly used data cloud, we aim to openly and collaboratively support the evolution of retrieval benchmarks to propel the industry forward. 

To help the broader ecosystem continue to improve performance, we’re thrilled to announce a unique collaboration between Snowflake and a team of retrieval experts from the University of Waterloo, which is renowned for its research prowess under Professor Jimmy Lin. Together, we’re embarking on a mission to build the next generation of retrieval evaluation benchmarks to understand better and evaluate how RAG agents perform. 

“As a researcher, I’m thrilled to collaborate with Snowflake on this joint mission to build an improved representation of real-world retrieval applications,” said Prof. Lin. “The expertise in practical enterprise AI from Snowflake, combined with our academic insights, promises to unlock new frontiers in AI innovation.”

At Snowflake, we aim to empower our customers to get the most out of their enterprise data. From efficient and scalable elastic computing, to the best tools and frameworks to talk to your data, we strive to deliver insights quickly, accurately, and efficiently. With the growth of RAG-like systems and workflows, it quickly became apparent that we must qualify and quantify how well these systems perform. 

Like all prior benchmarks, metrics, and tasks become saturated, and the gap between improvements on a leaderboard and reality begins to widen. In our work on our open source embedding model family, Snowflake Arctic embed, we found MTEB crucial to quick iteration and qualification but saw a growing gap in improvements on existing benchmarks compared to our internal benchmarks.

Our collaboration is not about creating novel retrieval models. It’s about creating novel open-source data sets and tasks to revolutionize the field. We’re fostering a community-driven approach to research and development, a strategy that promises to bring about exciting and groundbreaking changes.

  • TREC RAG: Using the past experiences of Professor Lin and Snowflake’s own Dr. Daniel Campos in creating world-class benchmarks and data sets, this RAG track focuses on understanding and evaluating the quality of cited and grounded generation and how it is influenced by the quality of retrieval, generation mode and use case.
  • BEIR v2 (Benchmarking Evaluation of Information Retrieval): Building on Nandan Thakur’s experience in building the first BEIR benchmark and expertise with commercial search systems, we seek to create a new and improved retrieval benchmark that is more representative of the workloads people use embedding models for. 

We’re not just excited about this journey; we’re thrilled. Thrilled to shape the future of information retrieval and AI with the University of Waterloo, Professor Jimmy Lin and his brilliant researchers. Stay tuned for updates on our progress and the breakthroughs that will emerge from this collaboration. We’re confident that they will be nothing short of remarkable!

Join us at the Snowflake Data Cloud Summit in San Francisco this June to learn more about our AI research.

Share

Related content

  • Product and Technology
    • AI & ML
Mar 05, 2024

Easy and Secure LLM Inference and Retrieval Augmented Generation (RAG) Using Snowflake Cortex

Because human-machine interaction using natural language is now possible with large language models (LLMs), more data teams and developers can bring AI to their daily workflows. To do this efficiently…

More
Read More
  • Product and Technology
    • AI & ML
Apr 16, 2024

Snowflake Launches the World’s Best Practical Text-Embedding Model for Retrieval Use Cases

Today Snowflake is launching and open-sourcing with an Apache 2.0 license the Snowflake Arctic embed…

Full Details
Read More
  • Product and Technology
    • AI & ML
Apr 24, 2024

Snowflake Arctic: The Best LLM for Enterprise AI — Efficiently Intelligent, Truly Open

Building top-tier enterprise-grade intelligence using LLMs has traditionally been prohibitively expensive and resource-hungry, and often…

More Details
Read More

Essential Guide to Gen AI

Download now

Snowflake Inc.
  • Platform
    • Cloud Data Platform
    • Pricing
    • Marketplace
    • Security & Trust
  • Solutions
    • Snowflake for Financial Services
    • Snowflake for Advertising, Media, & Entertainment
    • Snowflake for Retail & CPG
    • Healthcare & Life Sciences Data Cloud
    • Snowflake for Marketing Analytics
  • Resources
    • Resource Library
    • Webinars
    • Documentation
    • Community
    • Procurement
    • Legal
  • Explore
    • News
    • Blog
    • Trending
    • Guides
    • Developers
  • About
    • About Snowflake
    • Investor Relations
    • Leadership & Board
    • Snowflake Ventures
    • Careers
    • Contact

Sign up for Snowflake Communications

Thanks for signing up!

  • Privacy Notice
  • Site Terms
  • Cookie Settings
  • Do Not Share My Personal Information

© 2024 Snowflake Inc. All Rights Reserved |  If you’d rather not receive future emails from Snowflake, unsubscribe here or customize your communication preferences