OUNLP — Natural Language Processing Lab University of Oklahoma, School of Computer Science

AgentBeats SDK, AgentX–AgentBeats Competition, and OUNLP Project

AgentBeats SDK, Agentic AI MOOC, and OUNLP Work on Multi-Agent Evaluation

This post highlights our participation in the AgentX–AgentBeats Competition and the Berkeley Agentic AI MOOC, both of which focus on building reliable, verifiable multi-agent systems. The OUNLP lab is contributing by developing agentic evaluation pipelines grounded in real-world tasks and reproducible verifiers.

AgentBeats: A Standardized Platform for Agent Evaluation

The AgentBeats SDK, developed by Sierra, provides a unified framework for testing and evaluating multi-agent systems. It introduces structured agent roles and deterministic verifiers that allow researchers to run reproducible experiments over complex tasks.

AgentBeats uses two core agent roles:

This design mirrors real-world engineering workflows where one component generates artifacts and another independently verifies their correctness.

Insights from the Berkeley Agentic AI MOOC

Across the MOOC, invited speakers from OpenAI, DeepMind, Microsoft, Berkeley RDI, and Sierra emphasized principles required for dependable agentic systems:

These lessons directly inform how our lab approaches agent design and benchmarking.

OUNLP Project: Agentifying the Design2Code Pipeline

Our lab is building a green-agent-powered evaluation system for the Design2Code framework—a visual-to-code pipeline that translates webpage images or sketches into responsive HTML/CSS.

Our agentic integration includes:

This transforms Design2Code from a generative model into a fully verifiable agentic task, suitable for research, benchmarking, and competition submissions.

Looking Forward

As we continue through the MOOC and competition:

Agentic AI is rapidly evolving from single-shot prompting toward dependable, autonomous systems. Platforms like AgentBeats give us a testbed to study, measure, and improve these emerging capabilities—and OUNLP is excited to be part of this development.