RagaAI Catalyst Integrates with NVIDIA NeMo Agent Toolkit: Building Reliable AI Agents from Day One

RagaAI Catalyst Integrates with NVIDIA NeMo Agent Toolkit: Building Reliable AI Agents from Day One

RagaAI Catalyst Integrates with NVIDIA NeMo Agent Toolkit: Building Reliable AI Agents from Day One

RagaAI Catalyst Integrates with NVIDIA NeMo Agent Toolkit: Building Reliable AI Agents from Day One

RagaAI Catalyst Integrates with NVIDIA NeMo Agent Toolkit: Building Reliable AI Agents from Day One

Sugandha Sharma (GenAI Architect at NVIDIA), Nitai (Head of Product at RagaAI)

Sugandha Sharma (GenAI Architect at NVIDIA), Nitai (Head of Product at RagaAI)

Sugandha Sharma (GenAI Architect at NVIDIA), Nitai (Head of Product at RagaAI)

Aug 27, 2025

Introduction:

In today’s fast-moving world of AI agents, reliability and visibility aren’t just nice-to-haves; they’re essential. By bringing together RagaAI Catalyst, our platform for monitoring, evaluating and testing AI, and the NVIDIA NeMo Agent Toolkit for building enterprise-ready agentic AI systems, developers now have one place to build, verify, and improve agents from the first version to production.

Making AI agents dependable is not easy. They often use multi-step reasoning, search and retrieval, and different tools to get their work done. While powerful, this also means more chances for small mistakes, like a broken API, an outdated document, or a wrong search result, to grow into bigger problems.

The solution is to watch the agent’s performance closely at every step, so you can spot problems early and fix them before they affect users. 

Before we get into how the RagaAI and NVIDIA NeMo Agent Toolkit integration makes this possible, let’s look at what each platform offers.

RagaAI Catalyst:

RagaAI Catalyst is an automated testing platform designed for AI applications powered by LLMs, RAG, and agentic workflows. It helps teams quickly spot and resolve issues, such as hallucinations, safety risks, and inefficient prompting, so you can build safer, more reliable agents.

What Makes It Stand Out:

  • Real-Time Metrics: Provides deep insights into prompt quality, hallucination risks, context alignment, and response safety, etc.. These diagnostics help you pinpoint exactly what needs fixing.

  • Highly Accurate Evaluation: Delivers evaluation results that closely match human judgment, making the feedback trustworthy

  • Built‑In Safety Guardrails: Automatically catches and blocks unsafe or problematic content like PII leaks, biased or toxic responses, or prompt injection attacks

  • Structured Prompt & Model Experimentation: Lets your team systematically A/B test different prompts, model variations, or hyperparameters


Fig 1: RagaAI Catalyst Overview

NVIDIA NeMo Agent Toolkit:

The NVIDIA NeMo Agent Toolkit is an open-source, framework-agnostic library for building, profiling, and scaling AI agents and agentic systems. It works with popular frameworks like LangChain, LlamaIndex, CrewAI, etc., allowing teams to extend or retain their existing stack.

Key strengths include:

  • Framework agnostic & modular: build reusable, composable agent workflows.

  • Profiling & observability:  track latency, cost, and performance with OpenTelemetry support

  • Evaluation & optimisation: test agent accuracy, swap tools/models, and optimise compute spend

  • MCP support: connect to or expose tools via the Model Context Protocol

  • Enterprise-ready: integrates with NVIDIA NIM, FastAPI, and RAG pipelines for production use

Thus, the toolkit enables seamless integration of enterprise agents with diverse data sources and tools across multiple frameworks. The following diagram provides a high-level architectural overview of the toolkit, showing how agents, plugins, workflows, and user interfaces interact within the system.


Fig 2: Agent-Oriented System Architecture

The Integration Advantage:

When building AI agents, one of the biggest challenges isn’t just getting them to work; it’s making sure they’re safe, reliable, and observable once they’re running. Traditionally, this means adding custom code for logging, stitching together monitoring scripts, and juggling multiple dashboards.

With the NeMo Agent Toolkit, that complexity disappears. NeMo Agent Toolkit now natively supports RagaAI Catalyst. For agentic AI developers, the evaluation and observability layer is already built in, no extra code required.

Enabling Catalyst takes just two simple steps:

  1. Export environment variables (as shown in Notebook 1).

  2. Run your agent with a Catalyst-enabled config file (YML, also shown in Notebook 1).

That’s it. Once these are set up, everything your agent does is automatically captured and streamed into Catalyst; from prompts and model responses, to evaluation metrics and error traces. And it all shows up in the Catalyst dashboard in real time.

 Fig 3: Unified Agent Monitoring in Catalyst

The result? A plug-and-play workflow where Catalyst seamlessly connects to existing NeMo Agent Toolkit agentic systems. Without touching a single line of custom code, you instantly gain access to:

  • Evaluation metrics that track how your agent is performing.

  • Quality and safety checks to catch issues like hallucinations or retrieval drift.

  • Prompt observability so you can trace every step of your agent’s reasoning.

This is what makes the integration powerful: it’s not an afterthought, it’s native by design. NeMo Agent Toolkit helps with agent building; Catalyst ensures it’s trustworthy and production-ready, all in one unified environment.

Conclusion:

The integration of NeMo Agent Toolkit and RagaAI Catalyst represents a significant advancement in developing and managing high-performance AI agents. By using NeMo Agent Toolkit’s powerful agent-building tools for Catalyst’s deep observability and quality checks, teams can ensure accuracy, reliability, and safety at every stage. With a unified workflow, comprehensive testing, and continuous monitoring, developers are equipped to deliver AI agents that meet the highest standards- faster, with fewer risks, and greater confidence in every deployment.

Getting Started:

Here is a cookbook on how you can use the RagaAI Catalyst with the NVIDIA NeMo Agent Toolkit.

Notebook 1: Calculator Use Case: RagaAI_Catalyst_NeMo_calculator.ipynb:Demonstrates a calculator workflow built using the NeMo Agent Toolkit, with observability and evaluation powered by Catalyst.
Notebook 2: Alert Triage Use case: RagaAI_Catalyst_NeMo_alert_triage_agent.ipynb:An automated alert triage agentic system built using NeMo Agent Toolkit, with Catalyst ensuring monitoring and performance assessment.

Introduction:

In today’s fast-moving world of AI agents, reliability and visibility aren’t just nice-to-haves; they’re essential. By bringing together RagaAI Catalyst, our platform for monitoring, evaluating and testing AI, and the NVIDIA NeMo Agent Toolkit for building enterprise-ready agentic AI systems, developers now have one place to build, verify, and improve agents from the first version to production.

Making AI agents dependable is not easy. They often use multi-step reasoning, search and retrieval, and different tools to get their work done. While powerful, this also means more chances for small mistakes, like a broken API, an outdated document, or a wrong search result, to grow into bigger problems.

The solution is to watch the agent’s performance closely at every step, so you can spot problems early and fix them before they affect users. 

Before we get into how the RagaAI and NVIDIA NeMo Agent Toolkit integration makes this possible, let’s look at what each platform offers.

RagaAI Catalyst:

RagaAI Catalyst is an automated testing platform designed for AI applications powered by LLMs, RAG, and agentic workflows. It helps teams quickly spot and resolve issues, such as hallucinations, safety risks, and inefficient prompting, so you can build safer, more reliable agents.

What Makes It Stand Out:

  • Real-Time Metrics: Provides deep insights into prompt quality, hallucination risks, context alignment, and response safety, etc.. These diagnostics help you pinpoint exactly what needs fixing.

  • Highly Accurate Evaluation: Delivers evaluation results that closely match human judgment, making the feedback trustworthy

  • Built‑In Safety Guardrails: Automatically catches and blocks unsafe or problematic content like PII leaks, biased or toxic responses, or prompt injection attacks

  • Structured Prompt & Model Experimentation: Lets your team systematically A/B test different prompts, model variations, or hyperparameters


Fig 1: RagaAI Catalyst Overview

NVIDIA NeMo Agent Toolkit:

The NVIDIA NeMo Agent Toolkit is an open-source, framework-agnostic library for building, profiling, and scaling AI agents and agentic systems. It works with popular frameworks like LangChain, LlamaIndex, CrewAI, etc., allowing teams to extend or retain their existing stack.

Key strengths include:

  • Framework agnostic & modular: build reusable, composable agent workflows.

  • Profiling & observability:  track latency, cost, and performance with OpenTelemetry support

  • Evaluation & optimisation: test agent accuracy, swap tools/models, and optimise compute spend

  • MCP support: connect to or expose tools via the Model Context Protocol

  • Enterprise-ready: integrates with NVIDIA NIM, FastAPI, and RAG pipelines for production use

Thus, the toolkit enables seamless integration of enterprise agents with diverse data sources and tools across multiple frameworks. The following diagram provides a high-level architectural overview of the toolkit, showing how agents, plugins, workflows, and user interfaces interact within the system.


Fig 2: Agent-Oriented System Architecture

The Integration Advantage:

When building AI agents, one of the biggest challenges isn’t just getting them to work; it’s making sure they’re safe, reliable, and observable once they’re running. Traditionally, this means adding custom code for logging, stitching together monitoring scripts, and juggling multiple dashboards.

With the NeMo Agent Toolkit, that complexity disappears. NeMo Agent Toolkit now natively supports RagaAI Catalyst. For agentic AI developers, the evaluation and observability layer is already built in, no extra code required.

Enabling Catalyst takes just two simple steps:

  1. Export environment variables (as shown in Notebook 1).

  2. Run your agent with a Catalyst-enabled config file (YML, also shown in Notebook 1).

That’s it. Once these are set up, everything your agent does is automatically captured and streamed into Catalyst; from prompts and model responses, to evaluation metrics and error traces. And it all shows up in the Catalyst dashboard in real time.

 Fig 3: Unified Agent Monitoring in Catalyst

The result? A plug-and-play workflow where Catalyst seamlessly connects to existing NeMo Agent Toolkit agentic systems. Without touching a single line of custom code, you instantly gain access to:

  • Evaluation metrics that track how your agent is performing.

  • Quality and safety checks to catch issues like hallucinations or retrieval drift.

  • Prompt observability so you can trace every step of your agent’s reasoning.

This is what makes the integration powerful: it’s not an afterthought, it’s native by design. NeMo Agent Toolkit helps with agent building; Catalyst ensures it’s trustworthy and production-ready, all in one unified environment.

Conclusion:

The integration of NeMo Agent Toolkit and RagaAI Catalyst represents a significant advancement in developing and managing high-performance AI agents. By using NeMo Agent Toolkit’s powerful agent-building tools for Catalyst’s deep observability and quality checks, teams can ensure accuracy, reliability, and safety at every stage. With a unified workflow, comprehensive testing, and continuous monitoring, developers are equipped to deliver AI agents that meet the highest standards- faster, with fewer risks, and greater confidence in every deployment.

Getting Started:

Here is a cookbook on how you can use the RagaAI Catalyst with the NVIDIA NeMo Agent Toolkit.

Notebook 1: Calculator Use Case: RagaAI_Catalyst_NeMo_calculator.ipynb:Demonstrates a calculator workflow built using the NeMo Agent Toolkit, with observability and evaluation powered by Catalyst.
Notebook 2: Alert Triage Use case: RagaAI_Catalyst_NeMo_alert_triage_agent.ipynb:An automated alert triage agentic system built using NeMo Agent Toolkit, with Catalyst ensuring monitoring and performance assessment.

Introduction:

In today’s fast-moving world of AI agents, reliability and visibility aren’t just nice-to-haves; they’re essential. By bringing together RagaAI Catalyst, our platform for monitoring, evaluating and testing AI, and the NVIDIA NeMo Agent Toolkit for building enterprise-ready agentic AI systems, developers now have one place to build, verify, and improve agents from the first version to production.

Making AI agents dependable is not easy. They often use multi-step reasoning, search and retrieval, and different tools to get their work done. While powerful, this also means more chances for small mistakes, like a broken API, an outdated document, or a wrong search result, to grow into bigger problems.

The solution is to watch the agent’s performance closely at every step, so you can spot problems early and fix them before they affect users. 

Before we get into how the RagaAI and NVIDIA NeMo Agent Toolkit integration makes this possible, let’s look at what each platform offers.

RagaAI Catalyst:

RagaAI Catalyst is an automated testing platform designed for AI applications powered by LLMs, RAG, and agentic workflows. It helps teams quickly spot and resolve issues, such as hallucinations, safety risks, and inefficient prompting, so you can build safer, more reliable agents.

What Makes It Stand Out:

  • Real-Time Metrics: Provides deep insights into prompt quality, hallucination risks, context alignment, and response safety, etc.. These diagnostics help you pinpoint exactly what needs fixing.

  • Highly Accurate Evaluation: Delivers evaluation results that closely match human judgment, making the feedback trustworthy

  • Built‑In Safety Guardrails: Automatically catches and blocks unsafe or problematic content like PII leaks, biased or toxic responses, or prompt injection attacks

  • Structured Prompt & Model Experimentation: Lets your team systematically A/B test different prompts, model variations, or hyperparameters


Fig 1: RagaAI Catalyst Overview

NVIDIA NeMo Agent Toolkit:

The NVIDIA NeMo Agent Toolkit is an open-source, framework-agnostic library for building, profiling, and scaling AI agents and agentic systems. It works with popular frameworks like LangChain, LlamaIndex, CrewAI, etc., allowing teams to extend or retain their existing stack.

Key strengths include:

  • Framework agnostic & modular: build reusable, composable agent workflows.

  • Profiling & observability:  track latency, cost, and performance with OpenTelemetry support

  • Evaluation & optimisation: test agent accuracy, swap tools/models, and optimise compute spend

  • MCP support: connect to or expose tools via the Model Context Protocol

  • Enterprise-ready: integrates with NVIDIA NIM, FastAPI, and RAG pipelines for production use

Thus, the toolkit enables seamless integration of enterprise agents with diverse data sources and tools across multiple frameworks. The following diagram provides a high-level architectural overview of the toolkit, showing how agents, plugins, workflows, and user interfaces interact within the system.


Fig 2: Agent-Oriented System Architecture

The Integration Advantage:

When building AI agents, one of the biggest challenges isn’t just getting them to work; it’s making sure they’re safe, reliable, and observable once they’re running. Traditionally, this means adding custom code for logging, stitching together monitoring scripts, and juggling multiple dashboards.

With the NeMo Agent Toolkit, that complexity disappears. NeMo Agent Toolkit now natively supports RagaAI Catalyst. For agentic AI developers, the evaluation and observability layer is already built in, no extra code required.

Enabling Catalyst takes just two simple steps:

  1. Export environment variables (as shown in Notebook 1).

  2. Run your agent with a Catalyst-enabled config file (YML, also shown in Notebook 1).

That’s it. Once these are set up, everything your agent does is automatically captured and streamed into Catalyst; from prompts and model responses, to evaluation metrics and error traces. And it all shows up in the Catalyst dashboard in real time.

 Fig 3: Unified Agent Monitoring in Catalyst

The result? A plug-and-play workflow where Catalyst seamlessly connects to existing NeMo Agent Toolkit agentic systems. Without touching a single line of custom code, you instantly gain access to:

  • Evaluation metrics that track how your agent is performing.

  • Quality and safety checks to catch issues like hallucinations or retrieval drift.

  • Prompt observability so you can trace every step of your agent’s reasoning.

This is what makes the integration powerful: it’s not an afterthought, it’s native by design. NeMo Agent Toolkit helps with agent building; Catalyst ensures it’s trustworthy and production-ready, all in one unified environment.

Conclusion:

The integration of NeMo Agent Toolkit and RagaAI Catalyst represents a significant advancement in developing and managing high-performance AI agents. By using NeMo Agent Toolkit’s powerful agent-building tools for Catalyst’s deep observability and quality checks, teams can ensure accuracy, reliability, and safety at every stage. With a unified workflow, comprehensive testing, and continuous monitoring, developers are equipped to deliver AI agents that meet the highest standards- faster, with fewer risks, and greater confidence in every deployment.

Getting Started:

Here is a cookbook on how you can use the RagaAI Catalyst with the NVIDIA NeMo Agent Toolkit.

Notebook 1: Calculator Use Case: RagaAI_Catalyst_NeMo_calculator.ipynb:Demonstrates a calculator workflow built using the NeMo Agent Toolkit, with observability and evaluation powered by Catalyst.
Notebook 2: Alert Triage Use case: RagaAI_Catalyst_NeMo_alert_triage_agent.ipynb:An automated alert triage agentic system built using NeMo Agent Toolkit, with Catalyst ensuring monitoring and performance assessment.

Introduction:

In today’s fast-moving world of AI agents, reliability and visibility aren’t just nice-to-haves; they’re essential. By bringing together RagaAI Catalyst, our platform for monitoring, evaluating and testing AI, and the NVIDIA NeMo Agent Toolkit for building enterprise-ready agentic AI systems, developers now have one place to build, verify, and improve agents from the first version to production.

Making AI agents dependable is not easy. They often use multi-step reasoning, search and retrieval, and different tools to get their work done. While powerful, this also means more chances for small mistakes, like a broken API, an outdated document, or a wrong search result, to grow into bigger problems.

The solution is to watch the agent’s performance closely at every step, so you can spot problems early and fix them before they affect users. 

Before we get into how the RagaAI and NVIDIA NeMo Agent Toolkit integration makes this possible, let’s look at what each platform offers.

RagaAI Catalyst:

RagaAI Catalyst is an automated testing platform designed for AI applications powered by LLMs, RAG, and agentic workflows. It helps teams quickly spot and resolve issues, such as hallucinations, safety risks, and inefficient prompting, so you can build safer, more reliable agents.

What Makes It Stand Out:

  • Real-Time Metrics: Provides deep insights into prompt quality, hallucination risks, context alignment, and response safety, etc.. These diagnostics help you pinpoint exactly what needs fixing.

  • Highly Accurate Evaluation: Delivers evaluation results that closely match human judgment, making the feedback trustworthy

  • Built‑In Safety Guardrails: Automatically catches and blocks unsafe or problematic content like PII leaks, biased or toxic responses, or prompt injection attacks

  • Structured Prompt & Model Experimentation: Lets your team systematically A/B test different prompts, model variations, or hyperparameters


Fig 1: RagaAI Catalyst Overview

NVIDIA NeMo Agent Toolkit:

The NVIDIA NeMo Agent Toolkit is an open-source, framework-agnostic library for building, profiling, and scaling AI agents and agentic systems. It works with popular frameworks like LangChain, LlamaIndex, CrewAI, etc., allowing teams to extend or retain their existing stack.

Key strengths include:

  • Framework agnostic & modular: build reusable, composable agent workflows.

  • Profiling & observability:  track latency, cost, and performance with OpenTelemetry support

  • Evaluation & optimisation: test agent accuracy, swap tools/models, and optimise compute spend

  • MCP support: connect to or expose tools via the Model Context Protocol

  • Enterprise-ready: integrates with NVIDIA NIM, FastAPI, and RAG pipelines for production use

Thus, the toolkit enables seamless integration of enterprise agents with diverse data sources and tools across multiple frameworks. The following diagram provides a high-level architectural overview of the toolkit, showing how agents, plugins, workflows, and user interfaces interact within the system.


Fig 2: Agent-Oriented System Architecture

The Integration Advantage:

When building AI agents, one of the biggest challenges isn’t just getting them to work; it’s making sure they’re safe, reliable, and observable once they’re running. Traditionally, this means adding custom code for logging, stitching together monitoring scripts, and juggling multiple dashboards.

With the NeMo Agent Toolkit, that complexity disappears. NeMo Agent Toolkit now natively supports RagaAI Catalyst. For agentic AI developers, the evaluation and observability layer is already built in, no extra code required.

Enabling Catalyst takes just two simple steps:

  1. Export environment variables (as shown in Notebook 1).

  2. Run your agent with a Catalyst-enabled config file (YML, also shown in Notebook 1).

That’s it. Once these are set up, everything your agent does is automatically captured and streamed into Catalyst; from prompts and model responses, to evaluation metrics and error traces. And it all shows up in the Catalyst dashboard in real time.

 Fig 3: Unified Agent Monitoring in Catalyst

The result? A plug-and-play workflow where Catalyst seamlessly connects to existing NeMo Agent Toolkit agentic systems. Without touching a single line of custom code, you instantly gain access to:

  • Evaluation metrics that track how your agent is performing.

  • Quality and safety checks to catch issues like hallucinations or retrieval drift.

  • Prompt observability so you can trace every step of your agent’s reasoning.

This is what makes the integration powerful: it’s not an afterthought, it’s native by design. NeMo Agent Toolkit helps with agent building; Catalyst ensures it’s trustworthy and production-ready, all in one unified environment.

Conclusion:

The integration of NeMo Agent Toolkit and RagaAI Catalyst represents a significant advancement in developing and managing high-performance AI agents. By using NeMo Agent Toolkit’s powerful agent-building tools for Catalyst’s deep observability and quality checks, teams can ensure accuracy, reliability, and safety at every stage. With a unified workflow, comprehensive testing, and continuous monitoring, developers are equipped to deliver AI agents that meet the highest standards- faster, with fewer risks, and greater confidence in every deployment.

Getting Started:

Here is a cookbook on how you can use the RagaAI Catalyst with the NVIDIA NeMo Agent Toolkit.

Notebook 1: Calculator Use Case: RagaAI_Catalyst_NeMo_calculator.ipynb:Demonstrates a calculator workflow built using the NeMo Agent Toolkit, with observability and evaluation powered by Catalyst.
Notebook 2: Alert Triage Use case: RagaAI_Catalyst_NeMo_alert_triage_agent.ipynb:An automated alert triage agentic system built using NeMo Agent Toolkit, with Catalyst ensuring monitoring and performance assessment.

Introduction:

In today’s fast-moving world of AI agents, reliability and visibility aren’t just nice-to-haves; they’re essential. By bringing together RagaAI Catalyst, our platform for monitoring, evaluating and testing AI, and the NVIDIA NeMo Agent Toolkit for building enterprise-ready agentic AI systems, developers now have one place to build, verify, and improve agents from the first version to production.

Making AI agents dependable is not easy. They often use multi-step reasoning, search and retrieval, and different tools to get their work done. While powerful, this also means more chances for small mistakes, like a broken API, an outdated document, or a wrong search result, to grow into bigger problems.

The solution is to watch the agent’s performance closely at every step, so you can spot problems early and fix them before they affect users. 

Before we get into how the RagaAI and NVIDIA NeMo Agent Toolkit integration makes this possible, let’s look at what each platform offers.

RagaAI Catalyst:

RagaAI Catalyst is an automated testing platform designed for AI applications powered by LLMs, RAG, and agentic workflows. It helps teams quickly spot and resolve issues, such as hallucinations, safety risks, and inefficient prompting, so you can build safer, more reliable agents.

What Makes It Stand Out:

  • Real-Time Metrics: Provides deep insights into prompt quality, hallucination risks, context alignment, and response safety, etc.. These diagnostics help you pinpoint exactly what needs fixing.

  • Highly Accurate Evaluation: Delivers evaluation results that closely match human judgment, making the feedback trustworthy

  • Built‑In Safety Guardrails: Automatically catches and blocks unsafe or problematic content like PII leaks, biased or toxic responses, or prompt injection attacks

  • Structured Prompt & Model Experimentation: Lets your team systematically A/B test different prompts, model variations, or hyperparameters


Fig 1: RagaAI Catalyst Overview

NVIDIA NeMo Agent Toolkit:

The NVIDIA NeMo Agent Toolkit is an open-source, framework-agnostic library for building, profiling, and scaling AI agents and agentic systems. It works with popular frameworks like LangChain, LlamaIndex, CrewAI, etc., allowing teams to extend or retain their existing stack.

Key strengths include:

  • Framework agnostic & modular: build reusable, composable agent workflows.

  • Profiling & observability:  track latency, cost, and performance with OpenTelemetry support

  • Evaluation & optimisation: test agent accuracy, swap tools/models, and optimise compute spend

  • MCP support: connect to or expose tools via the Model Context Protocol

  • Enterprise-ready: integrates with NVIDIA NIM, FastAPI, and RAG pipelines for production use

Thus, the toolkit enables seamless integration of enterprise agents with diverse data sources and tools across multiple frameworks. The following diagram provides a high-level architectural overview of the toolkit, showing how agents, plugins, workflows, and user interfaces interact within the system.


Fig 2: Agent-Oriented System Architecture

The Integration Advantage:

When building AI agents, one of the biggest challenges isn’t just getting them to work; it’s making sure they’re safe, reliable, and observable once they’re running. Traditionally, this means adding custom code for logging, stitching together monitoring scripts, and juggling multiple dashboards.

With the NeMo Agent Toolkit, that complexity disappears. NeMo Agent Toolkit now natively supports RagaAI Catalyst. For agentic AI developers, the evaluation and observability layer is already built in, no extra code required.

Enabling Catalyst takes just two simple steps:

  1. Export environment variables (as shown in Notebook 1).

  2. Run your agent with a Catalyst-enabled config file (YML, also shown in Notebook 1).

That’s it. Once these are set up, everything your agent does is automatically captured and streamed into Catalyst; from prompts and model responses, to evaluation metrics and error traces. And it all shows up in the Catalyst dashboard in real time.

 Fig 3: Unified Agent Monitoring in Catalyst

The result? A plug-and-play workflow where Catalyst seamlessly connects to existing NeMo Agent Toolkit agentic systems. Without touching a single line of custom code, you instantly gain access to:

  • Evaluation metrics that track how your agent is performing.

  • Quality and safety checks to catch issues like hallucinations or retrieval drift.

  • Prompt observability so you can trace every step of your agent’s reasoning.

This is what makes the integration powerful: it’s not an afterthought, it’s native by design. NeMo Agent Toolkit helps with agent building; Catalyst ensures it’s trustworthy and production-ready, all in one unified environment.

Conclusion:

The integration of NeMo Agent Toolkit and RagaAI Catalyst represents a significant advancement in developing and managing high-performance AI agents. By using NeMo Agent Toolkit’s powerful agent-building tools for Catalyst’s deep observability and quality checks, teams can ensure accuracy, reliability, and safety at every stage. With a unified workflow, comprehensive testing, and continuous monitoring, developers are equipped to deliver AI agents that meet the highest standards- faster, with fewer risks, and greater confidence in every deployment.

Getting Started:

Here is a cookbook on how you can use the RagaAI Catalyst with the NVIDIA NeMo Agent Toolkit.

Notebook 1: Calculator Use Case: RagaAI_Catalyst_NeMo_calculator.ipynb:Demonstrates a calculator workflow built using the NeMo Agent Toolkit, with observability and evaluation powered by Catalyst.
Notebook 2: Alert Triage Use case: RagaAI_Catalyst_NeMo_alert_triage_agent.ipynb:An automated alert triage agentic system built using NeMo Agent Toolkit, with Catalyst ensuring monitoring and performance assessment.

Subscribe to our newsletter to never miss an update

Get Started With RagaAI®

Book a Demo

Schedule a call with AI Testing Experts

Get Started With RagaAI®

Book a Demo

Schedule a call with AI Testing Experts

Get Started With RagaAI®

Book a Demo

Schedule a call with AI Testing Experts

Get Started With RagaAI®

Book a Demo

Schedule a call with AI Testing Experts