NVIDIA AI-Q Adds Deep Research Skill to Agent Harnesses

NVIDIA packages research pipeline as reusable agent skill

NVIDIA released AI-Q as an open-source blueprint that exposes a dedicated deep research backend as a portable "skill" that general-purpose agent frameworks can delegate to. Agent harnesses like Claude Code, Codex, and LangChain Deep Agents can now submit research tasks (multi-document synthesis, decision briefs, long-horizon analysis) to a running AI-Q server and receive structured reports with source attribution, rather than attempting synthesis themselves.

The skill ships with install scripts for three major harness platforms. Claude Code loads repo-local skills from `.claude/skills/`; Codex from a configured skills directory; OpenCode from `~/.config/opencode/skills/`. Once installed, phrases like "research the regulatory landscape across our internal policy docs and produce a memo" route through the skill, which submits a job to the AI-Q server, polls for completion, and returns a cited output.

The second part of the release adds first-class Model Context Protocol (MCP) support so AI-Q can authenticate against enterprise data sources without standing up a parallel retrieval stack. Three authentication patterns are documented: unauthenticated MCP servers (simplest case), service-account MCP auth (preferred for CI and shared enterprise sources), and forwarding the signed-in AI-Q user's bearer token (when downstream APIs already trust that identity). Tokens are captured at job-submit time and restored inside async workers, so long-running research jobs preserve user identity context. Token refresh mid-job is not yet supported; jobs that exceed the access token's time-to-live will fail on auth-required calls.

Data sovereignty and auditability matter more than architectural elegance

Agent harnesses are built for orchestration, not research. When agents attempt multi-source synthesis without a dedicated backend, they produce inconsistent results on tasks requiring enterprise data, long-horizon planning, or citation accuracy. More critically, the agent harness gains direct access to sensitive source documents, which is unacceptable in regulated industries.

AI-Q inverts the risk: the research pipeline runs where the data is, reads enterprise data, performs retrieval and synthesis, and emits only the cited output. Raw documents never leave the controlled environment. This is the concrete win for teams in healthcare, financial services, government, and defense. The agent harness sees a single high-level capability and never touches the underlying sources.

Auditability ships as a pipeline feature, not a compliance retrofit. AI-Q reports include source attribution, and the underlying NeMo Agent Toolkit emits OpenTelemetry traces. Compliance teams can inspect which sources were retrieved, how they were used, and how the final cited answer was produced.

Teams can also choose their model path: Nemotron reasoning models handle planning and synthesis, while frontier-model routers handle tasks needing additional capability. Open models can run on-premises as NVIDIA NIM, or teams can disable them entirely to meet strict compliance requirements. The same evaluation harnesses used for internal benchmarking (FreshQA, Deep Research Bench, DeepSearchQA) ship with the blueprint, so teams can measure quality on their own data.

Start with deployment, not experimentation

AI-Q runs on Docker Compose or Helm, meaning the same blueprint works on a developer laptop, an on-premises Kubernetes cluster, or an air-gapped data center. For teams in regulated industries, the deployment choice is the architectural choice: pick the environment where your data lives, spin up the server there, and expose it to your agent harness via MCP.

Begin by spinning up the AI-Q server in your data environment (Docker Compose or Helm from the GitHub repository). Then map your enterprise data sources as MCP servers, starting with the authentication pattern that matches your existing access controls (service account for shared sources, bearer token forwarding if your API gateway already trusts the AI-Q user). Test MCP connectivity against your actual source systems, not dummy data.

Once the server is running, install the skill into your agent framework (three commands per harness type). Verify that your agent can submit a research task and receive a report with citations. Only then wire in your first enterprise use case. The evaluation harnesses ship with the blueprint; run them against your own data to establish a baseline for research quality before you declare the pipeline production-ready.

Dell has validated AI-Q on its infrastructure and published a reference architecture for on-premises multi-agent research workflows in regulated industries like financial services, public sector, and manufacturing.

NVIDIA AI-Q Adds Deep Research Skill to Agent Harnesses

Our Take

Why it matters

Do this week

NVIDIA packages research pipeline as reusable agent skill

Data sovereignty and auditability matter more than architectural elegance

Start with deployment, not experimentation

One daily brief. Every story gets a hype verdict.

Related stories

The 30-Day AI-Native Challenge: a free/freemium roadmap to real AI skills

Your AI compliance gap is wider than your governance framework

Compliance teams ditch spreadsheets for unified EDD software