Goldman Sachs has spent the past six months working with Anthropic to co-develop autonomous AI agents to automate a growing set of operational roles within the bank, according to comments from Goldman CIO Marco Argenti in an exclusive interview first reported by CNBC and later picked up by Reuters.
The work centers on two early use cases that sit directly on the critical path of revenue and risk: accounting for trades and transactions, plus client vetting and onboarding. In both areas, the constraint rarely comes from a single system. It comes from the combination of high-volume documents, structured data, strict rules, exception handling, and human judgment loops. That mix is exactly where agentic systems tend to compound value once they reach production reliability.
Why do these two functions matter first?
Trade accounting and reconciliation is a machine-made problem that still includes human time sinks. Data arrives late, formats differ across venues and counterparties, and exceptions require investigation across multiple systems. A capable agent can pull supporting artifacts, apply the firm’s reconciliation logic, generate an audit trail, and route edge cases to the right operator with a clean summary and recommended action.
Client vetting and onboarding carry a different kind of complexity. The bank must synthesize due diligence inputs, validate identities and entities, screen against policy requirements, and move cases through approval workflows. Compressing cycle time here improves client experience and accelerates time to revenue, while also tightening consistency in policy application. Reuters notes Goldman expects the agents to reduce the time required for these processes.
From copilots to digital coworkers
This initiative signals a shift from assistant-style tooling toward agents that execute end-to-end tasks. Argenti described the concept as a “digital co-worker” for process-intensive professions inside the firm, with early agents built on Anthropic’s Claude model.
That framing matters because it changes what a successful deployment looks like. A copilot improves an individual’s output. An agent changes throughput, queue dynamics, and operating cost per unit of work. It also forces the organization to define ownership, escalation paths, and measurable service levels.
The playbook behind the partnership
Two details stand out in the reporting.
First, Goldman embedded Anthropic engineers with its teams for co-development. That usually indicates the bank is optimizing for production-grade integration rather than a demo layer.
Second, the bank previously tested an autonomous coding agent called Devin with its engineering workforce, then expanded its focus after seeing Claude perform strongly beyond coding.
Put together, the story looks like a deliberate sequencing strategy:
- Start where productivity gains are easy to observe, like coding support.
- Validate model reliability on complex reasoning tasks.
- Move into regulated, high-consequence workflows where cycle time and consistency create clear ROI.
- Expand into additional scaled roles once governance and controls stabilize.
What this means for headcount, vendors, and operating models
Goldman leadership has publicly discussed using generative AI as part of a multi-year reorganization, with an emphasis on managing headcount growth while modernizing how work gets done. Coverage across major outlets highlights banks aiming for productivity gains first, then selective restructuring as systems mature.
In the Reuters summary of the CNBC reporting, Argenti characterized job impact as an early-stage question rather than an immediate outcome. The more immediate implication is capacity injection: faster processing, fewer bottlenecks, and improved client turnaround.
A second-order implication is vendor pressure. Argenti suggested that as AI matures, Goldman may reduce reliance on certain third-party providers used today. That is a direct line to the broader debate about AI-driven disruption across software and services categories.
Governance becomes the real differentiator.
Agent deployments in finance rise and fall on governance, not model demos.
A bank-level agent program typically requires:
Clear permissioning and data access rules
Agents require least privilege access, durable identity, and continuous monitoring.
Deterministic audit trails
Every action must be attributable, replayable, and explainable to internal risk stakeholders.
Exception management
Agents create value by efficiently handling the long tail of edge cases, which requires human escalation design and feedback loops.
Model risk management and evaluation
Continuous testing against policy, drift, and failure modes becomes part of operations, similar to existing controls around models used in risk and pricing.
These controls turn agent adoption from a technology project into an operating model shift.
The timeline signal
Argenti said Goldman expects to launch the agents soon, while keeping a specific date confidential. That phrasing usually signals the work is past pure experimentation and entering the stage where rollout depends on integration testing, controls, and readiness for operational ownership.
Where Goldman could expand next
In the same reporting thread, Argenti referenced potential future agent areas such as pitchbook production and employee surveillance-related tasks. Whether or not those become priority areas, the direction is consistent: once an agent can reliably execute multi-step reasoning on messy enterprise inputs, any workflow with heavy documentation, repeatable rules, and measurable cycle times becomes a candidate.
Strategic takeaway for financial services leaders
Goldman’s Anthropic build suggests a near-term pattern that other banks and insurers will likely copy:
- Target a narrow set of high-volume workflows tied to revenue and risk.
- Embed implementation talent to reach production quality.
- Treat agents as a service layer with SLAs, controls, and ownership.
- Use early wins to justify expansion and rationalize vendor spend.
The organizations that move fastest here typically decide on an operating model first, then tooling.






