The Actual Intelligentsia: agentic

Showing posts with label agentic. Show all posts

Tuesday, July 28, 2026

The HTTP of AI Shopping: Inside UCP, MCP, and the Open Protocol Stack

In a nutshell (TL;DR)...

This document explores the shift toward "Agentic Commerce," where AI agents independently execute shopping tasks. This ecosystem is powered by three key open protocols: MCP for live data access, UCP for transaction management (carts, identity, and payments), and A2A for dynamic negotiations between software agents. These open standards are vital for a friction-free, competitive future in retail.

In a previous post I talked about the rise of Agentic Commerce and how it will transform the way we shop online. I asked you to imagine asking your Personal AI Assistant: "Find me a waterproof hiking jacket in olive green under $200, apply my store rewards, and ship it to my apartment by Friday."

Five years ago, that prompt would yield a list of web links. Today, the AI Agent doesn't just find the jacket, it checks real-time inventory, verifies your loyalty tier, negotiates dynamic discounts, and executes the purchase without you ever loading a checkout page.

I’ve also delved into the magic sauce that makes this happen but let’s try and bring together the main components into one post and try to understand what is at play. It is an underlying stack of open protocols, a collection of open-source technical standards forming what experts are calling the "HTTP of Agentic Commerce".

If you want to understand how software will trade billions of dollars over the next decade, you need to understand the protocols making it happen: MCP, UCP, and A2A.

1. The Context Layer: Model Context Protocol (MCP)

Before an AI agent can buy anything, it needs access to real-time information. Language models are inherently isolated; they don't natively know if a store has 3 items left in stock or if a price dropped five minutes ago.

Enter Model Context Protocol (MCP), open-sourced by Anthropic.

MCP acts as a universal adapter between AI models and external data tools. Think of it as USB-C for AI: instead of developers writing custom API connectors for every single product database, store backend, or CRM, MCP provides a standardized format for agents to query live data.

What MCP Handles in E-Commerce:

Live Inventory Checks: Quoting real-time stock counts across multiple warehouse locations.
Spec Parsing: Extracting structured technical specifications (e.g., precise dimensions, fabric weight, voltage) from unstructured databases.
Contextual Inputs: Feeding user preferences, sizing profiles, and past purchase histories securely to the AI model.

2. The Commerce Layer: Universal Commerce Protocol (UCP)

While MCP provides data context, it isn't built to orchestrate end-to-end retail transactions. Querying an API for stock is easy; creating a multi-item cart, applying promotional codes, initiating identity verification, and managing payments requires a formal commerce standard.

To solve this, Google teamed up with retail and infrastructure giants—including Shopify, Etsy, Target, Walmart, Visa, and Stripe—to release the Universal Commerce Protocol (UCP).

UCP defines an open-source communication specification for e-commerce operations. It doesn't replace existing merchant platforms like Shopify or WooCommerce; it gives AI agents a set of standard functional primitives to talk directly to them.

Core UCP Primitives:

Catalog & Discovery: Exposing structured, real-time product catalogs directly to AI crawlers.
Cart Management: Enabling agents to create, modify, and calculate sub-totals for multi-item carts programmatically.
Identity & Loyalty Linking: Recognizing that "John Doe" is a Gold Tier member at a store, applying member pricing automatically without forcing a manual login on a website.
Checkout & Orchestration: Tokenizing payments and completing transactions securely while keeping the merchant as the official Merchant of Record.

3. The Negotiation Layer: Agent-to-Agent Protocol (A2A)

The ultimate evolution of agentic commerce isn't just a buyer bot interacting with a static website backend—it's software negotiating with software.

Agent-to-Agent (A2A) protocols establish rules for buyer agents (representing consumers) and seller agents (representing brands) to interact dynamically.

How A2A Dynamics Work in Practice:

Dynamic Bundling: A buyer agent requests a camera, lens, and tripod. The brand's seller agent calculates real-time margins and responds: "If you buy all three together, I can offer an instant $45 bundle discount."
Inventory Clearing: A merchant's AI agent notices excess seasonal inventory and dynamically grants targeted discounts to consumer agents searching for deals in that specific category.
Automated Terms Negotiation: Negotiating bulk order delivery timelines or return window extensions for enterprise purchases.

The Big Shift: Why Open Protocols Win

Why does this open architectural stack matter so much? Because walled gardens create friction, and friction kills conversion.

If every AI assistant required a proprietary integration to buy from every merchant, only massive platforms (like Amazon) would survive. Open standards like UCP, MCP, and A2A democratize the landscape. They allow a boutique clothing store running on a standard e-commerce platform to sell to a consumer using ChatGPT, Gemini, or a standalone personal AI assistant seamlessly.

The web was built on open transport protocols like HTTP and HTML. The era of agentic commerce is being built on data, transaction, and negotiation protocols. Brands that adopt these standards early won't just keep up, they will be the first ones discovered when software starts doing the shopping.

Tuesday, June 16, 2026

Agentic Commerce: The Virtual Procurement Revolution

In a nutshel (TL;DR)...

Agentic AI is transforming B2B procurement by automating routine sourcing and negotiation, shifting operations from reactive to predictive purchasing, and leveraging structured data to allow human experts to focus on strategic oversight through a "Human-in-the-Loop" approach, ultimately reducing operational friction.

The B2B Revolution: Meet Your New Virtual Procurement Department

Earlier this year I posted about the rise of Agentic Commerce being the future of online shopping taking over the consumer experience. We imagined a digital assistant booking a flight to Barcelona or effortlessly picking out the perfect pair of trail running shoes for a weekend getaway. It is an exciting vision, but while consumer applications get most of the spotlight, a much quieter, and arguably much more impactful, revolution is taking place in the business-to-business (B2B) sector.

Welcome to the era of Agentic Commerce in B2B, where AI is moving from a simple data-crunching tool to an active, independent participant in your supply chain.

For years, B2B purchasing has involved a lot of heavy lifting: tracking down supplier catalogs, manually comparing specifications across sprawling spreadsheets, managing endless email chains for quotes, and reacting to sudden inventory shortages. Today, Agentic AI is stepping in to change that dynamic entirely, acting as a "Virtual Procurement Department" for trading and manufacturing companies.

From now on, when I talk about “AI Agents” I am referring to quite a number of software vendors that are offering this as a solution such as ORO Labs or LevelPath that have built their platforms from the ground up with artificial intelligence as the foundation, to the likes of Tropic that lean into AI for highly targeted areas of procurement, such as specialized software (SaaS) purchasing, contract intelligence, or complex e-auctions

So let’s take a closer look at how platforms like this are transforming B2B commerce from a tedious chore into a highly predictive, streamlined operation.

Say Hello to Your Autonomous Sourcing Analyst

In the traditional B2B model, sourcing new components or managing supplier requests requires a significant amount of manual labor. Procurement teams spend hours sending out Requests for Quotations (RFQs), waiting for responses, and then lining up the data to make a choice.

With the introduction of Agentic Commerce, AI agents can take over this routine work. Instead of merely organizing data, these agents act like intelligent analysts. When a business needs a component, the agent can autonomously search through supplier databases, gather bids, and compare terms. It doesn't just look at the bottom-line price, either. The agent can evaluate lead times, review the supplier's quality history, and select the most cost-effective solution. It can even negotiate discounts within predefined rules and automatically approve orders that fall under specific budget limits.

Moving from Reactive to Predictive Purchasing

If you have ever had to deal with the headache of a sudden supply shortage, you know that reactive purchasing is stressful. Traditionally, businesses order new stock based on historical sales data and a little bit of guesswork, which often leads to either costly overstocking or frustrating "out of stock" scenarios.

Agentic AI shifts the paradigm from reactive to predictive. By analyzing historical data, current sales velocity, and market trends, AI agents can accurately forecast future demand. Furthermore, these agents can continuously monitor inventory levels and automatically trigger restocking orders at the perfect moment to prevent supply chain bottlenecks. The system knows not just what you need to order, but exactly when and in what quantity, ensuring operations run smoothly in the background.

Taming the Messy Supplier Data

Of course, for an AI agent to make smart purchasing decisions or negotiate with suppliers, it needs perfectly structured and accurate data. In the B2B world, supplier data is notoriously messy, often arriving in unstructured formats like massive PDFs or inconsistent spreadsheets.

To support an AI-based procurement platform, a Product Information Management (PIM) system acts as the critical intelligence hub. For an AI procurement agent to accurately analyze supplier bids, forecast demand, and execute purchases autonomously, it must base its decisions on flawless, highly structured data. They can even detect inconsistencies or missing attributes and populate them into the right fields. Ultimately, by guaranteeing that supplier records are accurate, consistent, and complete at the point of ingestion, an PIM system provides the trusted data foundation necessary for a B2B procurement platform to operate autonomously and reliably

What Happens to the Human Experts?

With agents sending RFQs, negotiating prices, and monitoring inventory, it is natural to wonder where human professionals fit into this new landscape. The good news is that the rise of agentic AI is not about eliminating the procurement team; it is about elevating them.

This brings us to the concept of the "Human-in-the-Loop" (HITL). While AI agents are fantastic at analyzing data and executing routine purchasing tasks, they lack the nuanced judgment, ethical grounding, and relationship-building skills required for high-stakes decisions. In an agentic B2B environment, human experts transition from tactical execution (like manual data entry) to strategic oversight.

For example, an agent might handle 90% of routine supplier restocking autonomously. But if it encounters an unprecedented supply chain disruption, a wildly out-of-budget price hike, or a scenario that requires complex ethical reasoning, it is programmed to pause and escalate the issue to a human manager. Humans remain in control, defining the agent's objectives, setting the financial guardrails, and managing the exceptions.

A Smarter Way to Do Business

The B2B revolution powered by Agentic Commerce is ultimately about removing friction. By handling the drudgery of data processing, supplier negotiations, and inventory tracking, AI agents free up human professionals to focus on what they do best: building strategic partnerships, exploring new market opportunities, and driving innovation.

The future of B2B is predictive, highly automated, and incredibly efficient. And with a Virtual Procurement Department working tirelessly in the background, businesses can look forward to a much smoother ride.

Tuesday, May 26, 2026

The Architecture of Human-in-the-Loop Agentic Governance

In a nutshell (TL;DR)...

The shift to autonomous 'agentic' AI requires mandatory Human-in-the-Loop (HITL) governance, which acts as a foundational layer for ethics, operations, and strategy. HITL prevents catastrophic 'confident mistakes' from probabilistic models, ensures accountability in regulated industries, and handles subjective decisions. Best practices involve setting clear intervention triggers (like high-risk actions or low confidence) and using 'Context Memos' to keep human experts efficient. Properly designed, this hybrid system automates routine volume while safely scaling output, allowing humans to focus on strategic oversight and continuous learning.

The Hybrid Workforce: Why Human-in-the-Loop is the Secret to Agentic AI Success

Back in April while I rambled about the evolution of Prompt Engineering, I made mention of the concept of keeping the “human-in-the-loop”, so I decided to look into the importance of this aspect of AI and here’s what I found…

Artificial Intelligence is undergoing a massive leaps and bounds, shifting from models that simply answer questions to "agentic" systems that proactively plan, use tools, and execute multi-step workflows. With this newfound autonomy, a critical question arises: if an AI can operate independently, what happens to the human?

The reality is that as AI systems become more capable of taking action, the need for human oversight does not disappear, it transforms. Human-in-the-Loop (HITL) is no longer just a mechanism for quality control or data labeling; it is a foundational layer of ethical, operational, and strategic governance.

Here is a deep dive into why retaining the human-in-the-loop is essential for agentic processes, the best practices for designing these interactions, and how to ensure this hybrid approach actually saves you time rather than creating more work.

Why Human-in-the-Loop Matters for Agentic AI

When AI simply provided recommendations, humans were the primary decision-makers, a paradigm known as "AI-in-the-Loop". In the agentic era, where AI drives the execution, making it a true "Human-in-the-Loop" system where humans supervise, validate, or act as an escalation authority. Retaining this human oversight is non-negotiable for several reasons:

Preventing "Confident Mistakes": Large Language Models (LLMs) are probabilistic, meaning they can generate outputs that look highly structured and logical but are entirely hallucinated. If an agent is empowered to modify infrastructure, update databases, or execute financial transactions, a hallucinated action could be disastrous. Think of an AI calculating your Tax Returns…
Navigating Subjectivity and Ethics: AI agents operate on logic and data, but the real world operates on context and ethics. An agent might make a decision that is technically correct but culturally inappropriate, heavily biased, or lacking in empathy.
Ensuring Accountability and Compliance: In regulated industries like healthcare, finance, or law, you cannot simply say "the model decided" . Human oversight is often a legal requirement to ensure that every sensitive action has a traceable human approver.

Best Practices for Designing Agentic HITL Processes

Integrating humans into an autonomous workflow requires careful design. If you bombard a human reviewer with every minor agent decision, you defeat the purpose of automation. The goal is to design for episodic, conditional intervention rather than continuous manual oversight. Let’s consider some best practices for architecting these systems…

1. Define Clear Intervention Triggers

Agents should be programmed to know their own limits and pause execution when they hit specific thresholds. Best-in-class workflows set triggers for:

Low Confidence: The agent halts if its statistical confidence in a decision falls below a preset benchmark.
High-Risk Actions: Any action that is irreversible, like permanently deleting data, executing a high-value trade, or sending an external email, should automatically trigger a pause for human approval.
Novelty (Black Swan Events): If the agent encounters an "out-of-distribution" scenario that wasn't in its training data, it must escalate the issue to a human problem-solver.

2. Structure the "Four Dimensions" of Oversight

To prevent fragmented and inconsistent human involvement, HITL should be treated as a structured, decoupled system component. This involves defining four key dimensions:

WHEN (Intervention Conditions): The exact criteria that trigger human involvement.
WHO (Role Resolution): Routing the approval to the correct domain expert (e.g., a financial manager for a budget approval versus a compliance officer for a regulatory check).
WHAT (Interaction Semantics): Clarifying what the human needs to do—approve, reject, modify, or simply monitor.
WHERE (Communication Channel): Meeting the human where they work. Urgent approvals might route to Slack or SMS, while lower-priority reviews might sit in an email or dedicated dashboard.

3. Provide a "Context Memo"

When an agent pauses to ask for help, it shouldn't just dump raw JSON or endless chat logs on the human reviewer. Instead, the agent should generate a concise "Context Memo" explaining what it is trying to achieve, why it paused, and exactly what decision it needs the human to make. This drastically reduces the cognitive load on the human expert.

4. Implement Modular HITL Design Patterns

Leverage established design patterns depending on the task:

Interrupt & Resume: The agent pauses mid-workflow, waits for a human to click approve/reject, and then resumes execution (ideal for access control or financial ops).
Human-as-a-Tool: The agent treats the human as just another API or tool. If it gets confused, it "calls" the human tool to ask a clarifying question.

Ensuring the Benefit: Efficiency vs. Doing It Yourself

A common objection to implementing HITL is: "If I have to review the AI’s work, doesn't that take just as much time as doing the task myself?"

Without proper design, it absolutely can. However, when deployed correctly, the hybrid human-AI model is vastly more efficient and scalable than manual labor. Here is how you ensure the ROI of a HITL system:

Automate the Volume, Humanize the Exceptions

In a well-tuned system, the AI agent autonomously handles 90% of routine requests flawlessly. The human is only looped in for the 10% of "corner cases" that are highly complex or ambiguous. You are scaling your output by 10x without increasing your risk profile.

Factor in the Cost of Catastrophe

The momentary delay of a human hitting "pause" or "approve" is negligible compared to the astronomical costs of an autonomous error such as a regulatory fine, a data breach, or a ruined customer relationship.

Turn Feedback into Continuous Learning

A human's response to an agent should not just be a one-time binary "yes" or "no." Through Reinforcement Learning from Human Feedback (RLHF), human corrections are fed back into the model. Every time a human intervenes, the agent learns from the correction, meaning it will be able to handle that specific edge case autonomously the next time.

Conclusion

The evolution of agentic AI is not leading us toward a world without humans; it is leading us toward a world of super-powered humans. By shifting the human role from tactical execution to strategic oversight and exception handling, organizations can safely harness the incredible speed and scale of autonomous agents while remaining firmly grounded in human values, ethics, and common sense. The most successful AI workflows of the future won't be the ones that eliminate humans, they will be the ones that know exactly when to ask them for help.

Tuesday, May 19, 2026

The Rise of Swarm Intelligence and Agentic AI Architecture

TLDR

The AI industry is rapidly shifting from the copilot model (Generative AI) to Agentic AI (autonomous execution of complex workflows) using Swarm Intelligence. This new architecture replaces monolithic models by distributing tasks across specialized, collaborative sub-agents (e.g., Research, Execution, and Critique Agents). This multi-agent orchestration enables planning, debating, and self-correction, drastically increasing reliability and allowing for end-to-end task completion, such as autonomously building and testing software applications.

Throwing back to my post a few weeks ago where I suggested the end of Prompt Engineering, one topic that cropped up was “Swarm Intelligence”. It took a wee look at what that might mean in the world of AI…

From Copilots to Swarm Intelligence: How Autonomous Agents are Redefining AI

For the past few years, our relationship with Artificial Intelligence has been defined by the "copilot" model. In this paradigm, AI acts as a highly capable but passive assistant: you prompt it to draft an email, write a snippet of code, or summarize a document, and it generates a response. It was a revolutionary step, but it still required a human to manually drive every interaction, piece together the outputs, and execute the final task.

Today, that era is rapidly fading. The industry has decisively shifted from Generative AI (creating content) to Agentic AI (executing workflows). We are no longer just interacting with conversational copilots; we are deploying autonomous agents capable of planning, verifying, and executing complex, multi-step workflows end-to-end.

At the heart of this transformation is a radical change in how AI systems are architected: the death of the monolithic model and the rise of "Swarm Intelligence."

The Death of the "Single God Model"

Previously, the prevailing approach was to rely on a "Single God Model"—one massive, monolithic AI expected to handle everything from creative writing to complex mathematics and code deployment. However, forcing a single model to act as a jack-of-all-trades inevitably led to bottlenecks, logical breakdowns, and "hallucinations," especially when managing long-horizon tasks that require deep reasoning.

To solve this, the industry pivoted to Swarm Intelligence (or multi-agent orchestration). Instead of relying on one model to do it all, tasks are distributed across a network of specialized sub-agents that work collaboratively. By dividing responsibilities, these agents emulate real-world human teams, communicating, debating, and self-correcting to achieve a shared objective.

In a typical swarm architecture, a complex problem is broken down and assigned to specialized roles:

The Research Agent: Dedicated to information gathering. It navigates external databases, scrapes the web, or searches internal documents to pull the exact context needed.
The Execution Agent: The "doer" of the group. This agent takes the research and uses tools to take action, whether that means writing a script, drafting a comprehensive report, or configuring a server.
The Critique (or Evaluator) Agent: The quality control layer. This agent independently reviews the Execution Agent's output, running tests, analyzing for logical flaws, and providing structured feedback for iterative refinement before any human ever sees the result.

Working in concert, these specialized sub-agents drastically reduce hallucination rates and solve problems that would overwhelm a single model.

A Tangible Example: Building Software with Agent Swarms

To understand how this looks in practice, let's look at Vibe Coding that I discussed previously, which is the process of building software applications through natural language rather than manual typing.

Imagine you want to build a full-stack Customer Relationship Management (CRM) application. In the old "copilot" days, you would prompt an AI to write the frontend code, copy-paste it, prompt it again for the database schema, manually wire them together, and spend hours debugging the inevitable integration errors.

Under a multi-agent orchestration platform (like Emergent or ChatDev), the process looks entirely different. You simply provide the high-level goal: "Build a CRM with a contact list, a pipeline view, and a database."

From there, the swarm takes over:

The Meta-Planner Agent receives your goal and breaks it down into a hierarchical task list, delegating work to subordinate agents.
The Design/Frontend Agent starts building the user interface components (like the contact list and pipeline dashboard).
The Backend/Execution Agent simultaneously spins up the database schema and writes the API routes to connect to the frontend.
The Critique/Testing Agent acts as an adversarial reviewer. It generates unit tests against the new code. If a database query fails or a security vulnerability is detected, the Critique Agent sends the error log directly back to the Execution Agent with instructions on how to fix it.

This multi-agent debate and refinement loop, where agents critique each other to expose errors and enforce self-correction, continues autonomously until the tests pass. The system ultimately delivers a fully functional, deployed application. You didn't write the code, nor did you have to guide the AI step-by-step; you acted as the high-level director while the swarm managed the execution.

The Future: Agent Meshes and Scalable Oversight

The shift toward Swarm Intelligence provides a framework for true reliability. By assigning agents to constantly verify and critique work, businesses can deploy AI with built-in guardrails against cascading errors. Pre-internet me says “That’s the theory anyway!”

Looking ahead, we will see the rise of standardized "agent meshes"—interconnected networks of agents that securely handle planning, memory, tool routing, and supervision across entire enterprise workflow. As these agentic systems mature, they will fade into the background infrastructure of our daily work, evolving from simple assistants you chat with into highly productive digital teammates that autonomously bring your ideas to life.