Tuesday, March 24, 2026

The Model Context Protocol (MCP): Bridging AI and Actionable Data

 My day job has recently introduced a new concept for me to understand in my daily life (like I need more new concepts to understand). The Model Context Protocol (MCP)...

What is MCP?

MCP is an open standard designed to unify how AI assistants and large language models (LLMs) connect with external data sources, tools, and environments. An MCP Server acts as a secure gateway or bridge between the AI application (the client) and external systems, such as databases, file systems, or APIs. It is frequently compared to a "USB-C port for AI," as it provides a universal, standardized interface for plugging external capabilities into AI systems.

For example this is incredibly useful if you are building a chatbot for your organisation and want your AI Assistant to have access to your internal customer support database to use as a knowledge base for resolving common issues and answer questions based on your company (i.e. it’s providing context to your AI service). Without it you would somehow have to expose all that information to the larger LLM which is just not gonna happen.

An MCP server exposes three core primitives to AI applications:

  • Tools: Executable functions that the AI can actively call to perform actions, such as writing to a database, executing a web search, or modifying a file.

  • Resources: Passive, read-only data sources that provide the AI with context, such as database schemas, API documentation, or your customer support database.

  • Prompts: Reusable instruction templates that help structure interactions and guide the AI through specific workflows. Prompts that are refined by the architect of the MCP server to provide you with more meaningful responses - saving you time in generating and testing these from scratch.

Why You Would Need an MCP Server?

  • To Eliminate Fragmented Integrations: Before MCP, developers had to write custom API integrations for every single external tool or system an AI needed to access. By implementing an MCP server, developers can build an integration once and grant the AI access to a vast, standardized ecosystem of resources without maintaining dozens of custom codebases.

  • To Enable Safe Action and Execution: LLMs are limited to the data they were trained on and lack built-in environments to safely execute code or make network requests on their own. An MCP server acts as a controlled execution layer. It keeps sensitive elements like API keys hidden from the model while the server handles the actual safe execution of tasks.

  • For Dynamic Tool Discovery: Unlike static API specifications (like OpenAPI) that must be pre-loaded into an LLM, MCP allows AI applications to query servers at runtime to dynamically discover what tools and resources are currently available.

  • To Ensure Security and Access Control: MCP servers are designed with enterprise security in mind, utilizing OAuth 2.1 for authentication and centralizing permissions management. This ensures that AI applications only interact with authorized data and that user-specific contexts are strictly respected so data does not leak between users.

  • For Portability Across Applications: Because MCP is vendor-agnostic and model-agnostic, you can build a toolset once via an MCP server and plug it into any compatible AI application or IDE—such as Claude Desktop, Cursor, Windsurf, or LangChain—without needing to rewrite the integration.

  • To Support Agentic Workflows: MCP facilitates conversational, multi-turn interactions through real-time updates and streaming (using Server-Sent Events). This allows AI agents to dynamically interact with multiple data sources, handle intermediate steps, and maintain persistent context over complex, multi-step tasks.

Why use MCP rather than an API?

While both APIs and MCPs aid in communication between systems, their core audiences, mechanisms, and philosophies differ significantly. A helpful way to frame the difference is that APIs connect machines, whereas MCP connects intelligence to machines.

Here are the primary differences between the two:

Target Audience and Optimization

  • APIs are built for human developers to write code against, optimizing software-to-software communication.

  • MCP is built specifically for AI models to streamline agentic interactions where an AI needs to reason about the data it receives.

Static vs. Dynamic Discovery

  • APIs rely on static contracts that must be pre-loaded, read, and manually interpreted to formulate requests.

  • MCP features dynamic discovery. An AI agent can query an MCP server at runtime to ask, "What tools can you offer?", and the server will automatically respond with a structured list of available tools, their descriptions, and parameter schemas. This means the AI always has an up-to-date view of its capabilities without needing manually updated documentation.

Security and Execution

  • APIs are exposed over the network and assume the caller can securely manage tokens, headers, and request formatting. However, AI models do not have built-in execution environments and cannot safely hold secrets like API keys.

  • MCP introduces a secure intermediary layer. The AI model never sees API keys or sensitive URLs. Instead, the AI asks the MCP server to use a specific tool, and the MCP server validates the input, securely executes the API call using its own hidden credentials, and returns only the safe results. MCP also standardizes security governance, utilizing protocols like OAuth 2.1 to ensure the AI only accesses data the user has explicitly authorized.

Granularity and Abstraction

  • APIs typically expose granular, entity-based endpoints (e.g., /users or /weather).

  • MCPs are less granular and focus on driving broader use cases. An MCP server exposes high-level capabilities (e.g., get_weather or get_open_supportIssues). A single MCP tool might execute several underlying REST API calls to gather all the necessary context for the AI.

LLM-Native Features

  • APIs are generally stateless request-response mechanisms.

  • MCP supports multi-turn, long-lived sessions (often using Server-Sent Events) that allow an AI agent to have back-and-forth interactions with a tool. Furthermore, MCP includes AI-specific features like sampling, which allows the MCP server to leverage the LLM's reasoning abilities. For example, an MCP server could fetch open issues and then use sampling to ask the LLM to filter them by "highest security impact"—a subjective analysis that a traditional REST API cannot natively perform.

Output Formatting

  • APIs return machine-readable data, such as raw JSON payloads and database entity IDs.

  • MCP is designed to return data optimized for an LLM's context window, often formatting responses as human-readable Markdown with fully hydrated entity names instead of raw IDs.

How secure is my data behind an MCP Server?

Because the Model Context Protocol (MCP) acts as a bridge between untrusted, model-generated inputs and sensitive external systems, a single weak point can turn that bridge into a pathway for exploitation. Securing an MCP deployment requires a "shared responsibility" model, where the server stands as a fortified wall protecting resources, and the client acts as a vigilant gatekeeper ensuring the AI does not overstep its bounds.

Academic research breaks down MCP threats into four main categories: malicious developers, external attackers, malicious users, and security flaws. In practice, these manifest as prompt injection, command execution, token theft, excessive permissions, and unverified endpoints.

To protect yourself, you must implement strict safeguards across both MCP servers and MCP clients and quite honestly 99.9% of it goes straight over my head. It can be “dead secure” is what I’ll say on the matter.

How does MCP make my life easier?

So let’s list out a few scenarios where an MCP Server would make sense. At the end of the day it sits in the background and makes interaction with AI more meaningful as it has access to more capabilities and context of the organisation you are talking to.

Software Development and Debugging

The AI coding assistant is greatly enhanced by using MCP to connect directly to local filesystems and version control systems like Git or GitHub. Instead of manually pasting code snippets into a chat, the AI can securely browse your local files, read repository code, search codebases, review pull requests, and even commit changes directly within environments like Cursor or Claude Desktop.

Automated Travel Planning

The true power of multi-server MCP architecture shines here by combining multiple disparate services into one workflow. By connecting a Travel Server, a Weather Server, and a Calendar Server, an AI agent can autonomously read your calendar to find available dates, check destination weather forecasts, search and book flights, and automatically add the itinerary to your schedule while emailing you a confirmation.

Workflow and Communication Automation

AI can connect seamlessly to platforms like Slack, Gmail, or Google Drive. An AI assistant can search through your team's Slack history to pull project context, summarize past decisions, and automatically draft and send emails based on a simple natural language request, all without you needing to switch tabs.

Data Analysis and Visualization

MCP allows AI models to connect directly to SQL databases, Google Sheets, or financial APIs. The AI can read raw data like customer feedback or stock market history, execute complex queries, and instantly generate interactive charts or analytical dashboards. For instance, an AI can use the Alpha Vantage MCP server to fetch 10 years of historical coffee prices and immediately plot an interactive visual graph for you.

Enterprise Knowledge Management

A multi-agent MCP setup can be entirely automated for a Training Management System that can use specialized MCP agents to automatically ingest uploaded PDF documents, extract key learning objectives, generate structured course modules, and create custom multiple-choice assessments without manual human intervention.

Ultimately, the core benefit of MCP in these scenarios is that it transforms AI from a passive text generator into an active, context-aware participant. By utilizing standardized tools, resources, and prompts, you gain a modular, secure way to grant AI access to your personal and business data without needing to write custom integrations for every single application.

That was a big chunk of stuff I learned this week… What next?


Monday, March 16, 2026

A Tale of Two Commerce Protocols

In previous posts I discussed the advent of Agentic Commerce and how that is primed to become the new way to shop for products online.

In order to enable the AI platforms to be aware of your brand presence and product information there are a number of strategies and techniques, specifically GEO (Generic Engine Optimization) and AEO (Answer Engine Optimization), that can attract the AI bots to prefer your brand and recommend your products within the many conversations that customers are now having with AI applications.

GEO is a broad strategy that involves a number of techniques that involve changes in how you write product content and optimize your websites so that AI will pick you first as the authoritative source for the answers within the Agentic Commerce experience.

Very recently a couple of new developments have emerged that both sound like it’s attempting to answer a similar question. Namely OpenAI’s Agentic Commerce Protocol (or ACP) and Google’s Universal Commerce Protocol (or UCP).

OpenAI’s ACP is an open, cross-platform protocol designed to enable shopping and payments directly within AI assistants, independent of any single platform or user interface. It allows AI agents to discover products via merchant-provided feeds, surface accurate pricing and availability, and autonomously initiate checkouts on the user's behalf without redirecting them to an external website.

The checkout process uses secure, delegated payment tokens (which are single-use, time-bound, and amount-restricted), while ensuring that the merchant retains full control over settlement, refunds, chargebacks, and compliance. The first implementation of this protocol is the Instant Checkout experience within ChatGPT.

Google’s UCP is a new open standard designed to establish a common language that allows AI agents, businesses, and payment providers to work together across the entire shopping journey from product discovery to post-purchase support. They also have massive Industry Endorsement collaborating with the likes of Etsy, Shopify, Best Buy and Walmart (US) who are either implementing, or have gone live with AI Agents.

While it is designed to be compatible with other agentic protocols, UCP is initially rolling out exclusively on Google-owned surfaces, such as Search AI Mode, Google Shopping, and the Gemini App. It enables shoppers to buy from eligible retailers directly during product discovery without leaving Google, utilizing Google Pay for seamless transactions while the retailer remains the seller of record.

Why ACP/UCP are More Helpful Than AIO/GEO

While Artificial Intelligence Optimization (AIO) and Generative Engine Optimization (GEO) are critical strategies, they are fundamentally focused on top-of-funnel visibility. AIO and GEO ensure that an AI model correctly parses, embeds, and cites your brand as the "source material" when answering a user's question. However, simply getting found is only the first step of the commerce journey.

ACP and UCP are arguably more helpful because they bridge the gap between discovery and execution, transforming the entire commercial funnel:

  • Moving from Recommendation to Action: AIO/GEO might prompt an AI to recommend your product, but the user still has to navigate to your site, browse, add to cart, and manually checkout. ACP and UCP grant the AI "agency" to act on the user's intent and execute the purchase directly within the conversational interface.

  • Frictionless Shopping: Traditional e-commerce is linear and rigid (search → browse → filter → product page → cart → checkout). ACP and UCP collapse these steps into a natural dialogue, drastically reducing friction and lowering cart abandonment.

  • Capturing Immediate Revenue: By allowing shoppers to move from intent to purchase without breaking context or leaving the app, these protocols turn high-intent discovery moments directly into revenue.

In short, AIO and GEO help AI talk about your product, but ACP and UCP allow AI to buy your product on the customer's behalf.

Which one to choose?

Both OpenAI's Agentic Commerce Protocol (ACP) and Google's Universal Commerce Protocol (UCP) share the same overarching goal: to reduce friction in the shopping journey by allowing AI agents to handle product discovery and checkout seamlessly, without redirecting the user to an external website.

However, they differ significantly in their execution environments, how they handle payments, and their initial scope.

OpenAI’s Agentic Commerce Protocol (ACP)

  • Design & Environment: ACP is an open, cross-platform protocol built to enable shopping and payments directly within AI assistants. It is designed to be independent of any single platform, user interface, or distribution surface. Currently, its primary implementation is the "Instant Checkout" experience inside ChatGPT.

  • Payment Mechanism: ACP initiates checkout on the user's behalf using delegated payment tokens. These tokens are highly secure because they are single-use, time-bound, and amount-restricted.

  • Merchant Role: In this model, merchants maintain complete control over the transactional backend, retaining responsibility for settlement, refunds, chargebacks, and compliance.

Google’s Universal Commerce Protocol (UCP)

  • Design & Environment: UCP is pitched as a new open standard designed to support the entire shopping lifecycle, from product discovery and buying to post-purchase support. However, unlike ACP's cross-platform focus, UCP is initially being rolled out exclusively across Google-owned surfaces, including Search AI Mode, Google Shopping, and the Gemini App.

  • Payment Mechanism: Instead of delegated tokens, UCP leverages Google Pay to complete transactions natively during product discovery, with PayPal support planned for the future.

  • Additional Features: Alongside UCP, Google launched a feature called "Business Agent," which allows retailers to engage shoppers conversationally and enable direct purchases right within Google Search.

The Core Differences

  • Where the Shopping Happens: ACP enables agent-led commerce primarily across the OpenAI ecosystem as a standalone destination, while UCP currently focuses on reducing checkout friction specifically within Google's massive search and discovery surfaces.

  • Coexistence Over Competition: Google designed UCP to be compatible with other agent-to-agent standards and protocols. This means the two protocols are not necessarily meant to replace one another, but rather to coexist. UCP helps convert high-intent shoppers who are actively searching on Google, while ACP opens the door to new demand where AI chat assistants act as the shopping destination.

So it’s not like the old VHS/Beta video wars of the 80s. The question isn't which protocol "wins" it's whether your product data (and infrastructure) is ready to feed both. The reality is that you may need to support a multi-protocol ecosystem, just like supporting Apple Pay, Google Pay, and PayPal today. We are entering a multi-agent, multi-protocol world where structured product data is the "source code" of commerce.


Tuesday, March 10, 2026

Prompting Techniques explained

 

The core differences between prompting techniques lie in how they structure instructions, the volume of examples provided, the specific mechanism used to trigger reasoning, and how they define the model's interaction with external information.

Instruction vs. Example Based Techniques

The most fundamental distinction exists between prompts that rely solely on description and those that utilize pattern recognition through examples.


Zero-Shot Prompting

This is the simplest technique, relying entirely on a description of the task without providing any examples. It depends on the model's pre-existing training data to understand instructions like "Classify movie reviews".

One-Shot and Few-Shot Prompting

These techniques differ from zero-shot by providing demonstrations. One-shot provides a single example to help the model imitate a task, while few-shot provides multiple examples (generally three to five) to establish a pattern. The core difference here is that few-shot prompting conditions the model to follow a specific output structure or reasoning style for the current inference, rather than relying solely on its general training. For classification tasks, mixing up the order of classes in few-shot examples is recommended to prevent the model from overfitting to a specific sequence.


Contextual and Persona-Based Techniques

These techniques differ in which aspect of the model's generation they primarily influence: its fundamental purpose, its immediate knowledge base, or its stylistic voice.

System Prompting

This sets the "big picture" context and defines the model's overarching purpose and capabilities (e.g., defining the model as a code translator). It is often used to enforce safety or specific output requirements like JSON formats.

Contextual Prompting

Unlike system prompting, which is broad, contextual prompting provides immediate, task-specific background information necessary for the current interaction.

Role Prompting

While system prompting defines *what* the model does, role prompting defines *who* the model is. It assigns a specific character or identity (e.g., "act as a travel guide" or "act as a confrontational debater") to frame the output's tone, style, and personality.


Reasoning and Logic Techniques

Several techniques are designed to improve performance on complex tasks by altering the model's cognitive process. The differences lie in the structure of that process—whether it is linear, abstract, or branching.

Chain of Thought (CoT)

This technique forces the model to generate intermediate reasoning steps before providing a final answer. It differs from standard prompting by breaking down the "black box" of the model's processing into a linear sequence of thoughts. It is particularly effective for math or logic tasks where a direct answer might fail.

Step-Back Prompting

Unlike CoT, which works through the specific details immediately, step-back prompting asks the model to first answer a high-level, general question related to the task. This abstraction allows the model to retrieve relevant principles and background knowledge *before* applying them to the specific problem, reducing errors rooted in specific details.

Tree of Thoughts (ToT)

While CoT follows a single linear path, ToT allows the model to explore multiple reasoning paths simultaneously. It generalizes CoT by maintaining a "tree" where the model can branch out to explore different possibilities, making it superior for tasks requiring exploration rather than just linear execution.


Consensus & Action-Based Techniques

These advanced techniques differ by introducing verification mechanisms or external interactions.

Self-Consistency: This technique addresses the limitation of a single reasoning path in CoT. It involves submitting the same prompt multiple times (often with a higher temperature to encourage diversity) and selecting the most consistent answer (majority voting). It essentially prioritizes the *reliability* of the reasoning over a single attempt.

ReAct (Reason & Act): This paradigm differentiates itself by allowing the model to interact with the outside world. It combines reasoning with the ability to perform actions, such as querying external APIs or search engines. It operates in a "Thought-Action-Observation" loop, whereas other techniques rely solely on the model's internal parameters.


Structural Frameworks

Finally, there are differences in how users are advised to organize prompts conceptually:

The Rhetorical Approach: Focuses on the rhetorical situation, explicitly defining the audience, author ethos, pathos (emotional appeal), and logos (logic).

The C.R.E.A.T.E. Framework: A specific acronym-based structure (Character, Request, Examples, Additions, Type, Extras) that emphasizes treating the AI as a distinct "character".

The Structured Approach: Emphasizes a formulaic breakdown: Role and Goal, Context, Task, and Reference content.

Wednesday, March 4, 2026

Prompt Engineering and its Wily Ways

This week I’ll take some time out from AIO and talk about some basics that I’ve been getting to grips with in my day job, particularly over the last year. Prompt Engineering has appeared from nowhere and the more you dig in, the more I find that there is just a ton of techniques and methods that can really make a difference in what you get back from AI. Sure you can treat it just like a Google search, but it can do a whole lot more…

What is an AI Prompt?

In the context of generative AI, a “prompt” is most often text, but can also be other modes like images or voice commands, that are provided to an AI model to elicit a specific response or prediction. It serves as the primary interface for interacting with Large Language Models (LLMs), acting as a form of "coding in English" where the user defines the task, context, and constraints for the AI to process.

In other words, it's not just like something you'd type into a Google search, it can be a whole lot more. Possibly review your Resume and re-write it in a particular manner, summarize a website article or even produce something out of a hat like a unique poem or story.

Why Take Time to Develop Them?

It’s easy to use this like a standard Google search and that’s totally fine too. However you can really unleash the power of AI by investing some time in “prompt engineering” which is described as more of an art than a science, often requiring experience and intuition to master. This iterative process is necessary for several reasons:

To Ensure Accuracy. LLMs function as prediction engines, generating the next most likely text based on their training data. Without a high-quality prompt to guide this prediction, the model may produce ambiguous, inaccurate, or irrelevant outputs.

It forces you to write very accurate instructions to ensure a more predictable result and this is a good practice for all walks of life.

To Navigate Sensitivity. Models are highly sensitive to word choice, tone, structure, and context; even small differences in phrasing or formatting can lead to significantly different results.

To Define Boundaries. A well-developed prompt helps the user understand the model's capabilities and limitations, allowing them to improve safety and reduce the likelihood of "hallucinations" (fabricated information). AI can lie very effectively, so don't give it a fraction of a chance to do it.

To Optimize Resources. Poorly designed prompts can lead to excessive token generation, which increases latency and computational costs. Refined prompts can enforce conciseness and specific output structures (like JSON) that make the data more usable.

Ultimately it’s always best to be absolutely clear on what you are asking the AI to do, giving it no possibility to go off and get creative with its answer.

Prompt Design

Designing high-quality prompts is an iterative process that blends art and engineering. The best practices for prompt engineering can be categorized into structural frameworks, instructional strategies, technical configuration, and process management.

Structural Frameworks

To maximize effectiveness, prompts should follow a logical structure rather than being a loose collection of sentences. Several frameworks are recommended:

The Structured Approach

This formula involves four key components:

    1.  Role and Goal - Broadly describe the aim and the persona the model should adopt.

    2.  Context - Provide background information.

    3.  Task - Make expectations explicit and detailed.

    4.  Reference Content - Supply the data or text the AI needs to process.

The C.R.E.A.T.E. Framework

A mnemonic for drafting prompts that stands for Character (role), Request (specific task), Examples, Additions (style/POV refinements), Type of Output, and Extras (context/reference text).

The Rhetorical Approach

This focuses on the "rhetorical situation," defining the audience, context, author ethos (credentials), pathos (desired emotional response), logos (logical points), and arrangement.

Instructional Strategies

How you phrase your request significantly impacts the model's performance.

Be Specific and Simple

Simplicity is a design principle; if a prompt is confusing to a human, it will likely confuse the AI model. You must be specific about the desired output to ensure the model focuses on what is relevant. Leave as little to interpretation as possible.

Use Instructions Over Constraints

It is generally more effective to give positive instructions (telling the model what to do) rather than constraints (telling it what not to do). Constraints should be reserved for safety purposes or specific formatting limits.

Provide Examples (Few-Shot)

Giving the model one or more examples (input and output pairs) is highly effective. It acts as a teaching tool, allowing the model to imitate the desired pattern, style, and tone. This is as simple as laying out a plain text example with a Heading, block of body text followed by some bullet points. It will use that format in its response. We will exploring prompting techniques in my next post.

Tip: For classification tasks, use at least six examples and mix up the classes (e.g., positive, negative, neutral) to prevent the model from overfitting to a specific order.

Break Tasks Down

For complex requests, split the task into smaller steps. For instance, instruct the model to first extract factual claims and then as a second prompt, verify them, rather than doing both in one pass.

Define the Role

Assigning a specific persona (e.g., "Technical Product Manager" "News Anchor" or "Industry Journalist") helps frame the output's voice and focused expertise.


Formatting and Syntax

The physical layout and syntax of the prompt help the model parse intent.


Use Clear Syntax

Utilize punctuation, headings, and section markers (like `---` or XML tags) to differentiate between instructions, context, and reference data.

Combat Recency Bias

Models can be influenced more heavily by information at the end of a prompt. It is often helpful to repeat instructions at the end of the prompt or place the primary instructions before the data content.

Prime the Output (Cues)

You can "jumpstart" the model's response by providing the first few words of the desired output. For example, ending a prompt with "Here is a bulleted list of key points:" guides the model to immediately start listing items.

Structured Output (JSON/XML)

Requesting output in specific formats like JSON limits hallucinations and creates structured data that is easier to integrate into applications. For the real techies out there, if the JSON output is truncated or malformed, libraries like json-repair can help salvage the data.


Technical Configuration

Beyond the text, model settings play a crucial role in the output quality.


Temperature and Top-P (controlling randomness)

These are known as hyper-parameters and the difference between them is quite subtle.

The temperature parameter is used in language models to control the randomness of the generated text. It controls how much the model should take into account low-probability words when generating the next token in the sequence. For tasks requiring factual accuracy (like math or code), set the temperature to 0 or a very low number. For creative tasks, higher temperatures (e.g., 0.9) encourage diversity.

The top_p parameter can also be used to control the randomness of the outputs. Top_p sampling is also called nucleus sampling, in which a probability threshold is set (Default value =1 in the API). This threshold represents the proportion of the probability distribution to consider for the next word. In other words, It consists of selecting the top words from the probability distribution, having the highest probabilities that add up to the given threshold.

For example, if we set a top_p of 0.05, it means that the model, once it generated the probability distribution, will only be considering the tokens that have the highest probabilities, and sum up to 5%. Then the model will be randomly selecting the next token among these 5% tokens, according to its likelihood. The top_p sampling is highly correlated to the quality and the size of the dataset used to train the model. In Machine learning subjects, as there are huge datasets with good quality, the answers are not that different when modifying the value of top_p.

It is generally recommended to alter only one of these parameters (Temperature or Top-P) at a time, not both.

Note : Don't ask me to repeat that after a few beers.


Token Limits

Be mindful of output length. Generating excessive tokens increases cost and latency. You can control this via configuration settings or by explicitly instructing the model to be concise (e.g.I "Explain in a tweet length message").


Process Management

Prompt engineering is rarely perfect on the first try.


Iterate and Document

You should document every version of your prompt, including the model used, temperature settings, and the resulting output. This helps in debugging and refining performance over time. Keep them in a Google doc or simple text file.

Experiment with Variables

Use variables (e.g. `{city}`) in your prompts to make them dynamic and reusable across different inputs.

Collaborate

Have multiple people attempt to design prompts for the same goal; variance in phrasing can lead to discovering more effective techniques


Next up...

In the next post I will try and outline some prompting techniques, of which there are many.

Beyond the Prompt: Vibe Coding

Previously , I explored a provocative reality: the era of manual, meticulous "prompt engineering" is coming to an end. The days of...