Tuesday, March 31, 2026

Stop Guessing, Start Directing: From Zero-Shot to Few-Shot Guide for AI Precision

When I started using AI almost a year ago I found the whole thing just utterly amazing. It took me quite a while to realise that the answer wasn’t just something that it had found on the web, it had actually generated that answer for me. I was using it like Google Search and it’s much more powerful than that.

After issuing a command, five seconds later my screen is filled with a response that is… technically correct, but completely useless. It’s too wordy and the tone is wrong. It hallucinated facts and worst of all, it sounds like a robot trying too hard to be human. There are so many tell-tale signs of AI generated text. Those long “em dashes” and nearly every paragraph is summarized with bullet points. Don’t worry I get rid of all my bullet points before posting my blogs ;-)

It took me a while to realise that continually asking AI to ‘regenerate’ an answer was effectively asking it to roll the dice again and again. Then I stumbled across Zero-Shot, One-Shot and Few-Shot Prompting…

Learning to make use of these techniques will be the single most powerful shift you can make to your workflow: understanding when to use Zero-Shot Prompting (your quick-and-dirty command) and One-Shot Prompting (The Goldilocks technique) and when to switch to Few-Shot Prompting (giving the AI a template to follow).

If you want the AI to stop guessing and start mimicking your brand’s voice, your logic, or your formatting, you need to master these techniques and understand when each one is most appropriate. In 2026, the era of treating AI like a magic 8-ball is over. We are now in the era of structured prompting.

So let’s try to understand these techniques a little better…

Zero-Shot : The "Quick & Dirty" Method

Zero-shot is the "Google Search" that I was unwittingly using when I started working with AI, and I’m sure everyone also started here. It’s built for speed, intuition, and broad strokes. You aren't teaching the AI; you are tapping into its existing massive library of patterns.

If you’re interested in the “sciency” bit, Zero-shot relies on “Global Probability”. When you ask for a "legal summary," the AI looks at the trillions of words it was trained on and predicts what a "standard" legal summary looks like. It’s essentially playing a high-stakes game of "predict the next word" based on general consensus.

What is it good for?…

After spending an initial period of asking it to create a poem about pirates lost in a garden centre and writing a story about a grubby bear with a gambling addiction, I really found it useful for brainstorming and providing me with a list of ideas, such as:

  • Suggesting 10 titles for a blog article.

  • Summarize this 40-page PDF into 5 bullet points.

  • Broad Fact-Finding such as "What were the three primary causes of the French Revolution?". These types of prompts lead to those Google Search AI Overviews which provide a deeper, more direct answer than sifting through loads of websites and Wikipedia articles.

  • Translate this menu into conversational Italian.

The Danger Zone

Once you’ve been around the block on reading AI generated text you’ll understand you’ll start spotting it instantly. So this Zero-Shot method is the worst for providing generic or clichéd blocks of text, mostly due to this "Global Probability" mechanism.

And for that reason alone it is more likely to hallucinate with a "plausible-sounding" answer if it doesn't know the fact, especially without giving it examples to anchor it.

Lastly, If you need the data in a specific format like JSON or CSV, Zero-shot will almost always include "here is your data!" text preceding the data that breaks your code.

One-Shot : The "Goldilocks" Technique

So, onwards and upwards. Sometimes you don’t need a whole training set; you just need to clear up the confusion. One-shot is providing exactly one example. It’s the most efficient way to define a "style" or "format" without cluttering your context window.

Note : A context window refers to the amount of text (measured in tokens) that a Large Language Model (LLM) can process or "remember" at one time.

One-shot acts as a structural anchor. While Zero-shot leaves the AI guessing about your preferred format, a single example removes 90% of that ambiguity. It’s particularly effective for high-performing 2026 models like Gemini 3 Flash or GPT-5, which are now sensitive enough to pivot their entire behavior based on a single data point.

What is it good for?…

  • If you want the output in a specific JSON structure or bullet-point style, you can define the exact format you want just by providing a verbatim example.

  • Provide one previous email you wrote so the AI can mimic your specific tone and level of formality.

  • Or just to be a little more obscure, maybe you’re translating English to "Legal-Speak" where one example shows the level of complexity that you are trying to achieve.

Few-Shot : The "Pattern-Match" Powerhouse

If Zero-shot is “suck it and see”, Few-shot is a 1-on-1 coaching session. You are providing a "mini-dataset" within the prompt, forcing the AI to ignore its global averages and follow your specific logic.

What is it good for?...

In 2026, models now have massive "context windows" (their short-term memory). Few-shot works because the AI prioritizes the patterns found inside the prompt over the patterns it learned during training. You are essentially creating a temporary "Custom GPT" for that single chat.

Why "Three" is the Magic Number

  • One Example is a suggestion (the AI might think it's a fluke).

  • Two Examples create a line (a basic direction).

  • Three Examples create a pattern. Once the AI sees a pattern repeated three times, its mathematical confidence in mimicking that pattern skyrockets.

Pro-Tip: "Diverse Few-Shotting"

Don’t just give three identical examples. Give three different versions of a success.

  • Example 1: Short sentence success.

  • Example 2: Long, complex paragraph success.

  • Example 3: Success with an "edge case" (like a negative or a question).
    This teaches the AI the boundaries of your request, not just the middle.

The "Shot" Summary

Here’s a wee summary to maybe give you a rule of thumb:

Technique

Method

Accuracy

Token Cost

Best For...

Zero-Shot

Just a command.

⭐⭐

🟢 Lowest

General knowledge, brainstorming

One-Shot

Command + 1 Example.

⭐⭐⭐

🟡 Low

Setting a specific format or tone

Few-Shot

Command + 3-5 Examples.

⭐⭐⭐⭐⭐

🔴 Higher

Logic, complex classification, data clean-up


Some Examples (The "Secret Sauce")

Now I will attempt to summarize with some solid examples to make it less abstract and academic…

Zero-shot prompting

This involves giving the model a direct instruction to perform a task without providing any examples.


Example : Translate this sentence from French to English: 'Bonjour le monde'..


Where it succeeds : Zero-shot succeeds because it is highly efficient for simple, well-understood tasks that the model has frequently encountered during its training, such as straightforward translations like this.

Where it fails : It often falls short when a task requires a specific output structure or when the prompt involves ambiguity, as the model is left guessing the desired format without a pattern to follow.

One-Shot Prompting

One-shot prompting enhances the zero-shot approach by providing exactly one input-output example before presenting the actual request.


Example : Translate the following sentence. Example: 'Salut' → 'Hello'. Now translate: 'Bonjour' → ?.


Where it succeeds : This technique is ideal when the model needs a specific format or context to understand a fairly simple task, giving it a basic starting point to imitate.

Where it fails : One-shot prompting struggles with nuanced tasks because a single example cannot fully capture the range of possible edge cases or complex formatting rules.

Few-Shot Prompting

Few-shot prompting provides multiple examples (typically two to five) to help the model recognize patterns and learn in-context.


Example : Parse a customer's pizza order into valid JSON

EXAMPLE 1 : I want a small pizza with cheese, tomato sauce, and pepperoni.

JSON Response: { "size": "small", "type": "normal", "ingredients": [["cheese", "tomato sauce", "peperoni"]] }

EXAMPLE 2 : Can I get a large pizza with tomato sauce, basil and mozzarella JSON Response: { "size": "large", "type": "normal", "ingredients": [["tomato sauce", "basil", "mozzarella"]] }

EXAMPLE 1 : Now, I would like a large pizza, with the first half cheese and mozzarella. And the other tomato sauce, ham and pineapple. JSON Response: { "size": "large", "type": "normal", "one-half-ingredients": [["tomato sauce", "basil", "mozzarella"]], "second-half-ingredients": [["tomato sauce", "ham", "pineapple"]] }


Where it succeeds : Few-shot prompting dramatically succeeds where zero-shot and one-shot fail by enforcing strict structural patterns (like generating JSON, YAML, or bulleted lists) and teaching the model how to handle varied, nuanced inputs. It allows the model to learn entirely new concepts in-context, such as successfully using a made-up word in a sentence after seeing a few examples of how it is done.

Where it fails : Few-shot prompting hits its limits when dealing with complex, multi-step reasoning or arithmetic tasks. For instance, providing multiple examples of whether a group of odd numbers adds up to an even number might still result in the model returning an incorrect answer for a new list of numbers. Because standard few-shot prompting only shows the final answer rather than the process of getting there, the model fails to learn the underlying logic. To succeed where few-shot fails, you must transition to Chain-of-Thought (CoT) prompting, which provides examples that break the problem down into intermediate reasoning steps. I will delve into CoT prompting in a future post.

Conclusion (and a wee challenge)

The difference between basic AI usage and true mastery often comes down to context.

Use Zero-Shot when you are exploring, brainstorming, or doing a task so simple that it’s almost impossible to mess up. It’s built for speed.

But when reliability, predictability, and precise formatting matter—especially if you are automating workflows—you must use Few-Shot. By providing just three curated examples, you anchor the model's logic, eliminate "AI-isms," and ensure consistent results.

A wee challenge for this week:

  1. Take the last prompt you wrote that gave you a generic, frustrating result.

  2. Structure that same task as a Few-Shot prompt, providing the AI with three examples of what a perfect response looks like.

  3. Compare the outputs.

Tuesday, March 24, 2026

The Model Context Protocol (MCP): Bridging AI and Actionable Data

 My day job has recently introduced a new concept for me to understand in my daily life (like I need more new concepts to understand). The Model Context Protocol (MCP)...

What is MCP?

MCP is an open standard designed to unify how AI assistants and large language models (LLMs) connect with external data sources, tools, and environments. An MCP Server acts as a secure gateway or bridge between the AI application (the client) and external systems, such as databases, file systems, or APIs. It is frequently compared to a "USB-C port for AI," as it provides a universal, standardized interface for plugging external capabilities into AI systems.

For example this is incredibly useful if you are building a chatbot for your organisation and want your AI Assistant to have access to your internal customer support database to use as a knowledge base for resolving common issues and answer questions based on your company (i.e. it’s providing context to your AI service). Without it you would somehow have to expose all that information to the larger LLM which is just not gonna happen.

An MCP server exposes three core primitives to AI applications:

  • Tools: Executable functions that the AI can actively call to perform actions, such as writing to a database, executing a web search, or modifying a file.

  • Resources: Passive, read-only data sources that provide the AI with context, such as database schemas, API documentation, or your customer support database.

  • Prompts: Reusable instruction templates that help structure interactions and guide the AI through specific workflows. Prompts that are refined by the architect of the MCP server to provide you with more meaningful responses - saving you time in generating and testing these from scratch.

Why You Would Need an MCP Server?

  • To Eliminate Fragmented Integrations: Before MCP, developers had to write custom API integrations for every single external tool or system an AI needed to access. By implementing an MCP server, developers can build an integration once and grant the AI access to a vast, standardized ecosystem of resources without maintaining dozens of custom codebases.

  • To Enable Safe Action and Execution: LLMs are limited to the data they were trained on and lack built-in environments to safely execute code or make network requests on their own. An MCP server acts as a controlled execution layer. It keeps sensitive elements like API keys hidden from the model while the server handles the actual safe execution of tasks.

  • For Dynamic Tool Discovery: Unlike static API specifications (like OpenAPI) that must be pre-loaded into an LLM, MCP allows AI applications to query servers at runtime to dynamically discover what tools and resources are currently available.

  • To Ensure Security and Access Control: MCP servers are designed with enterprise security in mind, utilizing OAuth 2.1 for authentication and centralizing permissions management. This ensures that AI applications only interact with authorized data and that user-specific contexts are strictly respected so data does not leak between users.

  • For Portability Across Applications: Because MCP is vendor-agnostic and model-agnostic, you can build a toolset once via an MCP server and plug it into any compatible AI application or IDE—such as Claude Desktop, Cursor, Windsurf, or LangChain—without needing to rewrite the integration.

  • To Support Agentic Workflows: MCP facilitates conversational, multi-turn interactions through real-time updates and streaming (using Server-Sent Events). This allows AI agents to dynamically interact with multiple data sources, handle intermediate steps, and maintain persistent context over complex, multi-step tasks.

Why use MCP rather than an API?

While both APIs and MCPs aid in communication between systems, their core audiences, mechanisms, and philosophies differ significantly. A helpful way to frame the difference is that APIs connect machines, whereas MCP connects intelligence to machines.

Here are the primary differences between the two:

Target Audience and Optimization

  • APIs are built for human developers to write code against, optimizing software-to-software communication.

  • MCP is built specifically for AI models to streamline agentic interactions where an AI needs to reason about the data it receives.

Static vs. Dynamic Discovery

  • APIs rely on static contracts that must be pre-loaded, read, and manually interpreted to formulate requests.

  • MCP features dynamic discovery. An AI agent can query an MCP server at runtime to ask, "What tools can you offer?", and the server will automatically respond with a structured list of available tools, their descriptions, and parameter schemas. This means the AI always has an up-to-date view of its capabilities without needing manually updated documentation.

Security and Execution

  • APIs are exposed over the network and assume the caller can securely manage tokens, headers, and request formatting. However, AI models do not have built-in execution environments and cannot safely hold secrets like API keys.

  • MCP introduces a secure intermediary layer. The AI model never sees API keys or sensitive URLs. Instead, the AI asks the MCP server to use a specific tool, and the MCP server validates the input, securely executes the API call using its own hidden credentials, and returns only the safe results. MCP also standardizes security governance, utilizing protocols like OAuth 2.1 to ensure the AI only accesses data the user has explicitly authorized.

Granularity and Abstraction

  • APIs typically expose granular, entity-based endpoints (e.g., /users or /weather).

  • MCPs are less granular and focus on driving broader use cases. An MCP server exposes high-level capabilities (e.g., get_weather or get_open_supportIssues). A single MCP tool might execute several underlying REST API calls to gather all the necessary context for the AI.

LLM-Native Features

  • APIs are generally stateless request-response mechanisms.

  • MCP supports multi-turn, long-lived sessions (often using Server-Sent Events) that allow an AI agent to have back-and-forth interactions with a tool. Furthermore, MCP includes AI-specific features like sampling, which allows the MCP server to leverage the LLM's reasoning abilities. For example, an MCP server could fetch open issues and then use sampling to ask the LLM to filter them by "highest security impact"—a subjective analysis that a traditional REST API cannot natively perform.

Output Formatting

  • APIs return machine-readable data, such as raw JSON payloads and database entity IDs.

  • MCP is designed to return data optimized for an LLM's context window, often formatting responses as human-readable Markdown with fully hydrated entity names instead of raw IDs.

How secure is my data behind an MCP Server?

Because the Model Context Protocol (MCP) acts as a bridge between untrusted, model-generated inputs and sensitive external systems, a single weak point can turn that bridge into a pathway for exploitation. Securing an MCP deployment requires a "shared responsibility" model, where the server stands as a fortified wall protecting resources, and the client acts as a vigilant gatekeeper ensuring the AI does not overstep its bounds.

Academic research breaks down MCP threats into four main categories: malicious developers, external attackers, malicious users, and security flaws. In practice, these manifest as prompt injection, command execution, token theft, excessive permissions, and unverified endpoints.

To protect yourself, you must implement strict safeguards across both MCP servers and MCP clients and quite honestly 99.9% of it goes straight over my head. It can be “dead secure” is what I’ll say on the matter.

How does MCP make my life easier?

So let’s list out a few scenarios where an MCP Server would make sense. At the end of the day it sits in the background and makes interaction with AI more meaningful as it has access to more capabilities and context of the organisation you are talking to.

Software Development and Debugging

The AI coding assistant is greatly enhanced by using MCP to connect directly to local filesystems and version control systems like Git or GitHub. Instead of manually pasting code snippets into a chat, the AI can securely browse your local files, read repository code, search codebases, review pull requests, and even commit changes directly within environments like Cursor or Claude Desktop.

Automated Travel Planning

The true power of multi-server MCP architecture shines here by combining multiple disparate services into one workflow. By connecting a Travel Server, a Weather Server, and a Calendar Server, an AI agent can autonomously read your calendar to find available dates, check destination weather forecasts, search and book flights, and automatically add the itinerary to your schedule while emailing you a confirmation.

Workflow and Communication Automation

AI can connect seamlessly to platforms like Slack, Gmail, or Google Drive. An AI assistant can search through your team's Slack history to pull project context, summarize past decisions, and automatically draft and send emails based on a simple natural language request, all without you needing to switch tabs.

Data Analysis and Visualization

MCP allows AI models to connect directly to SQL databases, Google Sheets, or financial APIs. The AI can read raw data like customer feedback or stock market history, execute complex queries, and instantly generate interactive charts or analytical dashboards. For instance, an AI can use the Alpha Vantage MCP server to fetch 10 years of historical coffee prices and immediately plot an interactive visual graph for you.

Enterprise Knowledge Management

A multi-agent MCP setup can be entirely automated for a Training Management System that can use specialized MCP agents to automatically ingest uploaded PDF documents, extract key learning objectives, generate structured course modules, and create custom multiple-choice assessments without manual human intervention.

Ultimately, the core benefit of MCP in these scenarios is that it transforms AI from a passive text generator into an active, context-aware participant. By utilizing standardized tools, resources, and prompts, you gain a modular, secure way to grant AI access to your personal and business data without needing to write custom integrations for every single application.

That was a big chunk of stuff I learned this week… What next?


The End of Prompt Sorcery: Why We Are Engineering Systems, Not Sentences in 2026

  Now this post might seem like a complete contradiction! Previously, I have been waxing lyrical on all sorts of prompting techniques from Z...