This week I’ll take some time out from AIO and talk about some basics that I’ve been getting to grips with in my day job, particularly over the last year. Prompt Engineering has appeared from nowhere and the more you dig in, the more I find that there is just a ton of techniques and methods that can really make a difference in what you get back from AI. Sure you can treat it just like a Google search, but it can do a whole lot more…
What is an AI Prompt?
In the context of generative AI, a “prompt” is most often text, but can also be other modes like images or voice commands, that are provided to an AI model to elicit a specific response or prediction. It serves as the primary interface for interacting with Large Language Models (LLMs), acting as a form of "coding in English" where the user defines the task, context, and constraints for the AI to process.
In other words, it's not just like something you'd type into a Google search, it can be a whole lot more. Possibly review your Resume and re-write it in a particular manner, summarize a website article or even produce something out of a hat like a unique poem or story.
Why Take Time to Develop Them?
It’s easy to use this like a standard Google search and that’s totally fine too. However you can really unleash the power of AI by investing some time in “prompt engineering” which is described as more of an art than a science, often requiring experience and intuition to master. This iterative process is necessary for several reasons:
To Ensure Accuracy. LLMs function as prediction engines, generating the next most likely text based on their training data. Without a high-quality prompt to guide this prediction, the model may produce ambiguous, inaccurate, or irrelevant outputs.
It forces you to write very accurate instructions to ensure a more predictable result and this is a good practice for all walks of life.
To Navigate Sensitivity. Models are highly sensitive to word choice, tone, structure, and context; even small differences in phrasing or formatting can lead to significantly different results.
To Define Boundaries. A well-developed prompt helps the user understand the model's capabilities and limitations, allowing them to improve safety and reduce the likelihood of "hallucinations" (fabricated information). AI can lie very effectively, so don't give it a fraction of a chance to do it.
To Optimize Resources. Poorly designed prompts can lead to excessive token generation, which increases latency and computational costs. Refined prompts can enforce conciseness and specific output structures (like JSON) that make the data more usable.
Ultimately it’s always best to be absolutely clear on what you are asking the AI to do, giving it no possibility to go off and get creative with its answer.
Prompt Design
Designing high-quality prompts is an iterative process that blends art and engineering. The best practices for prompt engineering can be categorized into structural frameworks, instructional strategies, technical configuration, and process management.
Structural Frameworks
To maximize effectiveness, prompts should follow a logical structure rather than being a loose collection of sentences. Several frameworks are recommended:
The Structured Approach
This formula involves four key components:
1. Role and Goal - Broadly describe the aim and the persona the model should adopt.
2. Context - Provide background information.
3. Task - Make expectations explicit and detailed.
4. Reference Content - Supply the data or text the AI needs to process.
The C.R.E.A.T.E. Framework
A mnemonic for drafting prompts that stands for Character (role), Request (specific task), Examples, Additions (style/POV refinements), Type of Output, and Extras (context/reference text).
The Rhetorical Approach
This focuses on the "rhetorical situation," defining the audience, context, author ethos (credentials), pathos (desired emotional response), logos (logical points), and arrangement.
Instructional Strategies
How you phrase your request significantly impacts the model's performance.
Be Specific and Simple
Use Instructions Over Constraints
It is generally more effective to give positive instructions (telling the model what to do) rather than constraints (telling it what not to do). Constraints should be reserved for safety purposes or specific formatting limits.
Provide Examples (Few-Shot)
Giving the model one or more examples (input and output pairs) is highly effective. It acts as a teaching tool, allowing the model to imitate the desired pattern, style, and tone. This is as simple as laying out a plain text example with a Heading, block of body text followed by some bullet points. It will use that format in its response. We will exploring prompting techniques in my next post.
Tip: For classification tasks, use at least six examples and mix up the classes (e.g., positive, negative, neutral) to prevent the model from overfitting to a specific order.
Break Tasks Down
For complex requests, split the task into smaller steps. For instance, instruct the model to first extract factual claims and then as a second prompt, verify them, rather than doing both in one pass.
Define the Role
Assigning a specific persona (e.g., "Technical Product Manager" "News Anchor" or "Industry Journalist") helps frame the output's voice and focused expertise.
Formatting and Syntax
The physical layout and syntax of the prompt help the model parse intent.
Use Clear Syntax
Utilize punctuation, headings, and section markers (like `---` or XML tags) to differentiate between instructions, context, and reference data.
Combat Recency Bias
Models can be influenced more heavily by information at the end of a prompt. It is often helpful to repeat instructions at the end of the prompt or place the primary instructions before the data content.
Prime the Output (Cues)
You can "jumpstart" the model's response by providing the first few words of the desired output. For example, ending a prompt with "Here is a bulleted list of key points:" guides the model to immediately start listing items.
Structured Output (JSON/XML)
Requesting output in specific formats like JSON limits hallucinations and creates structured data that is easier to integrate into applications. For the real techies out there, if the JSON output is truncated or malformed, libraries like json-repair can help salvage the data.
Technical Configuration
Beyond the text, model settings play a crucial role in the output quality.
Temperature and Top-P (controlling randomness)
These are known as hyper-parameters and the difference between them is quite subtle.
The temperature parameter is used in language models to control the randomness of the generated text. It controls how much the model should take into account low-probability words when generating the next token in the sequence. For tasks requiring factual accuracy (like math or code), set the temperature to 0 or a very low number. For creative tasks, higher temperatures (e.g., 0.9) encourage diversity.
The top_p parameter can also be used to control the randomness of the outputs. Top_p sampling is also called nucleus sampling, in which a probability threshold is set (Default value =1 in the API). This threshold represents the proportion of the probability distribution to consider for the next word. In other words, It consists of selecting the top words from the probability distribution, having the highest probabilities that add up to the given threshold.
For example, if we set a top_p of 0.05, it means that the model, once it generated the probability distribution, will only be considering the tokens that have the highest probabilities, and sum up to 5%. Then the model will be randomly selecting the next token among these 5% tokens, according to its likelihood. The top_p sampling is highly correlated to the quality and the size of the dataset used to train the model. In Machine learning subjects, as there are huge datasets with good quality, the answers are not that different when modifying the value of top_p.
It is generally recommended to alter only one of these parameters (Temperature or Top-P) at a time, not both.
Note : Don't ask me to repeat that after a few beers.
Token Limits
Be mindful of output length. Generating excessive tokens increases cost and latency. You can control this via configuration settings or by explicitly instructing the model to be concise (e.g.I "Explain in a tweet length message").
Process Management
Prompt engineering is rarely perfect on the first try.
Iterate and Document
You should document every version of your prompt, including the model used, temperature settings, and the resulting output. This helps in debugging and refining performance over time. Keep them in a Google doc or simple text file.
Experiment with Variables
Use variables (e.g. `{city}`) in your prompts to make them dynamic and reusable across different inputs.
Collaborate
Have multiple people attempt to design prompts for the same goal; variance in phrasing can lead to discovering more effective techniques

No comments:
Post a Comment