Advanced Prompt Techniques: Few-Shot Learning, Chain-of-Thought, and Self-Critique
Advanced Prompt Techniques: Few-Shot Learning, Chain-of-Thought, and Self-Critique
In our previous episodes, we mastered the “universal formula” for prompt construction and core principles. Now we enter the expert territory, exploring advanced techniques that can produce qualitative leaps in AI performance.
This episode goes beyond achieving “good” results—we’re pursuing extremely precise, reliable, and efficient outputs that rival human expert performance.
The Failure That Started Everything
Imagine you’ve mastered clear instructions and role-playing, yet when you ask an AI to “write a project weekly report following our company’s unique format,” it still fails. Why? Because it cannot understand “unique format” from thin air.
The critical question: When “explaining clearly” itself becomes difficult, how do we communicate with the model? The answer: Don’t just describe with words—show it directly through examples.
Technique 1: Few-Shot Learning — The Power of “Learning by Example”
The Science Behind Few-Shot Learning
Few-shot learning leverages the model’s in-context learning capabilities .
This is essentially example-based programming—instead of writing explicit rules, we demonstrate the desired behavior through carefully crafted examples .
When to Use Few-Shot Learning
- Fixed-format tasks: JSON, XML generation
- Style mimicry: Writing in specific tones or formats
- Complex rule following: Tasks with intricate, hard-to-verbalize requirements
- Domain-specific outputs: Industry jargon or specialized formats
Practical Example: Sentiment Analysis with Structured Output
Let’s see few-shot learning in action for a sentiment analysis task that requires specific JSON formatting:
**Input**: "The product is great, but shipping was too slow."**Output**: `{"sentiment": "mixed", "product": "positive", "logistics": "negative"}`
**Input**: "This was a perfect shopping experience."**Output**: `{"sentiment": "positive", "product": "positive", "logistics": "positive"}`
**Input**: "Screen has defects, customer service won't help."**Output**: `{"sentiment": "negative", "product": "negative", "customer_service": "negative"}`
**New Input**: "Phone battery life is excellent, but packaging was damaged."**Model Output**: `{"sentiment": "mixed", "product": "positive", "packaging": "negative"}`
Notice how the model learned to:
- Use the “mixed” sentiment category for conflicting aspects
- Break down feedback into specific components
- Apply consistent JSON formatting
- Infer new categories (“packaging”) when needed
Best Practices for Few-Shot Prompting
Diversity is Key: Include examples that cover edge cases and variations .
Quality over Quantity: 3-5 well-chosen examples often outperform 10 mediocre ones.
Order Matters: Place your strongest, clearest examples first to establish the pattern.
Technique 2: Complex Chain-of-Thought and Self-Consistency
Beyond “Step by Step”
While simple “think step by step” prompts work for basic problems, complex mathematical or logical challenges require structured, multi-layered reasoning .
Advanced CoT Prompting Structure
For complex problems, specify the reasoning framework explicitly:
Q: [Complex word problem]
Solve this systematically:1. **Define variables**: Clearly identify what each variable represents2. **List known conditions**: Extract all given information3. **Establish equations**: Translate conditions into mathematical relationships4. **Solve step-by-step**: Show each algebraic manipulation5. **Verify reasonableness**: Check if the answer makes logical sense
Self-Consistency: The Voting Mechanism
Self-consistency is a decoding strategy that significantly improves reasoning accuracy . Instead of taking the first answer, the technique:
- Generates multiple reasoning paths (due to model randomness)
- Compares final answers across different reasoning chains
- Selects the most frequent answer through majority voting
Research shows self-consistency can improve performance by 17.9% on GSM8K math problems and 11.0% on SVAMP reasoning tasks .
Implementation Example
Prompt: "A train travels 120 miles in 2 hours, then 180 miles in 3 hours. What's the average speed for the entire journey? Think step by step."
**Reasoning Path 1**:Total distance = 120 + 180 = 300 milesTotal time = 2 + 3 = 5 hoursAverage speed = 300/5 = 60 mph
**Reasoning Path 2**:First segment: 120 miles ÷ 2 hours = 60 mphSecond segment: 180 miles ÷ 3 hours = 60 mphTotal journey: 300 miles ÷ 5 hours = 60 mph
**Reasoning Path 3**:Distance₁ = 120, Time₁ = 2Distance₂ = 180, Time₂ = 3Average = (Distance₁ + Distance₂)/(Time₁ + Time₂) = 300/5 = 60 mph
**Consensus Answer**: 60 mph (3/3 agreement)
Technique 3: Self-Critique and Verification
The Internal Editor Approach
Self-critique prompting asks the model to critically examine its own output after generation . This technique effectively catches errors that occur due to “momentary lapses” in reasoning.
Core Self-Critique Patterns
Basic Verification:
- “Check your work.”
- “Review the above response and identify any potential errors.”
- “Is this answer consistent with the facts provided?”
Structured Self-Review:
After providing your answer, please:1. **Accuracy Check**: Verify all calculations and facts2. **Logic Review**: Ensure reasoning steps follow logically3. **Completeness Assessment**: Confirm all parts of the question are addressed4. **Alternative Perspective**: Consider if there's another valid interpretation
Self-Verification in Practice
Self-verification uses a dual-process approach :
- Forward Reasoning: Generate initial answer with CoT
- Backward Verification: Use the answer to predict original conditions
Example:
**Initial Problem**: "Jackie has 10 apples. Adam has 8 apples. How many more apples does Jackie have than Adam?"
**Forward Answer**: "Jackie has 2 more apples than Adam."
**Backward Verification**: "If Jackie has X apples, Adam has 8 apples, and Jackie has 2 more apples than Adam, what is X?"**Verification Answer**: "X = 8 + 2 = 10 apples"**Consistency Check**: ✓ Original condition confirmed
Code Generation with Self-Critique
**Round 1 Prompt**: "Write a Python function to calculate factorial."**Round 1 Output**:```pythondef factorial(n): return 1 if n == 0 else n * factorial(n-1)
Round 2 Prompt: “Review this function. Are there any edge cases it doesn’t handle correctly?” Round 2 Output: “Yes, this function doesn’t handle negative numbers. It would cause infinite recursion. I should add a check for n < 0.”
Improved Version:
def factorial(n): if n < 0: raise ValueError("Factorial is not defined for negative numbers") return 1 if n == 0 else n * factorial(n-1)
Technique 4: Structured Output and Tool Integration
Directed Output Formatting
Force the model to output in specific, parseable formats for seamless integration with other systems:
"Analyze this customer feedback and output your response as a JSON object with keys 'sentiment', 'urgency_level', 'department', and 'suggested_action'."
Tool Use Simulation
Advanced prompting can make models understand when external tools are needed and generate structured requests:
**Prompt**: "What is the square root of 2024 multiplied by pi? Use a calculator if needed."**Model Output**:```json{ "tool": "calculator", "operation": "sqrt(2024) * pi", "reasoning": "This requires precise mathematical calculation beyond mental math capabilities"}
Comprehensive Case Study: Building an Advanced AI Assistant
Let’s combine all techniques to create a sophisticated policy analysis assistant:
The Multi-Technique Prompt
**Role**: You are a senior policy analyst for a technology think tank.
**Task**: Analyze the following tech policy question using our structured approach.
**Response Framework**:a) **Executive Summary** (2-3 sentences)b) **Key Points** (bulleted list)c) **Underlying Assumptions** (what premises does this analysis rest on?)d) **Follow-up Research Questions** (3 specific queries for deeper investigation)
**Self-Review Process**:After your analysis, please:1. Review for potential bias or missing perspectives2. Verify factual claims against your knowledge3. Suggest specific search queries to validate recent developments
**Question**: [Insert complex policy question here]
Why This Works
- Role-playing establishes expertise and perspective
- Structured framework ensures comprehensive coverage
- Self-review catches errors and biases
- Tool integration (search suggestions) extends capabilities
Advanced Combination Strategies
Sequential Technique Application
**Step 1**: Use few-shot learning to establish format**Step 2**: Apply complex CoT for reasoning**Step 3**: Implement self-consistency for verification**Step 4**: Add self-critique for final review
Parallel Technique Integration
**Simultaneous Application**:- Few-shot examples within CoT demonstrations- Self-critique questions embedded in the reasoning process- Tool use suggestions integrated throughout
Common Pitfalls and Avoidance Strategies
The Context Window Trap
Problem: Too many examples exceed model limits Solution: Use representative examples, not exhaustive ones
The Overthinking Paradox
Problem: Excessive self-critique leads to analysis paralysis Solution: Limit critique rounds to 1-2 iterations
The Consistency Illusion
Problem: Self-consistency might reinforce systematic errors Solution: Combine with external validation when possible
Iterative Optimization Framework
The REFINE Cycle
- Run initial prompt with few-shot examples
- Evaluate output quality and consistency
- Fine-tune examples and instructions
- Implement self-critique mechanisms
- Navigate edge cases and errors
- Enhance with additional techniques as needed
Tool Recommendations for Advanced Prompting
Development Tools
- Prompt versioning: Track iterations and performance
- A/B testing: Compare technique combinations
- Output analysis: Measure consistency and accuracy
Evaluation Metrics
- Accuracy: Correctness of final answers
- Consistency: Agreement across multiple runs
- Efficiency: Token usage vs. quality trade-offs
- Robustness: Performance on edge cases
From Techniques to Thinking Patterns
Advanced prompt engineering transforms AI from a simple question-answering machine into a sophisticated reasoning partner capable of:
- Complex workflow execution
- Self-monitoring and error correction
- Adaptive problem-solving strategies
- Integration with external tools and systems
The Responsibility Factor
With great power comes great responsibility. These techniques can also generate:
- More sophisticated misinformation
- Harder-to-detect reasoning errors
- Complex biases embedded in multi-step processes
Always apply ethical guidelines and validation procedures when deploying advanced techniques in production systems.
Looking Ahead: Real-World Applications
We’ve now mastered all the “weapons” in the prompt engineering arsenal. In our next episode, we’ll enter the practical battlefield, diving deep into specific domains like programming, writing, marketing, and research to see how experts combine these techniques to solve real-world problems.
The journey from novice to expert is complete—now it’s time to apply these skills where they matter most.
References
-
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., & Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv preprint arXiv:2203.11171.
-
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Chi, E., Le, Q., & Zhou, D. (2022). Chain of Thought Prompting Elicits Reasoning in Large Language Models. arXiv preprint arXiv:2201.11903.
-
Weng, Y., Zhu, M., Xia, F., Li, B., He, S., Liu, K., & Zhao, J. (2022). Large Language Models are Better Reasoners with Self-Verification. arXiv preprint arXiv:2212.09561.
-
Huang, J., Gu, S. S., Hou, L., Wu, Y., Wang, X., Yu, H., & Han, J. (2022). Large Language Models Can Self-Improve. arXiv preprint arXiv:2210.11610.
-
Madaan, A., Tandon, N., Gupta, P., Hallinan, S., Gao, L., Wiegreffe, S., … & Clark, P. (2023). Self-Refine: Iterative Refinement with Self-Feedback. arXiv preprint arXiv:2303.17651.