Prompt Engineering

Overview

Effective prompt engineering is critical for reliable LLM outputs. This skill covers prompt design patterns, templating, few-shot learning, chain-of-thought reasoning, and output formatting.

Key Concepts

Prompt Structure

┌─────────────────────────────────────────────────────────────┐
│                    Prompt Components                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  System Prompt (Role & Behavior):                           │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  "You are a helpful assistant that..."              │   │
│  │  • Define persona and expertise                     │   │
│  │  • Set behavioral constraints                       │   │
│  │  • Specify output format                            │   │
│  │  • Establish safety guidelines                      │   │
│  └─────────────────────────────────────────────────────┘   │
│                          │                                   │
│  Context (Background):   ▼                                  │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  • Relevant documents or data                       │   │
│  │  • Previous conversation history                    │   │
│  │  • User preferences or settings                     │   │
│  └─────────────────────────────────────────────────────┘   │
│                          │                                   │
│  Few-Shot Examples:      ▼                                  │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  User: <example input 1>                            │   │
│  │  Assistant: <example output 1>                      │   │
│  │  User: <example input 2>                            │   │
│  │  Assistant: <example output 2>                      │   │
│  └─────────────────────────────────────────────────────┘   │
│                          │                                   │
│  User Input:             ▼                                  │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  The actual user query or instruction               │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                              │
│  Techniques:                                                │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  • Chain-of-Thought: "Think step by step"           │   │
│  │  • Self-Consistency: Multiple reasoning paths       │   │
│  │  • Tree-of-Thought: Explore multiple branches       │   │
│  │  • ReAct: Reasoning + Acting interleaved           │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Best Practices

1. Be Specific and Explicit

Ambiguity leads to inconsistent outputs.

2. Use Structured Output Formats

Request JSON, XML, or markdown for parsing.

3. Include Examples

Few-shot examples dramatically improve accuracy.

4. Break Down Complex Tasks

Use chain-of-thought for reasoning tasks.

5. Version and Test Prompts

Treat prompts as code with testing.

Code Examples

Example 1: Prompt Template System

@Service
public class PromptTemplateService {
    
    private final Map<String, PromptTemplate> templates = new ConcurrentHashMap<>();
    private final Mustache.Compiler mustache = Mustache.compiler();
    
    @PostConstruct
    public void loadTemplates() {
        // Load templates from files
        templates.put("code-review", PromptTemplate.builder()
            .name("code-review")
            .version("1.2.0")
            .systemPrompt("""
                You are an expert code reviewer with deep knowledge of {{language}}.
                Review code for:
                - Bugs and potential issues
                - Security vulnerabilities
                - Performance problems
                - Code style and best practices
                
                Provide actionable feedback with specific line references.
                Format your response as JSON matching this schema:
                {
                  "summary": "Brief overall assessment",
                  "issues": [
                    {
                      "severity": "critical|high|medium|low",
                      "line": <number>,
                      "description": "Issue description",
                      "suggestion": "How to fix"
                    }
                  ],
                  "positives": ["Good practices observed"]
                }
                """)
            .userTemplate("""
                Please review this {{language}} code:
                
                ```{{language}}
                {{code}}
                ```
                
                {{#focusAreas}}
                Focus particularly on: {{focusAreas}}
                {{/focusAreas}}
                """)
            .build());
        
        templates.put("summarize", PromptTemplate.builder()
            .name("summarize")
            .version("1.0.0")
            .systemPrompt("""
                You are a skilled summarizer. Create concise, accurate summaries
                that capture the key points while preserving important details.
                Adjust length based on the target audience and purpose.
                """)
            .userTemplate("""
                Summarize the following text in {{style}} style:
                
                Target length: {{length}}
                Audience: {{audience}}
                
                Text to summarize:
                {{text}}
                """)
            .build());
    }
    
    public PromptTemplate getTemplate(String name) {
        PromptTemplate template = templates.get(name);
        if (template == null) {
            throw new TemplateNotFoundException("Prompt template not found: " + name);
        }
        return template;
    }
    
    public CompiledPrompt compile(String templateName, Map<String, Object> variables) {
        PromptTemplate template = getTemplate(templateName);
        
        String systemPrompt = mustache.compile(template.getSystemPrompt())
            .execute(variables);
        String userPrompt = mustache.compile(template.getUserTemplate())
            .execute(variables);
        
        return new CompiledPrompt(systemPrompt, userPrompt, template.getVersion());
    }
}

@Data
@Builder
class PromptTemplate {
    private String name;
    private String version;
    private String systemPrompt;
    private String userTemplate;
    private List<String> requiredVariables;
    private Map<String, Object> defaultVariables;
}

record CompiledPrompt(String systemPrompt, String userPrompt, String version) {}

Example 2: Few-Shot Learning

@Service
public class FewShotPromptBuilder {
    
    private final Map<String, List<Example>> exampleLibrary = new HashMap<>();
    
    public FewShotPromptBuilder() {
        // Classification examples
        exampleLibrary.put("sentiment-analysis", List.of(
            new Example(
                "This product exceeded all my expectations! Absolutely love it.",
                "{\"sentiment\": \"positive\", \"confidence\": 0.95, \"aspects\": [\"quality\"]}"
            ),
            new Example(
                "Terrible customer service. Never buying from them again.",
                "{\"sentiment\": \"negative\", \"confidence\": 0.90, \"aspects\": [\"service\"]}"
            ),
            new Example(
                "It's okay. Works as expected but nothing special.",
                "{\"sentiment\": \"neutral\", \"confidence\": 0.70, \"aspects\": [\"functionality\"]}"
            )
        ));
        
        // Code generation examples
        exampleLibrary.put("sql-generation", List.of(
            new Example(
                "Find all users who signed up in the last 30 days",
                "SELECT * FROM users WHERE created_at >= NOW() - INTERVAL '30 days'"
            ),
            new Example(
                "Count orders by status",
                "SELECT status, COUNT(*) as count FROM orders GROUP BY status"
            ),
            new Example(
                "Get top 10 customers by total purchase amount",
                """
                SELECT u.id, u.name, SUM(o.total) as total_spent
                FROM users u
                JOIN orders o ON u.id = o.user_id
                GROUP BY u.id, u.name
                ORDER BY total_spent DESC
                LIMIT 10
                """
            )
        ));
    }
    
    public List<Message> buildFewShotPrompt(String task, String userInput, int numExamples) {
        List<Example> examples = exampleLibrary.get(task);
        if (examples == null) {
            throw new IllegalArgumentException("Unknown task: " + task);
        }
        
        List<Message> messages = new ArrayList<>();
        
        // Add system prompt
        messages.add(Message.system(getSystemPromptForTask(task)));
        
        // Add few-shot examples
        int count = Math.min(numExamples, examples.size());
        for (int i = 0; i < count; i++) {
            Example example = examples.get(i);
            messages.add(Message.user(example.input()));
            messages.add(Message.assistant(example.output()));
        }
        
        // Add actual user input
        messages.add(Message.user(userInput));
        
        return messages;
    }
    
    public List<Message> buildDynamicFewShot(
            String task, 
            String userInput,
            EmbeddingService embeddingService,
            int numExamples) {
        
        // Find most relevant examples using embeddings
        float[] queryEmbedding = embeddingService.embed(userInput);
        
        List<Example> examples = exampleLibrary.get(task);
        List<ScoredExample> scoredExamples = examples.stream()
            .map(ex -> {
                float[] exEmbedding = embeddingService.embed(ex.input());
                double similarity = cosineSimilarity(queryEmbedding, exEmbedding);
                return new ScoredExample(ex, similarity);
            })
            .sorted((a, b) -> Double.compare(b.score(), a.score()))
            .limit(numExamples)
            .toList();
        
        List<Message> messages = new ArrayList<>();
        messages.add(Message.system(getSystemPromptForTask(task)));
        
        for (ScoredExample scored : scoredExamples) {
            messages.add(Message.user(scored.example().input()));
            messages.add(Message.assistant(scored.example().output()));
        }
        
        messages.add(Message.user(userInput));
        
        return messages;
    }
    
    record Example(String input, String output) {}
    record ScoredExample(Example example, double score) {}
}

Example 3: Chain-of-Thought Prompting

@Service
public class ChainOfThoughtService {
    
    private final LlmClient llmClient;
    
    public ReasoningResult solveWithReasoning(String problem) {
        String systemPrompt = """
            You are a logical problem solver. When given a problem:
            1. Break it down into smaller steps
            2. Show your reasoning for each step
            3. Arrive at a final answer
            
            Format your response as:
            <thinking>
            Step 1: [First step of reasoning]
            Step 2: [Second step of reasoning]
            ...
            </thinking>
            
            <answer>
            [Your final answer]
            </answer>
            """;
        
        String userPrompt = """
            Solve this problem step by step:
            
            %s
            
            Think through this carefully before giving your final answer.
            """.formatted(problem);
        
        CompletionResponse response = llmClient.complete(
            CompletionRequest.builder()
                .model("gpt-4")
                .messages(List.of(
                    Message.system(systemPrompt),
                    Message.user(userPrompt)
                ))
                .temperature(0.1)  // Low temperature for reasoning
                .build()
        );
        
        return parseReasoningResponse(response.getContent());
    }
    
    /**
     * Self-consistency: Run multiple times and aggregate
     */
    public ReasoningResult solveWithSelfConsistency(String problem, int numPaths) {
        List<CompletableFuture<ReasoningResult>> futures = IntStream.range(0, numPaths)
            .mapToObj(i -> CompletableFuture.supplyAsync(() -> 
                solveWithReasoning(problem)))
            .toList();
        
        List<ReasoningResult> results = futures.stream()
            .map(CompletableFuture::join)
            .toList();
        
        // Find majority answer
        Map<String, Long> answerCounts = results.stream()
            .collect(Collectors.groupingBy(
                ReasoningResult::answer,
                Collectors.counting()
            ));
        
        String majorityAnswer = answerCounts.entrySet().stream()
            .max(Map.Entry.comparingByValue())
            .map(Map.Entry::getKey)
            .orElseThrow();
        
        double confidence = (double) answerCounts.get(majorityAnswer) / numPaths;
        
        // Get best reasoning for majority answer
        String bestReasoning = results.stream()
            .filter(r -> r.answer().equals(majorityAnswer))
            .findFirst()
            .map(ReasoningResult::reasoning)
            .orElse("");
        
        return new ReasoningResult(majorityAnswer, bestReasoning, confidence);
    }
    
    record ReasoningResult(String answer, String reasoning, double confidence) {}
}

Example 4: Structured Output Parsing

@Service
public class StructuredOutputService {
    
    private final LlmClient llmClient;
    private final ObjectMapper objectMapper;
    
    /**
     * Extract structured data with JSON schema
     */
    public <T> T extractStructured(String text, Class<T> targetClass) {
        JsonSchema schema = JsonSchemaGenerator.generate(targetClass);
        
        String systemPrompt = """
            You are a data extraction assistant. Extract information from the
            provided text and return it as valid JSON matching this schema:
            
            %s
            
            Important:
            - Return ONLY valid JSON, no explanations
            - Use null for missing optional fields
            - Follow the exact field names from the schema
            """.formatted(schema.toJsonString());
        
        String userPrompt = "Extract data from this text:\n\n" + text;
        
        CompletionResponse response = llmClient.complete(
            CompletionRequest.builder()
                .model("gpt-4")
                .messages(List.of(
                    Message.system(systemPrompt),
                    Message.user(userPrompt)
                ))
                .temperature(0.0)
                .responseFormat(ResponseFormat.JSON)
                .build()
        );
        
        try {
            return objectMapper.readValue(response.getContent(), targetClass);
        } catch (JsonProcessingException e) {
            throw new StructuredOutputException("Failed to parse LLM response as " + 
                targetClass.getSimpleName(), e);
        }
    }
    
    /**
     * With retry and validation
     */
    public <T> T extractWithValidation(String text, Class<T> targetClass, 
            Validator<T> validator, int maxRetries) {
        
        String lastError = null;
        
        for (int attempt = 0; attempt < maxRetries; attempt++) {
            String prompt = buildExtractionPrompt(text, targetClass, lastError);
            
            try {
                CompletionResponse response = llmClient.complete(
                    CompletionRequest.builder()
                        .model("gpt-4")
                        .messages(List.of(Message.user(prompt)))
                        .temperature(0.0)
                        .responseFormat(ResponseFormat.JSON)
                        .build()
                );
                
                T result = objectMapper.readValue(response.getContent(), targetClass);
                
                // Validate
                ValidationResult validation = validator.validate(result);
                if (validation.isValid()) {
                    return result;
                }
                
                lastError = "Validation failed: " + validation.getErrors();
                
            } catch (Exception e) {
                lastError = "Parse error: " + e.getMessage();
            }
        }
        
        throw new StructuredOutputException(
            "Failed to extract valid structured output after " + maxRetries + " attempts"
        );
    }
    
    private <T> String buildExtractionPrompt(String text, Class<T> targetClass, 
            String previousError) {
        StringBuilder prompt = new StringBuilder();
        prompt.append("Extract information as JSON matching: ");
        prompt.append(targetClass.getSimpleName());
        prompt.append("\n\nText:\n").append(text);
        
        if (previousError != null) {
            prompt.append("\n\nPrevious attempt failed: ").append(previousError);
            prompt.append("\nPlease fix the issues and try again.");
        }
        
        return prompt.toString();
    }
}

// Example extraction target
@Data
class OrderExtraction {
    @JsonProperty(required = true)
    private String customerName;
    
    @JsonProperty(required = true)
    private List<OrderItemExtraction> items;
    
    private String deliveryAddress;
    
    private LocalDate requestedDeliveryDate;
}

@Data
class OrderItemExtraction {
    private String productName;
    private int quantity;
    private BigDecimal unitPrice;
}

Example 5: Prompt Testing

@SpringBootTest
class PromptTests {
    
    @Autowired
    private PromptTemplateService promptService;
    
    @Autowired
    private LlmClient llmClient;
    
    /**
     * Test prompt produces expected structure
     */
    @Test
    void codeReviewPrompt_producesValidJson() {
        CompiledPrompt prompt = promptService.compile("code-review", Map.of(
            "language", "java",
            "code", """
                public void process(String input) {
                    System.out.println(input);
                }
                """
        ));
        
        CompletionResponse response = llmClient.complete(
            CompletionRequest.builder()
                .model("gpt-4")
                .messages(List.of(
                    Message.system(prompt.systemPrompt()),
                    Message.user(prompt.userPrompt())
                ))
                .temperature(0.0)
                .build()
        );
        
        // Verify JSON structure
        CodeReviewResponse review = objectMapper.readValue(
            response.getContent(), CodeReviewResponse.class);
        
        assertThat(review.getSummary()).isNotBlank();
        assertThat(review.getIssues()).isNotNull();
    }
    
    /**
     * Regression test for prompt behavior
     */
    @ParameterizedTest
    @CsvSource({
        "This is amazing!, positive",
        "Terrible experience, negative",
        "It's okay I guess, neutral"
    })
    void sentimentAnalysis_classifiesCorrectly(String input, String expectedSentiment) {
        FewShotPromptBuilder builder = new FewShotPromptBuilder();
        List<Message> messages = builder.buildFewShotPrompt(
            "sentiment-analysis", input, 3);
        
        CompletionResponse response = llmClient.complete(
            CompletionRequest.builder()
                .model("gpt-4")
                .messages(messages)
                .temperature(0.0)
                .build()
        );
        
        SentimentResult result = objectMapper.readValue(
            response.getContent(), SentimentResult.class);
        
        assertThat(result.getSentiment()).isEqualTo(expectedSentiment);
    }
    
    /**
     * Evaluate prompt quality metrics
     */
    @Test
    void evaluatePromptQuality() {
        List<TestCase> testCases = loadTestCases("sentiment-test-cases.json");
        
        int correct = 0;
        List<PromptEvalResult> results = new ArrayList<>();
        
        for (TestCase tc : testCases) {
            CompletionResponse response = runPrompt(tc.input());
            SentimentResult result = parse(response);
            
            boolean isCorrect = result.getSentiment().equals(tc.expectedOutput());
            if (isCorrect) correct++;
            
            results.add(new PromptEvalResult(tc, result, isCorrect));
        }
        
        double accuracy = (double) correct / testCases.size();
        
        // Report metrics
        System.out.printf("Accuracy: %.2f%% (%d/%d)%n", 
            accuracy * 100, correct, testCases.size());
        
        // Fail if below threshold
        assertThat(accuracy).isGreaterThan(0.90);
    }
}

Anti-Patterns

❌ Vague Instructions

# WRONG
"Summarize this"

# ✅ CORRECT
"Summarize this article in 3 bullet points, 
each under 20 words, focusing on key findings"

❌ No Output Format Specification

Always specify the expected format for reliable parsing.

Prompt Engineering

Prompt Engineering

Overview

Key Concepts

Prompt Structure

Best Practices

1. Be Specific and Explicit

2. Use Structured Output Formats

3. Include Examples

4. Break Down Complex Tasks

5. Version and Test Prompts

Code Examples

Example 1: Prompt Template System

Example 2: Few-Shot Learning

Example 3: Chain-of-Thought Prompting

Example 4: Structured Output Parsing

Example 5: Prompt Testing

Anti-Patterns

❌ Vague Instructions

❌ No Output Format Specification

References

Related Skills