Dec 07, 2025
After spending the last two years integrating AI into my daily testing workflows, I’ve learned one undeniable truth: AI is not a magic wand; it is a high-speed intern.
If you treat a Large Language Model (LLM) like a senior engineer who knows your entire codebase by heart, you will be disappointed. But if you treat it like a brilliant but junior assistant who needs clear guidance and supervision, it will 10x your productivity.
Through trial, error, and thousands of prompts, I’ve distilled my experience into five core best practices. Here is how to get the most out of AI for Quality Assurance.
1. Write Clear Instructions
Why is it important:
AI models operate on the principle of "Garbage In, Garbage Out." If your prompt is vague, the model will fill in the gaps with assumptions—and those assumptions are often wrong. To get usable test cases or code, you must be explicit about the scope, the format, the tools, and the constraints.
Example:
- ❌ Bad Prompt: "Write a test for the login page."
- Result: The AI gives you a generic test using a random framework you don't use, missing your specific business logic.
- ✅ Good Prompt: "Act as a Senior QA Automation Engineer. Write a Playwright test script in TypeScript for the login page. Use the Page Object Model pattern. The test should verify three scenarios: successful login, invalid password, and a locked-out user. Do not mock the API; use the UI inputs."
2. Split Complex Tasks into Simpler Subtasks
Why is it important:
LLMs have a limited "context window" and can lose the thread of logic if asked to do too much at once. If you ask an AI to "Build an entire automation framework from scratch," it will likely hallucinate or provide a shallow, broken skeleton. Breaking tasks down ensures high-quality output for every component.
Example:
Instead of asking for the whole framework at once, try this sequence:
- "Generate a folder structure for a Pytest framework."
- "Now, write the conftest.py file to handle browser setup and teardown."
- "Create a BasePage class with common methods like click and enter_text."
- "Finally, write the actual test script for the 'Add to Cart' feature inheriting from BasePage."
3. Provide References
Why is it important:
The AI does not know your private documentation, your specific API endpoints, or your user stories unless you provide them. "Grounding" the model by pasting relevant context reduces hallucinations and ensures the output is actually applicable to your project.
Example:
- Prompt: "I need to write API tests for the User Creation endpoint. Here is the Swagger/OpenAPI definition for that specific endpoint: [PASTE JSON/YAML SNIPPET HERE]. Based on this definition, generate 5 negative test cases checking for required fields and data type violations."
4. Give the Model Time to Think
Why is it important:
This is often called "Chain-of-Thought" prompting. If you ask for the final answer immediately, the model might guess. If you ask it to plan its approach first, it "reasons" through the logic, resulting in significantly higher accuracy.
Example:
- Prompt: "I need to test a calculation engine that determines insurance premiums based on age and driving history. Before writing the test cases, list out the boundary values and equivalence partitions you intend to use, and explain why you chose them. Once I approve the logic, you will write the test steps."
5. Test Changes Systematically (Git & Verify)
Why is it important:
AI-generated code is prone to subtle bugs, deprecated library usage, or logic errors. If you generate five different scripts and paste them all into your project at once, debugging will be a nightmare. You must treat AI code as untrusted until verified.
The Workflow:
- Generate: Ask the AI for a specific function or test.
- Verify: Run the code immediately. Does it compile? Does the test pass?
- Commit: If it works, commit it to Git.
- Repeat: Move to the next task.
Example:
"I am building a utility file. I will ask you to add functions one by one.
- Step 1: Generate a function to read data from a CSV.
- (I run the code. It works. I commit:
- Step 2: Now, generate a function to parse that CSV data into a JSON object.
- (I run the code. It fails. I revert the changes in my IDE to the previous clean state and refine my prompt.)
By committing only working code, you ensure you always have a 'safe save point' to return to if the AI leads you down a rabbit hole.
Final Thoughts
AI is a powerful accelerator for QA, but it requires a pilot. By being specific, breaking down work, providing context, allowing for reasoning, and rigorously verifying the output via version control, you turn a chaotic chatbot into your most valuable testing asset.