Automation Agent Engineer (SDET + RAG/Prompt Engineering)
About the Role
We're looking for a forward-thinking Automation Agent Engineer—part seasoned SDET, part prompt engineer—who’s excited to bridge the gap between classic QA automation and AI-assisted workflows. In this hybrid role, you’ll help develop and refine a mini-RAG system that transforms manual test cases into intelligent, LLM-generated test skeletons.
You'll shape how context is retrieved, how prompts are designed, and how generated test code aligns with standards for our Java-based automation stack (RestAssured, Selenium, Cucumber, Screenplay). This is a hands-on, highly collaborative role where your decisions will directly affect the quality, consistency, and scalability of our AI-enabled testing pipeline.
What You’ll Do
- Design and refine prompt templates and few-shot examples to guide the LLM toward generating high-quality test code.
- Collaborate on RAG logic, including chunking strategies, metadata filters, and reranker tuning for optimal context retrieval.
- Help convert manual test cases into structured test skeletons, normalizing them to project standards.
- Build evaluation loops to validate generated code—ensuring compliance with naming, format, assertions, lint rules, and compile/run requirements.
- Review and land autoscripted code with proper guardrails, documentation, and handoff to the SDET team.
- Optimize token usage and latency while maintaining test fidelity in collaboration with the AI and QA teams.
What You Bring
- 5+ years of experience in QA automation with strong skills in Java, RestAssured, Selenium, Cucumber, and ideally the Screenplay pattern.
- Hands-on knowledge of prompt engineering, including few-shot prompting and structured output design.
- Working understanding of RAG concepts (embeddings, retrievers, rerankers) and how they apply to software development workflows.
- Familiarity with CI pipelines (Jenkins), Git workflows, and code review best practices.
It is an asset if you have:
- Experience with LangChain, ChromaDB, or other local retrieval systems.
- Understanding of LLMs like Claude 4 Sonnet and how to manage token/cost trade-offs.
- Ability to build small tools such as static analyzers, linters, or rule-based code evaluators.
- Locations
- Multiple locations
- Remote status
- Fully Remote
About Perform
Since 2005, Perform's engineers have been helping companies scale their apps and their teams. We were near-shoring before it was even a term and have worked with 100s of clients along the way.
Already working at Perform?
Let’s recruit together and find your next colleague.