RAG Architect

We are seeking a Senior AI Engineer to lead the design and implementation of an end-to-end Retrieval-Augmented Generation (RAG) architecture. This role will drive the ingestion of GitHub repositories, Confluence pages, Qtest artifacts, PRDs, and script libraries to power autoscripting, onboarding search, and long-term knowledge reuse. As a technical leader, you will set the strategic direction, select cutting-edge models, and mentor AI and Automation Agent Engineers to deliver a scalable, secure, and innovative platform.

Key Responsibilities

Architect ingestion and retrieval layers, selecting loaders, chunking strategies (AST-aware for Java), embeddings (e.g., BGE-Code, mxbai), vector stores (e.g., Chroma), cross-encoder rerankers, and LangChain router chains.
Design CI orchestrations, including daily Jenkins jobs for delta detection, image captioning (e.g., Qwen2-VL, LLaVA), cost/latency guardrails, and rollback strategies.
Establish model and prompt governance, including prompt templates, few-shot libraries, safety filters, and evaluation rubrics (faithfulness, coverage, compile success).
Lead architecture for a UI onboarding tool, deciding on hosting (FastAPI + React or Streamlit MVP), SSO/auth flows, token streaming, and feedback mechanisms for continuous learning.
Oversee data security and compliance, embedding privacy policies, source citations, audit logs, and ensuring Confluence/Qtest credentials are managed in Secrets Manager.
Provide technical leadership by reviewing PRs, setting code quality standards, and conducting architecture workshops for AI and Automation Agent Engineers.

Must-Have Qualifications

6–8 years of experience building data or ML platforms, with at least 2 years deploying LLM/RAG systems in production.
Deep expertise in LangChain, ChromaDB, Qdrant, or pgvector, and cross-encoder rerankers.
Strong proficiency in Python (FastAPI or Flask) and ability to analyze Java codebases for chunking boundaries.
Proven experience designing CI/CD pipelines (Jenkins, GitHub Actions) with delta builds and artifact promotion.
Hands-on experience managing OpenAI/Anthropic API keys or self-hosting large models.
Demonstrated expertise in security and compliance, including PII protection, role-based access, and secret rotation.

Key Responsibilities

Must-Have Qualifications

About Perform

RAG Architect

Already working at Perform?