Heavy focus on multi-agent LLM design, prompt and context engineering; explicitly not a vibe-coding role and expects production-grade engineering and evals infrastructure.
About the Role
Senior AI Engineer (contract, part-time) to design and implement the core multi-agent LLM architecture and agentic logic for a stateful, multi-session generative AI simulation engine. Primary goals include architecting context/state management to reduce hallucination and latency, building evals/infrastructure for model/version testing, and integrating AI outputs with front-end and back-end systems.
Job Description
Role
Senior AI Engineer (contract, part-time) responsible for designing and implementing the core intelligence for a stateful, multi-agent generative AI simulation engine. The role focuses on multi-agent architecture, long-term session memory, context engineering to prevent hallucination, evals/infrastructure for testing model behavior, and secure moderation/prompt-injection mitigation. Collaboration with backend and frontend teams is required to integrate AI outputs into the product.
Key Responsibilities
- Design and manage multi-agent workflows and orchestrations for onboarding, dynamic scenario generation, and real-time simulation loops.
- Architect state and context management for thousands of persistent user sessions to minimize context rot and latency.
- Write, test, and version-control robust system instructions and prompts for standalone LLMs and multi-agent workflows.
- Design and own an evaluations (evals) framework to programmatically score prompt performance and end-to-end agent lifecycles; establish CI/CD-style testing loops for model behavior.
- Design and implement moderation and security pipelines to handle harmful inputs and mitigate prompt injection and undesired outputs.
- Integrate AI outputs with backend and frontend systems to ensure reliable data flow to the client.
Requirements
- Strong software engineering foundation and ability to write production-grade code; cannot rely solely on AI coding assistants.
- 1–3 years of hands-on experience building and deploying LLM-backed applications in production.
- Strong proficiency in Python and strong proficiency in TypeScript; familiarity with modern reactive frontend frameworks (preferably Angular v21).
- Hands-on experience with agent frameworks/agent harnesses (examples: LangGraph, CrewAI, OpenAI Agents SDK, Claude Agent SDK, Google ADK); preference for Google ADK experience.
- Deep understanding of LLM context/latency optimization, token usage, caching strategies, and context engineering.
- Experience designing and applying guardrails, moderation strategies, and prompt-injection mitigations for LLMs.
Ideally
- Comfortable defining best practices in an emerging space and driving precision in prompt/system instruction language.
- Strong focus on latency optimization and measurable performance of model-driven workflows.
Core Tech Skills (explicitly mentioned)
Python, TypeScript, Angular v21, LangGraph, CrewAI, OpenAI Agents SDK, Claude Agent SDK, Google ADK, LLM APIs, AWS, GCP, Django, JavaScript, Java, Node, React, Swift, Objective-C, Unity, Android, iOS, AR/VR, Wagtail
Compensation & Logistics
- Part-time contract: 20 hours/week from 2024-04-13 to 2024-05-29 with a strong likelihood of extension through October at full-time hours.
- Pay range: $61–$78 per hour (US 3 pay band). Company uses regional pay bands (US1/US2/US3) that adjust pay by location.