Город: Тбилиси
Зарплата:
от 4000
USD
Занятость: Полная занятость, Удаленная работа
Опыт работы: Более 6 лет
About Us
Silverbee is a fast-moving, seed-stage startup backed by Tier 1 Silicon Valley VCs. After a successful funding round, we are growing our core engineering team to build a production-grade agentic AI platform. Based in Tel Aviv, our mission is to become the service delivery backbone for the next generation of marketing departments.
What you will do
This is a senior individual contributor role with full ownership over production AI behavior, not a research or experimentation-only position.
-
Own the end-to-end behavior of a production agentic AI system built with LangGraph, from prompt and architecture design through evaluation, deployment, and production debugging
-
Design, implement, and evolve agent architectures (planning, execution, supervision, routing, tool usage) with a clear focus on quality, determinism, latency, and cost
-
Take full ownership of system prompt design and tuning, treating prompts as first-class production artifacts that are versioned, evaluated, tested, and deployed as part of the SDLC
-
Continuously improve agent performance by running controlled experiments with models, prompting strategies, memory, and tooling, always tied to measurable impact
-
Design, run, and evolve evaluation loops (offline evals, regression testing, and production signals) that directly gate releases and drive architectural decisions
-
Turn insights from evaluations, traces, and real production usage into concrete improvements in prompts, agents, and system design
-
Raise the bar on what “production-ready AI” means at Silverbee: reliability, observability, repeatability, and maintainability
-
Mentor other engineers and help establish best practices for building, evaluating, and operating production-grade AI systems
What you need to know
-
Strong hands-on experience owning LLM-powered systems that run in production, including diagnosing and fixing real-world failures
-
Practical experience designing and operating AI evaluation frameworks, including offline evals, regression testing, and qualitative and quantitative metrics
-
Deep understanding of prompt engineering as an engineering discipline: structure, constraints, iteration, debugging, and performance tuning
-
Experience with agent frameworks such as LangChain / LangGraph, or equivalent agentic architectures
-
Solid software engineering skills (Python and/or TypeScript), including system design, testing, and code quality
-
Strong product sense, with the ability to reason clearly about trade-offs between correctness, speed, cost, and flexibility
-
Proven ability to operate in ambiguous problem spaces and be accountable for outcomes, not just experimentation
-
Clear communication skills and the ability to work effectively with product managers, domain experts, and other non-AI stakeholders
The reason this job vacancy is written in English is that we are looking for a candidate who has proficiency in the language at the B1-B2 level.
If you are interested in participating in our project, please apply for the vacancy! We look forward to meeting you!
Похожие вакансии