⚠︎ Архивная вакансия
Эта вакансия была перемещена в архив. Возможно она уже не актуальна и рекрутер больше не принимает отклики на эту вакансию. Вы можете  найти актуальные похожие  вакансии

Senior AI Platform Engineer

Прямой работодатель  Zentist ( zentist.io )
США
Сеньор
Информационные технологии • Разработка • Node.js • Python • TypeScript • CRM • ERP • ML/AI
17 декабря 2025
Удаленная работа
Опыт работы любой
Работодатель  Zentist
Описание вакансии

About Zentist

Dental insurance plans are notoriously complicated. Policy agreements are difficult for consumers to understand and for providers to service and manage.

We started Zentist because we believe neither consumers nor providers should struggle with the complexity of understanding and utilizing dental plans. At Zentist, we are developing billing technology between dental providers and insurance companies that abstracts away all the complexity of billing and collecting insurance.

We’re located in San Francisco and we’re backed by leading investment firm including Costanoa Ventures, Point Nine Capital, and Fika Ventures. We would like to build our core team at a foundational point in the company to help us build a category-defining product for the long-term.

At Zentist, we promote a productive and collaborative culture. We work hard, have fun, and treat people with respect. Our benefits include a competitive salary, a culture of learning and development, and more!

Zentist is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

We work on PST time, approximately from 4 pm - 12 pm Moscow time.

Core mission (what this person must be able to do)

  • Keep JANE (JANE is Job Agent Network Engine) reliable, secure, and compliant as a job platform (single contract, lifecycle, event model, HITL, artifacts/diagnostics).
  • Continuously add new AI agent “capabilities” quickly  via capability registration—implemented primarily as Python/Node workers integrating different LLM providers.

Mixed requirements

AI agent capability development (must-have)

  • Strong Python and/or Node/TypeScript: able to build, test, ship, and maintain production worker services.
  • Multi-LLM provider experience: hands-on integrating at least 2 providers (e.g., OpenAI + Gemini/Anthropic/etc.), including:
    • auth/key management patterns,
    • rate limiting + retries + timeouts,
    • structured outputs / JSON mode / schema validation,
    • model selection by latency/cost/quality tradeoffs,
    • fallbacks and “degraded mode” behavior.
  • Tool-using / workflow agents: experience building agents that call tools/functions, manage multi-step flows, and produce structured outputs—not just “prompt → text”.
  • Agent quality engineering: ability to prevent regressions with:
    • golden test sets / fixtures,
    • automated eval harnesses,
    • prompt/version management,
    • rollout + rollback strategies per capability version.
  • Operationalizing AI in production: comfort with token/cost tracking, tracing LLM calls, and building guardrails around sensitive data (PII/PHI, logs, artifacts).

Platform/runtime ownership (nice-to-have)

  • Contract-first backend engineering: proven ability to design and evolve stable contracts (request/response schemas, versioning, backward compatibility).
  • Job lifecycle + orchestration thinking: comfortable implementing/maintaining:
    • job states + terminal outcomes (including partial success),
    • deadlines/expiry,
    • retries and cancellation,
    • durable state transitions.
  • HITL (Human-in-the-Loop) workflows: experience implementing:
    • “waiting for input” flows with schema-driven forms,
    • timers, due-by dates, escalations,
    • auditability of request/response/actor/timestamps.
  • Event-driven architecture: real experience with Kafka/NATS-style systems, including:
    • at-least-once delivery,
    • idempotent consumers,
    • dedupe,
    • event schemas (CloudEvents-style mapping).
  • Artifacts + diagnostics systems: ability to design outputs so “partial progress” is still valuable:
    • artifact roles (logs/evidence/intermediate/primary output),
    • presigned upload/download patterns,
    • structured diagnostics (error codes, confidence, suggested next actions).
  • Security/compliance engineering (non-negotiable):
    • OAuth2/JWT scopes and permission design,
    • mTLS / service-to-service security concepts,
    • PHI/PII handling patterns (tagging, redaction/tokenization hooks, retention controls),
    • audit logging expectations.
  • Observability ownership:
    • distributed tracing/correlation IDs,
    • metrics (queue latency, run time, success rate, HITL ratio, retries, cost units),
    • structured logs that make incidents debuggable.
  • Container-first development: build and run services/workers in containers; strong local dev workflow habits.
  • CI discipline: tests + conformance checks are treated as product requirements (especially for contract changes and capability versions).
  • Integration empathy: understands producer needs (Airflow, dashboard SDK, RPA adapters) and designs changes that don’t force other teams to rewrite code.

Специализация
Информационные технологииРазработкаNode.jsPythonTypeScript
Отрасль и сфера применения
CRMERPML/AI
Уровень должности
Сеньор