Typed inference pipelines
Define your task. Build your pipeline. Measure and improve.
Marigold hosts open-weight models on private infrastructure. Compose them into typed multi-step workflows and measure output quality against task specifications built from your own data.
Most AI pipeline infrastructure is built bespoke, on every engagement, and thrown away. Marigold is the inference layer you bring once: privately hosted open-weight models, a declarative workflow engine, and an eval surface that grows with production use.
Three surfaces
01
A unified async API over self-hosted HuggingFace models covering text, image, audio, and cross-modal operations. One container image, one EFS weight cache, per-model isolation. The marginal cost of adding a model falls with each one added.
02
Declarative YAML pipelines over the model registry. Steps declare typed inputs and outputs; the executor resolves the dependency graph, runs independent steps in parallel, and advances on result. Every step has an audit trail.
03
Run any model or pipeline against a labelled dataset. Score outputs using the same handler registry. Build custom eval libraries from production runs and corrected outputs. The task specification sharpens with use.
What runs on it
Extract text, generate embeddings, classify by category, and gate restricted content before it reaches an index. Text and image embeddings produced in the same pipeline enable cross-modal search.
Generate, score for safety, aesthetic quality, and prompt alignment. Reset and regenerate automatically until thresholds are met or attempt limits are reached.
Compare an observation image against a reference using structural embeddings. On deviation, segment both images, diff the masks, and produce a natural-language description of the discrepancy.
Extract entities, classify, summarise, and convert to speech in multiple languages from a single workflow submission. Output is a written summary plus audio files, one per target language.
Classify large volumes of unlabelled rows against a small labelled example set. Low-confidence predictions are passed to an instruct model for natural-language explanation. No retraining required.
Embed and compare images from multiple monitored locations in parallel. On detected change, describe what changed and assemble a report. Deliver as text and audio on a schedule.
What workflows can do
Workflow execution is built on runfox and json-logic-path, open source libraries available on PyPI.
Available as a managed deployment or a consultancy engagement.
Get in touch