Ternative is an open-source AI project from Colombia, building competitive language models — and the inference engine to run them — on hardware ordinary people already own.
The dominant story of modern AI is one of scale: more parameters, more GPUs, more capital. Orchid 1.0 is an argument against that being the only path.
It's a 2-billion-parameter ternary-weight model, fine-tuned from Microsoft's BitNet b1.58 and aligned with ORPO across two preference-tuning rounds — the first competitive LLM trained and aligned in Colombia. Every stage ran on a single RTX 3050 laptop with 4 GB of VRAM. No cloud, no cluster.
Getting there meant solving problems the big labs never hit: how to fine-tune in 4 GB, and how to serve a ternary model with a LoRA adapter when no existing engine could. So we built ternative — and released all of it under Apache 2.0, with weights, code, a technical paper, and an archived DOI.
A documented recipe with published failure modes — not a black box. Reproduce the benchmarks with the scripts in the repo.
Inference runs on your machine — CPU-only if needed. No account, no telemetry, no cloud dependency.
Weights, engine, paper and data recipe — all public, free for research and commercial use alike.
Ternative is built and maintained in the open, outside any large lab. Funding goes directly to continued development — better models, a faster engine, and the friendly Orchid Desktop app. If your organization relies on open, reproducible AI, consider supporting it through FLOSS/fund.
Both have a citable record. The model is archived on Zenodo with a permanent DOI.
@software{romerochisco2026ternative,
title = {ternative: Inference Engine for
Ternary-Weight LLMs with Runtime LoRA},
author = {Romero Chisco, Michelangelo},
year = {2026},
license = {Apache-2.0}
}