Introducing the Embedding Adapter
Zero-downtime migration between embedding models
Every organization that operates a vector database faces the same problem eventually: the embedding model you chose six months ago has been superseded by something better. Migrating means re-encoding every document, image, or profile in your corpus. For large-scale systems, this is a multi-day, multi-thousand-dollar operation that requires significant downtime.
The adapter approach
The Embedding Adapter learns a high-fidelity mapping between embedding spaces. Given paired samples from the old and new model, it trains a lightweight transformation that converts vectors from one space to the other.
This means you can start querying with a new embedding model immediately, without re-encoding your existing index. The adapter handles the translation in real time.
Performance characteristics
In our evaluations on MS MARCO and internal benchmarks, the adapter recovers 95-99% of native retrieval accuracy compared to a full re-index. The latency overhead is sub-millisecond per query. For most applications, this is indistinguishable from native performance.
Beyond model upgrades
The same technology enables cross-silo querying. Two organizations using different embedding models can search each other's indices without exchanging raw data. The adapter provides a shared geometric space for federated retrieval.
We published the technical details in our paper on Local Drift Adapters, which introduces per-cluster mixture-of-expert adapters that outperform global linear baselines on heterogeneous vector databases.
The Embedding Adapter is available today for enterprise customers. Reach out to discuss deployment.
Related
Emotion Vectors and the Future of Matching
Anthropic just published landmark research showing that LLMs maintain abstract, causally operative representations of emotion. This is something we have been thinking about for a long time, and it changes how we should build matching systems.
Why General-Purpose Embeddings Fail at Human Matching
Standard embedding models are trained on text similarity. But when the goal is predicting compatibility between people, text similarity is the wrong objective entirely.