Local Embeddings with OpenAI
Generate text embeddings locally with the OpenAI client
Text embeddings are the foundation of semantic search and retrieval augmented generation (RAG) pipelines. And unlike text generation models, there are many open-source state-of-the-art (SOTA) embedding models that are small enough to run on-device, saving significant amounts on AI spend:
Cost of OpenAI text-embedding-3-large vs. Function
Check out other cloud vs. on-device AI cost comparisons on fxn.ai.
Installing Function LLM
We have created a tiny utility library, Function LLM that patches
the OpenAI
client to generate embeddings on-device. First, install the library:
Generate Embeddings Locally
We will be using Nomic’s @nomic/nomic-embed-text-v1.5-quant
embedding model which boasts better performance than
OpenAI’s text-embedding-3-small
:
Nomic’s embedding model requires prepending the input string with a prefix indicating the embedding
task (e.g. search_query
). See the predictor card
for more info.