Building High-Performance Vector Search in Node.js with FAISS — Without Blocking the Event Loop

typescript dev.to

If you're building a RAG pipeline, semantic search engine, or AI-powered
app in Node.js, you've probably hit the same wall I did — vector search
libraries that freeze your entire server while searching through embeddings.

Today I want to share faiss-node-native, a project I've been building
to fix exactly that.

What is Vector Search and Why Does It Matter?

Modern AI applications — chatbots, semantic search, recommendation engines
— all rely on embeddings: high-dimensional vectors that represent the
meaning of text, images, or audio.

Vector search lets you find the most semantically similar items to a query
by searching through millions of these vectors in milliseconds. It's the
backbone of every RAG (Retrieval Augmented Generation) application.

// You convert text to embeddings using OpenAI, HuggingFace, etc.
const embedding = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: "What is the capital of France?"
});

// Then search your vector index for the most similar documents
const results = await index.search(embedding, 5);
Enter fullscreen mode Exit fullscreen mode

Facebook's FAISS is the gold standard library for this — used by Meta,
used in production at scale, battle-tested. But there was no good way to
use it in Node.js. Until now.


The Problem with Existing Solutions

The most popular package faiss-node works — but it has a critical flaw
for production use:

It blocks the Node.js event loop.

// faiss-node — SYNCHRONOUS, blocks everything
const results = index.search(query, 10); // 😱 freezes your server
Enter fullscreen mode Exit fullscreen mode

In a production server handling hundreds of concurrent requests, a
synchronous FAISS search can block your entire Node.js process for
hundreds of milliseconds. Every other request waits. Your API becomes
unresponsive.

This is a fundamental Node.js anti-pattern.


Introducing faiss-node-native

@faiss-node/native is a ground-up rewrite with a fully async,
non-blocking API built on N-API worker threads.

npm install @faiss-node/native
Enter fullscreen mode Exit fullscreen mode
// faiss-node-native — ASYNC, never blocks
const results = await index.search(query, 10); // ✅ non-blocking
Enter fullscreen mode Exit fullscreen mode

Your event loop stays free. Other requests keep being served. Your server
stays responsive.


Quick Start

Create an Index and Add Vectors

const { FaissIndex } = require('@faiss-node/native');

// Create a HNSW index for 1536-dimensional OpenAI embeddings
const index = new FaissIndex({ type: 'HNSW', dims: 1536 });

// Add your document embeddings
const embeddings = new Float32Array([
  /* your vectors here */
]);
await index.add(embeddings);

console.log('Vectors indexed:', index.getStats().ntotal);
Enter fullscreen mode Exit fullscreen mode

Search for Similar Documents

// Search for 5 nearest neighbors
const queryVector = new Float32Array([/* your query embedding */]);
const results = await index.search(queryVector, 5);

console.log('Nearest document IDs:', results.labels);
console.log('Distances:', results.distances);
Enter fullscreen mode Exit fullscreen mode

Save and Load from Disk

// Persist your index
await index.save('./my-index.faiss');

// Load it back later
const loaded = await FaissIndex.load('./my-index.faiss');
Enter fullscreen mode Exit fullscreen mode

Store in Redis or MongoDB

// Serialize to buffer — store anywhere
const buffer = await index.toBuffer();
await redis.set('faiss-index', buffer);

// Restore from buffer
const buf = await redis.getBuffer('faiss-index');
const index = await FaissIndex.fromBuffer(buf);
Enter fullscreen mode Exit fullscreen mode

Choosing the Right Index Type

// FLAT_L2 — exact search, best for < 10k vectors
const small = new FaissIndex({ type: 'FLAT_L2', dims: 128 });

// IVF_FLAT — approximate, best for 10k–1M vectors
const medium = new FaissIndex({ 
  type: 'IVF_FLAT', 
  dims: 768,
  nlist: 100,
  nprobe: 10
});
await medium.train(trainingVectors); // train first!

// HNSW — best overall, logarithmic search, best for large datasets
const large = new FaissIndex({ 
  type: 'HNSW', 
  dims: 1536,
  M: 16,
  efSearch: 50
});
Enter fullscreen mode Exit fullscreen mode

Thread Safety Out of the Box

Unlike faiss-node, all operations in faiss-node-native are
thread-safe. You can safely run concurrent searches without worrying
about race conditions:

// Completely safe — runs concurrently
const [results1, results2, results3] = await Promise.all([
  index.search(query1, 5),
  index.search(query2, 5),
  index.search(query3, 5)
]);
Enter fullscreen mode Exit fullscreen mode

Full TypeScript Support

import { FaissIndex, FaissIndexConfig, SearchResults } from '@faiss-node/native';

const config: FaissIndexConfig = {
  type: 'HNSW',
  dims: 768
};

const index = new FaissIndex(config);
const results: SearchResults = await index.search(queryVector, 10);
Enter fullscreen mode Exit fullscreen mode

Building a Simple RAG System

Here's a minimal but complete RAG pipeline using faiss-node-native:

const { FaissIndex } = require('@faiss-node/native');
const OpenAI = require('openai');

const openai = new OpenAI();
const index = new FaissIndex({ type: 'HNSW', dims: 1536 });
const documents = [];

async function addDocument(text) {
  const res = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text
  });
  const vector = new Float32Array(res.data[0].embedding);
  await index.add(vector);
  documents.push(text);
}

async function search(query, k = 3) {
  const res = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: query
  });
  const vector = new Float32Array(res.data[0].embedding);
  const results = await index.search(vector, k);
  return results.labels.map(i => documents[i]);
}

// Usage
await addDocument('Paris is the capital of France');
await addDocument('Berlin is the capital of Germany');
await addDocument('Tokyo is the capital of Japan');

const matches = await search('What is the capital of France?');
console.log(matches); // ['Paris is the capital of France', ...]
Enter fullscreen mode Exit fullscreen mode

What's Next

I'm actively working on:

  • ✅ Prebuilt binaries for Linux, macOS, Windows (no compile needed)
  • 🔄 LangChain.js integration as an official vector store
  • 🔄 Benchmarks vs faiss-node and hnswlib-node
  • 🔄 GPU support via CUDA

Get Started

npm install @faiss-node/native
Enter fullscreen mode Exit fullscreen mode

⭐ Star the repo:
github.com/anupammaurya6767/faiss-node-native

📦 npm:
npmjs.com/package/@faiss-node/native

📖 Docs:
anupammaurya6767.github.io/faiss-node-native

If you're building RAG apps, semantic search, or anything LLM-related
in Node.js — give it a try and let me know what you think in the
comments! 🚀


Built with ❤️ for the Node.js community by
@anupammaurya6767

Read Full Tutorial open_in_new
arrow_back Back to Tutorials