Skip to content

πŸπŸ“¦ High-performance cosine similarity ranking for Retrieval-Augmented Generation (RAG) pipelines.

License

Notifications You must be signed in to change notification settings

analyticsinmotion/symrank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

50 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

logo-symrank

Similarity ranking for Retrieval-Augmented Generation

Meta PyPI versionΒ  Supports Python 3.10-3.14Β  Apache 2.0 LicenseΒ  uvΒ  RuffΒ  Powered by RustΒ  Analytics in Motion

✨ What is SymRank?

SymRank is a blazing-fast Python library for top-k cosine similarity ranking, designed for vector search, retrieval-augmented generation (RAG), and embedding-based matching.

Built with a Rust + SIMD backend, it offers the speed of native code with the ease of Python.


πŸš€ Why SymRank?

⚑ Fast: SIMD-accelerated cosine scoring with adaptive parallelism

🧠 Smart: Automatically selects serial or parallel mode based on workload

πŸ”’ Top-K optimized: Efficient inlined heap selection (no full sort overhead)

🐍 Pythonic: Easy-to-use Python API

πŸ¦€ Powered by Rust: Safe, high-performance core engine

πŸ“‰ Memory Efficient: Supports batching for speed and to reduce memory footprint


Below are single-query cosine similarity benchmarks comparing SymRank to NumPy and scikit-learn across realistic re-ranking candidate sizes.

Candidates (N) SymRank matrix (ms) NumPy normalized (ms) sklearn (ms) Fastest SymRank Speedup
20 0.006 0.027 0.210 SymRank 4.50x
50 0.017 0.050 0.266 SymRank 2.92x
100 0.020 0.086 0.390 SymRank 4.18x
500 0.169 0.393 1.843 SymRank 2.32x
1,000 0.170 0.669 3.588 SymRank 3.95x
5,000 0.748 5.261 32.196 SymRank 7.03x
10,000 1.976 13.938 42.514 SymRank 7.05x
  • Cosine similarity top k (k=5), embedding dimension 1536, float32.
  • NumPy baseline uses candidates normalized once and query normalized per call.
  • sklearn uses sklearn.metrics.pairwise.cosine_similarity.
  • Times are mean milliseconds per query on Windows.

Real-world benchmark (Hugging Face embeddings)

Performance on 10,000 real OpenAI-style embeddings streamed from the Hugging Face Hub.

Method Mean time (ms) Speedup vs SymRank
SymRank matrix 1.800 1.0x
SymRank list 22.753 12.64x
NumPy normalized 44.665 24.81x
sklearn 42.709 23.73x
  • Dataset: Qdrant/dbpedia-entities-openai3-text-embedding-3-large-1536-1M
  • k=5, float32, Windows
  • Mean time per query

πŸ“¦ Installation

You can install SymRank with 'uv' or alternatively using 'pip'.

Recommended (with uv)

uv pip install symrank

Alternatively (using pip)

pip install symrank

πŸ§ͺ Usage

SymRank provides two APIs optimized for different workflows.


Option 1: cosine_similarity_matrix (recommended for performance)

Best when:

  • Candidate embeddings are already stored as a single 2D NumPy array
  • Performance matters (about 10 to 14x faster for N=1,000 to 10,000 versus the list API)
  • Running many queries against the same candidate set
import numpy as np
from symrank import cosine_similarity_matrix

# Example data (dimension = 4 for readability)
query = np.array([1.0, 0.0, 0.0, 0.0], dtype=np.float32)

candidate_matrix = np.array(
    [
        [1.0, 0.0, 0.0, 0.0],  # identical to query
        [0.0, 1.0, 0.0, 0.0],  # orthogonal
        [0.5, 0.5, 0.0, 0.0],  # partially aligned
        [0.2, 0.1, 0.0, 0.0],  # weakly aligned
    ],
    dtype=np.float32,
)

ids = ["doc_a", "doc_b", "doc_c", "doc_d"]

results = cosine_similarity_matrix(query, candidate_matrix, ids, k=3)
print(results)

Output:

[
  {"id": "doc_a", "score": 1.0},
  {"id": "doc_d", "score": 0.8944272},
  {"id": "doc_c", "score": 0.70710677}
]

Notes:

  • Scores are cosine similarity (range -1 to 1, higher = more similar)
  • Results are sorted by descending similarity

Typical production usage (1536-dimensional embeddings):

import numpy as np
from symrank import cosine_similarity_matrix

D = 1536
N = 10_000

query = np.random.rand(D).astype(np.float32)
candidate_matrix = np.random.rand(N, D).astype(np.float32)
ids = [f"doc_{i}" for i in range(N)]

top5 = cosine_similarity_matrix(query, candidate_matrix, ids, k=5)
for result in top5:
    print(f"{result['id']}: {result['score']:.4f}")

Optional batching for memory control:

# Process 10k candidates in batches of 2000
results = cosine_similarity_matrix(
    query, candidate_matrix, ids, k=5, batch_size=2000
)

Option 2: cosine_similarity (flexible and convenient)

Best when:

  • Candidates come from mixed or streaming sources
  • Vectors are naturally represented as (id, vector) pairs
  • Simplicity is more important than maximum throughput

Basic example using Python lists:

import symrank as sr

query = [0.1, 0.2, 0.3, 0.4]
candidates = [
    ("doc_1", [0.1, 0.2, 0.3, 0.5]),
    ("doc_2", [0.9, 0.1, 0.2, 0.1]),
    ("doc_3", [0.0, 0.0, 0.0, 1.0]),
]

results = sr.cosine_similarity(query, candidates, k=2)
print(results)

Output:

[
  {"id": "doc_1", "score": 0.9939991235733032},
  {"id": "doc_3", "score": 0.7302967309951782}
]

Basic example using NumPy arrays:

import symrank as sr
import numpy as np

query = np.array([0.1, 0.2, 0.3, 0.4], dtype=np.float32)
candidates = [
    ("doc_1", np.array([0.1, 0.2, 0.3, 0.5], dtype=np.float32)),
    ("doc_2", np.array([0.9, 0.1, 0.2, 0.1], dtype=np.float32)),
    ("doc_3", np.array([0.0, 0.0, 0.0, 1.0], dtype=np.float32)),
]

results = sr.cosine_similarity(query, candidates, k=2)
print(results)

Output:

[
  {"id": "doc_1", "score": 0.9939991235733032},
  {"id": "doc_3", "score": 0.7302967309951782}
]

Optional batching:

results = sr.cosine_similarity(query, candidates, k=5, batch_size=1000)

Performance Comparison

Dataset Size Option 1 (matrix) Option 2 (list) Speedup
N=100 0.02 ms 0.06 ms 3.3x
N=1,000 0.18 ms 2.28 ms 12.7x
N=10,000 1.50 ms 19.66 ms 13.1x

Benchmark: 1536-dimensional embeddings, k=5, Python 3.14, Windows. Benchmark includes Python-side overhead for each API.


Quick Decision Guide

Use cosine_similarity_matrix if:

  • βœ… You have a pre-built NumPy matrix of candidates
  • βœ… Performance is critical
  • βœ… Processing many queries against the same corpus

Use cosine_similarity if:

  • βœ… Building candidates on-the-fly
  • βœ… Mixed vector input types (lists or NumPy arrays)
  • βœ… Flexibility > raw speed

Both functions return the same format: a list of dicts sorted by descending similarity score.


🧩 API: cosine_similarity(...)

cosine_similarity(
    query_vector,              # List[float] or np.ndarray
    candidate_vectors,         # List[Tuple[str, List[float] or np.ndarray]]
    k=5,                       # Number of top results to return
    batch_size=None            # Optional: set for memory-efficient batching
)

'cosine_similarity(...)' Parameters

Parameter Type Default Description
query_vector list[float] or np.ndarray required The query vector you want to compare against the candidate vectors.
candidate_vectors list[tuple[str, list[float] or np.ndarray]] required List of (id, vector) pairs. Each vector can be a list or NumPy array.
k int 5 Number of top results to return, sorted by descending similarity.
batch_size int or None None Optional batch size to reduce memory usage. If None, uses SIMD directly.

Returns

List of dictionaries with id and score (cosine similarity), sorted by descending similarity:

[{"id": "doc_42", "score": 0.8763}, {"id": "doc_17", "score": 0.8451}, ...]

πŸ“„ License

This project is licensed under the Apache License 2.0.