AI Libraries Overview

This page describes the different AI libraries available and what resources to refer to.

In the modern AI stack, large language models (LLMs) are used for summarizing information, assisting users, and solving complex problems across industries. This demands fast and accurate access to relevant stored information in order to enable these models to deliver the best results to you. The AI Libraries make this possible by enabling high-performance semantic search on unstructured data.

Unstructured data isn't where it ends for the AI Libraries, it also enables time series similarity search (TSS), allowing you to find patterns in numeric data that match a sequence of interest. This allows you to scan historical data for moments that behave like a current signal, identify recurring events or analyze past system behavior.

Vector search

  • Brute Force (Flat) - This method compares every data point in a dataset to find the most similar ones. It’s simple and accurate, but can be slow with large datasets. It is best suited to small-scale or low-latency applications where precision is critical.

  • Hierarchical Navigable Small Worlds (HNSW) - An efficient algorithm for fast approximate nearest-neighbor search. It builds a graph to quickly narrow down potential matches, making it well-suited for large datasets where speed is important but exact results are not always necessary.

  • Inverted File (IVF) - Splits the dataset into clusters and searches only within the most relevant ones, reducing computation. This is useful for large-scale search when balancing speed and memory usage is important.

  • Inverted File Product Quantization (IVFPQ) - Extends IVF by compressing data into smaller representations to save memory and further speed up search. Ideal for massive datasets where some accuracy can be traded for faster results and lower storage costs.

Times series search

  • Time Series Search (TSS) - Designed for finding similar patterns in sequential data (like financial or sensor signals). It focuses on shape and timing rather than exact values, making it useful for anomaly detection, forecasting, and pattern matching in temporal datasets.