How to Query Data in KDB.AI

This page covers how to perform queries against tables in the KDB.AI database.

Selecting the table to query

Each table in KDB.AI has an associated name. To perform a query, specify the name of the table in which the relevant data is stored. Using Python client you can create a table object from the session. The REST client is more direct with the table name supplied as a field in the JSON payload.

Python

Copy
documents = session.database('default').table("documents")

Queries

Obtain the data from the entire table using the following command:

Python

REST

q

Python

Copy
documents.query()

Bash

Copy
curl -s -X POST localhost:8081/api/v2/databases/default/tables/documents/query | jq .

q

Copy
gw(`query;`database`table!(`default;`documents));

Filters

To select a subset of the data, you can apply filters as documented here. These serve as where clauses.

Python

REST

q

Python

Copy
import datetime
start_time=datetime.datetime(2024,1,10)
end_time=datetime.datetime(2024,10,10)
results = table.query(filter=[("within","createdDate",[start_time, end_time]),("<=", "length", 100)])
print(f'Results: {results}')

Bash

Copy
curl -s -H "Content-Type: application/json" localhost:8081/api/v2/databases/default/tables/documents/query \
-d '{"table":"documents","filter":[["within","createdDate",["2020.07.10D15:00:00.0000", "2021.07.10D15:00:00.0000"]],["<=", "length", 100]]}'

q

Copy
gw(`query;`database`table`filter!(`default;`documents;((within;`time;(2000.06.15D00:00:00.000000001; 2000.08.15D00:00:00.000000001));(<=;`length;100))));

Processing results

You can return a subset of the columns in the table, reducing the amount of data sent back to the client.

Python

REST

q

Python

Copy
documents.query(aggs={"author":"author","content":"content"})

Bash

Copy
curl -s -X POST -H 'Content-Type: application/json' localhost:8081/api/v2/databases/default/tables/documents/query \
-d '{"aggs":{"author":"author","content":"content"}}' | jq .

q

Copy
gw(`query;`database`table`aggs!(`default;`documents;`author`content!`author`content))

In addition to returning a subset of the columns, you can return aggregated results, group by categorical variables, and sort based on a column name.

Python

REST

q

Python

Copy
aggs = dict()
aggs['SumLength'] = ('sum','length')

print(f'Table data:\n\n {table.query()}')
results = table.query(aggs=aggs, group_by=['author'], sort_columns=['SumLength'])
print(f'Aggregate data results:\n {results}')

Bash

Copy
curl -s -H "Content-Type: application/json" localhost:8081/api/v2/databases/default/tables/documents/query \
-d '{"aggs":{"sumLength":["sum","length"]}, "groupBy":["author"], "sortColumns":["sumLength"]}'

q

Copy
gw(`query;`database`table`aggs`groupBy`sortColumns!(`default;`documents;(enlist `sumLength)!(enlist `sum`length);`author;`sumLength))

Supported aggregations

The table below lists all supported aggregation functions.

Function

Description

all

Returns the logical 'and' of all values in a set.

any

Returns the logical 'or' of all values in a set.

avg

Calculates the mean value across the set of matching records.

count

Returns the number of records in the current selection.

dev

Calculates the standard deviation of a column.

distinct

Returns the distinct values from a column.

first

Returns the first occurrence of a value. This is useful when performing a group by aggregation.

last

Returns the last occurrence of a value. This is useful when performing a group by aggregation.

max

Takes the maximum value of a set of records.

min

Takes the minimum value of a set of records.

prd

Calculates the product of matching records.

sdev

Calculates the sample deviation of matching records.

scov

Calculates the sample covariance between matching records.

sum

Calculates the sum of matching records.

svar

Calculates the sample variance of matching records.

var

Calculates the variance of matching records.

Next Steps

Now that you're familiar with querying, move on to the following: