PyKX Introduction Notebook

The purpose of this notebook is to introduce you to PyKX capabilities and functionality.

For the best experience, visit what is PyKX and the quickstart guide first.

To follow along, we recommend to download the notebook.

Now let's go through the following sections:

  1. Import PyKX
  2. Basic PyKX data structures
  3. Access and create PyKX objects
  4. Run analytics on PyKX objects

1. Import PyKX

To access PyKX and its functions, import it in your Python code as follows:

Python

Copy
import pykx as kx
kx.q.system.console_size = [10, 80]

Tip

We recommend to always use import pykx as kx. The shortened import name kx makes the code more readable and is standard for the PyKX library.

Below we load additional libraries used through this notebook:

Python

Copy
import numpy as np
import pandas as pd

2. Basic PyKX data structures

Central to your interaction with PyKX are the data types supported by the library. PyKX is built atop the q programming language. This provides small footprint data structures for analytic calculations and the creation of highly-performant databases. The types we show below are generated from Python-equivalent types.

This section describes the basic elements in the PyKX library and explains why/how they are different:

2.a Atom

In PyKX, an atom is a single irreducible value of a specific type. For example, you may come across pykx.FloatAtom or pykx.DateAtom objects which may have been generated as follows, from an equivalent Pythonic representation.

Python

Copy
kx.FloatAtom(1.0)  

Python

Copy
pykx.FloatAtom(pykx.q('1f'))   

Python

Copy
from datetime import date
kx.DateAtom(date(2020, 1, 1))

Python

Copy
pykx.DateAtom(pykx.q('2020.01.01'))

2.b Vector

Like PyKX atoms, PyKX Vectors are a data structure with multiple elements of a single type. These objects in PyKX, along with lists described below, form the basis for most of the other important data structures that you will encounter including dictionaries and tables.

Vector objects provide significant benefits when applying analytics over Python lists. Like Numpy, PyKX gains from the underlying speed of its analytic engine when operating on these strictly typed objects.

Vector type objects are always 1-D and are/can be indexed along a single axis.

In the following example, we create PyKX vectors from common Python equivalent numpy and pandas objects:

Python

Copy
kx.IntVector(np.array([1, 2, 3, 4], dtype=np.int32))

Python

Copy
pykx.IntVector(pykx.q('1 2 3 4i'))

Python

Copy
kx.toq(pd.Series([1, 2, 3, 4]))

Python

Copy
pykx.LongVector(pykx.q('1 2 3 4'))

2.c List

A PyKX List is an untyped vector object. Unlike vectors which are optimised for the performance of analytics, lists are mostly used for storing reference information or matrix data.

Unlike vector objects which are 1-D in shape, lists can be ragged N-Dimensional objects. This makes them useful for storing complex data structures, but limits their performance when dealing with data-access/data modification tasks.

Python

Copy
kx.List([[1, 2, 3], [1.0, 1.1, 1.2], ['a', 'b', 'c']])

Python

Copy
pykx.List(pykx.q('
1 2   3  
1 1.1 1.2
a b   c  
'))

2.d Dictionary

A PyKX Dictionary is a mapping between a direct key-value association. The list of keys and values to which they are associated must have the same count. While it can be considered as a key-value pair, it's physically stored as a pair of lists.

Python

Copy
kx.Dictionary({'x': [1, 2, 3], 'x1': np.array([1, 2, 3])})

Python

Copy
 x    1 2 3
x1    1 2 3

2.e Table

PyKX Tables are a first-class typed entity which lives in memory. They're a collection of named columns implemented as a dictionary. This mapping construct means that PyKX tables are column oriented. This makes analytic operations on columns much faster than for a relational database equivalent.

PyKX Tables come in many forms, but the key table types are as follows:

In this section we exemplify the first two, which are the in-memory data table types.

pykx.Table

Python

Copy
print(kx.Table([[1, 2, 'a'], [2, 3, 'b'], [3, 4, 'c']], columns = ['col1', 'col2', 'col3']))

Python

Copy
col1 col2 col3
--------------
1    2    a   
2    3    b   
3    4    c   

Python

Copy
print(kx.Table(data = {'col1': [1, 2, 3], 'col2': [2 , 3, 4], 'col3': ['a', 'b', 'c']}))

Python

Copy
col1 col2 col3
--------------
1    2    a   
2    3    b   
3    4    c   

Python

Copy
kx.Table([[1, 2, 'a'], [2, 3, 'b'], [3, 4, 'c']],
         columns = ['col1', 'col2', 'col3'])

Python

Copy
   col1  col2 col3
------------------
0     1     2    a
1     2     3    b
2     3     4    c

Python

Copy
kx.Table(data = {
         'col1': [1, 2, 3],
         'col2': [2 , 3, 4],
         'col3': ['a', 'b', 'c']})

Python

Copy
   col1  col2 col3
------------------
0     1     2    a
1     2     3    b
2     3     4    c

pykx.KeyedTable

Python

Copy
kx.Table(data = {'x': [1, 2, 3], 'x1': [2, 3, 4], 'x2': ['a', 'b', 'c']}
         ).set_index(['x'])

Python

Copy
    x1  x2
x
-----------
1    2    a
2    3    b
3    4    c

2.f Other data types

Below we outlined some of the important PyKX data type structures that you will run into through the rest of this notebook.

pykx.Lambda

A pykx.Lambda is the most basic kind of function within PyKX. They take between 0 and 8 parameters and are the building blocks for most analytics written by users when interacting with data from PyKX.

Python

Copy
pykx_lambda = kx.q('{x+y}')
type(pykx_lambda)

Python

Copy
pykx.wrappers.Lambda

Python

Copy
pykx_lambda(1, 2)

Python

Copy
pykx.LongAtom(pykx.q('3'))

pykx.Projection

Like functools.partial, functions in PyKX can have some of their parameters set in advance, resulting in a new function, which is called a projection. When you call this projection, the set parameters are no longer required and cannot be provided.

If the original function had n total parameters and m provided, the result would be a function (projection) that requires the user to input n-m parameters.

Python

Copy
projection = kx.q('{x+y}')(1)
projection

Python

Copy
pykx.Projection(pykx.q('{x+y}[1]'))

Python

Copy
projection(2)

Python

Copy
pykx.LongAtom(pykx.q('3'))

3. Access and create PyKX objects

Now that you're familiar with the PyKX object types, let's see how they work in real-world scenarios, such as:

3.a Create PyKX objects from Pythonic data types

One of the most common ways to generate PyKX data is by converting equivalent Pythonic data types. PyKX natively supports conversions to and from the following common Python data formats:

  • Python
  • Numpy
  • Pandas
  • PyArrow

You can generate PyKX objects by using the kx.toq PyKX function:

Python

Copy
pydict = {'a': [1, 2, 3], 'b': ['a', 'b', 'c'], 'c': 2}
kx.toq(pydict)

Python

Copy
---------
a     123
b     abc
c       2

Python

Copy
nparray = np.array([1, 2, 3, 4], dtype = np.int32)
kx.toq(nparray)

Python

Copy
pykx.IntVector(pykx.q('1 2 3 4i'))

Python

Copy
pdframe = pd.DataFrame(data = {'a':[1, 2, 3], 'b': ['a', 'b', 'c']})
kx.toq(pdframe)

Python

Copy
    a    b
----------
0    1   a
1    2   b
2    3   c

3.b Random data generation

PyKX provides a module to create random data of user-specified PyKX types or their equivalent Python types. The creation of random data helps in prototyping analytics.

As a first example, generate a list of 1,000,000 random floating-point values between 0 and 1 as follows:

Python

Copy
kx.random.random(1000000, 1.0)

Python

Copy
pykx.FloatVector(pykx.q('0.3927524 0.5170911 0.5159796 0.4066642 0.1780839 0.3017723 0.785033 0.534709..'))

If you wish to choose values randomly from a list, use the list as the second argument to your function:

Python

Copy
kx.random.random(5, [kx.LongAtom(1), ['a', 'b', 'c'], np.array([1.1, 1.2, 1.3])])

Python

Copy
pykx.List(pykx.q('
1.1 1.2 1.3
1
1.1 1.2 1.3
1
`a`b`c
'))

Random data does not only come in 1-Dimensional forms. To create multi-Dimensional PyKX Lists, turn the first argument into a list. The following examples include a PyKX trick that uses nulls/infinities to generate random data across the full allowable range:

Python

Copy
kx.random.random([2, 5], kx.GUIDAtom.null)

Python

Copy
pykx.List(pykx.q('
9b19ab9c-b26d-d6b3-a8fa-267ba0620848 d8d6c050-964e-6247-e2cd-bf9435389b9a 1c4..
a68f5b00-754e-9863-04aa-8b59cc4e3122 72969cc8-4445-451b-9266-7770a60c3120 0c7..
'))

Python

Copy
kx.random.random([2, 3, 4], kx.IntAtom.inf)

Python

Copy
pykx.List(pykx.q('
1837510540 373968399  35818431  1421474592  424239201  1727064393 250148680 1..
1566069007 1773121422 2104411811 1441846567 103906494  315107819  931560883  ..
'))

Finally, to have consistency over the generated objects, set the seed for the random data generation explicitly. You can complete this globally or for individual function calls:

Python

Copy
kx.random.seed(10)
kx.random.random(10, 2.0)

Python

Copy
pykx.FloatVector(pykx.q('0.1782082 1.669039 0.7243899 1.999868 0.7675971 1.723838 0.1836728 0.5061767 ..'))

Python

Copy
kx.random.random(10, 2.0, seed = 10)

Python

Copy
pykx.FloatVector(pykx.q('0.1782082 1.669039 0.7243899 1.999868 0.7675971 1.723838 0.1836728 0.5061767 ..'))

3.c Run q code to generate data

PyKX is an entry point to the vector programming language q. This means that PyKX users can execute q code directly via PyKX within a Python session, by calling kx.q.

For example, to create q data, run the following command:

Python

Copy
kx.q('0 1 2 3 4')

Python

Copy
pykx.LongVector(pykx.q('0 1 2 3 4'))

Python

Copy
kx.q('([idx:desc til 5]col1:til 5;col2:5?1f;col3:5?`2)')

Python

Copy

    col1       col2  col3
idx            
4    0    0.8619188    ol
3    1    0.09183638   mg
2    2    0.2530883    cm
1    3    0.2504566    cc
0    4    0.7517286    jg

Next, apply arguments to a user-specified function x+y:

Python

Copy
kx.q('{x+y}', kx.LongAtom(1), kx.LongAtom(2))

Python

Copy
pykx.LongAtom(pykx.q('3'))

3.d Read data from a CSV file

A lot of data that you run into for data analysis tasks comes in the form of CSV files. PyKX, like Pandas, provides a CSV reader called via kx.q.read.csv. In the next cell we create a CSV that can be read in PyKX:

Python

Copy
import csv

with open('pykx.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    field = ["name", "age", "height", "country"]
    
    writer.writerow(field)
    writer.writerow(["Oladele Damilola", "40", "180.0", "Nigeria"])
    writer.writerow(["Alina Hricko", "23", "179.2", "Ukraine"])
    writer.writerow(["Isabel Walter", "50", "179.5", "United Kingdom"])

Python

Copy
kx.q.read.csv('pykx.csv', types = {'age': kx.LongAtom, 'country': kx.SymbolAtom})

Python

Copy
                  name    age    height           country
---------------------------------------------------------
0    "Oladele Damilola"    40      180e           Nigeria
1        "Alina Hricko"    23    179.2e           Ukraine
2       "Isabel Walter"    50    179.5e    United Kingdom

Python

Copy
import os
os.remove('pykx.csv')

3.e Query external processes via IPC

One of the most common usage patterns in organizations with access to data in kdb+/q is to query data from an external server process infrastructure. For the example below you need to install q.

First, set up a q/kdb+ server. Set it on port 5050 and populate it with some data in the form of a table tab:

Python

Copy
import subprocess
import time

try:
    with kx.PyKXReimport():
        proc = subprocess.Popen(
            ('q', '-p', '5000')
        )
    time.sleep(2)
except:
    raise kx.QError('Unable to create q process on port 5000')

Once a q process is available, connect to it for synchronous query execution:

Python

Copy
conn = kx.SyncQConnection(port = 5000)

You can now run q commands against the q server:

Python

Copy
conn('tab:([]col1:100?`a`b`c;col2:100?1f;col3:100?0Ng)')
conn('select from tab where col1=`a')

Python

Copy
    col1      col2                                    col3
-----------------------------------------------------------
0    a    0.01974141    ddb87915-b672-2c32-a6cf-296061671e9d
1    a    0.5611439    580d8c87-e557-0db1-3a19-cb3a44d623b1
2    a    0.8685452    2d948578-e9d6-79a2-8207-9df7a71f0b3b
3    a    0.3460797    cddeceef-9ee9-3847-9172-3e3d7ab39b26
4    a    0.5046331    1c22a468-9492-2173-9e4f-9003a23d02b7
5    a    0.765905    5e9cd21b-88c5-bbf5-7215-6409e115a2a4
6    a    0.8794685    3462beab-42ee-ccad-989b-8d69f070dffc
7    a    0.02487862    bc150163-c551-0eba-8871-9767f5c0e3d5
8    a    0.3664924    dd6b4a2b-c046-e464-a0b9-efb96ed5f0eb
... ...        ...                                       ...
36    a    0.9929108    03a9b290-95c8-c3b8-fb9a-9ac9874763b8

37 rows × 3 columns

Alternatively, use the PyKX query API:

Python

Copy
conn.qsql.select('tab', where=['col1=`a', 'col2<0.3'])

Python

Copy
   col1         col2                                  col3
----------------------------------------------------------
0    a    0.01974141  ddb87915-b672-2c32-a6cf-296061671e9d
1    a    0.02487862  bc150163-c551-0eba-8871-9767f5c0e3d5
2    a    0.2073435   ee853957-d502-d30d-5945-bf8c97022332
3    a    0.2188574   d9a3e171-b1cf-0271-507a-0fba0b52e6ff
4    a    0.1451855   ea4d0269-375c-d73b-96f0-6bb6334ca423
5    a    0.1497004   1cce6bdd-e34b-ba4f-8c01-31d098d81221
6    a    0.166486    6417d4b3-3fc6-e35a-1c34-8c5c3327b1e8
7    a    0.2643322   f294c3cb-a6da-e15d-c8e0-3a848d2abf10
8    a    0.07841939  020715aa-8ffa-e1d3-9c68-3ad7919d4f5e
9    a    0.08077328  65b2f5b0-918c-b87b-4fc4-4aa24b192476

Or use PyKX's context interface to run SQL server side if you have access to it:

Python

Copy
conn('\l s.k_')
conn.sql('SELECT * FROM tab where col2>=0.5')

Python

Copy
   col1         col2                                  col3
----------------------------------------------------------
0    a    0.5611439    580d8c87-e557-0db1-3a19-cb3a44d623b1
1    a    0.8685452    2d948578-e9d6-79a2-8207-9df7a71f0b3b
2    b    0.7716917    52cb20d9-f12c-9963-2829-3c64d8d8cb14
3    a    0.5046331    1c22a468-9492-2173-9e4f-9003a23d02b7
4    c    0.6014692    7ea4d431-4dec-3017-3d13-cc9ef7f1c0ee
5    c    0.5000071    782c5346-f5f7-b90e-c686-8d41fa85233b
6    c    0.8392881    245f5516-0cb8-391a-e1e5-fadddc8e54ba
7    b    0.5938637    e30bab29-2df0-3fb0-535f-58d1e7bd83c0
8    a    0.765905     5e9cd21b-88c5-bbf5-7215-6409e115a2a4
... ...         ...                                     ...
55   b    0.8236115    f2c41bca-67df-aa6c-4730-bca38cbd6825

56 rows × 3 columns

Finally, shut down the q server used for this demonstration:

Python

Copy
proc.kill()

4. Run analytics on PyKX objects

Like many Python libraries (including Numpy and Pandas), PyKX provides many ways to use its data with analytics that you generated and defined within the library. Let's explore the following:

4.a Use in-built methods on PyKX Vectors

When you interact with PyKX Vectors, you may wish to gain insights into these objects through the application of basic analytics such as calculation of the mean/median/mode of the vector:

Python

Copy
q_vector = kx.random.random(1000, 10.0)

Python

Copy
q_vector.mean()

Python

Copy
pykx.FloatAtom(pykx.q('4.984157'))

Python

Copy
q_vector.max()

Python

Copy
pykx.FloatAtom(pykx.q('9.998212'))

The above is useful for basic analysis. For bespoke analytics on these vectors, use the apply method:

Python

Copy
def bespoke_function(x, y):
    return x*y

q_vector.apply(bespoke_function, 5)

Python

Copy
pykx.FloatVector(pykx.q('31.74132 38.3376 46.40922 10.17963 38.73944 48.33864 41.12562 45.44382 32.290..'))

4.b Use in-built methods on PyKX Tables

In addition to the vector processing capabilities of PyKX, it's important to have the ability to manage tables. Highlighted in depth within the Pandas-Like API documentation, these methods allow you to apply functions and gain insights into your data in a familiar way.

The example below uses combinations of the most used elements of this Table API operating on the following table:

Python

Copy
N = 1000000
example_table = kx.Table(data = {
    'sym' : kx.random.random(N, ['a', 'b', 'c']),
    'col1' : kx.random.random(N, 10.0),
    'col2' : kx.random.random(N, 20)
    }
)
example_table

Python

Copy
       sym        col1  col2
----------------------------
0        b    7.782944     6
1        c   0.5899977    17
2        c    2.580528     8
3        b    5.651351    10
4        b    2.336329    11
5        b     2.87167    17
6        c    9.705893     9
7        a    5.729889     8
8        c    1.482026    14
...    ...         ...   ...
999999   c    8.862285     6

1,000,000 rows × 3 columns

You can search for and filter data within your tables using loc similarly to how this is achieved by Pandas:

Python

Copy
example_table.loc[example_table['sym'] == 'a']

Python

Copy
       sym        col1  col2
----------------------------
0        a    5.729889     8
1        a    4.396508    13
2        a    0.7636906   19
3        a    9.904306    17
4        a    1.439738    10
5        a    2.898631    19
6        a    2.360396     2
7        a    1.932728    12
8        a    4.877998     4
...    ...         ...   ...
332823   a    6.653308    18

332,824 rows × 3 columns

This also happens when retrieving data from a table through the __get__ method:

Python

Copy
example_table[example_table['sym'] == 'b']

Python

Copy
       sym     col1   col2
--------------------------
   0     b  7.782944     6
   1     b  5.651351    10
   2     b  2.336329    11
   3     b   2.87167    17
   4     b  2.917054     2
   5     b  7.093562    18
   6     b  1.715391    10
   7     b  4.231884     0
   8     b  4.727296     2
 ...   ...       ...   ...
333014   b  9.361253    17

333,015 rows × 3 columns

Next, you can set the index columns of a table. In PyKX, this means converting the table from a pykx.Table object to a pykx.KeyedTable object:

Python

Copy
example_table.set_index('sym')

Python

Copy
          col1  col2
  sym
--------------------    
    b 7.782944     6
    c 0.5899977   17
    c 2.580528     8
    b 5.651351    10
    b 2.336329    11
    b  2.87167    17
    c 9.705893     9
    a 5.729889     8
    c 1.482026    14
  ...      ...   ...
    c 8.862285     6
    
1,000,000 rows × 3 columns

Or you can apply basic data manipulation operations such as mean and median:

Python

Copy
print('mean:')
display(example_table.mean(numeric_only = True))

print('median:')
display(example_table.median(numeric_only = True))

Python

Copy
mean:
        ----------------
        col1    4.998412
        col2    9.497452

median:
        ----------------
        col1    4.996685
        col2    9f

Next, use the groupby method to group PyKX tabular data so you can use it for analytic purposes.

In the first example, let's start by grouping the dataset based on the sym column and calculate the mean for each column based on their sym:

Python

Copy
example_table.groupby('sym').mean()

Python

Copy
         col1        col2
sym        
-------------------------
  a   5.00519     9.49375
  b  5.000742    9.501077
  c  4.989338    9.497527

To extend the above groupby, consider a more complex example which uses numpy to run calculations on the PyKX data. You will notice later that you can simplify this specific use-case further.

Python

Copy
def apply_func(x):
    nparray = x.np()
    return np.sqrt(nparray).mean()

example_table.groupby('sym').apply(apply_func)

Python

Copy
             col1    col2
sym
-------------------------
a    2.109397    2.859095
b    2.108571    2.860037
c    2.105694    2.859527

For time-series specific joining of data, use merge_asof joins. In this example, you have several tables with temporal information namely a trades and quotes table:

Python

Copy
trades = kx.Table(data={
    "time": [
        pd.Timestamp("2016-05-25 13:30:00.023"),
        pd.Timestamp("2016-05-25 13:30:00.023"),
        pd.Timestamp("2016-05-25 13:30:00.030"),
        pd.Timestamp("2016-05-25 13:30:00.041"),
        pd.Timestamp("2016-05-25 13:30:00.048"),
        pd.Timestamp("2016-05-25 13:30:00.049"),
        pd.Timestamp("2016-05-25 13:30:00.072"),
        pd.Timestamp("2016-05-25 13:30:00.075")
    ],
    "ticker": [
       "GOOG",
       "MSFT",
       "MSFT",
       "MSFT",
       "GOOG",
       "AAPL",
       "GOOG",
       "MSFT"
   ],
   "bid": [720.50, 51.95, 51.97, 51.99, 720.50, 97.99, 720.50, 52.01],
   "ask": [720.93, 51.96, 51.98, 52.00, 720.93, 98.01, 720.88, 52.03]
})
quotes = kx.Table(data={
   "time": [
       pd.Timestamp("2016-05-25 13:30:00.023"),
       pd.Timestamp("2016-05-25 13:30:00.038"),
       pd.Timestamp("2016-05-25 13:30:00.048"),
       pd.Timestamp("2016-05-25 13:30:00.048"),
       pd.Timestamp("2016-05-25 13:30:00.048")
   ],
   "ticker": ["MSFT", "MSFT", "GOOG", "GOOG", "AAPL"],
   "price": [51.95, 51.95, 720.77, 720.92, 98.0],
   "quantity": [75, 155, 100, 100, 100]
})

print('trades:')
display(trades)
print('quotes:')
display(quotes)

Python

Copy
trades:
                                         time    ticker    bid        ask
            -------------------------------------------------------------
            0    2016.05.25D13:30:00.023000000    GOOG    720.5    720.93
            1    2016.05.25D13:30:00.023000000    MSFT    51.95    51.96
            2    2016.05.25D13:30:00.030000000    MSFT    51.97    51.98
            3    2016.05.25D13:30:00.041000000    MSFT    51.99    52f
            4    2016.05.25D13:30:00.048000000    GOOG    720.5    720.93
            5    2016.05.25D13:30:00.049000000    AAPL    97.99    98.01
            6    2016.05.25D13:30:00.072000000    GOOG    720.5    720.88
            7    2016.05.25D13:30:00.075000000    MSFT    52.01    52.03
quotes:
                                         time    ticker    price  quantity
            -------------------------------------------------------------
            0    2016.05.25D13:30:00.023000000    MSFT    51.95      75
            1    2016.05.25D13:30:00.038000000    MSFT    51.95      155
            2    2016.05.25D13:30:00.048000000    GOOG    720.77    100
            3    2016.05.25D13:30:00.048000000    GOOG    720.92    100
            4    2016.05.25D13:30:00.048000000    AAPL    98f       100

When applying the asof join, you can additionally use named arguments to make a distinction between the tables that the columns originate from. In this case, suffix with _trades and _quotes:

Python

Copy
trades.merge_asof(quotes, on='time', suffixes=('_trades', '_quotes'))

Python

Copy
    time                             ticker_trades     bid     ask ticker_quotes  price quantity
-------------------------------------------------------------------------------------------------
  0 2016.05.25D13:30:00.023000000             GOOG   720.5  720.93          MSFT  51.95       75
  1 2016.05.25D13:30:00.023000000             MSFT   51.95   51.96          MSFT  51.95       75
  2 2016.05.25D13:30:00.030000000             MSFT   51.97   51.98          MSFT  51.95       75
  3 2016.05.25D13:30:00.041000000             MSFT   51.99     52f          MSFT  51.95      155
  4 2016.05.25D13:30:00.048000000             GOOG   720.5  720.93          AAPL    98f      100
  5 2016.05.25D13:30:00.049000000             AAPL   97.99   98.01          AAPL    98f      100
  6 2016.05.25D13:30:00.072000000             GOOG   720.5  720.88          AAPL    98f      100
  7 2016.05.25D13:30:00.075000000             MSFT   52.01   52.03          AAPL    98f      100
      

4.c Use PyKX/q native functions

While the Pandas-like API and methods provided off PyKX Vectors provides an effective method of applying analytics on PyKX data, the most efficient and performant way to run analytics on your data is by using PyKX/q primitives available through the kx.q module.

These include functionality for calculating moving averages, asof/window joins, column reversal etc. Now let's see a few examples with how you can use these functions, grouped into the following sections:

Mathematical functions

mavg

Calculate a series of average values across a list using a rolling window:

Python

Copy
kx.q.mavg(10, kx.random.random(10000, 2.0))

Python

Copy
pykx.FloatVector(pykx.q('1.469756 1.029263 0.7352848 0.5950915 0.7071875 0.8486546 0.910078 0.95322 1...'))
cor

Calculate the correlation between two lists:

Python

Copy
kx.q.cor([1, 2, 3], [2, 3, 4])

Python

Copy
pykx.FloatAtom(pykx.q('1f'))

Python

Copy
kx.q.cor(kx.random.random(100, 1.0), kx.random.random(100, 1.0))

Python

Copy
pykx.FloatAtom(pykx.q('0.02687833'))
prds

Calculate the cumulative product across a supplied list:

Python

Copy
kx.q.prds([1, 2, 3, 4, 5])

Python

Copy
pykx.LongVector(pykx.q('1 2 6 24 120'))

Iteration functions

each

Supplied both as a standalone primitive and as a method for PyKX Lambdas each allows you to pass individual elements of a PyKX object to a function:

Python

Copy
kx.q.each(kx.q('{prd x}'), kx.random.random([5, 5], 10.0, seed=10))

Python

Copy
pykx.FloatVector(pykx.q('1033.597 377.1784 7126.713 418.3232 89.97531'))

Python

Copy
kx.q('{prd x}').each(kx.random.random([5, 5], 10.0, seed=10))

Python

Copy
pykx.FloatVector(pykx.q('1033.597 377.1784 7126.713 418.3232 89.97531'))

Table functions

meta

Retrieve metadata information about a table:

Python

Copy
qtab = kx.Table(data = {
    'x' : kx.random.random(1000, ['a', 'b', 'c']).grouped(),
    'y' : kx.random.random(1000, 1.0),
    'z' : kx.random.random(1000, kx.TimestampAtom.inf)
})

Python

Copy
kx.q.meta(qtab)

Python

Copy
     t  f  a
c
------------
x  "s"     g
y  "f"
z  "p"        
xasc

Sort the contents of a specified column in ascending order:

Python

Copy
kx.q.xasc('z', qtab)

Python

Copy
            x          y                              z
-------------------------------------------------------
   0        c  0.2660419  2000.09.17D00:27:33.222932480
   1        b  0.2378591  2001.02.01D19:58:48.496586752
   2        c  0.05802967 2001.05.29D15:29:16.181340160
   3        c  0.9474748  2003.03.24D08:12:02.975653888
   4        b  0.02726729 2004.01.31D07:25:21.959215104
   5        b  0.08927731 2004.12.31D23:50:54.425055232
   6        c  0.2256163  2005.07.12D10:45:38.423119872
   7        b  0.1675316  2006.04.19D21:31:40.507750400
   8        b  0.8185412  2006.05.28D15:22:24.331161600
 ...      ...        ...                            ...
 999        a  0.4414727  2292.03.15D06:41:24.638662656

1,000 rows × 3 columns

You can find the full list of the functions and some examples of their usage here.