kdb+ Initial Import Quickstart

This page explains how to import an existing kdb+ database into kdb Insights Enterprise.

If you have an existing large kdb+ database, you should use initial import to add your data to a kdb Insights database. This is generally faster than using a batch ingest.

It is important to draw a clear distinction between an initial import and a batch ingest.

  • Initial Import: can only be used with a database that has never been initialized before and consequently does not contain any data.

  • Batch Ingest: can be used to add data to an already initialized and running database that may also contain data.

Before importing any data, ensure you have read the prerequisites to determine that your database is suitable for an initial import ingest.

Quickstart steps

Create temporary structure

First, create a temporary structure.

bash

Copy
TEMP=`mktemp -d`
export TEMP # root directory to store data and config for example
export DATA=$TEMP/data/ # directory where test data will be created
export SCHEMA=$TEMP/schema/ # directory where schema will be stored
export KX_PACKAGE_PATH=$TEMP/packages # directory where package will be stored
export KX_ARTIFACT_PATH=$TEMP/artifacts # directory where artifacts will be stored
export KX_DATABASE_NAME=new # name of kdb+ Insights database being created
export INSIGHTS_HOSTNAME=<your url> # kdb+ Insights URL
mkdir -pv $DATA $SCHEMA $KX_PACKAGE_PATH $KX_ARTIFACT_PATH

Note

Remember to set INSIGHTS_HOSTNAME appropriately.

  • If needed, create some sample data. The following sample creates a kdb+ database containing trade and quote tables.

q

Copy
dbroot:hsym`$getenv`DATA;
d:2021.09.01+til 20;
{[d;n]sv[`;.Q.par[dbroot;d;`trade],`]set .Q.en[dbroot;([]sym:`$'n?.Q.A;time:d+.z.N+til n;price:n?100f;size:n?50)];}[;10000]each d;
{[d;n]sv[`;.Q.par[dbroot;d;`quote],`]set .Q.en[dbroot;([]sym:`$'n?.Q.A;time:d+.z.N+til n;ask:n?100f;asize:n?50;bid:n?100f;bsize:n?50)];}[;10000]each d;

This creates the following structure.

bash

Copy
/tmp/test/.
├── db
│ ├── 2021.09.01
│ ├── 2021.09.02
│ ├── 2021.09.03
│ ├── 2021.09.04
│ ├── 2021.09.05
│ ├── 2021.09.06
│ ├── 2021.09.07
│ ├── 2021.09.08
│ ├── 2021.09.09
│ ├── 2021.09.10
│ ├── 2021.09.11
│ ├── 2021.09.12
│ ├── 2021.09.13
│ ├── 2021.09.14
│ ├── 2021.09.15
│ ├── 2021.09.16
│ ├── 2021.09.17
│ ├── 2021.09.18
│ ├── 2021.09.19
│ └── 2021.09.20
└── sym

Copy the data

  1. Next, copy the data using the following steps. Create a database tier(s) storage to store the data.

    Note

    You may need to change the storage class for the PVC depending on what is available on your cluster.

    YAML to create a PVC

    bash

    Copy
    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: $KX_DATABASE_NAME-hdb
    spec:
      storageClassName: rook-cephfs
      accessModes:
      - ReadWriteMany
      resources:
        requests:
          storage: 20Gi
    EOF
  2. Create a pod, which mounts the local storage tier.

    bash

    Copy
    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Pod
    metadata:
      name: $KX_DATABASE_NAME-copy-pod
    spec:
      securityContext:
        fsGroup: 65534
        fsGroupChangePolicy: OnRootMismatch
        runAsNonRoot: true
        runAsUser: 65534
      containers:
        - name: $KX_DATABASE_NAME-copy-container
          image: amazonlinux:2
          stdin: true
          command: [ "/bin/bash", "-c" ]
          args:
            - |
              yum install -y rsync
              mkdir /data/db/hdb/data/
              while true; do sleep 30; done;
          volumeMounts:
            - mountPath: /data/db/hdb
              name: $KX_DATABASE_NAME-storage
      volumes:
      - name: $KX_DATABASE_NAME-storage
        persistentVolumeClaim:
          claimName: $KX_DATABASE_NAME-hdb
    EOF
  3. Copy kdb+ data to the tiers created.

    bash

    Copy
    bash krsync.sh $DATA $KX_DATABASE_NAME-copy-pod /data/db/hdb/data/

Deploy the database

Now the data is imported, you can deploy your database.

  1. Create a kdb+ Insights schema for the database to be imported to.

    bash

    Copy
    q create-schema.q -db $DATA -fmt package -out $SCHEMA

    Edit the schema of the trade and quote tables to set the prtnCol.

  2. Create a kdb+ Insights database with the new database configured.

    bash

    Copy
    kxi package init $KX_PACKAGE_PATH/$KX_DATABASE_NAME
    kxi package add --to $KX_PACKAGE_PATH/$KX_DATABASE_NAME database --name $KX_DATABASE_NAME
    basename -s .yaml $SCHEMA/*.yaml | xargs -n 1 kxi package add --to $KX_PACKAGE_PATH/$KX_DATABASE_NAME table --name
    cp $SCHEMA/* $KX_PACKAGE_PATH/$KX_DATABASE_NAME/tables/

    Tip

    If a warning is displayed indicating that kxi package is deprecated, you can use kxi pm instead.

  3. Deploy the database

    bash

    Copy
    kxi package packit $KX_PACKAGE_PATH/$KX_DATABASE_NAME --version 1.0.0 --tag
    kxi package push $KX_DATABASE_NAME-1.0.0.kxi --hostname $INSIGHTS_HOSTNAME

    JSON

    Copy
    {
        "new": [
            {
                "version": "1.0.0",
                "_status": "installed",
                "message": "OK"
            }
        ]
    }

    bash

    Copy
    kxi package deploy $KX_DATABASE_NAME --hostname $INSIGHTS_HOSTNAME

    JSON

    Copy
    {
        "id": "8529c6c4-efd7-40bb-a7b1-d099ea0e24c0",
        "name": "new",
        "status": "DEPLOYED",
        "version": "1.0.0"
    }

  4. Inspect status API and logs to verify that the import worked.

    bash

    Copy
    kxi --realm insights --hostname $INSIGHTS_HOSTNAME auth login
    export INSIGHTS_TOKEN=`kxi auth print-token` && curl -L "https://${INSIGHTS_HOSTNAME}/servicegateway/api/v1/database/$KX_DATABASE_NAME/status" -H "Authorization: Bearer $INSIGHTS_TOKEN"

    JSON

    Copy
    {
      "state": "normal",
      "encryption": "encrypted",
      "progress": {
        "cmdCurrent": "",
        "cmdIndex": null,
        "cmdTotal": null,
        "subCurrent": "",
        "subIndex": null,
        "subTotal": null,
        "completeETA": ""
      },
      "memory": {
        "usage": 151252992,
        "limit": 2147483648
      }
    }

    bash

    Copy
    curl -X POST --header "Content-Type: application/json"\
        --header "Accepted: application/json"\
        --header "Authorization: Bearer $INSIGHTS_TOKEN"\
        --data "{\"table\":\"trade\",\"startTS\":\"2021-09-01T00:00:00.000000000\",\"endTS\":\"2024-11-12T13:27:00.000000000\",\"agg\":[[\"cnt\",\"count\",\"time\"]]}"\
        "https://${INSIGHTS_HOSTNAME}/servicegateway/kxi/getData"

    JSON

    Copy
    {
      "header": {
        "auditID": "e13b1e27-a82a-41e3-b6c9-cc08d4678367",
        "rcvTS": "2024-11-12T20:00:15.588000000",
        "corr": "aff5c6a1-ac83-4895-9df5-221b66c6e64d",
        "logCorr": "e13b1e27-a82a-41e3-b6c9-cc08d4678367",
        "http": "json",
        "api": ".kxi.getData",
        "agg": ":10.6.142.183:5070",
        "refVintage": -9223372036854776000,
        "rc": 0,
        "ac": 0,
        "ai": "",
        "limitApplied": false
      },
      "payload": [
        {
          "cnt": 200000
        }
      ]
    }

Next steps

Now you have tried this process with the quickstart, you can attempt following the initial import process. If you experience any issues, refer to our troubleshooting page for help.