Finance - Realtime ML Stock Prediction

This page provides a tutorial on how you can use kdb Insights Enterprise to define and run a machine learning model to predict the future state of stock prices.

This builds on the work done in the first finance tutorial which showed how to develop trading strategies. You must complete that tutorial before proceeding with the following sections.

Note

This tutorial shows how you can train a Python-first ML model in the kdb Insights Enterprise platform rather than focusing on developing a particularly accurate model.

This tutorial illustrates the following beneficial features of using kdb Insights Enterprise:

Benefit

Description

  Python based analysis

Stage data for Python-based analysis to predict future values, such as Neural Network algorithms.

  ANSI SQL support

Execute using ANSI SQL queries which most Data Science and Machine Leaning tools are familiar.

  Multiple visualization options

Use the scratchpad to verify the training/test sets and predictions and then move to advanced visualization using Reports or Common BI tools like PowerBi or Tableau to show live data, more depth to the model and empower larger teams to visualize insights.

Run model

Query dataset

Query the trade data you created in the Build and backtest scalable trading strategies using real time and historical tick data tutorial, to get it in the form required for the model later on.

  1. Click on Packages in the left-hand menu to open the packages index. If the status of the equities package is Running proceed. If not then click on the three dots to the right and click Deploy.
  2. Click + on the ribbon menu and click Query.

  3. Click on the SQL tab of the Query and Load Data panel.
  4. To retrieve the open, high, low, close (OHLC) of events in the trade table, add this to the code editor:

    SQL

    Copy
    SELECT date_trunc('minute', timestamp) timestamp,
        FIRST(price) o, MAX(price) h,
        MIN(price) l, LAST(price) c
    FROM trade GROUP BY date_trunc('minute', timestamp)
  5. Define an output variable ohlc.

  6. Switch window at the bottom of the page from Console to Visual.
  7. Click Run Query to execute the query.

     

    A SQL query showing open, high, low, close values from the trade table.

    Information

    What does this show?

    This OHLC chart shows the open, high, low and closing prices, from the trade table for each period.

    OHLC charts are useful since they show the four major data points over a period, with the closing price being considered the most important by many traders.

Run the ML code

For processing the data, a paper written in 2020 used an Long Short-Term Memory algorithm to predict stock prices. Using an open source Git repo, a LTSM model written in Python can be run against the data, to create predictions, and visualize them in the Scratchpad.

  1. In Scratchpad panel, of the Query window, click on the Python tab. Add the following code to the scratchpad editor:

  2. In the Visual tab, switch the chart type for the Y-Axis Left Settings to Line.

  3. Click Run All.

    Plot of train, test and ohlc predictions dataset

    Information

    What is the model doing?

    Using standard machine learning libraries that are available in kdb Insights Enterprise the dataset is weighted, shaped and plotted.

    The model is first trained using 75% of the data, with the remaining 25% is used to test the model.

    Zooming in on the graph where the color changes, you can see where the test dataset begins. Against this test data the predictions follow the live data extremely closely.

    A SQL query showing open, high, low, close values from the trade table.

What's next