How To Delete Data in KDB.AI
This page explains how to delete data from your KDB.AI tables.
Because KDB.AI is optimized for append-only streaming workloads, delete operations behave differently from traditional databases. They can be slow on large datasets and should be planned carefully. This page covers supported table types, performance tips, safe deletion practices, and rollback mechanisms. Follow the recommendations here to minimize risk and ensure correct behavior during deletions.
Supported table types
KDB.AI supports the delete operation only on specific table index configurations. Before attempting a delete, confirm that your table meets one of the following criteria:
-
It has no index
-
It uses a flat index
-
It uses a qFlat index
-
It uses multi-indexes that include only flat or qFlat indexes
KDB.AI does not currently support deletions on tables with any other index types.
Key considerations before deleting data
1. Performance
Delete operations in KDB.AI are inherently slow - small deletions may complete in seconds, but larger datasets can take several hours to process.
This is because KDB.AI is built on a system-of-record model optimized for continuously appending data rather than modifying or removing it. While this design benefits high-throughput, real-time workloads, it makes deletions a heavier operation that requires more processing time.
2. Timing
For best performance and to minimize impact on other operations, schedule delete operations at the end of the day or during maintenance windows when no other tasks are active.
3. Data safety
The system includes safety mechanisms to handle data consistency during delete operations.
However, we strongly recommend backing up all critical data before performing a delete, as unforeseen corruption or system failures can lead to irreversible data loss.
4. Use of filters
Always apply filters when executing a delete operation to precisely target the data you intend to remove.
If no filter is specified, the operation will delete all data from the table without any confirmation, which can lead to unintended consequences.
5. Disk and memory requirements
Disk usage
The delete operation creates a backup of the data before deletion. As a result, it temporarily consumes additional disk space equal to the size of the table plus size of index being deleted. Ensure sufficient disk space is available to accommodate this backup during the operation.
Memory usage
The peak memory usage during deletion is determined by the largest column in the table. Ensure that the system has enough available memory to handle this maximum load efficiently.
Best practices
-
Preview your delete filter on a subset of data to ensure correctness.
-
Log all delete operations for audit purposes.
-
Test deletes in non-production environments before applying them in live systems.
Backup and rollback behavior
The delete operation creates a backup of the target data in the mounted data directory before attempting deletion. This backup is automatically cleaned up upon a successful delete.
In the case of a delete failure, the operation will:
-
Rollback by restoring data from the backup.
-
Retain the backup directory to ensure data is available for verification and recovery.
This behavior is intentional to safeguard against potential data corruption.
Action Required
After a delete failure, make sure you:
-
Verify the restored data integrity.
-
If data corruption is detected, manually recover it from the backup.
-
Delete the backup directory manually to reclaim disk space.
Delete rows
The examples below show how to delete one or more rows based on a condition, using the KDB.AI q, Ptyhon, and REST APIs.
For instance, let's delete rows where date = 2025.05.14
and sym = "Vod.l"
from the trade
table in the default
database:
Python
# Delete rows from the 'trade' table where 'date' is '2025.05.14' and 'sym' is 'Vod.l'
table.deleteData(
filter=[
['=', 'date', '2025.05.14'],
['=', 'sym', 'Vod.l']
]
)
q
// Delete rows from 'trade' in 'default' database where date is 2025.05.14 and sym is "Vod.l"
gw(
`deleteData;
args: `database`table`filter!(
`default;
`trade;
(("=";"date";"2025.05.14"); ("=";"sym";"Vod.l"))
)
REST
POST /api/v2/databases/default/tables/trade/data/delete
Content-Type: application/json
{
"filter": [
["=", "date", "2025.05.14"],
["=", "sym", "Vod.l"]
]
}