How to Query Data Using qSQL
This page provides a quick guide to querying and transforming tables using qSQL, a SQL-like syntax designed for q. It covers core operators, query structure, and key techniques for filtering, grouping, and sorting data.
Use qSQL to query data in a table using a syntax similar to conventional SQL. While qSQL differs from native q syntax, it provides a more expressive approach for working with ordered data.
The following qSQL operators enable efficient querying and manipulation of tables:
delete
deletes rows or columns from a tableexec
returns columns from a table, possibly with new columnsselect
returns part of a table, possibly with new columnsupdate
adds rows or columns to a table
Overview of topics
1. How to structure qSQL queries
Follow this template syntax to structure qSQL queries. Optional elements appear in square brackets; a slash (/
) introduces a trailing comment:
select
[Limit expression
] [Select phrase
] [byBy phrase
] fromFrom phrase
[whereWhere phrase
]exec
[Limit expression
] [Select phrase
] [byBy phrase
] fromFrom phrase
[whereWhere phrase
]update
Select phrase
[byBy phrase
] fromFrom phrase
[whereWhere phrase
]delete
fromFrom phrase
[whereWhere phrase
] / rowsdelete
Select phrase
fromFrom phrase
/ columns
A template is evaluated in the following order.
from
phrase
The from
phrase is required in all query templates.
Use the following table expression:
- A table or dictionary (call-by-value)
- The name of a table or dictionary, in memory or on disk, as a symbol atom (call-by-name)
Examples:
update c:b*2 from ([]a:1 2;b:3 4) / call by value
select a,b from t / call by value
select a,b from `t / call by name
update c:b*2 from `:path/to/db / call by name
where
phrase
The where
phrase with a boolean list selects records:
q
q)t:([]c1:`a`b`c;c2:10 20 30;c3:1.1 2.2 3.3)
q)select from t where 101b
c1 c2 c3
---------
a 10 1.1
c 30 3.3
Subphrases specify successive filters. The following examples return the same result, but have different performance characteristics.
q
q)select from t where c2>15,c3<3.0
c1 c2 c3
---------
b 20 2.2
q)select from t where (c2>15) and c3<3.0
c1 c2 c3
---------
b 20 2.2
In the first example, the query filters c2
values greater than 15
first, then tests the corresponding c3
values.
In the second example, the query compares all c2
values to 15
and all c3
values to 3.0
, then combines the two result vectors with a logical AND
.
Efficient where
phrases start with their most stringent tests.
Query a partitioned table
When querying a partitioned table, the first where
subphrase should select from the value/s used to partition the table.
Otherwise, kdb+ (attempts to) load into memory all partitions for the column/s in the first subphrase.
Filter on groups
Use fby
to filter grouped results within a single qSQL query.
Instead of nesting two select … by …
queries like this:
q
select … from select … by … from t
Simplify the logic by collapsing the expression into a single query using fby
:
q
select … by … from … where … fby …
Use fby
when you find yourself trying to apply a filter to the aggregated column of a table produced by select … by …
.
Phrases and subphrases
The select
phrase, by
phrase, and where
phrase are
respectively the select
, by
, and where
phrases.
Each phrase is a comma-separated list of subphrases.
A subphrase is a q expression in which names are resolved with respect to the from
phrase and any table/s linked by foreign keys. Subphrases are evaluated in order from the left, but each subphrase expression is evaluated right-to-left in normal q syntax.
To apply the join
operator within a subphrase, wrap the entire subphrase in parentheses.
q
q)tbl:([]id:`a`b`c;val:10 20 30;price:1.1 2.2 3.3)
q)select (id,'4),val from tbl
x val
--------
`a 4 10
`b 4 20
`c 4 30
Names in subphrases
When qSQL evaluates a name inside a subphrase, it resolves it in the following order:
- Column or key name
- Local name in (or argument of) the encapsulating function
- Global name in the current working namespace – not necessarily the space in which the function was defined
Dot notation allows you to refer to foreign keys. The following example uses the suppliers
and parts
database (sp.q
) script.
q
q)\l sp.q
// s
+`p`city!(`p$`p1`p2`p3`p4`p5`p6`p1`p2;`london`london`london`london`london`lon..
// p
(`s#+(,`color)!,`s#`blue`green`red)!+(,`qty)!,900 1000 1200
// sp
+`s`p`qty!(`s$`s1`s1`s1`s2`s3`s4;`p$`p1`p4`p6`p2`p2`p4;300 200 100 400 200 300)
q)select sname:s.name, qty from sp
sname qty
---------
smith 300
smith 200
smith 400
smith 200
clark 100
smith 100
jones 300
jones 400
blake 200
clark 200
clark 300
smith 400
When compiling functions, the implicit args x
, y
, z
are visible to the compiler only when they are not inside the select
, by
, and where
phrases. You can observe that the table expression is not masked by evaluating the value
of the function and inspecting its second item: the arguments (args
).
q
q)args:{(value x)1}
q)args{} / no explicit args, so x is a default implicit arg of identity (::)
,`x
q)/from phrase is not masked, y is detected as an implicit arg here
q)args{select from y where a=x,b=z}
`x`y
q)args{[x;y;z]select from y where a=x,b=z} / x,y,z are now explicit args
`x`y`z
q)/call with wrong number of args results in rank error
q){select from ([]a:0 1;b:2 3) where a=x,b=y}[0;2]
'rank
[0] {select from ([]a:0 1;b:2 3) where a=x,b=y}[0;2]
^
q)/works with explicit args
q){[x;y]select from ([]a:0 1;b:2 3) where a=x,b=y}[0;2]
a b
---
0 2
You can refer explicitly to namespaces to break the subphrase resolution order listed above. For example, if you have a function with the same name as a column inside your table:
q
t:([]f:til 4)
f:2*
select f f from t // uses column f to order itself f[f]
f
-
0
1
2
3
select (`. `f) f from t // calls 2* on column f
f
-
0
2
4
6
Duplicate names for columns or groups
select
auto-aliases colliding duplicate column names for either select az,a from t
, or select a by c,c from t
, but not for select a,a by a from t
.
Such a collision throws a 'dup names for cols/groups a
error during parse, indicating the first column name which collides.
(Since V4.0 2020.03.17.)
q
q)parse"select b by b from t"
'dup names for cols/groups b
[2] select b by b from t
^
The easiest way to resolve this conflict is to explicitly rename columns. For example select a,b by c:a from t
.
Computed columns
In a subphrase, a q expression computes a new column or key, and a colon names it.
q
q)t:([] c1:`a`b`c; c2:10 20 30; c3:1.1 2.2 3.3)
q)select c1, c3*2 from t
c1 c3
------
a 2.2
b 4.4
c 6.6
q)select c1, dbl:c3*2 from t
c1 dbl
------
a 2.2
b 4.4
c 6.6
In the context of a query, the colon names a result column or key. It does not assign a variable in the workspace.
If a computed column or key is not named, q names it if possible as the leftmost term in the column expression, else as x
. If a computed name is already in use, q suffixes it with 1
, 2
, and so on as needed to make it unique.
q
q)t:([] c1:`a`b`c; c2:10 20 30; c3:1.1 2.2 3.3)
q)select c1, c1, 2*c2, c2+c3, string c3 from t
c1 c11 x c2 c3
--------------------
a a 20 11.1 "1.1"
b b 40 22.2 "2.2"
c c 60 33.3 "3.3"
Limit expressions
Limit expressions restrict the results returned by select
or exec
.
(For exec
there is only one: distinct
).
They are described in select
and exec
.
Result and side effects
A select
query returns a table or dictionary.
An exec
query returns a list of column values or a dictionary.
An update
or delete
query behaves differently depending on how the table is referenced:
-
Call-by-value: The query returns a modified copy of the table or dictionary.
-
Call-by-name: The query modifies the table or dictionary in place (in memory or on disk) and returns its name as the result.
q
q)t1:t2:([]a:1 2;b:3 4)
q)update a:neg a from t1
a b
----
-1 3
-2 4
q)t1~t2 / t1 unchanged
1b
q)update a:neg a from `t1
`t1
q)t1~t2 / t1 changed
0b
2. Virtual column i
A virtual column i
represents the index of each record, for example, the row number.
Because it is implicit in every table, it never appears as a column or key name in the result.
q
q)t:([]a:1 2 3;b:4 5 6)
q)t
a b
---
1 4
2 5
3 6
q)select i,b from t
x b
---
0 4
1 5
2 6
q)select from t where i in 0 2
a b
---
1 4
3 6
q)delete from t where i>1
a b
---
1 4
2 5
Virtual column in partitioned tables
In a partitioned table i
is the index (row number) relative to the partition, not the whole table.
q
q)`:/tmp/db/2015.01.01/t/ set ([] ti:09:30:00 09:31:00; p:101 102f)
`:/tmp/db/2015.01.01/t/
q)`:/tmp/db/2015.01.02/t/ set ([] ti:09:30:00 09:31:00; p:101.5 102.5)
`:/tmp/db/2015.01.02/t/
q)\l /tmp/db
q)select i,ti,p from t
x ti p
----------------
0 09:30:00 101
1 09:31:00 102
0 09:30:00 101.5
1 09:31:00 102.5
3. Aggregates
In SQL:
SQL
SELECT stock, SUM(amount) AS total FROM trade GROUP BY stock
In q:
q
q)trade:([]stock:`bac`bac`ibm`usb;amt:10 10 20 30;price:1.2 1.1 2.2 3.3)
q)select total:sum amt by stock from trade
stock| total
-----| -----
bac | 20
ibm | 20
usb | 30
The column stock
is a key in the result table.
Refer to Mathematics
for more aggregate functions.
4. Sort
Unlike traditional SQL, qSQL query templates do not include syntax for specifying sort order. If you need to sort query results, use xasc
and xdesc
instead.
q's sort operations are stable, meaning they preserve the order of equal elements. This allows you to combine multiple sorts to achieve mixed-order sorting.
The following example uses the suppliers and parts database (sp.q
) script.
q
q)\l sp.q
q)sp
s p qty
---------
s1 p1 300
s1 p2 200
s1 p3 400
s1 p4 200
s4 p5 100
s1 p6 100
s2 p1 300
s2 p2 400
s3 p2 200
s4 p2 200
s4 p4 300
s1 p5 400
q)`p xasc `qty xdesc select from sp where p in `p2`p4`p5
s p qty
---------
s2 p2 400
s1 p2 200
s3 p2 200
s4 p2 200
s4 p4 300
s1 p4 200
s1 p5 400
s4 p5 100
5. Optimize performance
Use the following practices to improve query performance in qSQL:
- Select only the columns you will use.
- Apply the most restrictive constraint first.
- Use a suitable attribute on the first non-virtual constraint. For example
`p
(parted) or`g
(grouped) onsym
. - Constraints should have the unmodified column name on the left of the constraint operator. For example
where sym in syms,…
. - When aggregating, use the virtual field first in the
by
phrase. For example,select .. by date,sym from …
.
6. Use multithreading
When secondary threads enabled, the following pattern will make use of the threads via peach
for both in-memory and on-disk tables.
q
select … by sym, … from t where sym in …, …
when sym
has a `g
or `p
attribute. (Since V3.2 2014.05.02)
This is also applicable for partitioned databases as:
q
select … by sym, … from t where date …, sym in …, …
Furthermore, multithreaded primitives will also use of secondaries in select clauses where appropriate. (Since V4.0 2020.03.17)
7. Special functions
The following functions receive special treatment within select
:
avg | first | prd | cor |
last | sum | count | max |
var | cov | med | wavg |
dev | min | wsum | distinct |
When you call these functions explicitly, q performs additional steps automatically—such as enlisting results or aggregating across partitions.
However, when you wrap them inside another function, q cannot detect the special context and skips those steps. In such cases, you must insert the necessary behavior manually.
q
q)select sum a from ([]a:1 2 3)
a
-
6
q)select {(),sum x}a from ([]a:1 2 3)
a
-
6
8. Conditional evaluation support
qSQL expressions do not directly support Cond.
q
q)u:([]a:raze ("ref/";"kb/"),\:/:"abc"; b:til 6)
q)select from u where a like $[1b;"ref/*";"kb/*"]
'rank
[0] select from u where a like $[1b;"ref/*";"kb/*"]
^
To enable, enclose in a lambda:
q
q)u:([]a:raze ("ref/";"kb/"),\:/:"abc"; b:til 6)
q)select from u where a like {$[x;"ref/*";"kb/*"]}1b
a b
---------
"ref/a" 0
"ref/b" 2
"ref/c" 4
or use the Vector Conditional instead.
q
q)u:([]a:raze ("ref/";"kb/"),\:/:"abc"; b:til 6)
q)select from u where a like ?[1b;"ref/*";"kb/*"]
a b
---------
"ref/a" 0
"ref/b" 2
"ref/c" 4
9. Integrate with functions
You can use standard q functions (lambdas) inside qSQL queries to transform values or filter results. This includes both inline transformations and parameterized queries.
Use a function inside a qSQL Query
Define any suitable function (lambda) and reference it inside a select statement. The lambda operates on column values row by row.
q
q)trade:([]time:09:04:59.000 09:04:59.001 09:04:59.050 09:04:59.060;stock:`bac`bac`ibm`usb;amount:110 10 20 30;price:1.2 1.1 2.2 3.3)
q)f:{[x] x+42}
q)select stock, f amount from trade
stock amount
------------
bac 152
bac 52
ibm 62
usb 72
Run parameterized queries
Evaluate query template expressions in a function (lambda).
q
q)trade:([]time:09:04:59.000 09:04:59.001 09:04:59.050 09:04:59.060;stock:`bac`bac`ibm`usb;amount:110 10 20 30;price:1.2 1.1 2.2 3.3)
q)myquery:{[tbl; amt] select stock, time from tbl where amount > amt}
q)myquery[trade; 100]
stock time
------------------
bac 09:04:59.000
Note
qSQL does not allow column names as parameters. To parameterize columns, use functional qSQL instead.
10. Functional qSQL
The interpreter translates the query templates into functional qSQL for evaluation. The functional forms are more general, and some complex queries require their use. But the query templates are powerful, readable, and there is no performance penalty for using them.
Wherever possible, prefer the query templates to functional forms.
The functional forms of delete
, exec
, select
and update
are particularly useful for programmatically-generated queries, such as when column names are dynamically produced.
Functional form is an alternative to using a qSQL template to construct a query. For example, the following are equivalent:
q
q)select n from t
q)?[t;();0b;(enlist `n)!enlist `n]
The q interpreter parses delete
, exec
, select
, and update
into their equivalent functional forms, so there is no performance difference.
The functional forms are:
![t;c;b;a] /update and delete
?[t;i;p] /simple exec
?[t;c;b;a] /select or exec
?[t;c;b;a;n] /select up to n records
?[t;c;b;a;n;(g;cn)] /select up to n records sorted by g on cn
where:
t
is a table, or the name of a table as a symbol atom.c
is thewhere
phrase, a list of constraints. Every constraint inc
is a parse tree representing an expression to be evaluated; the result of each being a boolean vector. The parse tree consists of a function followed by a list of its arguments, each an expression containing column names and other variables. Represented by symbols, it distinguishes actual symbol constants by enlisting them. The function is applied to the arguments, producing a boolean vector that selects the rows. The selection is performed in the order of the items inc
, from left to right: only rows selected by one constraint are evaluated by the next.
b
is theby
phrase. The domain of dictionaryb
is a list of symbols that are the key names for the grouping. Its range is a list of column expressions (parse trees) whose results are used to construct the groups. The grouping is ordered by the domain items, from major to minor.b
is one of:- the general empty list
()
- boolean atom:
0b
for no grouping;1b
for distinct - a symbol atom or list naming table column/s
- a dictionary of group-by specifications
- the general empty list
a
is theselect
phrase. The domain of dictionarya
is a list of symbols containing the names of the produced columns. qSQL query templates assign default column names in the result, but here each result column must be named explicitly. Each item of its range is an evaluation list consisting of a function and its argument(s), each of which is a column name or another such result list. For each evaluation list, the function is applied to the specified value(s) for each row and the result is returned. The evaluation lists are resolved recursively when operations are nested.a
is one of :- The general empty list
()
- A symbol atom: the name of a table column
- A parse tree
- A dictionary of select specifications (aggregations)
- The general empty list
i
is a list of indexesp
is a parse treen
is a non-negative integer or infinity, indicating the maximum number of records to be returnedg
is a unary grade function
Call by name
Columns in a
, b
, and c
appear as symbols.
To distinguish symbol atoms and vectors from columns, enlist them.
q
q)t:([] c1:`a`b`a`c`a`b`c; c2:10*1+til 7; c3:1.1*1+til 7)
q)select from t where c2>35,c1 in `b`c
c1 c2 c3
---------
c 40 4.4
b 60 6.6
c 70 7.7
q)?[t; ((>;`c2;35);(in;`c1;enlist[`b`c])); 0b; ()]
c1 c2 c3
---------
c 40 4.4
b 60 6.6
c 70 7.7
Note above that:
- The columns
c1
andc2
appear as symbol atoms - The symbol vector
`b`c
appears asenlist[`b`c]
Different types of a
and b
return different types of result for select
and exec
.
| b
a | bool () sym/s dict
-----------|----------------------------------------
() | table dict - keyed table
sym | - vector dict dict
parse tree | - vector dict dict
dict | table vector/s table table
Use enlist
to create singletons to ensure appropriate entities are lists.
?
Select
?[t;c;b;a]
For select
, the arguments t
, c
, b
, and a
are as above, and returns a table.
q
q)show t:([]n:`x`y`x`z`z`y;p:0 15 12 20 25 14)
n p
----
x 0
y 15
x 12
z 20
z 25
y 14
q)select m:max p,s:sum p by name:n from t where p>0,n in `x`y
name| m s
----| -----
x | 12 12
y | 15 29
Following is the equivalent functional form. Note the use of enlist
to create singletons, ensuring that appropriate entities are lists.
q
q)c: ((>;`p;0);(in;`n;enlist `x`y))
q)b: (enlist `name)!enlist `n
q)a: `m`s!((max;`p);(sum;`p))
q)?[t;c;b;a]
name| m s
----| -----
x | 12 12
y | 15 29
- For no constraints, make
c
the empty list - For no grouping make
b
a boolean0b
- For distinct rows make
b
a boolean1b
- To produce all columns of
t
in the result, makea
the empty list()
select from t
is equivalent to functional form ?[t;();0b;()]
.
Select distinct
For special case select distinct
, specify b
as 1b
.
q
q)t:([] c1:`a`b`a`c`b`c; c2:1 1 1 2 2 2; c3:10 20 30 40 50 60)
q)?[t;(); 1b; `c1`c2!`c1`c2] / select distinct c1,c2 from t
c1 c2
-----
a 1
b 1
c 2
b 2
Rank 5
Limit result rows
?[t;c;b;a;n]
Returns as for rank 4, but where n
is:
- An integer or infinity, only the first
n
rows, or the last ifn
is negative - A pair of non-negative integers, up to
n[1]
rows starting with rown[0]
q
q)show t:([] c1:`a`b`c`a; c2:10 20 30 40)
c1 c2
-----
a 10
b 20
c 30
a 40
q)?[t;();0b;();-2] / select[-2] from t
c1 c2
-----
c 30
a 40
q)?[t;();0b;();1 2] / select[1 2] from t
c1 c2
-----
b 20
c 30
Rank 6
Limit result rows and sort by a column
?[t;c;b;a;n;(g;cn)]
Returns as for rank 5, but where
g
is a unary grading functioncn
is a column name as a symbol atom
sorted by g
on column cn
.
q
q)?[t; (); 0b; `c1`c2!`c1`c2; 0W; (idesc;`c1)]
c1 c2
-----
c 30
b 20
a 10
a 40
?
Exec
exec
is a simplified form of select
that returns a list or dictionary rather than a table.
?[t;c;b;a]
The constraint specification c
(where
) is as for select
.
q
q)show t:([] c1:`a`b`c`c`a`a; c2:10 20 30 30 40 40;
c3: 1.1 2.2 3.3 3.3 4.4 3.14159; c4:`cow`sheep`cat`dog`cow`dog)
c1 c2 c3 c4
-------------------
a 10 1.1 cow
b 20 2.2 sheep
c 30 3.3 cat
c 30 3.3 dog
a 40 4.4 cow
a 40 3.14159 dog
No grouping
b
is the general empty list.
b a result
--------------------------------------------------------------
() () the last row of t as a dictionary
() sym the value of that column
() dict a dictionary with keys and values as specified by a
q
q)?[t; (); (); ()] / exec last c1,last c2,last c3 from t
c1| `a
c2| 40
c3| 3.14159
c4| `dog
q)?[t; (); (); `c1] / exec c1 from t
`a`b`c`c`a`a
q)?[t; (); (); `one`two!`c1`c2] / exec one:c1,two:c2 from t
one| a b c c a a
two| 10 20 30 30 40 40
q)?[t; (); (); `one`two!(`c1;(sum;`c2))] / exec one:c1,two:sum c2 from t
one| `a`b`c`c`a`a
two| 170
Group by column
b
is a column name. The result is a dictionary.
Where a
is a column name, in the result:
- The keys are distinct values of the column named in
b
- The values are lists of corresponding values from the column named in
a
q
q)?[t; (); `c1; `c2] / exec c2 by c1 from t
a| 10 40 40
b| ,20
c| 30 30
Where a
is a dictionary, in the result:
- The key is a table with a single anonymous column containing distinct values of the column named in
b
- The value is a table with columns as defined in
a
q
q)?[t; (); `c1; enlist[`c2]!enlist`c2] / exec c2:c2 by c1 from t
| c2
-| --------
a| 10 40 40
b| ,20
c| 30 30
q)?[t; (); `c1; `two`three!`c2`c3] / exec two:c2,three:c3 by c1 from t
| two three
-| ------------------------
a| 10 40 40 1.1 4.4 3.14159
b| ,20 ,2.2
c| 30 30 3.3 3.3
q)?[t;();`c1;`m2`s3!((max;`c2);(sum;`c3))] / exec m2:max c2,s3:sum c3 by c1 from t
| m2 s3
-| -----------
a| 40 8.64159
b| 20 2.2
c| 30 6.6
Group by columns
b
is a list of column names. Where a
is a column name, returns a dictionary in which:
- The key is the empty symbol
- The value is the value of the column/s specified in
a
q
q)?[t; (); `c1`c2; `c3]
| 1.1 2.2 3.3 3.3 4.4 3.14159
q)?[t; (); `c1`c2; `c3`c4!((max;`c3);(last;`c4))]
| c3 c4
| -------
| 4.4 dog
Group by a dictionary
b
is a dictionary. Result is a dictionary in which the key is a table with columns as specified by b
and
b a result value
-----------------------------------------------------
dict () last records of table that match each key
dict sym corresponding values from the column in a
dict dict values as defined in a
q
q)?[t; (); `one`two!`c1`c2; ()]
one two| c1 c2 c3 c4
-------| -------------------
a 10 | a 10 1.1 cow
a 40 | a 40 3.14159 dog
b 20 | b 20 2.2 sheep
c 30 | c 30 3.3 dog
q)/ exec last c1,last c2,last c3,last c4 by one:c1,two:c2 from t
q)?[t; (); enlist[`one]!enlist(string;`c1); ()]
one | c1 c2 c3 c4
----| -------------------
,"a"| a 40 3.14159 dog
,"b"| b 20 2.2 sheep
,"c"| c 30 3.3 dog
q)/ exec last c1,last c2,last c3,last c4 by one:string c1 from t
q)?[t; (); enlist[`one]!enlist `c1; `c2] / exec c2 by one:c1 from t
one|
---| --------
a | 10 40 40
b | ,20
c | 30 30
q)?[t; (); `one`four!`c1`c4; `m2`s3!((max;`c2);(sum;`c3))]
one four | m2 s3
---------| ----------
a cow | 40 5.5
a dog | 40 3.14159
b sheep| 20 2.2
c cat | 30 3.3
c dog | 30 3.3
?
Simple Exec
?[t;i;p]
Where t
is not partitioned, another form of exec
.
q
q)show t:([]a:1 2 3;b:4 5 6;c:7 9 0)
a b c
-----
1 4 7
2 5 9
3 6 0
q)?[t;0 1 2;`a]
1 2 3
q)?[t;0 1 2;`b]
4 5 6
q)?[t;0 1 2;(last;`a)]
3
q)?[t;0 1;(last;`a)]
2
q)?[t;0 1 2;(*;(min;`a);(avg;`c))]
5.333333
!
Update
![t;c;b;a]
update
arguments t
, c
, b
, and a
are as for select
.
q
q)show t:([]a:1 2 3;b:4 5 6;c:7 9 0)
a b c
-----
1 4 7
2 5 9
3 6 0
q)?[t;0 1 2;`a]
1 2 3
q)?[t;0 1 2;`b]
4 5 6
q)?[t;0 1 2;(last;`a)]
3
q)?[t;0 1;(last;`a)]
2
q)?[t;0 1 2;(*;(min;`a);(avg;`c))]
5.333333
The degenerate cases are the same as in select
.
!
Delete
delete
is a simplified form of Update
![t;c;0b;a]
One of c
or a
must be empty, the other not. c
selects which rows will be removed. a
is a symbol vector with the names of columns to be removed.
q
q)t:([]c1:`a`b`c;c2:`x`y`z)
q)/following is: delete c2 from t
q)![t;();0b;enlist `c2]
c1
--
a
b
c
q)/following is: delete from t where c2 = `y
q)![t;enlist (=;`c2; enlist `y);0b;`symbol$()]
c1 c2
-----
a x
c z
Convert using parse
Applying parse
to a qSQL statement expressed as a string returns its internal functional representation.
With additional processing, you can use the output to construct the equivalent functional form manually.
As query complexity increases, this process becomes more challenging and requires a detailed understanding of how kdb+ interprets and transforms qSQL during parsing.
An example of using parse
to convert qSQL to its corresponding functional form is:
q
q)t:([]c1:`a`b`c; c2:10 20 30)
q)parse "select c2:2*c2 from t where c1=`c"
?
`t
,,(=;`c1;,`c)
0b
(,`c2)!,(*;2;`c2)
q)?[`t; enlist (=;`c1;enlist `c); 0b; (enlist `c2)!enlist (*;2;`c2)]
c2
--
60
Issues when converting to functional form
To convert a select
query to a functional form, you may attempt to
apply the parse
function to the query string:
q
q)parse "select sym,price,size from trade where price>50"
?
`trade
,,(>;`price;50)
0b
`sym`price`size!`sym`price`size
As we know, parse
produces a parse tree and since some of the elements may themselves be parse trees, we can’t immediately take the output of parse
and plug it into the form ?[t;c;b;a]
.
After inspecting and experimenting with the result of parse
, you may determine that the corresponding functional form is:
q
q)funcQry:?[`trade;enlist(>;`price;50);0b;`sym`price`size! `sym`price`size]
q)strQry:select sym,price,size from trade where price>50
q)funcQry~strQry
1b
However, this process becomes increasingly complex as the structure and logic of the query grow more intricate:
q
q)parse "select count i from trade where 140>(count;i) fby sym"
?
`trade
,,(>;140;(k){@[(#y)#x[0]0#x
1;g;:;x[0]'x[1]g:.=y]};(enlist;#:;`i);`sym))
0b
(,`x)!,(#:;`i)
In this case, it is not obvious what the functional form of the above query should be, even after applying parse
.
This approach of manually analyzing parse output to reconstruct qSQL in functional form has three main limitations. The next three subsections explain each one in detail.
parse
trees and eval
The first issue with passing a select
query to parse
is that each returned item is in unevaluated form.
As discussed here, simply applying value
to a parse tree does not work.
However, if we evaluate each one of the arguments fully, then there would be no nested parse
trees.
We could then apply value
to the result:
q
q)eval each parse "select count i from trade where 140>(count;i) fby sym"
?
+`sym`time`price`size!(`VOD`IBM`BP`VOD`IBM`IBM`HSBC`VOD`MS..
,(>;140;(k){@[(#y)#x[0]0#x
1;g;:;x[0]'x[1]g:.=y]};(enlist;#:;`i);`sym))
0b
(,`x)!,(#:;`i)
The equivalence below holds for a general qSQL query provided as a string:
q
q)value[str]~value eval each parse str
1b
In particular:
q
q)str:"select count i from trade where 140>(count;i) fby sym"
q)value[str]~value eval each parse str
1b
In fact, since within the functional form we can refer to the table by name
, we can make this even
clearer.
Also, the first item in the result of parse
applied to a select
query will always be ?
(or !
for a delete
or update
query) which cannot be evaluated any further.
So we don’t need to apply eval
to it.
q
q)pTree:parse str:"select count i from trade where 140>(count;i) fby sym"
q)@[pTree;2 3 4;eval]
?
`trade
,(>;140;(k){@[(#y)#x[0]0#x
1;g;:;x[0]'x[1]g:.=y]};(enlist;#:;`i);`sym))
0b
(,`x)!,(#:;`i)
q)value[str] ~ value @[pTree;2 3 4;eval]
1b
Variable representation in parse
trees
Recall that in a parse
tree, a variable is represented by a symbol containing its name.
So to represent a symbol or a list of symbols, you must use enlist
on that expression. In k, enlist
is the unary form of the comma operator in k:
q
q)parse"3#`a`b`c`d`e`f"
#
3
,`a`b`c`d`e`f
q)(#;3;enlist `a`b`c`d`e`f)~parse"3#`a`b`c`d`e`f"
1b
This causes a difficulty. As discussed above, q has no unary syntax for operators.
Which means the following isn’t a valid q expression and so returns an error.
q
q)(#;3;,`a`b`c`d`e`f)
',
In the parse
tree we receive, we need to somehow distinguish between k’s unary ,
(which we want to replace with enlist
) and the binary join
operator, which we want to leave as it is.
Explicit definitions in .q
are shown in full
The fby
in the select
query above is represented by its full k
definition.
q
q)parse "fby"
k){@[(#y)#x[0]0#x 1;g;:;x[0]'x[1]g:.=y]}
While the k form does't generally affect functionality, it can make the resulting functional expression harder to read.
The solution
Write a function to automate the process of converting a select
query into its equivalent functional form. This function, buildQuery
, returns the functional form as a string.
q
q)buildQuery "select count i from trade where 140>(count;i) fby sym"
"?[trade;enlist(>;140;(fby;(enlist;count;`i);`sym));0b;
(enlist`x)! enlist (count;`i)]"
When executed, it always returns the same result as the select
query from which it's derived:
q
q)str:"select count i from trade where 140>(count;i) fby sym"
q)value[str]~value buildQuery str
1b
And since the same logic applies to exec
, update
, and delete
, the function can also convert these statements to their corresponding functional forms.
To write this function, let's solve the three issues outlined above:
- Parse-tree items may be parse trees
- Parse trees use k’s unary syntax for operators
- q keywords from
.q.
are replaced by their k definitions
The first issue, where some items returned by parse
may themselves be parse trees is easily resolved by applying eval
to the individual items.
The second issue is with k’s unary syntax for ,
.
We want to replace it with the q keyword enlist
.
To do this, we define a function that traverses the parse
tree and detects if any element is an enlisted list of symbols or an enlisted single symbol.
If it finds one, we replace it with a string representation of enlist
instead of ,
.
q
ereptest:{ //returns a boolean
(1=count x) and ((0=type x) and 11=type first x) or 11=type x}
ereplace:{"enlist",.Q.s1 first x}
funcEn:{$[ereptest x;ereplace x;0=type x;.z.s each x;x]}
Before we replace the item we first need to check it has the correct form. We need to test if it is one of:
- An enlisted list of syms. It will have type
0h
, count1
and the type of its first item will be11h
if and only if it is an enlisted list of syms. - An enlisted single sym. It will have type
11h
and count1
if and only if it is an enlisted single symbol.
The ereptest
function above performs this check, with ereplace
performing the replacement.
.Q.s1
is dependent on the size of the console so make it larger if necessary.
Since we are going to be checking a parse tree which may contain parse trees nested to arbitrary depth, we need a way to check all the elements down to the base level.
We observe that a parse tree is a general list, and therefore of type 0h
.
This knowledge combined with the use of .z.s
allows us to scan a parse tree recursively.
The logic goes: if what you have passed into funcEn
is a parse tree then reapply the function to each element.
To illustrate we examine the following select
query.
q
q)show pTree:parse "select from trade where sym like \"F*\",not sym=`FD"
?
`trade
,((like;`sym;"F*");(~:;(=;`sym;,`FD))) 0b
()
q)x:eval pTree 2 //apply eval to Where clause
Consider the where
clause in isolation:
q
q)x //a 2-list of Where clauses
(like;`sym;"F*")
(~:;(=;`sym;,`FD))
q)funcEn x
(like;`sym;"F*")
(~:;(=;`sym;"enlist`FD"))
Similarly we create a function which replaces k functions with their q equivalents in string form, thus addressing the third issue above.
q
q)kreplace:{[x] $[`=qval:.q?x;x;string qval]}
q)funcK:{$[0=t:type x;.z.s each x;t<100h;x;kreplace x]}
Running these functions against our where
clause, we notice the k
representations being converted to q.
q
q)x
(like;`sym;"F*")
(~:;(=;`sym;,`FD))
q)funcK x //replaces ~: with “not”
(like;`sym;"F*")
("not";(=;`sym;,`FD))
Next, we make a slight change to kreplace
and ereplace
and combine them.
q
kreplace:{[x] $[`=qval:.q?x;x;"~~",string[qval],"~~"]}
ereplace:{"~~enlist",(.Q.s1 first x),"~~"}
q)funcEn funcK x
(like;`sym;"F*") ("~~not~~";(=;`sym;"~~enlist`FD~~"))
The double tilde (~~
) acts as a tag, allowing us to differentiate it from actual string elements in the parse
tree. This allows us to drop the embedded quotation marks at a later stage inside the buildQuery
function:
q
q)ssr/[;("\"~~";"~~\"");("";"")] .Q.s1 funcEn funcK x
"((like;`sym;\"F*\");(not;(=;`sym;enlist`FD)))"
thus giving us the correct format for the where
clause in a functional select
.
By applying the same logic to the rest of the parse
tree, we can write the buildQuery
function.
q
q)buildQuery "select from trade where sym like \"F*\",not sym=`FD"
"?[trade;((like;`sym;\"F*\");(not;(=;`sym;enlist`FD)));0b;()]"
One thing to take note of is that since we use reverse lookup on the .q
namespace and only want one result we occasionally get the wrong keyword back.
q
q)buildQuery "update tstamp:ltime tstamp from z"
"![z;();0b;(enlist`tstamp)!enlist (reciprocal;`tstamp)]"
q).q`ltime
%:
q).q`reciprocal
%:
These cases are rare, and a developer should be able to recognize them when they occur. While the functional form will still execute correctly, it may cause confusion for those reading or maintaining the code.
Fifth and sixth arguments
Functional select
also has ranks 5 and 6; for example, fifth and sixth arguments.
We also cover these with the buildQuery
function.
q
q)buildQuery "select[10 20] from trade"
"?[trade;();0b;();10 20]"
q)//5th parameter included
The 6th argument is a column and a direction to order the results by. Use <
for ascending and >
for descending.
q
q)parse"select[10;<price] from trade"
?
`trade
()
0b
()
10
,(<:;`price)
q).q?(<:;>:)
`hopen`hclose
q)qfind each ("<:";">:") //qfind defined above
hopen
hclose
We see that the k function for the 6th argument of the functional form is <:
(ascending) or >:
(descending).
At first glance this appears to be hopen
or hclose
.
In fact in earlier versions of q, iasc
and hopen
were equivalent (as were idesc
and hclose
).
The definitions of iasc
and idesc
were later altered to signal a rank error if not applied to a list.
q
q)iasc
k){$[0h>@x;'`rank;<x]}
q)idesc
k){$[0h>@x;'`rank;>x]}
q)iasc 7
'rank
Since the columns of a table are lists, it is irrelevant whether the functional form uses the old or new version of iasc
or idesc
.
The buildQuery
function handles the 6th argument as a special case so will produce iasc
or idesc
as appropriate.
q
q)buildQuery "select[10 20;>price] from trade"
"?[trade;();0b;();10 20;(idesc;`price)]"
The full buildQuery
function code is as follows:
q
\c 30 200
tidy:{ssr/[;("\"~~";"~~\"");("";"")] $[","=first x;1_x;x]}
strBrk:{y,(";" sv x),z}
//replace k representation with equivalent q keyword
kreplace:{[x] $[`=qval:.q?x;x;"~~",string[qval],"~~"]}
funcK:{$[0=t:type x;.z.s each x;t<100h;x;kreplace x]}
//replace eg ,`FD`ABC`DEF with "enlist`FD`ABC`DEF"
ereplace:{"~~enlist",(.Q.s1 first x),"~~"}
ereptest:{(1=count x) and ((0=type x) and 11=type first x) or 11=type x}
funcEn:{$[ereptest x;ereplace x;0=type x;.z.s each x;x]}
basic:{tidy .Q.s1 funcK funcEn x}
addbraks:{"(",x,")"}
//Where clause needs to be a list of Where clauses,
//so if only one Where clause, need to enlist.
stringify:{$[(0=type x) and 1=count x;"enlist ";""],basic x}
//if a dictionary, apply to both keys and values
ab:{
$[(0=count x) or -1=type x; .Q.s1 x;
99=type x; (addbraks stringify key x ),"!",stringify value x;
stringify x] }
inner:{[x]
idxs:2 3 4 5 6 inter ainds:til count x;
x:@[x;idxs;'[ab;eval]];
if[6 in idxs;x[6]:ssr/[;("hopen";"hclose");("iasc";"idesc")] x[6]];
//for select statements within select statements
x[1]:$[-11=type x 1;x 1;[idxs,:1;.z.s x 1]];
x:@[x;ainds except idxs;string];
x[0],strBrk[1_x;"[";"]"] }
buildQuery:{inner parse x}
Summary
In this guide, you learned how to:
-
Structure and write qSQL queries using SQL-like syntax
-
Use core operators like
select
,exec
,update
, anddelete
-
Apply filters, grouping, sorting, and computed columns
-
Work with special features like the virtual column
i
, aggregates, andfby
-
Optimize query performance and leverage multithreading
-
Integrate functions and conditionals within queries
-
Understand and construct the functional form of qSQL using
parse