Programming/Kdb/Q-sql

From Thalesians Wiki
< Programming‎ | Kdb
Revision as of 04:31, 4 July 2021 by Admin (talk | contribs)

Example table schemas

Consider the following quotes table schema (empty table):

quotes:([]
    code:`symbol$();
    date:`date$();
    time:`time$();
    bidprice:`float$();
    bidsize:`short$();
    askprice:`float$();
    asksize:`short$();
    mktflag:`symbol$())

...and the following trades table schema (empty table):

trades:([]
    code:`symbol$();
    date:`date$();
    time:`time$();
    price:`float$();
    volume:`short$();
    mktflag:`symbol$();
    comments:())

Insert

Let us add the first row to the quotes table using an insert:

insert[`quotes;(`CLM16;2016.04.07;10:15:00.010;38.34;86h;38.35;3h;`E)]

Notice that when we want to modify a table, we usually provide a reference to it as a symbol, `quotes, instead of passing it as a variable, quotes.

The values of the new row are provided in order, as a list.

We can now examine the table quotes:

quotes
code  date       time         bidprice bidsize askprice asksize mktflag
-----------------------------------------------------------------------
CLM16 2016.04.07 10:15:00.010 38.34    86      38.35    3       E

We can add the second row in a similar manner:

insert[`quotes;(`GCM16;2016.04.07;11:02:16.663;1241.4;22h;1241.5;1h;`E)]

Let us again examine the table quotes:

quotes
code  date       time         bidprice bidsize askprice asksize mktflag
-----------------------------------------------------------------------
CLM16 2016.04.07 10:15:00.010 38.34    86      38.35    3       E
GCM16 2016.04.07 11:02:16.663 1241.4   22      1241.5   1       E

Inserting multiple rows at once

One can insert multiple rows to the quotes table using the followning syntax:

.[`quotes;();,;(
    (`CLM16; 2016.04.07; 12:05:00.303; 38.12 ; 7h ; 38.13 ; 13h; `E);
    (`CLM16; 2016.04.07; 12:22:00.486; 38.11 ; 16h; 38.12 ; 8h ; `E);
    (`GCM16; 2016.04.07; 13:00:00.205; 1238.6; 8h ; 1238.7; 7h ; `E);
    (`CLM16; 2016.04.07; 15:00:00.051; 38.52 ; 9h ; 38.53 ; 18h; `E);
    (`GCM16; 2016.04.07; 15:20:02.224; 1240.9; 6h ; 1241f ; 1h ; `E);
    (`CLM16; 2016.04.08; 10:53:00.002; 40.83 ; 6h ; 40.84 ; 66h; `E);
    (`CLM16; 2016.04.08; 13:56:30.070; 40.54 ; 38h; 40.56 ; 58h; `E);
    (`CLM16; 2016.04.08; 15:20:02.000; 40.77 ; 26h; 40.79 ; 44h; `E);
    (`CLM16; 2016.04.08; 15:21:43.786; 40.76 ; 3h ; 40.77 ; 28h; `E))]

Let us again examine the table quotes:

quotes
code  date       time         bidprice bidsize askprice asksize mktflag
-----------------------------------------------------------------------
CLM16 2016.04.07 10:15:00.010 38.34    86      38.35    3       E
GCM16 2016.04.07 11:02:16.663 1241.4   22      1241.5   1       E
CLM16 2016.04.07 12:05:00.303 38.12    7       38.13    13      E
CLM16 2016.04.07 12:22:00.486 38.11    16      38.12    8       E
GCM16 2016.04.07 13:00:00.205 1238.6   8       1238.7   7       E
CLM16 2016.04.07 15:00:00.051 38.52    9       38.53    18      E
GCM16 2016.04.07 15:20:02.224 1240.9   6       1241     1       E
CLM16 2016.04.08 10:53:00.002 40.83    6       40.84    66      E
CLM16 2016.04.08 13:56:30.070 40.54    38      40.56    58      E
CLM16 2016.04.08 15:20:02.000 40.77    26      40.79    44      E
CLM16 2016.04.08 15:21:43.786 40.76    3       40.77    28      E

We have just used dot (.) for dyadic functions. Quoting Q for Mortals, the general form of functional . for dyadic functions is

.[L;I;f;y]

L is a mapping with explicit domain, I is a list in the domain of L, f is a dyadic function and y is an atom or list of the proper shape. For a list, the result is the item-wise application to the items of L indexed at depth by I, of f and the parameter y. Over the subdomain I, the map output becomes

(L . I) f y    / binary operator
f[L . I; y]    / dyadic function

Thus, if

L:(100 200;300 400 500 600;600 700);

then

.[L;1 2;+;42]

amounts to copying L, adding 42 to the [1;2]nd element of the copy, and returning that copy:

(100j, 200j;300j, 400j, 542j, 600j;600j 700j)

Similarly,

.[L;(1;(0;2;3));+;42]

amounts to copying L, adding 42 to the [1;0]th, [1;2]nd, and [1;3]rd elements of the copy, and returning that copy:

(100j, 200j;342j, 400j, 542j, 642j;600j, 700j)

Whereas

.[L;(1;(0;2;3));+;(42;45;44)]

amounts to copying L, adding 42 to the [1; 0]th, 45 to the [1; 2]nd, and 44 to the [1; 3]rd element of the copy, and returning that copy:

(100j, 200j;342j, 400j, 545j, 644j;600j, 700j)

And

.[L;();+;42]

amounts to copying L, adding 42 to all elements of the copy, and returning that copy:

(142j, 242j;342j, 442j, 542j, 642j;642j, 742j)

In none of the cases was the original L modified. If we wanted to modify it, we would have passed a name reference, as a symbol, instead:

.[`L;();+;42]
`L
L
<pre>
<pre>
(142j, 242j;342j, 442j, 542j, 642j;642j, 742j)

Along the same lines, we can use . to extend a list (remember that , concatenates):

scientists:`newton`leibniz;
.[`scientists;();,;`wiener`khinchin`kolmogorov];
scientists
`newton`leibniz`wiener`khinchin`kolmogorov

Just as we used to extend a table.

Deleting rows from a table

To delete all rows from the quotes table we can use the following code:

q)delete from `quotes

Notice that we referred to the quotes table by its symbol name,

`quotes

, instead of providing a variable, quotes. This is generally the case when we want to modify a table in place.

We can check that the quotes table is now indeed empty:

q)count quotes
0
q)quotes
code date time bidprice bidsize askprice asksize mktflag
--------------------------------------------------------

Repopulating quotes

Let us repopulate quotes again:

quotes:([]code:`symbol$();date:`date$();time:`time$();bidprice:`float$();bidsize:`short$();askprice:`float$();asksize:`short$();mktflag:`symbol$());
.[`quotes;();,;(
    (`CLM16; 2016.04.07; 10:15:00.010; 38.34 ; 86h; 38.35 ; 3h ; `E);
    (`GCM16; 2016.04.07; 11:02:16.663; 1241.4; 22h; 1241.5; 1h ; `E);
    (`CLM16; 2016.04.07; 12:05:00.303; 38.12 ; 7h ; 38.13 ; 13h; `E);
    (`CLM16; 2016.04.07; 12:22:00.486; 38.11 ; 16h; 38.12 ; 8h ; `E);
    (`GCM16; 2016.04.07; 13:00:00.205; 1238.6; 8h ; 1238.7; 7h ; `E);
    (`CLM16; 2016.04.07; 15:00:00.051; 38.52 ; 9h ; 38.53 ; 18h; `E);
    (`GCM16; 2016.04.07; 15:20:02.224; 1240.9; 6h ; 1241f ; 1h ; `E);
    (`CLM16; 2016.04.08; 10:53:00.002; 40.83 ; 6h ; 40.84 ; 66h; `E);
    (`CLM16; 2016.04.08; 13:56:30.070; 40.54 ; 38h; 40.56 ; 58h; `E);
    (`CLM16; 2016.04.08; 15:20:02.000; 40.77 ; 26h; 40.79 ; 44h; `E);
    (`CLM16; 2016.04.08; 15:21:43.786; 40.76 ; 3h ; 40.77 ; 28h; `E))]

The metadata of the trades table before we populate it

Let us examine the metadata of the trades table before we populate it:

meta trades
c       | t f a
--------| -----
code    | s
date    | d
time    | t
price   | f
volume  | h
mktflag | s
comments|

Note that

type trades[`comments]

is 0h, so this column is a mixed list (remember that table columns are all lists). It appeared in the schema without a type cast:

trades : ([]
    ...
    volume:`short$();
    mktflag:`symbol$();
    comments:())

Exercise

Use dot (.) to populate the trades table with the following rows:

(`CLM16; 2016.04.07; 10:20:00.329; 38.3  ; 4h; `E; "NYMEX/CME Group trade");
(`GCM16; 2016.04.07; 12:00:00.055; 1239.7; 6h; `E; "");
(`GCM16; 2016.04.07; 12:37:02.106; 1240.5; 1h; `E; "");
(`CLM16; 2016.04.07; 13:00:00.128; 38.04 ; 3h; `E; "NYMEX/CME Group trade");
(`VXK16; 2016.04.07; 13:22:05.617; 18.85 ; 5h; `E; "");
(`GCM16; 2016.04.07; 14:35:01.241; 1241.2; 1h; `E; "");
(`GCM16; 2016.04.08; 10:13:01.048; 1240f ; 3h; `E; "");
(`VXK16; 2016.04.08; 11:34:53.417; 18.53 ; 1h; `E; "Transaction represents a trade in two contract months in the same class");
(`CLM16; 2016.04.08; 12:00:00.227; 40.61 ; 3h; `E; "NYMEX/CME Group trade");
(`VXK16; 2016.04.08; 12:44:00.684; 18.44 ; 2h; `E; "Transaction represents a trade in two contract months in the same class");
(`VXK16; 2016.04.08; 12:45:33.130; 18.49 ; 1h; `E; "Transaction represents a trade in two contract months in the same class");
(`CLM16; 2016.04.08; 15:20:02.000; 40.78 ; 3h; `E; "NYMEX/CME Group trade");
(`CLM16; 2016.04.11; 11:00:00.105; 41.43 ; 2h; `E; "NYMEX/CME Group trade");
(`VXK16; 2016.04.11; 14:00:00.829; 18.35 ; 1h; `E; "");
(`VXK16; 2016.04.11; 15:14:58.775; 19.05 ; 2h; `E; "");
(`GCM16; 2016.04.11; 16:00:00.044; 1257.9; 1h; `E; "");
(`GCM16; 2016.04.11; 16:28:34.311; 1258.7; 1h; `E; "")

Solution

.[`trades;();,;(
    (`CLM16; 2016.04.07; 10:20:00.329; 38.3  ; 4h; `E; "NYMEX/CME Group trade");
    (`GCM16; 2016.04.07; 12:00:00.055; 1239.7; 6h; `E; "");
    (`GCM16; 2016.04.07; 12:37:02.106; 1240.5; 1h; `E; "");
    (`CLM16; 2016.04.07; 13:00:00.128; 38.04 ; 3h; `E; "NYMEX/CME Group trade");
    (`VXK16; 2016.04.07; 13:22:05.617; 18.85 ; 5h; `E; "");
    (`GCM16; 2016.04.07; 14:35:01.241; 1241.2; 1h; `E; "");
    (`GCM16; 2016.04.08; 10:13:01.048; 1240f ; 3h; `E; "");
    (`VXK16; 2016.04.08; 11:34:53.417; 18.53 ; 1h; `E; "Transaction represents a trade in two contract months in the same class");
    (`CLM16; 2016.04.08; 12:00:00.227; 40.61 ; 3h; `E; "NYMEX/CME Group trade");
    (`VXK16; 2016.04.08; 12:44:00.684; 18.44 ; 2h; `E; "Transaction represents a trade in two contract months in the same class");
    (`VXK16; 2016.04.08; 12:45:33.130; 18.49 ; 1h; `E; "Transaction represents a trade in two contract months in the same class");
    (`CLM16; 2016.04.08; 15:20:02.000; 40.78 ; 3h; `E; "NYMEX/CME Group trade");
    (`CLM16; 2016.04.11; 11:00:00.105; 41.43 ; 2h; `E; "NYMEX/CME Group trade");
    (`VXK16; 2016.04.11; 14:00:00.829; 18.35 ; 1h; `E; "");
    (`VXK16; 2016.04.11; 15:14:58.775; 19.05 ; 2h; `E; "");
    (`GCM16; 2016.04.11; 16:00:00.044; 1257.9; 1h; `E; "");
    (`GCM16; 2016.04.11; 16:28:34.311; 1258.7; 1h; `E; ""))]
trades
code  date       time         price  volume mktflag comments                 ..
-----------------------------------------------------------------------------..
CLM16 2016.04.07 10:20:00.329 38.3   4      E       "NYMEX/CME Group trade"  ..
GCM16 2016.04.07 12:00:00.055 1239.7 6      E       ""                       ..
GCM16 2016.04.07 12:37:02.106 1240.5 1      E       ""                       ..
CLM16 2016.04.07 13:00:00.128 38.04  3      E       "NYMEX/CME Group trade"  ..
VXK16 2016.04.07 13:22:05.617 18.85  5      E       ""                       ..
GCM16 2016.04.07 14:35:01.241 1241.2 1      E       ""                       ..
GCM16 2016.04.08 10:13:01.048 1240   3      E       ""                       ..
VXK16 2016.04.08 11:34:53.417 18.53  1      E       "Transaction represents a..
CLM16 2016.04.08 12:00:00.227 40.61  3      E       "NYMEX/CME Group trade"  ..
VXK16 2016.04.08 12:44:00.684 18.44  2      E       "Transaction represents a..
VXK16 2016.04.08 12:45:33.130 18.49  1      E       "Transaction represents a..
CLM16 2016.04.08 15:20:02.000 40.78  3      E       "NYMEX/CME Group trade"  ..
CLM16 2016.04.11 11:00:00.105 41.43  2      E       "NYMEX/CME Group trade"  ..
VXK16 2016.04.11 14:00:00.829 18.35  1      E       ""                       ..
VXK16 2016.04.11 15:14:58.775 19.05  2      E       ""                       ..
GCM16 2016.04.11 16:00:00.044 1257.9 1      E       ""                       ..
GCM16 2016.04.11 16:28:34.311 1258.7 1      E       ""                       ..

The metadata of the trades table after we have populated it

Let us examine the metadata of the trades table before we populate it:

meta trades
c       | t f a
--------| -----
code    | s
date    | d
time    | t
price   | f
volume  | h
mktflag | s
comments| C

Note that

type trades[`comments]

is still 0h, but the metadata above tells us that comments is a column of strings. Strings are themselves lists

trades[`comments][0][0]
"N"

and the only way to declare a list of lists is through a mixed list, there is no other way we could have declared this column in our schema.

Q-sql select

Consider the simplest possible form of q-sql select:

select from quotes
code  date       time         bidprice bidsize askprice asksize mktflag
-----------------------------------------------------------------------
CLM16 2016.04.07 10:15:00.010 38.34    86      38.35    3       E
GCM16 2016.04.07 11:02:16.663 1241.4   22      1241.5   1       E
CLM16 2016.04.07 12:05:00.303 38.12    7       38.13    13      E
CLM16 2016.04.07 12:22:00.486 38.11    16      38.12    8       E
GCM16 2016.04.07 13:00:00.205 1238.6   8       1238.7   7       E
CLM16 2016.04.07 15:00:00.051 38.52    9       38.53    18      E
GCM16 2016.04.07 15:20:02.224 1240.9   6       1241     1       E
CLM16 2016.04.08 10:53:00.002 40.83    6       40.84    66      E
CLM16 2016.04.08 13:56:30.070 40.54    38      40.56    58      E
CLM16 2016.04.08 15:20:02.000 40.77    26      40.79    44      E
CLM16 2016.04.08 15:21:43.786 40.76    3       40.77    28      E

We have selected all rows from the table.

Exercise

Select all rows from the trades table.

Solution

select from trades
code  date       time         price  volume mktflag comments                 ..
-----------------------------------------------------------------------------..
CLM16 2016.04.07 10:20:00.329 38.3   4      E       "NYMEX/CME Group trade"  ..
GCM16 2016.04.07 12:00:00.055 1239.7 6      E       ""                       ..
GCM16 2016.04.07 12:37:02.106 1240.5 1      E       ""                       ..
CLM16 2016.04.07 13:00:00.128 38.04  3      E       "NYMEX/CME Group trade"  ..
VXK16 2016.04.07 13:22:05.617 18.85  5      E       ""                       ..
GCM16 2016.04.07 14:35:01.241 1241.2 1      E       ""                       ..
GCM16 2016.04.08 10:13:01.048 1240   3      E       ""                       ..
VXK16 2016.04.08 11:34:53.417 18.53  1      E       "Transaction represents a..
CLM16 2016.04.08 12:00:00.227 40.61  3      E       "NYMEX/CME Group trade"  ..
VXK16 2016.04.08 12:44:00.684 18.44  2      E       "Transaction represents a..
VXK16 2016.04.08 12:45:33.130 18.49  1      E       "Transaction represents a..
CLM16 2016.04.08 15:20:02.000 40.78  3      E       "NYMEX/CME Group trade"  ..
CLM16 2016.04.11 11:00:00.105 41.43  2      E       "NYMEX/CME Group trade"  ..
VXK16 2016.04.11 14:00:00.829 18.35  1      E       ""                       ..
VXK16 2016.04.11 15:14:58.775 19.05  2      E       ""                       ..
GCM16 2016.04.11 16:00:00.044 1257.9 1      E       ""                       ..
GCM16 2016.04.11 16:28:34.311 1258.7 1      E       ""                       ..

Q-sql select: selecting columns

We may list the columns that we want to select, in the order in which we want them to appear in the resulting table:

select date,time,code,asksize,askprice from quotes
date       time         code  asksize askprice
----------------------------------------------
2016.04.07 10:15:00.010 CLM16 3       38.35
2016.04.07 11:02:16.663 GCM16 1       1241.5
2016.04.07 12:05:00.303 CLM16 13      38.13
2016.04.07 12:22:00.486 CLM16 8       38.12
2016.04.07 13:00:00.205 GCM16 7       1238.7
2016.04.07 15:00:00.051 CLM16 18      38.53
2016.04.07 15:20:02.224 GCM16 1       1241
2016.04.08 10:53:00.002 CLM16 66      40.84
2016.04.08 13:56:30.070 CLM16 58      40.56
2016.04.08 15:20:02.000 CLM16 44      40.79
2016.04.08 15:21:43.786 CLM16 28      40.77

By default, the output is not saved, but we can set it to a variable:

new_quotes_table:select date,time,code,asksize,askprice from quotes

Exercise

Select the date, time, volume, and price (in that order) from the trades table.

Solution

select date,time,volume,price from trades
date       time         volume price
-------------------------------------
2016.04.07 10:20:00.329 4      38.3
2016.04.07 12:00:00.055 6      1239.7
2016.04.07 12:37:02.106 1      1240.5
2016.04.07 13:00:00.128 3      38.04
2016.04.07 13:22:05.617 5      18.85
2016.04.07 14:35:01.241 1      1241.2
2016.04.08 10:13:01.048 3      1240
2016.04.08 11:34:53.417 1      18.53
2016.04.08 12:00:00.227 3      40.61
2016.04.08 12:44:00.684 2      18.44
2016.04.08 12:45:33.130 1      18.49
2016.04.08 15:20:02.000 3      40.78
2016.04.11 11:00:00.105 2      41.43
2016.04.11 14:00:00.829 1      18.35
2016.04.11 15:14:58.775 2      19.05
2016.04.11 16:00:00.044 1      1257.9
2016.04.11 16:28:34.311 1      1258.7

Q-sql select: selecting columns under new names

We may give the selected columns new names in the result:

select d:date,t:time,code,size:asksize,price:askprice from quotes
d          t            code  size price
-----------------------------------------
2016.04.07 10:15:00.010 CLM16 3    38.35
2016.04.07 11:02:16.663 GCM16 1    1241.5
2016.04.07 12:05:00.303 CLM16 13   38.13
2016.04.07 12:22:00.486 CLM16 8    38.12
2016.04.07 13:00:00.205 GCM16 7    1238.7
2016.04.07 15:00:00.051 CLM16 18   38.53
2016.04.07 15:20:02.224 GCM16 1    1241
2016.04.08 10:53:00.002 CLM16 66   40.84
2016.04.08 13:56:30.070 CLM16 58   40.56
2016.04.08 15:20:02.000 CLM16 44   40.79
2016.04.08 15:21:43.786 CLM16 28   40.77

Exercise

Select the date, time, volume, and price (in that order) from the trades table but under the names d, t, v, and p.

Solution

select d:date,t:time,v:volume,p:price from trades
d          t            v p
--------------------------------
2016.04.07 10:20:00.329 4 38.3
2016.04.07 12:00:00.055 6 1239.7
2016.04.07 12:37:02.106 1 1240.5
2016.04.07 13:00:00.128 3 38.04
2016.04.07 13:22:05.617 5 18.85
2016.04.07 14:35:01.241 1 1241.2
2016.04.08 10:13:01.048 3 1240
2016.04.08 11:34:53.417 1 18.53
2016.04.08 12:00:00.227 3 40.61
2016.04.08 12:44:00.684 2 18.44
2016.04.08 12:45:33.130 1 18.49
2016.04.08 15:20:02.000 3 40.78
2016.04.11 11:00:00.105 2 41.43
2016.04.11 14:00:00.829 1 18.35
2016.04.11 15:14:58.775 2 19.05
2016.04.11 16:00:00.044 1 1257.9
2016.04.11 16:28:34.311 1 1258.7

Q-sql select: operations on columns

Perhaps more interestingly, selected columns may be functions of / results of operations on the existing ones:

select dt:date+time,code,size:asksize,price:askprice from quotes
dt                            code  size price
-----------------------------------------------
2016.04.07D10:15:00.010000000 CLM16 3    38.35
2016.04.07D11:02:16.663000000 GCM16 1    1241.5
2016.04.07D12:05:00.303000000 CLM16 13   38.13
2016.04.07D12:22:00.486000000 CLM16 8    38.12
2016.04.07D13:00:00.205000000 GCM16 7    1238.7
2016.04.07D15:00:00.051000000 CLM16 18   38.53
2016.04.07D15:20:02.224000000 GCM16 1    1241
2016.04.08D10:53:00.002000000 CLM16 66   40.84
2016.04.08D13:56:30.070000000 CLM16 58   40.56
2016.04.08D15:20:02.000000000 CLM16 44   40.79
2016.04.08D15:21:43.786000000 CLM16 28   40.77

Remember that columns are lists, so the above + works on two lists:

q)quotes[`date]
2016.04.07 2016.04.07 2016.04.07 2016.04.07 2016.04.07 2016.04.07 2016.04.07 ..
q)quotes[`time]
10:15:00.010 11:02:16.663 12:05:00.303 12:22:00.486 13:00:00.205 15:00:00.051..
q)quotes[`date]+quotes[`time]
2016.04.07D10:15:00.010000000 2016.04.07D11:02:16.663000000 2016.04.07D12:05:..

Exercise

The best bid and best ask (offer) prices are denoted by and , respectively.

The midprice is the average of the best bid and best ask prices:

Write a q-sql select query selecting the timestamp (date + time) and the midprice.

Solution

select dt:date+time,mid:.5*bidprice+askprice from quotes
dt                            mid
-------------------------------------
2016.04.07D10:15:00.010000000 38.345
2016.04.07D11:02:16.663000000 1241.45
2016.04.07D12:05:00.303000000 38.125
2016.04.07D12:22:00.486000000 38.115
2016.04.07D13:00:00.205000000 1238.65
2016.04.07D15:00:00.051000000 38.525
2016.04.07D15:20:02.224000000 1240.95
2016.04.08D10:53:00.002000000 40.835
2016.04.08D13:56:30.070000000 40.55
2016.04.08D15:20:02.000000000 40.78
2016.04.08D15:21:43.786000000 40.765

Exercise

The microprice is often more useful than the midprice, since it weights the best bid and best ask prices with the volumes posted at the best bid and best ask prices:

where and represent the volumes posted at the best bid and best ask prices, respectively.

Modify your query to add the microprice to it.

Solution

select dt:date+time,mid:.5*bidprice+askprice,microprice:((bidsize*askprice)+asksize*bidprice)%bidsize+asksize from quotes
dt                            mid     microprice
------------------------------------------------
2016.04.07D10:15:00.010000000 38.345  38.34966
2016.04.07D11:02:16.663000000 1241.45 1241.496
2016.04.07D12:05:00.303000000 38.125  38.1235
2016.04.07D12:22:00.486000000 38.115  38.11667
2016.04.07D13:00:00.205000000 1238.65 1238.653
2016.04.07D15:00:00.051000000 38.525  38.52333
2016.04.07D15:20:02.224000000 1240.95 1240.986
2016.04.08D10:53:00.002000000 40.835  40.83083
2016.04.08D13:56:30.070000000 40.55   40.54792
2016.04.08D15:20:02.000000000 40.78   40.77743
2016.04.08D15:21:43.786000000 40.765  40.76097

A challenge

So what if we want to return only the first character from each code?

select dt:date+time,shortcode:code[0],size:asksize,price:askprice from quotes

does not give the desired result:

dt                            shortcode size price
---------------------------------------------------
2016.04.07D10:15:00.010000000 CLM16     3    38.35
2016.04.07D11:02:16.663000000 CLM16     1    1241.5
2016.04.07D12:05:00.303000000 CLM16     13   38.13
2016.04.07D12:22:00.486000000 CLM16     8    38.12
2016.04.07D13:00:00.205000000 CLM16     7    1238.7
2016.04.07D15:00:00.051000000 CLM16     18   38.53
2016.04.07D15:20:02.224000000 CLM16     1    1241
2016.04.08D10:53:00.002000000 CLM16     66   40.84
2016.04.08D13:56:30.070000000 CLM16     58   40.56
2016.04.08D15:20:02.000000000 CLM16     44   40.79
2016.04.08D15:21:43.786000000 CLM16     28   40.77

This is because the indexing, ...[0], is applied to the entire list quotes[`code], and we get a symbol scalar, `CLM16. This scalar is returned for each row in the result, as you can see above.

Indeed, this is what happens when we include scalars among selected columns:

select dt:date+time,shortcode:123,size:asksize,price:askprice from quotes
dt                            shortcode size price
---------------------------------------------------
2016.04.07D10:15:00.010000000 123       3    38.35
2016.04.07D11:02:16.663000000 123       1    1241.5
2016.04.07D12:05:00.303000000 123       13   38.13
2016.04.07D12:22:00.486000000 123       8    38.12
2016.04.07D13:00:00.205000000 123       7    1238.7
2016.04.07D15:00:00.051000000 123       18   38.53
2016.04.07D15:20:02.224000000 123       1    1241
2016.04.08D10:53:00.002000000 123       66   40.84
2016.04.08D13:56:30.070000000 123       58   40.56
2016.04.08D15:20:02.000000000 123       44   40.79
2016.04.08D15:21:43.786000000 123       28   40.77

Still, what if we want to return the first character of each code?

One way to do this is by applying a function to each element of the list which represents the resulting column:

select dt:date+time,shortcode:{`$1#string x}each code,size:asksize,price:askprice from quotes
dt                            shortcode size price
---------------------------------------------------
2016.04.07D10:15:00.010000000 C         3    38.35
2016.04.07D11:02:16.663000000 G         1    1241.5
2016.04.07D12:05:00.303000000 C         13   38.13
2016.04.07D12:22:00.486000000 C         8    38.12
2016.04.07D13:00:00.205000000 G         7    1238.7
2016.04.07D15:00:00.051000000 C         18   38.53
2016.04.07D15:20:02.224000000 G         1    1241
2016.04.08D10:53:00.002000000 C         66   40.84
2016.04.08D13:56:30.070000000 C         58   40.56
2016.04.08D15:20:02.000000000 C         44   40.79
2016.04.08D15:21:43.786000000 C         28   40.77

Instead of the keyword each, we could have used the ' adverb (don't forget the parentheses around the verb–adverb pair):

select dt:date+time,shortcode:({`$1#string x}')code,size:asksize,price:askprice from quotes
dt                            shortcode size price
---------------------------------------------------
2016.04.07D10:15:00.010000000 C         3    38.35
2016.04.07D11:02:16.663000000 G         1    1241.5
2016.04.07D12:05:00.303000000 C         13   38.13
2016.04.07D12:22:00.486000000 C         8    38.12
2016.04.07D13:00:00.205000000 G         7    1238.7
2016.04.07D15:00:00.051000000 C         18   38.53
2016.04.07D15:20:02.224000000 G         1    1241
2016.04.08D10:53:00.002000000 C         66   40.84
2016.04.08D13:56:30.070000000 C         58   40.56
2016.04.08D15:20:02.000000000 C         44   40.79
2016.04.08D15:21:43.786000000 C         28   40.77

This works since

{`$1#string x}each quotes[`code]

and

({`$1#string x}')quotes[`code]

give the desired list

`C`G`C`C`G`C`G`C`C`C`C

which is what the above select returns for that column (remember: columns are lists).

Exercise

Select from the quotes table cf, which is a symbol consisting of two characters: the first character of the code and the first character of the mktflag.

Solution

We can form (code,mktflag) pairs with code,'mktflag and then apply a function to each (') such pair:

select cf:({`$(1#string x[0]),1#string x[1]}')(code,'mktflag) from quotes
cf
--
CE
GE
CE
CE
GE
CE
GE
CE
CE
CE
CE

However, we can do without the "pairing" code code,'mktflag. While the keyword each does not work on functions of multiple arguments and the following throws 'type,

select cf:{`$(1#string x),1#string y}each[code;mktflag] from quotes

Each (') does work on functions of multiple arguments:

select cf:{`$(1#string x),1#string y}'[code;mktflag] from quotes

The following, somewhat less terse, solution works too. But here we iterate twice, whereas we could iterate only once, as in the previous solution:

select cf:`$(({1#string x}each code),'{1#string x}each mktflag) from quotes

Q-sql select: where specifications (constraints)

As in classical SQL, in q-sql we can add to our select the where specifications (constraints). Thus, if we want to return only those columns where price is greater than 1000f...

select date,time,price:askprice from quotes where price>1000f

Oh, wait, this returns an error:

'price
[0] select date,time,price:askprice from quotes where price>1000f
                                                      ^

That's because the constraints are applied first, before one of the resulting columns is given the name price. We must apply the constraint to askprice in the original table instead:

select date,time,price:askprice from quotes where askprice>1000f
date       time         price
------------------------------
2016.04.07 11:02:16.663 1241.5
2016.04.07 13:00:00.205 1238.7
2016.04.07 15:20:02.224 1241

We can add multiple constraints separated by commas:

select date,time,price:askprice from quotes where askprice>1000f,time>=12:00:00.000
date       time         price
------------------------------
2016.04.07 13:00:00.205 1238.7
2016.04.07 15:20:02.224 1241

NB! Unlike classical SQL, the order of the constraints matters! The constraints are applied in order. From the point of view of performance, you should apply the most "expensive" constraints, which shrink the result the most, first.

In particular, when dealing with partitioned tables, include the constraint on the partitioning column (usually date) first. Since partitioned tables are stored on disc in directories by date, this constraint allows kdb+/q to zoom in on the right directory. If it is not the first constraint, an expensive scan of the entire database on disc may result.

Q-sql select: indices

Tables are stored in deterministic order. If we want to return the indices of the rows returned by the query, we can add i to the select:

select i,date,time,price:askprice from quotes where askprice>1000f,time>=12:00:00.000
x date       time         price
--------------------------------
4 2016.04.07 13:00:00.205 1238.7
6 2016.04.07 15:20:02.224 1241

Exercise

Select from trades those rows where price is closer to the smallest integer greater than it than the greatest integer that is less than or equal to the price.

Solution

select from trades where ((ceiling price)-price)<price-floor price
code  date       time         price  volume mktflag comments                 ..
-----------------------------------------------------------------------------..
GCM16 2016.04.07 12:00:00.055 1239.7 6      E       ""                       ..
VXK16 2016.04.07 13:22:05.617 18.85  5      E       ""                       ..
VXK16 2016.04.08 11:34:53.417 18.53  1      E       "Transaction represents a..
CLM16 2016.04.08 12:00:00.227 40.61  3      E       "NYMEX/CME Group trade"  ..
CLM16 2016.04.08 15:20:02.000 40.78  3      E       "NYMEX/CME Group trade"  ..
GCM16 2016.04.11 16:00:00.044 1257.9 1      E       ""                       ..
GCM16 2016.04.11 16:28:34.311 1258.7 1      E       ""                       ..

The following solution is terser (and involves only a call to floor, no call to ceiling):

select from trades where (price-floor price)>0.5

Q-sql select: more complex constraints

More complex constraints can be formed using boolean operations:

select i,date,time,price:askprice from quotes where (((askprice>1000f)or(time>=12:00:00.000))and(time <= 15:30:00.000))
x  date       time         price
---------------------------------
1  2016.04.07 11:02:16.663 1241.5
2  2016.04.07 12:05:00.303 38.13
3  2016.04.07 12:22:00.486 38.12
4  2016.04.07 13:00:00.205 1238.7
5  2016.04.07 15:00:00.051 38.53
6  2016.04.07 15:20:02.224 1241
8  2016.04.08 13:56:30.070 40.56
9  2016.04.08 15:20:02.000 40.79
10 2016.04.08 15:21:43.786 40.77

Remember that constraints also operate on lists (columns) and result in boolean lists:

((quotes[`askprice]>1000f)or(quotes[`time]>=12:00:00.000))and(quotes[`time]<=15:30:00.000)
01111110111b

Exercise

Select from the quotes table those rows where the time is either less than 11 a.m. or greater than or equal to 3 p.m.

Solution

select from quotes where (time<11:00:00.000)or time>=15:00:00.000
<pre>
<pre>
code  date       time         bidprice bidsize askprice asksize mktflag
-----------------------------------------------------------------------
CLM16 2016.04.07 10:15:00.010 38.34    86      38.35    3       E
CLM16 2016.04.07 15:00:00.051 38.52    9       38.53    18      E
GCM16 2016.04.07 15:20:02.224 1240.9   6       1241     1       E
CLM16 2016.04.08 10:53:00.002 40.83    6       40.84    66      E
CLM16 2016.04.08 15:20:02.000 40.77    26      40.79    44      E
CLM16 2016.04.08 15:21:43.786 40.76    3       40.77    28      E

Don't forget the parentheses. The query won't work correctly without them:

select from quotes where time<11:00:00.000 or time>=15:00:00.000
code  date       time         bidprice bidsize askprice asksize mktflag
-----------------------------------------------------------------------
CLM16 2016.04.07 10:15:00.010 38.34    86      38.35    3       E
CLM16 2016.04.08 10:53:00.002 40.83    6       40.84    66      E

(Which is not what we want.)

\begin{frame}[fragile] \frametitle{q-sql select: group-by specifications (\textbf{by} phrase)~(i)} \begin{itemize} \item We can group the results by the value of a particular column: \begin{snippet}{q} select i,date,time,price:askprice by date from quotes where askprice>=40f \end{snippet} \begin{snippet}{output} date | x date time ..


| -----------------------------------------------------------------..

2016.04.07| 1 4 6 2016.04.07 2016.04.07 2016.04.07 11:02:16.663.. 2016.04.08| 7 8 9 10 2016.04.08 2016.04.08 2016.04.08 2016.04.08 10:53:00.002.. \end{snippet} \item The result is a keyed table (in this case, keyed by date). \item Apart from the key column, all the columns in the result are lists of lists, collecting values for each partition: \begin{snippet}{q} (select i,date,time,price:askprice by date from quotes where askprice>=40f)[2016.04.07][`time] \end{snippet} \begin{snippet}{output} 11:02:16.663 13:00:00.205 15:20:02.224 \end{snippet} \end{itemize} \end{frame}

\begin{frame}[fragile] \frametitle{q-sql select: group-by specifications (\textbf{by} phrase)~(ii)} Notice that, if you omit the list of selected columns from a \code{select} with \code{by}, somewhat counterintuitively, we will, for every single column, get the last value in each bucket, instead of a list of all values in the bucket: \begin{snippet}{q} q)select by date from quotes where askprice>=40f date | code time bidprice bidsize askprice asksize mktflag


| ------------------------------------------------------------

2016.04.07| GCM16 15:20:02.224 1240.9 6 1241 1 E 2016.04.08| CLM16 15:21:43.786 40.76 3 40.77 28 E \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{q-sql select: group-by specifications (\textbf{by} phrase)~(iii)} Group-by specifications make it easy to produce summary statistics: \begin{snippet}{q} select first_bidprice:first bidprice,avg_askprice:avg askprice,max_bidsize:max bidsize,number_of_quotes:count i by date,code from quotes where askprice<1250f \end{snippet} \begin{snippet}{output} date code | first_bidprice avg_askprice max_bidsize number_of_quotes


| --------------------------------------------------------

2016.04.07 CLM16| 38.34 38.2825 86 4 2016.04.07 GCM16| 1241.4 1240.4 22 3 2016.04.08 CLM16| 40.83 40.74 38 4 \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{q-sql select: group-by specifications (\textbf{by} phrase)~(iv)} \begin{itemize} \item We can also group by the result of a function, e.g.: \begin{snippet}{q} select first_bidprice:first bidprice,avg_askprice:avg askprice,max_bidsize:max bidsize,number_of_quotes:count i by{[x] `hh$x}each time from quotes where askprice<1250f \end{snippet} \begin{snippet}{output} time| first_bidprice avg_askprice max_bidsize number_of_quotes


| --------------------------------------------------------

10 | 38.34 39.595 86 2 11 | 1241.4 1241.5 22 1 12 | 38.12 38.125 16 2 13 | 1238.6 639.63 38 2 15 | 38.52 340.2725 26 4 \end{snippet} \item Although in this case the use of \code{each} is superfluous: \code{`hh$...} works just as well on lists as on scalars: \begin{snippet}{q} select first_bidprice:first bidprice,avg_askprice:avg askprice,max_bidsize:max bidsize,number_of_quotes:count i by`hh$time from quotes where askprice<1250f \end{snippet} \end{itemize} \end{frame}

\begin{frame}[fragile] \frametitle{Exercise} Group the data from \code{trades} by the number of digits in the \code{price}. \end{frame}

\begin{frame}[fragile] \frametitle{Solution~(i)} \begin{itemize} \item We could do \begin{snippet}{q} select code,date,time,price,volume,mktflag,comments by count each string price from trades \end{snippet} \begin{snippet}{output} price| code date ..


| ----------------------------------------------------------------------..

4 | `CLM16`GCM16 2016.04.0.. 5 | `CLM16`VXK16`VXK16`CLM16`VXK16`VXK16`CLM16`CLM16`VXK16`VXK16 2016.04.0.. 6 | `GCM16`GCM16`GCM16`GCM16`GCM16 2016.04.0.. \end{snippet}

   but this counts the decimal point as a digit (which it isn't).

\item To remedy this, we could use \begin{snippet}{q} select code,date,time,price,volume,mktflag,comments by {count sv[""]vs["."]x} each string price from trades \end{snippet} \begin{snippet}{output} price| code dat..


| ----------------------------------------------------------------------..

3 | ,`CLM16 ,20.. 4 | `CLM16`VXK16`GCM16`VXK16`CLM16`VXK16`VXK16`CLM16`CLM16`VXK16`VXK16 201.. 5 | `GCM16`GCM16`GCM16`GCM16`GCM16 201.. \end{snippet} \end{itemize} \end{frame}

\begin{frame}[fragile] \frametitle{Solution~(ii)} If we care only about the digits before the decimal point, we could use \begin{snippet}{q} select code,date,time,price,volume,mktflag,comments by (1+floor log[price]%log 10) from trades \end{snippet} \begin{snippet}{output} x| code date .. -| --------------------------------------------------------------------------.. 2| `CLM16`CLM16`VXK16`VXK16`CLM16`VXK16`VXK16`CLM16`CLM16`VXK16`VXK16 2016.04.. 4| `GCM16`GCM16`GCM16`GCM16`GCM16`GCM16 2016.04.. \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{\textbf{xbar}} \begin{itemize} \item Consider the following example: \begin{snippet}{q} q)5 xbar 23 21 56 39 10 9 81 20 45 101 15 20 20 55 35 10 5 80 20 45 100 15 \end{snippet} \item For each integer in the list on the right \code{xbar} outputs a multiple of 5 that is less than or equal to it. \item \code{xbar} works with floats, too: \begin{snippet}{q} q)5 xbar 23.7 21.9 56.3 39.2 10.1 9.8 81.3 20.4 45.4 101.0 15.3 20 20 55 35 10 5 80 20 45 100 15f \end{snippet} \item And, importantly, it works with temporal types: \begin{snippet}{q} q)(1000*60*5)xbar 00:03:09.022 00:05:58.921 00:21:00.000 00:29:00.000 00:00:00.000 00:05:00.000 00:20:00.000 00:25:00.000 \end{snippet} On the left we have the number of milliseconds\footnote{When working with timestamps we should use nanoseconds.}. Thus we have produced five-minute bars. \item \code{xbar} is defined in k as follows: \begin{snippet}{q} q)xbar k){x*y div x:$[16h=abs[@x];"j"$x;x]} \end{snippet} \end{itemize} \end{frame}

\begin{frame}[fragile] \frametitle{q-sql select: \textbf{xbar} in group-by specifications (\textbf{by} phrase)} \begin{itemize} \item \code{xbar} is particularly useful in group-by specifications. \item Thus, if we want to group-by 15-minute intervals, we can apply it like so: \begin{snippet}{q} select first_bidprice:first bidprice,avg_askprice:avg askprice,max_bidsize:max bidsize,number_of_quotes:count i by(15 * 60 * 1000000000)xbar(quotes[`date]+quotes[`time]) from quotes where askprice<1250f \end{snippet} \begin{snippet}{output} x | first_bidprice avg_askprice max_bidsize number_of_quotes


|--------------------------------------------------------

2016.04.07D10:15:00.000000000| 38.34 38.35 86 1 2016.04.07D11:00:00.000000000| 1241.4 1241.5 22 1 2016.04.07D12:00:00.000000000| 38.12 38.13 7 1 2016.04.07D12:15:00.000000000| 38.11 38.12 16 1 2016.04.07D13:00:00.000000000| 1238.6 1238.7 8 1 2016.04.07D15:00:00.000000000| 38.52 38.53 9 1 2016.04.07D15:15:00.000000000| 1240.9 1241 6 1 2016.04.08D10:45:00.000000000| 40.83 40.84 6 1 2016.04.08D13:45:00.000000000| 40.54 40.56 38 1 2016.04.08D15:15:00.000000000| 40.77 40.78 26 2 \end{snippet} \end{itemize} \end{frame}

\begin{frame} \frametitle{Exercise} Produce a table of last trade prices, last trade volumes, average trade prices, and average trade volumes in each one hour-long bar. \end{frame}

\begin{frame}[fragile] \frametitle{Solution~(i)} The question is whether we want to lump different dates into the same bucket. If we do, then we could use \begin{snippet}{q} select lastp:last price,lastv:last volume,avgp:avg price,avgv:avg volume by (1000*60*60)xbar time from trades \end{snippet} \begin{snippet}{output} time | lastp lastv avgp avgv


| -------------------------

10:00:00.000| 1240 3 639.15 3.5 11:00:00.000| 41.43 2 29.98 1.5 12:00:00.000| 18.49 1 511.548 2.6 13:00:00.000| 18.85 5 28.445 4 14:00:00.000| 18.35 1 629.775 1 15:00:00.000| 19.05 2 29.915 2.5 16:00:00.000| 1258.7 1 1258.3 1 \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{Solution~(ii)} More likely than not, we want to distinguish the different dates, so we apply \code{xbar} as follows: \begin{snippet}{q} q)select lastp:last price,lastv:last volume,avgp:avg price,avgv:avg volume by (1000000000*60*60)xbar date+time from trades \end{snippet} \begin{snippet}{output} date | lastp lastv avgp avgv


| --------------------------

2016.04.07D10:00:00.000000000| 38.3 4 38.3 4 2016.04.07D12:00:00.000000000| 1240.5 1 1240.1 3.5 2016.04.07D13:00:00.000000000| 18.85 5 28.445 4 2016.04.07D14:00:00.000000000| 1241.2 1 1241.2 1 2016.04.08D10:00:00.000000000| 1240 3 1240 3 2016.04.08D11:00:00.000000000| 18.53 1 18.53 1 2016.04.08D12:00:00.000000000| 18.49 1 25.84667 2 2016.04.08D15:00:00.000000000| 40.78 3 40.78 3 2016.04.11D11:00:00.000000000| 41.43 2 41.43 2 2016.04.11D14:00:00.000000000| 18.35 1 18.35 1 2016.04.11D15:00:00.000000000| 19.05 2 19.05 2 2016.04.11D16:00:00.000000000| 1258.7 1 1258.3 1 \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{\textbf{exec}} \begin{itemize} \item \code{exec} behaves similarly to but differently from \code{select}. Unlike \code{select}, it does not return tables. \item The following \begin{snippet}{q} exec from quotes \end{snippet} will return the first row of the table as a dictionary: \begin{snippet}{output} code | `CLM16 date | 2016.04.08 time | 15:21:43.786 bidprice| 40.76 bidsize | 3h askprice| 40.77 asksize | 28h mktflag | `E \end{snippet} \item The following will return a dictionary of two lists: \begin{snippet}{q} exec date, time from quotes \end{snippet} \begin{snippet}{output} date| 2016.04.07 2016.04.07 2016.04.07 2016.04.07 2016.04.07 2016.0.. time| 10:15:00.010 11:02:16.663 12:05:00.303 12:22:00.486 13:00:00.205 15:00:.. \end{snippet} \item Whereas the following will simply return a list: \begin{snippet}{q} exec date from quotes \end{snippet} \end{itemize} \end{frame}

\begin{frame}[fragile] \frametitle{\textbf{update}} \begin{itemize} \item \code{update}, whose syntax is similar to that of \code{select}, can be used to update tables. \item For example, to replace all \code{code}s with \code{`UNKNOWN} in the table \code{quotes}, we can use the following: \begin{snippet}{q} update code:`UNKNOWN from quotes \end{snippet} \item To replace the code \code{`CLM16} with \code{`crudeoil} we can use \begin{snippet}{q} update code:`crudeoil from trades where code=`CLM16 \end{snippet} \item These q-sql statements will return new tables. To update the table in place, reference it by name (\code{`quotes}) instead of using the variable \code{quotes} in the query: \begin{snippet}{q} q)update code:`crudeoil from `trades where code=`CLM16 `trades \end{snippet} \end{itemize} \end{frame}

\begin{frame}[fragile] \frametitle{Exercise} Use \code{update} to set all instances of \code{bidprice} where \code{bidprice} is greater than \code{1000f} to \code{0n}. \end{frame}

\begin{frame}[fragile] \frametitle{Solution} \begin{snippet}{q} q)update bidprice:0n from `quotes where bidprice>1000f `quotes \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{Exercise} Use \code{update} to set all \code{bidprice}s in the table \code{quotes} to an increasing sequence of floats: \code{0f} \code{1f} \code{2f} \code{3f}... \end{frame}

\begin{frame}[fragile] \frametitle{Solution} \begin{snippet}{q} update bidprice:`float$til count quotes from `quotes \end{snippet} or \begin{snippet}{q} update bidprice:0f+til count quotes from `quotes \end{snippet} or \begin{snippet}{q} update bidprice:`float$i from `quotes \end{snippet} or \begin{snippet}{q} update bidprice:0f+i from `quotes \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{Recall the meaning of the column attributes} \begin{itemize} \item \code{`s} --- \defn{sorted}: \code{`s#2 2 3}. The elements of the list must be sorted. By marking the list as sorted, binary search can be performed instead of fully scanning the column, speeding up many operations. \item \code{`u} --- \defn{unique}: \code{`u#2 4 5}. The elements of the list must be unique. This attribute creates a unique hash table in the background, allowing constant time lookup of elements. \item \code{`p} --- \defn{parted}: \code{`p#2 2 1}. All elements of the same value must occur in one sequential block. By storing the same items continuously on-disc reading using standard hard drives is much faster. \item \code{`g} --- \defn{grouped}: \code{`g#2 1 2}. This attribute is closest to indexing in standard databases. A lookup table from each distinct value in the table is created to map to the positions where that value occurs. This enables much quicker lookup of the entries. \end{itemize} \end{frame}

\begin{frame}[fragile] \frametitle{Setting and unsetting the column attributes} We can use \code{update} to set and unset the column attributes: \begin{snippet}{q} q)meta quotes c | t f a


| -----

code | s date | d time | t bidprice| f bidsize | h askprice| f asksize | h mktflag | s \end{snippet} \begin{snippet}{q} q)update `s#date from `quotes `quotes \end{snippet} \begin{snippet}{q} q)meta quotes c | t f a


| -----

code | s date | d s time | t bidprice| f bidsize | h askprice| f asksize | h mktflag | s \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{Exercise} Set the grouped attribute on the column \code{code} and the sorted attribute on the column \code{date} of the \code{trades} table. \end{frame}

\begin{frame}[fragile] \frametitle{Solution} \begin{snippet}{q} q)meta trades c | t f a


| -----

code | s date | d time | t price | f volume | h mktflag | s comments| C \end{snippet} \begin{snippet}{q} q)update `g#code,`s#date from `trades `trades \end{snippet} \begin{snippet}{q} q)meta trades c | t f a


| -----

code | s g date | d s time | t price | f volume | h mktflag | s comments| C \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{Exercise} Remove the attributes from the \code{code} and \code{date} columns of the \code{trades} table. \end{frame}

\begin{frame}[fragile] \frametitle{Solution} \begin{snippet}{q} q)meta trades c | t f a


| -----

code | s g date | d s time | t price | f volume | h mktflag | s comments| C \end{snippet} \begin{snippet}{q} q)update `#code,`#date from `trades `trades \end{snippet} \begin{snippet}{q} q)meta trades c | t f a


| -----

code | s date | d time | t price | f volume | h mktflag | s comments| C \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{\textbf{delete}} \begin{itemize} \item \code{delete} also has a similar syntax to that of \code{update}. \item We can use the following to delete all quotes for \code{`GCM16}: \begin{snippet}{q} delete from `quotes where code=`GCM16 \end{snippet} \item Notice that we have again referred to the table by its name, \code{`quotes}, in order to change it in place. \end{itemize} \end{frame}

\begin{frame}[fragile] \frametitle{Exercise} Delete all rows from \code{quotes} where the \code{askprice} is greater than \code{1000f}. \end{frame}

\begin{frame}[fragile] \frametitle{Solution} \begin{snippet}{q} q)delete from `quotes where askprice>1000f `quotes \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{Deleting columns} \code{delete} can be used to delete columns as well as rows: \begin{snippet}{q} delete bidprice from quotes \end{snippet} This will return a copy. To modify the table in place, use the symbol reference \code{`quotes} instead of \code{quotes}. \end{frame}

\begin{frame} \frametitle{Exercise} Delete the columns \code{mktflag} and \code{comments} from the table \code{trades}. \end{frame}

\begin{frame}[fragile] \frametitle{Solution} \begin{snippet}{q} q)delete mktflag,comments from `trades `trades \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{The order of evaluation~(i)} We could use \code{0N!} to investigate the order of evaluation in q: \begin{snippet}{q} q)0N!1;0N!2;0N!3 1 2 3 3 \end{snippet} \begin{snippet}{q} q)(0N!1;0N!2;0N!3) 3 2 1 1 2 3 \end{snippet} \begin{snippet}{q} q)add:{x+y+z};add[0N!1;0N!2;0N!3] 3 2 1 6 \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{The order of evaluation~(ii)} \begin{snippet}{q} q)select 0N!date,0N!time,0N!askprice from quotes 2016.04.07 2016.04.07 2016.04.07 2016.04.07 2016.04.07 2016.04.07 2016.04.07 .. 10:15:00.010 11:02:16.663 12:05:00.303 12:22:00.486 13:00:00.205 15:00:00.051.. 38.35 1241.5 38.13 38.12 1238.7 38.53 1241 40.84 40.56 40.79 40.77 date time askprice


2016.04.07 10:15:00.010 38.35 2016.04.07 11:02:16.663 1241.5 2016.04.07 12:05:00.303 38.13 2016.04.07 12:22:00.486 38.12 2016.04.07 13:00:00.205 1238.7 2016.04.07 15:00:00.051 38.53 2016.04.07 15:20:02.224 1241 2016.04.08 10:53:00.002 40.84 2016.04.08 13:56:30.070 40.56 2016.04.08 15:20:02.000 40.79 2016.04.08 15:21:43.786 40.77 \end{snippet} \end{frame}

\begin{frame}[fragile] \frametitle{The order of evaluation~(iii)} \begin{snippet}{q} q)select 0N!date,0N!time,0N!askprice from quotes where 0N!date>2016.04.07,0N!askprice>40.7 00000001111b 1011b 2016.04.08 2016.04.08 2016.04.08 10:53:00.002 15:20:02.000 15:21:43.786 40.84 40.79 40.77 date time askprice


2016.04.08 10:53:00.002 40.84 2016.04.08 15:20:02.000 40.79 2016.04.08 15:21:43.786 40.77 \end{snippet} From this it is clear that the order of evaluation is as follows: \begin{itemize} \item \code{date>2016.04.07} \item \code{askprice>40.7} \item \code{date} \item \code{time} \item \code{askprice} \end{itemize} \end{frame}

\begin{frame}[fragile] \frametitle{The order of evaluation~(iv)} \begin{snippet}{q} q)select 0N!time,0N!askprice by 0N!date from quotes where 0N!date>2016.04.07,0N!askprice>40.7 00000001111b 1011b 2016.04.08 2016.04.08 2016.04.08 10:53:00.002 15:20:02.000 15:21:43.786 40.84 40.79 40.77 date | time askprice


| --------------------------------------------------------

2016.04.08| 10:53:00.002 15:20:02.000 15:21:43.786 40.84 40.79 40.77 \end{snippet} From this it is clear that the order of evaluation is as follows: \begin{itemize} \item \code{date>2016.04.07} \item \code{askprice>40.7} \item \code{by date} \item \code{time} \item \code{askprice} \end{itemize} \end{frame}

\begin{frame}[fragile] \frametitle{The order of evaluation~(v)} \begin{snippet}{q} q)select 0N!time by 0N!date,0N!askprice from quotes where 0N!date>2016.04.07,0N!askprice>40.7 00000001111b 1011b 2016.04.08 2016.04.08 2016.04.08 40.84 40.79 40.77 ,10:53:00.002 ,15:20:02.000 ,15:21:43.786 date askprice| time


| ------------

2016.04.08 40.77 | 15:21:43.786 2016.04.08 40.79 | 15:20:02.000 2016.04.08 40.84 | 10:53:00.002 \end{snippet} From this it is clear that the order of evaluation is as follows: \begin{itemize} \item \code{date>2016.04.07} \item \code{askprice>40.7} \item \code{by date} \item \code{by askprice} \item \code{time} \end{itemize} \end{frame}