Programming/Kdb/Resources

From Thalesians Wiki

Books

Q for Mortals

  • Author: Jeffry A. Borror
  • Publisher: q4m LLC
  • Paperback: 586 pages
  • ISBN-10: 0692573674
  • ISBN-13: 978-0692573679

Q for Mortals Version 3 is a thorough presentation of the q programming language and an introduction to the kdb+ database. It is a complete rewrite of the original Q for Mortals that is current with q 3.3. The presentation is derived from classes taught by the author at international financial institutions over the last decade. It is a series of tutorials based on q snippets intended to be entered interactively into the q console by the reader. The text takes its subject seriously but not itself. Technical explanations are augmented by mathematical observations, references to general programming concepts and other programming languages, and jokes. Coding style recommendations and advice to avoid gotchas appear liberally throughout. Examples are as simple as they can be but no simpler.

Chapter 1, Q Shock and Awe, provides a piquant panorama of the power of q and its dazzling zen-like nature. Chapter 2 describes the base data types of q. Chapter 3 discusses lists, the fundamental data structure of q. Chapter 4 presents the basic operators. Chapter 5 introduced dictionaries, which associate keys and values. Chapter 6 presents an in-depth description of functions and q's constructs for functional programming. Chapter 7 demonstrates transforming data from one type to another. Chapter 8 introduces tables and keyed tables, the fundamental data structures for kdb. Chapter 9 describes q-sql and all the methods to manipulate tables. Chapter 10 presents ways to control execution of q programs. Chapter 11 covers file and interprocess communication I/O. Chapter 12 describes workspace organization and management. Chapter 13 discusses system commands and command line parameters. Chapter 14 serves as an introduction to the kdb+ database. Chapter A has a complete rundown of the built-in functions. Chapter B lists common error messages. A cross-referenced index closes the book.

Version 3.1 of Q for Mortals is available online: https://code.kx.com/q4m3/

Q Tips: Fast, Scalable and Maintainable Kdb+

  • Author: Nick Psaris
  • Publisher: Vector Sigma
  • Paperback: 463 pages
  • ASIN: B00UZ8OMME

Learn q by building a real life application. Q Tips teaches you everything you need to know to build a fully functional Complex Event Processing (CEP) engine. Advanced topics include profiling an active q server, derivatives pricing, and histogram charting. As each new topic is introduced, tips are highlighted to help you write better q.

Machine Learning and Big Data with kdb+/q

  • Authors: Jan Novotny, Paul A. Bilokon, Aris Galiotos, Frederic Deleze
  • Publisher: Wiley
  • Hardcover: 640 pages
  • ISBN-10: 1119404754
  • ISBN-13: 978-1119404750

Upgrade your programming language to more effectively handle high-frequency data.

Machine Learning and Big Data with kdb+/q offers quants, programmers, and algorithmic traders a practical entry into the powerful but non-intuitive kdb+ database and q programming language. Ideally designed to handle the speed and volume of high-frequency financial data at sell- and buy-side institutions, these tools have become the de facto standard; this book provides the foundational knowledge practitioners need to work effectively with this rapidly-evolving approach to analytical trading.

The discussion follows the natural progression of working strategy development to allow hands-on learning in a familiar sphere, illustrating the contrast of efficiency and capability between the q language and other programming approaches. Rather than an all-encompassing "bible"-type reference, this book is designed with a focus on real-world practicality to help you quickly get up to speed and become productive with the language.

  • Understand why kdb+/q is the ideal solution for high-frequency data.
  • Delve into "meat" of q programming to solve practical economic problems.
  • Perform everyday operations, including basic regressions, cointegration, volatility estimation, modelling and more.
  • Learn advanced techniques from market impact and microstructure analyses to machine learning techniques including neural networks.

The kdb+ database and its underlying programming language q offer unprecedented speed and capability. As trading algorithms and financial models grow ever more complex against the markets they seek to predict, they encompass an ever-larger swath of data — more variables, more metrics, more responsiveness and altogether more "moving parts".

Traditional programming languages are increasingly failing to accommodate the growing speed and volume of data and lack the necessary flexibility that cutting-edge financial modelling demands. Machine Learning and Big Data with kdb+/q opens up the technology and flattens the learning curve to help you quickly adopt a more effective set of tools.

Fun Q

  • Author: Nick Psaris
  • Publisher: Vector Sigma
  • Paperback: 415 pages
  • ISBN-10: 1734467509
  • ISBN-13: 978-1734467505

Bring the power of machine learning to the fastest time-series database. Fun Q uses the powerful q programming language to implement many of the most famous machine-learning algorithms. Using a meticulously factored machine-learning library, each algorithm is broken into its basic building blocks and then rebuilt from scratch. Famous machine-learning data sets are used to motivate each chapter as advanced q idioms are introduced. Whether you are a data scientist who is new to q or a kdb+ administrator who is new to machine learning, you'll have fun learning how machine-learning algorithms can be implemented in the concise vector-functional language q. With nothing but the q binary, you'll be able to download data sets, generate plots in the q terminal and get progress-bar-style feedback as model parameters iteratively improve. In addition to being a functional introduction to machine learning algorithms in q, it is designed to be a fun introduction as well!

Data in the 21st century is like oil in the 18th century. A data scientist can save millions in labor and network infrastructure costs. She can unlock further billions in gained access to real-time insight. This is the benefit of a lab setup at the direct heart of the rig, silo, distillery and refinery complex. Evolving symbolic data notation shortens the communication lines and breaks down the complexity overhead. It is this integrative kind of technology and language that makes the world yet smaller.

C++ has QuantLib, Python has numpy et al. Q/kdb+ now has three brand new machine learning libraries. That is, if we don't count the additional ad-hoc Github projects of cummunity members. We are seeing the start of a new era unfolding page by page. Hopefully, the three libraries will harmonize into a powerful toolbox in the hands of a new breed of data scientist.

—Daniel Krizian. Book review—Fun Q: A Functional Introduction to Machine Learning in Q. Vector, Volume 27, No. 1, Nov. 9, 2020.

Tutorials

References

Frequently asked questions

Hello! Perhaps you're just learning q and kdb, or maybe you've been using them for yours but something's got you stuck. You want answers. Fast. That's why you're using kdb in the first place, right?

Our names are Jim and Nate, and we've spent — and continue to spend — much of our time reading and writing q code as well as building kdb databases and analytics on top of them. We also spend a lot of time with our users, teaching them how to use q and kdb more effectively as well as helping them implement their ideas. They come to us, because they know that's the fastest way to get the answers they need.

[...]

So, we created this site to host a growing collection of articles that will eventually become our book. Although we will be very fortunate indeed if any one of our writings inspires you, we are confident we can help you navigate the powerful yet cryptic waters of q and kdb. Fast.

Code repositories

APL and other related languages

  • The British APL Association (BAA) was founded in 1984 to promote a family of interactive array-programming languages noted for elegance, conciseness, and fast development speed. Many of them were derived from Kenneth Iverson's mathematical notation.
  • Vector—the journal of the BAA.
  • APL Wiki is an online open-content wiki; that is, a voluntary association of individuals and groups working to develop a common knowledge resource. It was launched at the end of 2006 as a MoinMoin wiki. It was created and maintained by Kai Jäger of APL Team Ltd, and its logo derives from that of APL Team. In 2019, APL Wiki was reborn as a MediaWiki site, this time with content more in the style of Wikipedia. It is now maintained by Richard Park, but is not directly affiliated with any particular individuals, companies, or organizations. Migration of content from the old APL Wiki is ongoing.

A+

  • A+ is an array programming language descended from the programming language A, which in turn was created to replace APL in 1988. Arthur Whitney developed the A portion of A+, while other developers at Morgan Stanley extended it, adding a graphical user interface and other language features. A+ is a high-level, interactive, interpreted language, designed for numerically intensive applications, especially those found in financial applications. A+ runs on many Unix variants, including Linux. It is free and open source software released under a GNU General Public License.
A+ provides an extended set of functions and operators, a graphical user interface with automatic synchronizing of widgets and variables, asynchronous executing of functions associated with variables and events, dynamic loading of user compiled subroutines, and other features. A newer graphical user interface has not yet been ported to all supported platforms.
The A+ language implements the following changes to the APL language:
  • An A+ function may have up to nine formal parameters.
  • A+ code statements are separated by semicolons, so a single statement may be divided into two or more physical lines.
  • The explicit result of a function or operator is the result of the last statement executed.
  • A+ implements an object called a dependency, which is a global variable (the dependent variable) and an associated definition that is like a function with no arguments. Values can be explicitly set and referenced in exactly the same ways as for a global variable, but they can also be set through the associated definition.
Interactive A+ development is primarily done in the Xemacs editor, through extensions to the editor. Because A+ code uses the original APL symbols, displaying A+ requires a font with those special characters; a font named kapl is provided on the web site for that purpose.
Arthur Whitney went on to create a proprietary array language named K. Like J, K omits the APL character set. It lacks some of the perceived complexities of A+, such as the existence of statements and two different modes of syntax.

J

  • J is a high-level, general-purpose programming language that is particularly suited to the mathematical, statistical, and logical analysis of data. It is a powerful tool for developing algorithms and exploring problems that are not already well understood.
J is written in portable C and is available for Windows, Linux, Mac, iOS, Android, and Raspberry Pi. J can be installed and distributed for free. The source is provided under both commercial and GPL 3 licenses.
J is easy to install, has a small footprint, and has direct access to tutorials and documentation.
  • Jd (J database) is a high-performance columnar RDBMS written in J that is geared towards storing and analyzing large amounts of data. Jd is free for non-commercial use.
Jd lives openly and dynamically in the J execution and development environment, so that the full power of J is available to the application developer. For example, Jd columns are mapped to J nouns, so built-in J primitives can apply directly to the data.
It works well with large tables (millions of rows to billions), multiple tables connected by joins, structured data, numerical data, and complex queries and aggregations.

Benchmarks

New technologies continue to advance the speed of these workloads or reduce their cost. These include new and improved database software; new kinds of non-volatile memory, and new ways to incorporate them into the stack; new processors, memory, and server architectures; advanced file systems; and more scaleable storage systems.
STAC-M3 subjects and technology stack to the same rigorous standards.
  • Other STAC benchmarks include time sync (STAC-TS), tick-to-trade (STAC-T1), network I/O (STAC-N1, -T0), feed handlers (STAC-M1), data distribution (STAC-M2), risk computation (STAC-A2), backtesting (STAC-A3), tick analytics (STAC-M3).