How Python Type Hints Make Coding Easier
Wed May 31 2023 by Brian StanleyYou may have heard about Python type hints and wondered whether they're relevant to quants or only to professional software developers. In this article, I'll explain how QuantRocket's JupyterLab environment uses type hints to enable better auto-complete and in-editor documentation, and I'll explain when quants should use type hints in their own code.
What are Type Hints?
Type hints were introduced in Python 3.5 and are a way of annotating Python functions to indicate the expected types of variables, arguments, and return values. The following example function uses type hints to indicate that the name
argument should be a string and that the function will return a string:
def greeting(name: str) -> str:
return 'Hello ' + name
What is the Purpose of Type Hints?
Type hints are ignored by the Python interpreter but can be used by third-party tools such as type checkers, IDEs, and linters to provide various benefits.
At large organizations, type hints provide a way to enforce consistent usage of functions and classes when large teams of developers work on a common codebase. As most quants work alone or in small teams and write standalone scripts rather than edit large, complex codebases, this isn't a strong reason for most quants to care about type hints.
More to the point, IDEs like JupyterLab can utilize type hints to facilitate better auto-complete, in-editor documentation, and other editor enrichments that make coding easier and more intuitive. JupyterLab has always been the IDE of choice for data scientists, but type hints take its capabilities to the next level.
Why Traditional Introspection Isn't Always Enough
JupyterLab — and Python more generally — has always offered a more interactive and intelligent coding experience than many other programming languages because Python is an interpreted language. Jupyter notebooks use a Python feature called introspection to dynamically inspect objects in notebooks, allowing the notebook to provide auto-complete suggestions and access to function documentation. To make this work, the Jupyter notebook runs a Python interpreter in the background, called the kernel, which executes notebook cells one by one and which the notebook front-end can query to obtain information about the current state of objects in the notebook.
There are two key limitations of kernel-based auto-complete. The first is that it only works in notebooks or interactive consoles, but not in text editors, since editors don't connect to a Python kernel. Notebooks are great, but sometimes you need to use a text editor, and it would be nice to have auto-complete there, too. JupyterLab's text editor has traditionally been a weak point (providing syntax highlighting but not much more), so much so that previous versions of QuantRocket shipped with an alternate editor, Eclipse Theia, to supplement JupyterLab.
The second limitation of kernel-based introspection is that it only works for objects defined in notebook cells that have already been executed. Thus, it can assist us up to the point of, but not beyond, the current execution state of the notebook. This is inconvenient for APIs like the Pipeline API where it is common to chain together multiple method calls in a single line of code. Consider the following Pipeline expression, which selects the quantile of stocks with the lowest price-to-earnings ratio:
from zipline.pipeline import sharadar
low_pe = sharadar.Fundamentals.slice(dimension='ART').PE.latest.percentile_between(0, 20)
A typical Pipeline user might know that this line of code is possible but might not remember all the steps to get there or what the different dimension
values are for Fundamentals.slice()
. (The dimension
argument is used to select annual, quarterly, or trailing-twelve-month fundamentals.) Assuming we've already imported the sharadar
module in a previous cell of the notebook, Jupyter's traditional, kernel-based introspection will be able to help us auto-complete Fundamentals
and slice
, but it won't know what to do beyond that, because slice()
is a method and the kernel doesn't yet know what it returns, since we haven't yet executed this cell. We would need to execute the first part of the expression to get help with the later parts, but this would mean awkwardly splitting a single logical expression across multiple cells. Alternatively, we could leave JupyterLab to consult outside documentation, but that is inconvenient.
Type Hints Solve the Limitations of Traditional Introspection
Type hints facilitate a completely different approach to intelligent code editing, called static analysis. This approach doesn't rely on a Python kernel interactively executing your code. Rather, static analysis tools read and parse the imported library's source code files statically (that is, without executing them) to facilitate auto-complete, in-editor documentation, and other editor features. This means these features can work not only in notebooks but in text editors, and they can work in notebook cells you haven't yet executed.
Two things are necessary to make static analysis work. One is a static analysis tool. QuantRocket's JupyterLab environment utlizes Pyright, an open-source static analysis library developed and maintained by Microsoft and used in Visual Studio Code, Microsoft's popular desktop code editor. The second requirement is that the libraries you import and use must be appropriately documented and annotated with type hints so that the static analysis tool understands how the libraries work.
Let's return to the Pipeline expression from earlier and see how the combination of Pyright and type hints solves the problem we encountered. Whereas kernel-based introspection gets stuck at slice()
(since it doesn't know what kind of object the not-yet-executed slice()
method returns), Pyright breezes through to the end of the line.
How is this accomplished? Pyright parses the source code file containing the Fundamentals.slice()
method and discovers a type hint that explicitly states what kind of object the slice()
method returns (it returns another instance of Fundamentals
). The type hint looks something like this:
def slice(cls, dimension: ..., period_offset: int = 0) -> 'Fundamentals':
...
The type hint allows Pyright to inspect the source code for the returned object and see what parameters it takes and what kind of object it returns, then repeat the same process for the next step in the expression, and so on until the end of the expression.
When to Add Type Hints in Your Code
Facilitating a user-friendly and intelligent coding experience is primarily the job of platform developers and library maintainers. End users don't need to do much more than make sure they're using modern platforms and updated libraries that take advantage of static analysis.
There are times, however, when you should add type hints in your own code. For example, you should do so when you write a function (as opposed to simply using an existing library function) that accepts arguments which it would be nice for the editor to understand. Let's look at Zipline as an example. Most user-defined Zipline functions accept two arguments, context
and data
, which provide access to the trading algorithm's current state (such as the current positions) and to price data, respectively. If you write your functions without type hints, like this:
def rebalance(context, data):
...
Pyright won't understand what kind of objects context
and data
are and won't be able to help you navigate their available properties and methods. But if you include type hints, like this:
import zipline.api as algo
def rebalance(context: algo.Context, data: algo.BarData):
...
you'll get auto-complete and in-editor documentation to remind you how to do common tasks like checking your positions or loading current or historical prices.
About QuantRocket
QuantRocket is a Python-based platform for researching, backtesting, and trading quantitative strategies. It offers a suite of data integrations and supports multiple backtesters: Zipline, the open-source backtester that originally powered Quantopian; Moonshot, a vectorized backtester created in-house; and MoonshotML, a walk-forward machine learning backtester. Built on Docker, QuantRocket can be deployed locally or to the cloud and has an open architecture that is flexible and extensible. As of version 2.9, all public interfaces of all QuantRocket-maintained packages have been annotated with type hints.
Install QuantRocket now to get started.