Modern Python Cookbook
上QQ阅读APP看书,第一时间看更新

Writing hints for more complex types

The Python language allows us to write functions (and classes) that are entirely generic with respect to data type. Consider this function as an example:

    def temperature(*, f_temp=None, c_temp=None):
        if c_temp is None:
            return {'f_temp': f_temp, 'c_temp': 5*(f_temp-32)/9}
        elif f_temp is None:
            return {'f_temp': 32+9*c_temp/5, 'c_temp': c_temp}
        else:
            raise TypeError("One of f_temp or c_temp must be provided")

This follows three recipes shown earlier: Using super flexible keyword parameters, Forcing keyword-only arguments with the * separator, and Designing complex if...elif chains, from Chapter 2, Statements and Syntax.

This function produces a fairly complex data structure as a result. It's not very clear what the data structure is. Worse, it's difficult to be sure functions are using the output from this function correctly. The parameters don't provide type hints, either.

This is valid, working Python. It lacks a formal description that would help a person understand the intent.

We can also include docstrings. Here's the recommended style:

def temperature(*, f_temp=None, c_temp=None):
    """Convert between Fahrenheit temperature and
 Celsius temperature.
 
 :key f_temp: Temperature in °F.
 :key c_temp: Temperature in °C.
 :returns: dictionary with two keys:
 :f_temp: Temperature in °F.
 :c_temp: Temperature in °C.
    """

The docstring doesn't support sophisticated, automated testing to confirm that the documentation actually matches the code. The two could disagree with each other.

The mypy tool performs the needed automated type-checking. For this to work, we need to add type hints about the type of data involved. How can we provide meaningful type hints for more complex data structures?

Getting ready

We'll implement a version of the temperature() function. We'll need two modules that will help us provide hints regarding the data types for parameters and return values:

from typing import Optional, Union, Dict

We've opted to import a few of the type names from the typing module. If we're going to supply type hints, we want them to be terse. It's awkward having to write typing.List[str]. We prefer to omit the module name by using this kind of explicit import.

How to do it...

Python 3.5 introduced type hints to the language. We can use them in three places: function parameters, function returns, and type hint comments:

  1. Annotate parameters to functions, like this:
    def temperature(*,
     f_temp: Optional[float]=None,
        c_temp: Optional[float]=None):
    

    We've added : and a type hint as part of the parameter. The type float tells mypy any number is allowed here. We've wrapped this with the Optional[] type operation to state that the argument value can be either a number or None.

  2. Annotate return values from functions, like this:
    def temperature(*,
     f_temp: Optional[float]=None,
        c_temp: Optional[float]=None) -> Dict[str, float]:
    

    We've added -> and a type hint for the return value of this function. In this case, we've stated that the result will be a dictionary object with keys that are strings, str, and values that are numbers, float.

    The typing module introduces the type hint names, such as Dict, that describes a data structure. This is different from the dict class, which actually builds objects. typing.Dict is merely a description of possible objects.

  3. If necessary, we can add type hints as comments to assignment statements. These are sometimes required to clarify a long, complex series of statements. If we wanted to add them, the annotations could look like this:
    result: Dict[str, float] = {"c_temp": c_temp, "f_temp": f_temp}
    

    We've added a Dict[str, float] type hint to the statement that builds the final dictionary object.

How it works...

The type information we've added are called hints. They're not requirements that are somehow checked by the Python compiler. They're not checked at runtime either.

These type hints are used by a separate program, mypy. See http://mypy-lang.org for more information.

The mypy program examines the Python code, including the type hints. It applies some formal reasoning and inference techniques to determine if the various type hints will always be true. For larger and more complex programs, the output from mypy will include warnings and errors that describe potential problems with either the code itself, or the type hints decorating the code.

For example, here's a mistake that's easy to make. We've assumed that our function returns a single number. Our return statement, however, doesn't match our expectation:

def temperature_bad(
 *, f_temp: Optional[float] = None, c_temp: Optional[float] = None
 ) -> float:
    if f_temp is not None:
        c_temp = 5 * (f_temp - 32) / 9
    elif f_temp is not None:8888889
        f_temp = 32 + 9 * c_temp / 5
    else:
        raise TypeError("One of f_temp or c_temp must be provided")
    result = {"c_temp": c_temp, "f_temp": f_temp}
    return result

When we run mypy, we'll see this:

Chapter_03/ch03_r07.py:45: error: Incompatible return value type (got "Dict[str, float]", expected "float")

We can see that line 45, the return statement, doesn't match the function definition. The result was a Dict[str, float] object but the definition hint was a float object. Ideally, a unit test would also uncover a problem here.

Given this error, we need to either fix the return or the definition to be sure that the expected type and the actual type match. It's not clear which of the two type hints is right. Either of these could be the intent:

  • Return a single value, consistent with the definition that has the -> float hint. This means the return statement needs to be fixed.
  • Return the dictionary object, consistent with the return statement where a Dict[str, float] object was created. This means we need to correct the def statement to have the proper return type. Changing this may spread ripples of change to other functions that expect the temperature() function to return a float object.

The extra syntax for parameter types and return types has no real impact on performance, and only a very small cost when the source code is first compiled into byte code. They are—after all—merely hints.

The docstring is an important part of the code. The code describes data and processing, but can't clarify intent. The docstring comments can provide insight into what the values in the dictionary are and why they have specific key names.

There's more...

A dictionary with specific string keys is a common Python data structure. It's so common there's a type hint in the mypy_extensions library that's perfect for this situation. If you've installed mypy, then mypy_extensions should also be present.

The TypedDict class definition is a way to define a dictionary with specific string keys, and has an associated type hint for each of those keys:

from mypy_extensions import TypedDict
TempDict = TypedDict(
    "TempDict",
    {
        "c_temp": float,
        "f_temp": float,
    }
)

This defines a new type, TempDict, which is a kind of Dict[str, Any], a dictionary mapping a string key to another value. This further narrows the definition by listing the expected string keys should be from the defined set of available keys. It also provides unique types for each inpidual string key. These constraints aren't checked at runtime as they're used by mypy.

We can make another small change to make use of this type:

def temperature_d(
 *, 
 f_temp: Optional[float] = None,
 c_temp: Optional[float] = None
 ) -> TempDict:
    if f_temp is not None:
        c_temp = 5 * (f_temp - 32) / 9
    elif c_temp is not None:
        f_temp = 32 + 9 * c_temp / 5
    else:
        raise TypeError("One of f_temp or c_temp must be provided")
    result: TempDict = {"c_temp": c_temp, "f_temp": f_temp}
    return result

We've made two small changes to the temperature() function to create this temperature_d() variant. First, we've used the TempDict type to define the resulting type of data. Second, the assignment for the result variable has had the type hint added to assert that we're building an object conforming to the TempDict type.

See also