MA operates nearly transparently as a drop-in replacement for Numerical, offering nearly all the same functions. In addition, more methods and functions are available to deal with the complexities of masked arrays. This talk discusses the design choices and features of the MA extension.
Numerical Python has proven to be a powerful and successful array-language extension to Python. Its use permits compact and efficient manipulations of large amounts of data. The Numerical Python package defines a large number of functions on arrays, such as trigonometric functions, logs and exponentials, inner and outer products, logical functions, and specialized array manipulation functions.
Unfortunately, many real-life applications contain data that, while basically consisting of full arrays, contain elements whose value is unknown. Most commonly, this arises from observational data, such as a particular weather station that was out of operation for a time, or an area of the Earth or experimental domain that needs to be excluded from a
calculation, or by times when no observations are taken in an otherwise uniform sequence, such as daily financial data not recorded on weekends or holidays, missing pixels in a picture, etc. Additionally, many numerical algorithms can be more easily implemented using arrays with "missing" data.
The masked array facility, MA, was designed to meet this need. A masked array is conceptually an array with missing values. MA represents such an array as a data array and a mask array. The mask array, if present, is an array of 1's and 0's, of the same shape as the data array, where a 1 represents a location with invalid data. When operations are performed, the resulting quantity has an appropriate mask. For example, if we add two
masked arrays, the result has a mask consisting of the logical or of the operands' masks. All operations avoid using the data from invalid locations.
For technical reasons it is not possible to inherit from the Numeric array object. While it would be possible to write a separate extension in C, it was felt important that MA arrays be convertible to and from Numeric arrays, and that a maintenance burden not be introduced by writing a compiled extension. Therefore, MA is implemented entirely in Python. As such, MA is a great case study of Python's power and flexibility. As a class, the MaskedArray class illustrates usage of nearly every feature you can have in a Python class, and classes such as masked_binary_operator illustrate the design of function-like class instances.