Path: blob/master/site/en-snapshot/guide/tf_numpy.ipynb
25115 views
Copyright 2020 The TensorFlow Authors.
NumPy API on TensorFlow
Overview
TensorFlow implements a subset of the NumPy API, available as tf.experimental.numpy
. This allows running NumPy code, accelerated by TensorFlow, while also allowing access to all of TensorFlow's APIs.
Setup
Enabling NumPy behavior
In order to use tnp
as NumPy, enable NumPy behavior for TensorFlow:
This call enables type promotion in TensorFlow and also changes type inference, when converting literals to tensors, to more strictly follow the NumPy standard.
Note: This call will change the behavior of entire TensorFlow, not just the tf.experimental.numpy
module.
TensorFlow NumPy ND array
An instance of tf.experimental.numpy.ndarray
, called ND Array, represents a multidimensional dense array of a given dtype
placed on a certain device. It is an alias to tf.Tensor
. Check out the ND array class for useful methods like ndarray.T
, ndarray.reshape
, ndarray.ravel
and others.
First create an ND array object, and then invoke different methods.
Type promotion
There are 4 options for type promotion in TensorFlow.
By default, TensorFlow raises errors instead of promoting types for mixed type operations.
Running
tf.numpy.experimental_enable_numpy_behavior()
switches TensorFlow to useNumPy
type promotion rules (described below).After TensorFlow 2.15, there are two new options (refer to TF NumPy Type Promotion for details):
tf.numpy.experimental_enable_numpy_behavior(dtype_conversion_mode="all")
uses Jax type promotion rules.tf.numpy.experimental_enable_numpy_behavior(dtype_conversion_mode="safe")
uses Jax type promotion rules, but disallows certain unsafe promotions.
NumPy Type Promotion
TensorFlow NumPy APIs have well-defined semantics for converting literals to ND array, as well as for performing type promotion on ND array inputs. Please see np.result_type
for more details.
TensorFlow APIs leave tf.Tensor
inputs unchanged and do not perform type promotion on them, while TensorFlow NumPy APIs promote all inputs according to NumPy type promotion rules. In the next example, you will perform type promotion. First, run addition on ND array inputs of different types and note the output types. None of these type promotions would be allowed by TensorFlow APIs.
Finally, convert literals to ND array using ndarray.asarray
and note the resulting type.
When converting literals to ND array, NumPy prefers wide types like tnp.int64
and tnp.float64
. In contrast, tf.convert_to_tensor
prefers tf.int32
and tf.float32
types for converting constants to tf.Tensor
. TensorFlow NumPy APIs adhere to the NumPy behavior for integers. As for floats, the prefer_float32
argument of experimental_enable_numpy_behavior
lets you control whether to prefer tf.float32
over tf.float64
(default to False
). For example:
Broadcasting
Similar to TensorFlow, NumPy defines rich semantics for "broadcasting" values. You can check out the NumPy broadcasting guide for more information and compare this with TensorFlow broadcasting semantics.
Indexing
NumPy defines very sophisticated indexing rules. See the NumPy Indexing guide. Note the use of ND arrays as indices below.
Example Model
Next, you can see how to create a model and run inference on it. This simple model applies a relu layer followed by a linear projection. Later sections will show how to compute gradients for this model using TensorFlow's GradientTape
.
TensorFlow NumPy and NumPy
TensorFlow NumPy implements a subset of the full NumPy spec. While more symbols will be added over time, there are systematic features that will not be supported in the near future. These include NumPy C API support, Swig integration, Fortran storage order, views and stride_tricks
, and some dtype
s (like np.recarray
and np.object
). For more details, please see the TensorFlow NumPy API Documentation.
NumPy interoperability
TensorFlow ND arrays can interoperate with NumPy functions. These objects implement the __array__
interface. NumPy uses this interface to convert function arguments to np.ndarray
values before processing them.
Similarly, TensorFlow NumPy functions can accept inputs of different types including np.ndarray
. These inputs are converted to an ND array by calling ndarray.asarray
on them.
Conversion of the ND array to and from np.ndarray
may trigger actual data copies. Please see the section on buffer copies for more details.
Buffer copies
Intermixing TensorFlow NumPy with NumPy code may trigger data copies. This is because TensorFlow NumPy has stricter requirements on memory alignment than those of NumPy.
When a np.ndarray
is passed to TensorFlow NumPy, it will check for alignment requirements and trigger a copy if needed. When passing an ND array CPU buffer to NumPy, generally the buffer will satisfy alignment requirements and NumPy will not need to create a copy.
ND arrays can refer to buffers placed on devices other than the local CPU memory. In such cases, invoking a NumPy function will trigger copies across the network or device as needed.
Given this, intermixing with NumPy API calls should generally be done with caution and the user should watch out for overheads of copying data. Interleaving TensorFlow NumPy calls with TensorFlow calls is generally safe and avoids copying data. See the section on TensorFlow interoperability for more details.
Operator precedence
TensorFlow NumPy defines an __array_priority__
higher than NumPy's. This means that for operators involving both ND array and np.ndarray
, the former will take precedence, i.e., np.ndarray
input will get converted to an ND array and the TensorFlow NumPy implementation of the operator will get invoked.
TF NumPy and TensorFlow
TensorFlow NumPy is built on top of TensorFlow and hence interoperates seamlessly with TensorFlow.
tf.Tensor
and ND array
ND array is an alias to tf.Tensor
, so obviously they can be intermixed without triggering actual data copies.
TensorFlow interoperability
An ND array can be passed to TensorFlow APIs, since ND array is just an alias to tf.Tensor
. As mentioned earlier, such interoperation does not do data copies, even for data placed on accelerators or remote devices.
Conversely, tf.Tensor
objects can be passed to tf.experimental.numpy
APIs, without performing data copies.
Gradients and Jacobians: tf.GradientTape
TensorFlow's GradientTape can be used for backpropagation through TensorFlow and TensorFlow NumPy code.
Use the model created in Example Model section, and compute gradients and jacobians.
Trace compilation: tf.function
TensorFlow's tf.function
works by "trace compiling" the code and then optimizing these traces for much faster performance. See the Introduction to Graphs and Functions.
tf.function
can be used to optimize TensorFlow NumPy code as well. Here is a simple example to demonstrate the speedups. Note that the body of tf.function
code includes calls to TensorFlow NumPy APIs.
Vectorization: tf.vectorized_map
TensorFlow has inbuilt support for vectorizing parallel loops, which allows speedups of one to two orders of magnitude. These speedups are accessible via the tf.vectorized_map
API and apply to TensorFlow NumPy code as well.
It is sometimes useful to compute the gradient of each output in a batch w.r.t. the corresponding input batch element. Such computation can be done efficiently using tf.vectorized_map
as shown below.
Device placement
TensorFlow NumPy can place operations on CPUs, GPUs, TPUs and remote devices. It uses standard TensorFlow mechanisms for device placement. Below a simple example shows how to list all devices and then place some computation on a particular device.
TensorFlow also has APIs for replicating computation across devices and performing collective reductions which will not be covered here.
List devices
tf.config.list_logical_devices
and tf.config.list_physical_devices
can be used to find what devices to use.
Placing operations: tf.device
Operations can be placed on a device by calling it in a tf.device
scope.
Copying ND arrays across devices: tnp.copy
A call to tnp.copy
, placed in a certain device scope, will copy the data to that device, unless the data is already on that device.
Performance comparisons
TensorFlow NumPy uses highly optimized TensorFlow kernels that can be dispatched on CPUs, GPUs and TPUs. TensorFlow also performs many compiler optimizations, like operation fusion, which translate to performance and memory improvements. See TensorFlow graph optimization with Grappler to learn more.
However TensorFlow has higher overheads for dispatching operations compared to NumPy. For workloads composed of small operations (less than about 10 microseconds), these overheads can dominate the runtime and NumPy could provide better performance. For other cases, TensorFlow should generally provide better performance.
Run the benchmark below to compare NumPy and TensorFlow NumPy performance for different input sizes.