Tensor Utils¶
This module contains a couple of util functions to facilitate working with tensors in numerical computations.
-
tensor_utils.
pdist
(tensor, metric='euclidean')[source]¶ - Pairwise distances between observations in n-dimensional space.
- Ported from scipy.spatial.distance.pdist @2f5aa264724099c03772ed784e7a947d2bea8398 for cherry-picked distance metrics.
Parameters: tensor : tensorflow.Tensor
metric : DistanceMetric, optional
Pairwise metric to apply. Defaults to DistanceMetric.Euclidean.
Returns: Y : tensorflow.Tensor
Returns a condensed distance matrix Y as tensorflow.Tensor. For each \(i\) and \(j\) (where \(i<j<m\)), where m is the number of original observations. The metric
dist(u=X[i], v=X[j])
is computed and stored in entryj
of subtensorY[j]
.Examples
Gives equivalent results to scipy.spatial.distance.pdist but uses tensorflow.Tensor objects: >>> import tensorflow as tf >>> import numpy as np >>> from scipy.spatial.distance import pdist as pdist_scipy >>> input_scipy = np.array([[ 0.77228064, 0.09543156], [ 0.3918973 , 0.96806584], [ 0.66008144, 0.22163063]]) >>> result_scipy = pdist_scipy(input_scipy, metric=”euclidean”) >>> session = tf.Session() >>> input_tensorflow = tf.constant(input_scipy) >>> result_tensorflow = session.run(pdist(input_tensorflow, metric=”euclidean”)) >>> np.allclose(result_scipy, result_tensorflow) True
Will raise a NotImplementedError for unsupported metric choices: >>> import tensorflow as tf >>> import numpy as np >>> input_scipy = np.array([[ 0.77228064, 0.09543156], [ 0.3918973 , 0.96806584], [ 0.66008144, 0.22163063]]) >>> session = tf.Session() >>> input_tensorflow = tf.constant(input_scipy) >>> session.run(pdist(input_tensorflow, metric=”lengthy_metric”)) Traceback (most recent call last):
...NotImplementedError: tensor_utils.pdist: Metric ‘lengthy_metric’ currently not supported!
Like scipy.spatial.distance.pdist, we fail for input that is not 2-d: >>> import tensorflow as tf >>> import numpy as np >>> input_scipy = np.random.rand(2, 2, 1) >>> session = tf.Session() >>> input_tensorflow = tf.constant(input_scipy) >>> session.run(pdist(input_tensorflow, metric=”lengthy_metric”)) Traceback (most recent call last):
...ValueError: tensor_utils.pdist: A 2-d tensor must be passed.
-
tensor_utils.
safe_divide
(x, y, small_constant=1e-16, name=None)[source]¶ - tf.divide(x, y) after adding a small appropriate constant to y
- in a smart way so that we can avoid division-by-zero artefacts.
Parameters: x : tensorflow.Tensor
Left-side operand of tensorflow.divide
y : tensorflow.Tensor
Right-side operand of tensorflow.divide
small_constant : tensorflow.Tensor
Small constant tensor to add to/subtract from y before computing x / y to avoid division-by-zero.
name : string or NoneType, optional
Name of the resulting node in a tensorflow.Graph. Defaults to None.
Returns: division_result : tensorflow.Tensor
Result of division tf.divide(x, y) after applying clipping to y.
Examples
Will safely avoid divisions-by-zero under normal circumstances:
>>> import tensorflow as tf >>> import numpy as np >>> session = tf.Session() >>> x = tf.constant(1.0) >>> nan_tensor = tf.divide(x, 0.0) # will produce "inf" due to division-by-zero >>> np.isinf(nan_tensor.eval(session=session)) True >>> z = safe_divide(x, 0., small_constant=1e-16) # will avoid "inf" due to division-by-zero by clipping >>> np.isinf(z.eval(session=session)) False
To see that simply adding a constant may fail, but this implementation handles those corner cases correctly, consider this example:
>>> import tensorflow as tf >>> import numpy as np >>> x, y = tf.constant(1.0), tf.constant(-1e-16) >>> small_constant = tf.constant(1e-16) >>> v1 = x / (y + small_constant) # without sign >>> v2 = safe_divide(x, y, small_constant=small_constant) # with sign >>> val1, val2 = session.run([v1, v2]) >>> np.isinf(val1) # simply adding without considering the sign can still yield "inf" True >>> np.isinf(val2) # our version behaves appropriately False
-
tensor_utils.
safe_sqrt
(x, clip_value_min=0.0, clip_value_max=inf, name=None)[source]¶ - Computes tf.sqrt(x) after clipping tensor x using
- tf.clip_by_value(x, clip_value_min, clip_value_max) to avoid square root (e.g. of negative values) artefacts.
Parameters: x : tensorflow.Tensor or tensorflow.SparseTensor
Operand of tensorflow.sqrt.
clip_value_min : 0-D (scalar) tensorflow.Tensor, optional
The minimum value to clip by. Defaults to 0
clip_value_max : 0-D (scalar) tensorflow.Tensor, optional
The maximum value to clip by. Defaults to float(“inf”)
name : string or NoneType, optional
Name of the resulting node in a tensorflow.Graph. Defaults to None.
Returns: sqrt_result: tensorflow.Tensor
Result of square root tf.sqrt(x) after applying clipping to x.
Examples
Will safely avoid square root of negative values:
>>> import tensorflow as tf >>> import numpy as np >>> x = tf.constant(-1e-16) >>> z = tf.sqrt(x) # fails, results in 'nan' >>> z_safe = safe_sqrt(x) # works, results in '0' >>> session = tf.Session() >>> z_val, z_safe_val = session.run([z, z_safe]) >>> np.isnan(z_val) # ordinary tensorflow computation gives 'nan' True >>> np.isnan(z_safe_val) # `safe_sqrt` produces '0'. False >>> z_safe_val 0.0
-
tensor_utils.
squareform
(tensor)[source]¶ - TODO
- Ported from scipy.spatial.distance.squareform @2f5aa264724099c03772ed784e7a947d2bea8398, but supports only 1-d (vector) input
Parameters: tensor : tensorflow.Tensor Returns: redundant_distance_tensor : tensorflow.Tensor Examples
May be used in conjunction with tensor_utils.pdist to obtain a redundant distance matrix: >>> import tensorflow as tf >>> import numpy as np >>> from scipy.spatial.distance import pdist as scipy_pdist, squareform as scipy_squareform >>> original_input = np.random.rand(2, 4) >>> tf_redundant_distance_tensor = squareform(pdist(tf.constant(original_input))) >>> scipy_redundant_distance_matrix = scipy_squareform(scipy_pdist(original_input)) >>> session = tf.Session() >>> tf_redundant_distance_matrix = session.run(tf_redundant_distance_tensor) >>> np.allclose(tf_redundant_distance_matrix, scipy_redundant_distance_matrix) True
Contrary to scipy.spatial.squareform, conversion of 2D input to a condensed distance vector is not supported: >>> import numpy as np >>> import tensorflow as tf >>> illegal_input = tf.constant(np.random.rand(4, 4)) >>> squareform(illegal_input) Traceback (most recent call last):
...NotImplementedError: tensor_utils.squareform: Only 1-d (vector) input is supported!
-
tensor_utils.
unvectorize
(tensor, original_shape)[source]¶ - Reshape previously vectorized tensor back to its original_shape.
- Essentially the inverse transformation as the one performed by tensor_utils.vectorize.
Parameters: tensor : tensorflow.Variable object or tensorflow.Tensor object
Input tensor to unvectorize.
original_shape : tensorflow.Shape
Original shape of tensor prior to its vectorization.
Returns: tensor_unvectorized : tensorflow.Tensor object
Tensor with the same values as tensor but reshaped back to shape original_shape.
Examples
Function unvectorize undoes the work done by vectorize:
>>> import tensorflow as tf >>> import numpy as np >>> t1 = tf.constant([[12.0, 14.0, -3.0], [4.0, 3.0, 1.0], [9.0, 2.0, 4.0]]) >>> t2 = unvectorize(vectorize(t1), original_shape=t1.shape) >>> session = tf.Session() >>> t1_array, t2_array = session.run([t1, t2]) >>> np.allclose(t1_array, t2_array) True
It will also work for tensorflow.Variable objects, but will return tensorflow.Tensor as unvectorized output.
>>> import tensorflow as tf >>> import numpy as np >>> v = tf.Variable([[0.0, 1.0], [2.0, 0.0]]) >>> session = tf.Session() >>> session.run(tf.global_variables_initializer()) >>> t = unvectorize(vectorize(v.initialized_value()), original_shape=v.shape) >>> v_array, t_array = session.run([v, t]) >>> np.allclose(t_array, v_array) True
-
tensor_utils.
vectorize
(tensor)[source]¶ - Turn any matrix into a long vector for the parameters
by expanding it. Turn: [[a, b], [c, d]] into [a, b, c, d]
For vector inputs, this simply returns a copy of the vector.
- For reference see also vec-operator in:
- https://hec.unil.ch/docs/files/23/100/handout1.pdf#page=2
Parameters: tensor : tensorflow.Variable object or tensorflow.Tensor object
Input tensor to vectorize.
Returns: tensor_vectorized: tensorflow.Variable object or tensorflow.Tensor object
Vectorized result for input tensor.
Examples
A tensorflow.Variable can be vectorized: (NOTE: the returned vectorized variable must be initialized to use it in tensorflow computations.)
>>> import tensorflow as tf >>> v1 = tf.Variable([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]) >>> v1_vectorized = vectorize(v1) >>> session = tf.Session() >>> session.run(tf.global_variables_initializer()) >>> session.run(v1_vectorized) array([[ 1.], [ 2.], [ 3.], [ 4.], [ 5.], [ 6.]], dtype=float32)
A normal tensorflow.Tensor can be vectorized:
>>> import tensorflow as tf >>> t1 = tf.constant([[12.0, 14.0, -3.0], [4.0, 3.0, 1.0], [9.0, 2.0, 4.0]]) >>> t1_vectorized = vectorize(t1) >>> session = tf.Session() >>> session.run(t1_vectorized) array([[ 12.], [ 14.], [ -3.], [ 4.], [ 3.], [ 1.], [ 9.], [ 2.], [ 4.]], dtype=float32)