NumPy ndarray.var()
The numpy.ndarray.var()
method calculates the variance of elements in a NumPy array.
It measures the spread of data by computing the average squared deviation from the mean.
You can compute variance for the entire array or along a specified axis.
Syntax
ndarray.var(axis=None, dtype=None, out=None, ddof=0, keepdims=False, *, where=True)
Parameters
Parameter | Type | Description |
---|---|---|
axis | None, int, or tuple of ints, optional | Axis or axes along which variance is computed. If None , variance is computed for the entire array. |
dtype | dtype, optional | Data type for computations. If None , the type is inferred from the array. |
out | ndarray, optional | Alternative output array for storing the result. Must have the same shape as expected output. |
ddof | int, optional | Delta Degrees of Freedom (default: 0). The divisor in variance computation is N - ddof , where N is the number of elements. |
keepdims | bool, optional | If True , the reduced dimensions are kept as size one, allowing proper broadcasting. |
where | array_like of bool, optional | Specifies elements to include in variance computation. |
Return Value
Returns a float if axis=None
, or an array of variance values if an axis is specified.
The result represents the average squared deviation of array elements from the mean.
Examples
1. Calculating Variance of an Entire Array
In this example, we compute the variance of all elements in a 1D array.
import numpy as np
# Creating a 1D NumPy array
arr = np.array([1, 2, 3, 4, 5])
# Calculating variance of the entire array
variance = arr.var()
# Printing the result
print("Variance of the array:", variance)
Output:
Variance of the array: 2.0
The variance is calculated as the average squared deviation from the mean.
2. Using the axis
Parameter in ndarray.var()
Here, we compute variance along different axes in a 2D array.
import numpy as np
# Creating a 2D NumPy array
arr = np.array([[1, 2, 3],
[4, 5, 6]])
# Calculating variance along axis 0 (column-wise)
variance_axis0 = arr.var(axis=0)
print("Variance along axis 0:", variance_axis0)
# Calculating variance along axis 1 (row-wise)
variance_axis1 = arr.var(axis=1)
print("Variance along axis 1:", variance_axis1)
Output:
Variance along axis 0: [2.25 2.25 2.25]
Variance along axis 1: [0.66666667 0.66666667]
The variance is computed separately for each column when axis=0
and for each row when axis=1
.
3. Adjusting Degrees of Freedom with ddof
Using ddof=1
applies sample variance instead of population variance.
import numpy as np
# Creating a 1D NumPy array
arr = np.array([1, 2, 3, 4, 5])
# Calculating variance with ddof=1 (sample variance)
variance_sample = arr.var(ddof=1)
print("Sample variance:", variance_sample)
Output:
Sample variance: 2.5
With ddof=1
, the divisor in variance calculation becomes N-1
instead of N
.
4. Keeping Dimensions with keepdims=True
Using keepdims=True
retains the shape of the original array.
import numpy as np
# Creating a 2D NumPy array
arr = np.array([[1, 2, 3],
[4, 5, 6]])
# Calculating variance along axis 1 with keepdims=True
variance_keepdims = arr.var(axis=1, keepdims=True)
print("Variance with keepdims:", variance_keepdims)
Output:
Variance with keepdims: [[0.66666667]
[0.66666667]]
Keeping dimensions ensures that the output has the same number of dimensions as the input array, making it useful for broadcasting.