NumPy: Indexing, Slicing, and Reshaping
Efficiently accessing and manipulating parts of NumPy arrays is critical for data processing. This document covers the powerful indexing, slicing, and reshaping capabilities of NumPy.
1. Basic Indexing and Slicing
NumPy array indexing is similar to Python list indexing, but with extensions for multiple dimensions.
import numpy as np
# 1D Array
arr_1d = np.arange(10) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print("1D Array:", arr_1d)
# Accessing a single element
print("Element at index 5:", arr_1d[5])
# Slicing (start:stop:step) - stop is exclusive
print("Slice from index 2 to 7 (exclusive):", arr_1d[2:7])
print("Slice from beginning to 5 (exclusive):", arr_1d[:5])
print("Slice from 5 to end:", arr_1d[5:])
print("Every other element:", arr_1d[::2])
print("Reversed array:", arr_1d[::-1])
# 2D Array
arr_2d = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
print("\n2D Array:\n", arr_2d)
# Accessing a single element (row, column)
print("Element at (1, 2):", arr_2d[1, 2]) # Row 1, Column 2 (value 6)
print("Element at (0, 0):", arr_2d[0][0]) # Alternative syntax
# Slicing rows
print("\nFirst row:", arr_2d[0, :])
print("First two rows:\n", arr_2d[:2, :])
# Slicing columns
print("\nSecond column:", arr_2d[:, 1])
print("First two columns:\n", arr_2d[:, :2])
# Slicing both rows and columns
print("\nSub-array (first two rows, first two columns):\n", arr_2d[:2, :2])
# Using negative indices (from the end)
print("\nLast row:", arr_2d[-1, :])
print("Last column:", arr_2d[:, -1])
2. Advanced Indexing
Integer Array Indexing
Allows selection of arbitrary items based on their indices.
import numpy as np
arr = np.arange(10, 20) # [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
print("Original Array:", arr)
# Select specific elements
indices = np.array([0, 3, 7])
print("Elements at indices [0, 3, 7]:", arr[indices])
# Select elements from a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("\n2D Array:\n", arr_2d)
# Select specific elements (row, col) pairs
rows = np.array([0, 1, 2])
cols = np.array([0, 1, 0])
print("Elements at (0,0), (1,1), (2,0):", arr_2d[rows, cols]) # [arr_2d[0,0], arr_2d[1,1], arr_2d[2,0]]
Boolean (Mask) Indexing
Selects elements based on a boolean condition.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print("Original Array:", arr)
# Select elements greater than 5
mask = arr > 5
print("Boolean mask for arr > 5:", mask)
print("Elements greater than 5:", arr[mask])
# Combined condition
print("Elements > 3 and < 8:", arr[(arr > 3) & (arr < 8)])
# For 2D arrays
arr_2d = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
print("\n2D Array:\n", arr_2d)
print("Elements > 5 in 2D array:", arr_2d[arr_2d > 5])
3. Reshaping Arrays
Changing the shape (dimensions) of an array without changing its data.
reshape(shape): Returns a new array with the specified shape.flatten(): Returns a 1D copy of the array.ravel(): Returns a 1D array (view or copy, depending on contiguity). Often faster thanflatten()..T: Transposes the array (swaps rows and columns).np.newaxis(orNone): Increases the dimension of an array.
import numpy as np
arr = np.arange(12) # [0, 1, ..., 11]
print("Original 1D Array:\n", arr)
# Reshape to 3 rows, 4 columns
arr_reshaped_3x4 = arr.reshape(3, 4)
print("\nReshaped to 3x4:\n", arr_reshaped_3x4)
# Reshape, letting NumPy infer one dimension (-1)
arr_reshaped_2x_infer = arr.reshape(2, -1) # Inferred to be (2, 6)
print("\nReshaped to 2x (inferred):\n", arr_reshaped_2x_infer)
arr_reshaped_infer_x_3 = arr.reshape(-1, 3) # Inferred to be (4, 3)
print("\nReshaped to x3 (inferred):\n", arr_reshaped_infer_x_3)
# Flatten back to 1D
arr_flattened = arr_reshaped_3x4.flatten()
print("\nFlattened (copy):\n", arr_flattened)
arr_raveled = arr_reshaped_3x4.ravel()
print("Raveled (view or copy):\n", arr_raveled)
# Transpose
print("\nTransposed (3x4 -> 4x3):\n", arr_reshaped_3x4.T)
# Adding a new dimension (e.g., for batch processing in deep learning)
arr_2d_from_1d = arr_1d[:, np.newaxis] # Makes it a column vector
print("\n1D array as column vector (using np.newaxis):\n", arr_2d_from_1d)
print("Shape of column vector:", arr_2d_from_1d.shape)
arr_2d_from_1d_row = arr_1d[np.newaxis, :] # Makes it a row vector
print("\n1D array as row vector (using np.newaxis):\n", arr_2d_from_1d_row)
print("Shape of row vector:", arr_2d_from_1d_row.shape)
4. Concatenation and Splitting Arrays
Concatenation
np.concatenate((arr1, arr2), axis): Joins a sequence of arrays along an existing axis.np.vstack((arr1, arr2)): Stacks arrays in sequence vertically (row-wise).np.hstack((arr1, arr2)): Stacks arrays in sequence horizontally (column-wise).
import numpy as np
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
print("Arr1:\n", arr1)
print("\nArr2:\n", arr2)
# Concatenate along rows (axis=0)
concatenated_rows = np.concatenate((arr1, arr2), axis=0)
print("\nConcatenated along rows:\n", concatenated_rows)
# Same as vstack
vstacked = np.vstack((arr1, arr2))
print("\nVstacked:\n", vstacked)
# Concatenate along columns (axis=1)
concatenated_cols = np.concatenate((arr1, arr2), axis=1)
print("\nConcatenated along columns:\n", concatenated_cols)
# Same as hstack
hstacked = np.hstack((arr1, arr2))
print("\nHstacked:\n", hstacked)
Splitting
np.split(array, indices_or_sections, axis): Split an array into multiple sub-arrays.np.vsplit(array, indices_or_sections): Split an array into multiple sub-arrays vertically (row-wise).np.hsplit(array, indices_or_sections): Split an array into multiple sub-arrays horizontally (column-wise).
import numpy as np
arr = np.arange(16).reshape(4, 4)
print("Original Array:\n", arr)
# Split horizontally into 2 equal parts
hsplit_arr = np.hsplit(arr, 2)
print("\nHorizontal Split into 2 parts:\n", hsplit_arr[0], "\n---\n", hsplit_arr[1])
# Split vertically into 2 equal parts
vsplit_arr = np.vsplit(arr, 2)
print("\nVertical Split into 2 parts:\n", vsplit_arr[0], "\n---
", vsplit_arr[1])
# Split into unequal parts at specific indices
arr_1d = np.arange(10)
split_at_indices = np.split(arr_1d, [3, 7]) # Split before index 3, and before index 7
print("\n1D array split at [3, 7]:", split_at_indices)
Further Topics:
resize()vsreshape()insert(),delete(),append()for modifying arrays- Views vs. Copies (important for avoiding unintended side effects)
Mastering indexing, slicing, and reshaping in NumPy provides fine-grained control over your data, enabling complex manipulations and efficient preparation for scientific computing tasks.