To access the i
th column of a NumPy multidimensional array (specifically a 2D array), use slicing syntax with a colon (:
) for rows and the column index i
for columns. Here’s a detailed explanation with examples:
Basic Syntax
import numpy as np
# Create a 2D array
arr = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
# Access the i-th column (0-based index)
column_i = arr[:, i] # : selects all rows, i selects the column
Key Points
- Indexing starts at 0: The first column is
i=0
, the second isi=1
, etc. - Result is a 1D array: The returned column is a flattened view (not a 2D sub-array).
- Modifications affect the original array: Since the result is a view (not a copy), changes propagate to the original array.
Examples
1. Accessing Columns
import numpy as np
arr = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
# First column (i=0)
col0 = arr[:, 0] # [1, 4, 7]
# Second column (i=1)
col1 = arr[:, 1] # [2, 5, 8]
# Last column (i=-1)
col_last = arr[:, -1] # [3, 6, 9]
2. Modifying a Column
# Double the values in the second column (i=1)
arr[:, 1] *= 2
# Result:
# [[1, 4, 3],
# [4, 10, 6],
# [7, 16, 9]]
3. Keeping Column as a 2D Array (Use reshape
or newaxis
)
By default, arr[:, i]
returns a 1D array. To retain a 2D shape (e.g., column vector):
col0_2d = arr[:, [0]] # [[1], [4], [7]] (shape: 3x1)
# OR
col0_2d = arr[:, 0:1] # [[1], [4], [7]]
# Using np.newaxis
col0_2d = arr[:, 0][:, np.newaxis] # [[1], [4], [7]]
4. Accessing Multiple Columns
# Select columns 0 and 2
subset = arr[:, [0, 2]] # [[1, 3],
# [4, 6],
# [7, 9]]
Edge Cases & Pitfalls
- Invalid Column Index: Using
i >= number of columns
causesIndexError
.
# Example with 3 columns (indices: 0, 1, 2)
arr[:, 3] # ❌ IndexError: index 3 is out of bounds
- Empty Arrays: If the array has 0 columns, accessing any column fails.
- 1D Arrays: “Columns” don’t exist in 1D arrays. Use
arr[i]
to access elements.
Why This Works
- NumPy uses row-major order (C-style), so contiguous elements in a row are stored adjacently. When accessing a column, NumPy jumps between rows using strides.
- The syntax
[:, i]
leverages basic slicing, which returns a view (not a copy) for memory efficiency.
Practical Use Case
# Extract column names from a structured dataset
data = np.array([["Alice", 30, "Engineer"],
["Bob", 25, "Designer"]])
# Get the "Age" column (i=1)
ages = data[:, 1].astype(int) # [30, 25]
By mastering this slicing technique, you efficiently manipulate columns in NumPy arrays for data analysis, machine learning, and scientific computing.