How to selecting multiple columns in a Pandas dataframe ?

To select multiple columns in a Pandas DataFrame, use one of the following methods:

1. Using Double Square Brackets `[ ]`

Pass a list of column names to select specific columns in a desired order:

import pandas as pd

# Sample DataFrame
df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Salary': [50000, 60000, 70000],
    'Department': ['HR', 'Tech', 'Finance']
})

# Select 'Name' and 'Age' columns
selected_columns = df[['Name', 'Age']]

Output:

	Name	Age
0	Alice	25
1	Bob	30
2	Charlie	35

2. Using `loc` (Label-Based Selection)

Select columns by their labels (names):

# Select 'Name' and 'Department' columns for all rows
selected_columns = df.loc[:, ['Name', 'Department']]

3. Using `iloc` (Position-Based Selection)

Select columns by their integer positions (indexes):

# Select the first and third columns (positions 0 and 2)
selected_columns = df.iloc[:, [0, 2]]

Output:

	Name	Salary
0	Alice	50000
1	Bob	60000
2	Charlie	70000

4. Using `filter()`

Select columns by name patterns or exact matches:

# Select columns containing 'Name' or 'Age'
selected_columns = df.filter(items=['Name', 'Age'])

# Use regex to match patterns (e.g., columns ending with 't')
selected_columns = df.filter(regex='t$')  # Selects 'Department'

5. Using Column Ranges

Select a sequence of columns by their positions:

# Select columns 0 to 2 (exclusive of the end index)
selected_columns = df.iloc[:, 0:2]  # Columns 0 (Name) and 1 (Age)

Key Notes

Order Preservation: Columns are returned in the order you specify.
Return Type: All methods return a new DataFrame (a view or copy).
Common Errors:
KeyError: Occurs if a column name doesn’t exist.
IndexError: Occurs if using an invalid integer position with iloc.

Advanced Selection

Combine with Conditions

# Select columns where the mean of numeric values > 30
numeric_cols = df.select_dtypes(include='number')
selected_columns = numeric_cols.loc[:, numeric_cols.mean() > 30]

Dynamic Column Selection

# Select columns containing 'Sal' in their names
selected_columns = df.loc[:, df.columns.str.contains('Sal')]

Summary Table

Method	Use Case	Example
`df[['col1', 'col2']]`	Simple column selection by name	`df[['Name', 'Age']]`
`loc`	Label-based selection with flexibility	`df.loc[:, ['Name', 'Salary']]`
`iloc`	Position-based selection	`df.iloc[:, [0, 2]]`
`filter()`	Select columns by name patterns/regex	`df.filter(items=['Name', 'Age'])`

Use these methods to efficiently extract subsets of columns from your DataFrame!

1. Using Double Square Brackets [ ]

2. Using loc (Label-Based Selection)

3. Using iloc (Position-Based Selection)

4. Using filter()