How to get list from pandas dataframe column or row?

To extract a list from a pandas DataFrame column or row, you can use the following methods. Below is a detailed explanation with examples:

1. Extracting a Column as a List

Method: df['column_name'].tolist()

  • Steps:
  1. Select the column using bracket notation df['col'] (returns a Series)
  2. Convert the Series to a list with .tolist()
  • Example:
  import pandas as pd

  # Sample DataFrame
  data = {'Name': ['Alice', 'Bob', 'Charlie'],
          'Age': [25, 30, 35],
          'City': ['New York', 'London', 'Paris']}
  df = pd.DataFrame(data)

  # Extract 'Name' column as a list
  name_list = df['Name'].tolist()
  print(name_list)  # Output: ['Alice', 'Bob', 'Charlie']

Alternative: list(df['column_name'])

  • Works but less efficient for large datasets:
  age_list = list(df['Age'])
  print(age_list)  # Output: [25, 30, 35]

2. Extracting a Row as a List

Method 1: df.iloc[row_index].tolist()

  • Steps:
  1. Select the row by integer position with .iloc[]
  2. Convert the row (a Series) to a list
  • Example:
  # Extract the first row (index 0) as a list
  row_0_list = df.iloc[0].tolist()
  print(row_0_list)  # Output: ['Alice', 25, 'New York']

Method 2: df.loc[row_label].tolist()

  • Use row labels (if using custom indices):
  # Set custom index
  df = df.set_index('Name')
  # Extract row for 'Bob'
  bob_list = df.loc['Bob'].tolist()
  print(bob_list)  # Output: [30, 'London']

3. Extracting All Rows as a List of Lists

Method: df.values.tolist()

  • Converts the entire DataFrame into a list of lists (each inner list is a row)
  • Example:
  all_rows = df.values.tolist()
  print(all_rows)
  # Output: [['Alice', 25, 'New York'],
  #          ['Bob', 30, 'London'],
  #          ['Charlie', 35, 'Paris']]

4. Extracting Unique Values from a Column

Method: df['col'].unique().tolist()

  • Gets distinct values:
  unique_cities = df['City'].unique().tolist()
  print(unique_cities)  # Output: ['New York', 'London', 'Paris']

5. Handling Edge Cases

A. Missing Values (NaN):

  • .tolist() preserves NaN values. Use dropna() to exclude them:
  df_with_nan = pd.DataFrame({'Scores': [90, None, 85]})
  clean_list = df_with_nan['Scores'].dropna().tolist()
  print(clean_list)  # Output: [90.0, 85.0]

B. Rows with Custom Indices:

  • When rows have custom indices, .iloc/.loc ensure correct selection:
  df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}, index=['x', 'y'])
  row_y = df.loc['y'].tolist()  # Output: [2, 4]

Key Notes

MethodUse CaseOutput Type
df['col'].tolist()Single columnlist of values
df.iloc[0].tolist()Row by integer indexlist of values
df.loc['label'].tolist()Row by label (custom index)list of values
df.values.tolist()Entire DataFrame (rows)list of lists
df['col'].unique().tolist()Unique column valueslist of values

Example Workflow

import pandas as pd

# Create DataFrame
df = pd.DataFrame({
    'Product': ['Apple', 'Banana', 'Cherry'],
    'Price': [1.20, 0.50, 2.30],
    'In_Stock': [True, True, False]
})

# 1. Extract 'Product' column
products = df['Product'].tolist()
print("Products:", products)  # Output: ['Apple', 'Banana', 'Cherry']

# 2. Extract first row
first_row = df.iloc[0].tolist()
print("First row:", first_row)  # Output: ['Apple', 1.2, True]

# 3. Extract all rows
all_rows = df.values.tolist()
print("All rows:", all_rows)
# Output: [['Apple', 1.2, True], ['Banana', 0.5, True], ['Cherry', 2.3, False]]

# 4. Unique prices
unique_prices = df['Price'].unique().tolist()
print("Unique prices:", unique_prices)  # Output: [1.2, 0.5, 2.3]

Best Practices

  • Use .tolist(): Optimized for pandas and efficient.
  • Avoid list(df['col']): Slower for large datasets.
  • Prefer .iloc/.loc for rows: Explicit indexing prevents ambiguity.
  • Handle NaN: Clean data with dropna() if needed.

Leave a Reply

Your email address will not be published. Required fields are marked *