How can I get a value from a cell of a dataframe?

To get a value from a specific cell in a pandas DataFrame, you have several methods at your disposal. Here’s a detailed explanation with examples:

Core Methods for Cell Access

  1. loc[] – Label-based indexing
  2. iloc[] – Integer position-based indexing
  3. at[] – Fast label-based scalar access
  4. iat[] – Fast integer-based scalar access

Example DataFrame Setup

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'London', 'Paris']
}

df = pd.DataFrame(data, index=['a', 'b', 'c'])
print(df)

Output:

      Name  Age      City
a    Alice   25  New York
b      Bob   30    London
c  Charlie   35     Paris

1. Using loc[] (Label-Based)

Access by row label and column name.

# Get Alice's Age (row 'a', column 'Age')
value = df.loc['a', 'Age']
print(value)  # Output: 25

# Get Bob's City (row 'b', column 'City')
value = df.loc['b', 'City']
print(value)  # Output: London

2. Using iloc[] (Integer-Based)

Access by row position (0-indexed) and column position.

# Get Charlie's Name (row 2, column 0)
value = df.iloc[2, 0]
print(value)  # Output: Charlie

# Get Bob's Age (row 1, column 1)
value = df.iloc[1, 1]
print(value)  # Output: 30

3. Using at[] (Fast Label Access)

Optimized for single scalar values (label-based).

# Get Alice's City
value = df.at['a', 'City']
print(value)  # Output: New York

# ~3x faster than loc[] for single values

4. Using iat[] (Fast Integer Access)

Optimized for single scalar values (position-based).

# Get Charlie's Age (row 2, column 1)
value = df.iat[2, 1]
print(value)  # Output: 35

# ~3x faster than iloc[] for single values

Handling Edge Cases

Case 1: Duplicate Index Labels

df_dup = df.copy()
df_dup.index = ['a', 'a', 'b']  # Duplicate index 'a'

# loc[] returns a Series when duplicates exist
print(df_dup.loc['a', 'Age'])

Output:

a    25
a    30
Name: Age, dtype: int64

Case 2: Non-Existent Labels

# Use try-except to handle KeyErrors
try:
    value = df.loc['x', 'Salary']  # Invalid labels
except KeyError:
    print("Label not found!")

Case 3: Conditional Selection

# Get Age where Name is 'Bob'
value = df[df['Name'] == 'Bob']['Age'].values[0]
print(value)  # Output: 30

Performance Comparison

MethodUse CaseSpeed
at[]Single cell by labelFastest
iat[]Single cell by integer positionFast
loc[]Label-based selectionMedium
iloc[]Position-based selectionMedium

Best Practices

  1. Prefer at[]/iat[] for single values – Significant speed advantage
  2. Use loc[]/iloc[] for ranges/slices:
   # First two rows, 'Name' and 'City' columns
   df.loc[['a','b'], ['Name','City']]
  1. Convert to dictionary for repeated access:
   data_dict = df.set_index('Name').to_dict()
   print(data_dict['Age']['Bob'])  # Output: 30

Complete Example Workflow

# Create DataFrame
df = pd.DataFrame({
    'Product': ['Apple', 'Banana', 'Cherry'],
    'Price': [1.20, 0.50, 2.75],
    'Stock': [15, 42, 8]
}, index=['Store1', 'Store2', 'Store3'])

# Access Banana's price using different methods
print(df.loc['Store2', 'Price'])   # 0.50 (label)
print(df.iloc[1, 1])               # 0.50 (position)
print(df.at['Store2', 'Price'])    # 0.50 (fast label)
print(df.iat[1, 1])                # 0.50 (fast position)

# Get Cherry's stock (conditional)
cherry_stock = df[df['Product']=='Cherry']['Stock'].values[0]
print(cherry_stock)  # 8

Key Notes

  • Indexing starts at 0 for iloc[] and iat[]
  • Labels matter for loc[] and at[] (check df.index and df.columns)
  • Use values attribute for NumPy array conversion:
  # Get entire row as array
  df.loc['a'].values  # ['Alice' 25 'New York']
  • Chained indexing (e.g., df['Age']['a']) may cause issues – prefer direct methods

By mastering these techniques, you’ll efficiently extract values from DataFrames for data analysis and manipulation tasks.

Leave a Reply

Your email address will not be published. Required fields are marked *