How to extract filename and extension in Bash ?

To extract the filename and its extension in Bash, use parameter expansion for efficiency and simplicity. Here’s a detailed guide covering edge cases like hidden files, filenames with no extensions, and multiple dots:

Method 1: Parameter Expansion (Recommended)

Bash’s built-in string manipulation is the fastest and most reliable approach.

1. Extract Filename Without Extension

filename="document.tar.gz"
name="${filename%.*}"       # Removes the shortest suffix after the last '.'
echo "$name"                # Output: document.tar

2. Extract Extension

extension="${filename##*.}" # Removes the longest prefix up to the last '.'
echo "$extension"           # Output: gz

3. Handle Full Paths

If working with a full path (e.g., /home/user/docs/file.txt):

fullpath="/home/user/docs/file.txt"
filename="${fullpath##*/}"  # Extract filename: file.txt
name="${filename%.*}"       # Remove extension: file
extension="${filename##*.}" # Extract extension: txt

Edge Cases

Case 1: Filename with No Extension

filename="README"
name="${filename%.*}"       # Output: README
extension="${filename##*.}" # Output: README (incorrect)

Fix: Check if the filename contains a .:

if [[ "$filename" == *.* ]]; then
  name="${filename%.*}"
  extension="${filename##*.}"
else
  name="$filename"
  extension=""
fi

Case 2: Hidden Files (e.g., .bashrc)

Hidden files (starting with .) typically have no extension:

filename=".bashrc"
if [[ "$filename" =~ ^\..+ && "$filename" != *.* ]]; then
  name="$filename"
  extension=""
else
  name="${filename%.*}"
  extension="${filename##*.}"
fi

Case 3: Multiple Dots (e.g., image.version1.2.jpg)

The last dot determines the extension:

filename="image.version1.2.jpg"
name="${filename%.*}"       # Output: image.version1.2
extension="${filename##*.}" # Output: jpg

Method 2: Using basename and dirname

For paths, combine basename and parameter expansion:

fullpath="/var/log/app.log"
filename=$(basename "$fullpath")   # Output: app.log
name="${filename%.*}"              # Output: app
extension="${filename##*.}"        # Output: log

Method 3: awk for Advanced Splitting

Use awk to handle complex patterns:

filename="data.2023.backup.tar.gz"
extension=$(awk -F. '{print $NF}' <<< "$filename")  # Output: gz
name=$(awk 'sub(/\.[^.]*$/, "")' <<< "$filename")   # Output: data.2023.backup.tar

Method 4: sed for Regex Parsing

Use regular expressions with sed:

filename="archive.tar.gz"
name=$(sed 's/\.[^.]*$//' <<< "$filename")     # Output: archive.tar
extension=$(sed 's/.*\.//' <<< "$filename")    # Output: gz

Complete Script Example

A reusable function to handle all cases:

split_filename() {
  local fullpath="$1"
  local filename="${fullpath##*/}"  # Extract filename from path
  local name extension

  if [[ "$filename" =~ ^\..+ && "$filename" != *.* ]]; then
    # Hidden file with no extension (e.g., .bashrc)
    name="$filename"
    extension=""
  elif [[ "$filename" == *.* ]]; then
    # Split into name and extension
    name="${filename%.*}"
    extension="${filename##*.}"
  else
    # No extension
    name="$filename"
    extension=""
  fi

  echo "Filename: $name"
  echo "Extension: ${extension:-none}"
}

# Test Cases
split_filename "/home/user/docs/file.txt"      # Filename: file, Extension: txt
split_filename "image.png"                    # Filename: image, Extension: png
split_filename ".bashrc"                      # Filename: .bashrc, Extension: none
split_filename "archive.tar.gz"               # Filename: archive.tar, Extension: gz
split_filename "README"                       # Filename: README, Extension: none

Key Notes

  • Parameter Expansion is the most efficient method (no external commands).
  • Hidden Files: Treat files starting with . as having no extension.
  • No Extension: Return an empty string if no . exists.
  • Multiple Dots: The last dot defines the extension (e.g., file.tar.gzgz).

Use these methods to reliably parse filenames in your Bash scripts!

Leave a Reply

Your email address will not be published. Required fields are marked *