Troubleshooting Linux Disk Space Issues, File Deletion Semantics, and Handling Argument List Limits
Simulating Filesystem Constraints and Loop Device Mounts
A basic loop device can be created and mounted to create a space-restricted test environment:
# Create a 100K file and associate it with a loop device
# Then mount it at /app/log
dd if=/dev/zero of=/tmp/100k bs=1K count=100
losetup /dev/loop0 /tmp/100k
mkfs.ext4 /dev/loop0
mount /dev/loop0 /app/log
# Check available inodes
df -ih /app/log
When inode exhaustion is deliberately triggered, file creation fails even though block capacity remains:
cd /app/log
# Attempt to create multiple files
for i in $(seq 1 10); do
touch "file_$i.txt"
done
# Some files fail with "No space left on device"
df -h /app/log
Linux File Deletion Semantics
A file is completely removed only when both conditions are met:
- All hard links (directory entries) are removed (link count reaches zero).
- No open file descriptors reference the inode (process usage count is zero).
Common background processes that keep deleted files open:
/var/log/messages/var/log/secure/var/log/cron
Use lsof to identify files marked (deleted) that still consume space:
yum install -y lsof
lsof | grep deleted
Reproduction: Space Leak from Held File Handles
Terminal 1:
tail -f sample_data.log
Terminal 2:
lsof | grep sample_data.log
# OUTPUT: tail ... /root/sample_data.log
When editing with vim:
# Terminal 1
vim sample_data.log
# Terminal 2
lsof | grep sample_data.log
# OUTPUT: vim ... /root/.sample_data.log.swp
Cleaning Up Large Deleted Logs
Simulate a large log file that is quickly removed but still held open by rsyslog:
# Write a huge amount of data
seq 500000000 >> /var/log/messages
df -h /dev/sda3
# Delete the file — space does not immediately free up
rm -f /var/log/messages
df -h /dev/sda3
# Identify processes holding deleted file handles
lsof | grep messages
# Restart the services to release the handles
systemctl restart rsyslog.service
systemctl restart abrtd.service
df -h /dev/sda3
Handling "Argument list too long" When Removing Many Small Files
When you need to create or delete hundreds of thousends of files, shell globbing may exceed the maximum argument length.
Bulk File Creation
mkdir bulk_dir
# Direct brace expansion fails:
touch bulk_dir/{1..400000} # Argument list too long
# Pipe through xargs instead
echo bulk_dir/{1..400000} | xargs touch
ls bulk_dir/ | wc -l
Deletion Strategies
# Direct removal fails
rm bulk_dir/* # Argument list too long
# Using find with xargs is safer, but overly broad patterns also fail
find bulk_dir/1* | xargs rm
# Target by sub-pattern in iterations
find bulk_dir/11* | xargs rm
find bulk_dir/33* | xargs rm
# ls piped to xargs works for subsets
ls bulk_dir/4* | xargs rm -f
# Most decisive approach: remove parent directory after verifying ownership/permissions
# rm -rf bulk_dir
Inode Exhaustion: Detection and Mitigation
When df -h reports free space but file creation fails, inspect inode usage:
df -ih
Locate directories consuming excessive inodes:
# Example: directory with 50,000 entries occupies ~1.2MB of block metadata
mkdir project_data
for i in $(seq 1 50000); do
touch project_data/entry_$i
done
ls -lhd project_data/
ls project_data/ | wc -l
Adding Swap Space Dynamically
When memory pressure triggers swapping, you can add extra swap via a file.
# Check existing swap
free -h
# Step 1: Allocate a file
dd if=/dev/zero of=/swap_extra bs=1M count=500
# Step 2: Prepare swap signature
mkswap /swap_extra
# Step 3: Secure permissions and activate
chmod 600 /swap_extra
swapon /swap_extra
# Verification
free -h
# Persistent configuration through rc.local
# /etc/rc.local:
# swapon /swap_extra
Filesystem Overview and Selection Guidance
A filesystem governs how data is organized on a partition.
Defaults by Release
- CentOS 7:
xfs - CentOS 6:
ext4 - CentOS 5:
ext3 - Swap partition:
swap - Memory-backed:
tmpfs
Workload-Oriented Choices
| Filesystem | Recommended Use Cases |
|---|---|
reiserfs |
Large numbers of small files (<100KB); requires extra package |
xfs |
MySQL deployments, large-scale data services |
ext4 |
General purpose: streaming, databases, small files |
ext2 |
CDN caching layers where journaling overhead is avoided |
swap |
Temporary memory overflow buffer |
tmpfs |
In-memory caching for performance acceleration |
What is a CDN?
A Content Delivery Network (CDN) distributes static assets across geographically dispersed nodes. Users fetch data from a nearby edge server, which reduces latency and relieves origin server congestion.