Sysadmin Essentials
Core Linux commands for managing servers — processes, services, disk, networking, users, and logs.
Recipe
Quick-reference recipe card — copy-paste ready.
# System info
uname -a # kernel version and architecture
hostnamectl # hostname, OS, kernel
uptime # how long the server has been running
df -h # disk usage (human-readable)
free -h # memory usage
top # live process monitor (q to quit)
# Process management
ps aux # all running processes
kill <PID> # graceful stop
kill -9 <PID> # force kill
# Networking
ss -tulnp # listening ports and their processes
curl -I https://example.com # HTTP headers only
ip addr show # network interfaces and IPs
# Service management (systemd)
sudo systemctl status nginx
sudo systemctl restart nginx
sudo systemctl enable nginx # start on bootWhen to reach for this: When deploying, debugging, or maintaining a Linux server — VPS, EC2 instance, or self-hosted infrastructure.
Working Example
# Diagnose why a Node.js app is down
# 1. Check if the process is running
ps aux | grep node
# 2. Check the service status
sudo systemctl status my-app
# 3. Check which port it should be on
ss -tulnp | grep 3000
# 4. Check recent logs
sudo journalctl -u my-app --since "10 minutes ago" --no-pager
# 5. Check disk space (app may have filled the disk)
df -h
# 6. Check memory (OOM killer may have stopped it)
dmesg | tail -20What this demonstrates:
- Systematic debugging: process → service → port → logs → resources
journalctlis the standard way to read systemd service logsdmesgshows kernel messages including out-of-memory kills
Deep Dive
Process Management
# List all processes with full detail
ps aux
# USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
# node 1234 2.3 4.1 123456 78901 ? Ssl 08:00 1:23 node server.js
# Find a specific process
ps aux | grep "next"
pgrep -f "next dev" # just the PID
# Kill by name
pkill -f "next dev"
# Real-time process monitor (better than top)
htop # install: sudo apt install htop
# Run a process in the background
node server.js &
# Bring it back
fg
# Run a process that survives SSH disconnect
nohup node server.js > app.log 2>&1 &
# Or use screen/tmux (recommended)
tmux new -s myapp
# Detach: Ctrl+B then D
# Reattach: tmux attach -t myappDisk & Storage
# Disk usage by directory (sorted, top 10)
du -sh /* 2>/dev/null | sort -rh | head -10
# Find large files (>100MB)
find / -type f -size +100M 2>/dev/null
# Check inode usage (can run out even with free space)
df -i
# Monitor disk I/O
iostat -x 1 5 # 5 samples, 1 second apart
# Clean package cache (Ubuntu/Debian)
sudo apt autoremove
sudo apt clean
# Check what's using space in the current directory
du -sh */ | sort -rhMemory & CPU
# Memory overview
free -h
# total used free shared buff/cache available
# Mem: 16Gi 8.2Gi 1.3Gi 512Mi 6.5Gi 7.1Gi
# Top memory consumers
ps aux --sort=-%mem | head -10
# Top CPU consumers
ps aux --sort=-%cpu | head -10
# System load averages (1, 5, 15 minutes)
uptime
# 10:30:00 up 45 days, load average: 0.50, 0.75, 0.60
# CPU info
nproc # number of CPU cores
lscpu # detailed CPU infoNetworking
# All listening ports
ss -tulnp
# -t TCP -u UDP -l listening -n numeric -p process
# Test connectivity
ping -c 3 google.com # 3 pings then stop
traceroute example.com # trace network path
# DNS lookup
dig example.com
nslookup example.com
# Download a file
curl -O https://example.com/file.tar.gz
wget https://example.com/file.tar.gz
# HTTP request with headers and timing
curl -w "\nDNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" \
-o /dev/null -s https://example.com
# Firewall (Ubuntu with ufw)
sudo ufw status
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw allow 22/tcp
sudo ufw enableService Management (systemd)
# Common service operations
sudo systemctl start nginx
sudo systemctl stop nginx
sudo systemctl restart nginx
sudo systemctl reload nginx # reload config without downtime
sudo systemctl status nginx
# Enable/disable on boot
sudo systemctl enable nginx
sudo systemctl disable nginx
# View logs for a service
sudo journalctl -u nginx -f # follow (live tail)
sudo journalctl -u nginx --since today
sudo journalctl -u nginx -n 100 # last 100 lines
# Create a custom service for a Node.js app
# /etc/systemd/system/my-app.service[Unit]
Description=My Node.js App
After=network.target
[Service]
Type=simple
User=deploy
WorkingDirectory=/opt/my-app
ExecStart=/usr/bin/node server.js
Restart=on-failure
RestartSec=5
Environment=NODE_ENV=production
Environment=PORT=3000
[Install]
WantedBy=multi-user.target# After creating the service file
sudo systemctl daemon-reload
sudo systemctl enable my-app
sudo systemctl start my-appUsers & Permissions
# Current user info
whoami
id
# Add a user
sudo adduser deploy
# Add user to a group
sudo usermod -aG sudo deploy # give sudo access
sudo usermod -aG docker deploy # give Docker access
# File permissions
chmod 755 script.sh # rwxr-xr-x
chmod 600 .env # rw------- (secrets)
chmod +x deploy.sh # add execute permission
# Change ownership
sudo chown deploy:deploy /opt/my-app -R
# Permission reference
# r=4 w=2 x=1
# 755 = owner(rwx) group(r-x) others(r-x)
# 644 = owner(rw-) group(r--) others(r--)
# 600 = owner(rw-) group(---) others(---)Logs
# System logs
sudo journalctl -xe # recent with explanations
sudo journalctl --since "1 hour ago"
# Traditional log files
tail -f /var/log/syslog # follow system log
tail -f /var/log/nginx/error.log # follow nginx errors
less /var/log/auth.log # authentication log
# Log rotation status
ls -la /var/log/nginx/
# Search logs
sudo journalctl -u my-app --grep="error" --since todayGotchas
Things that will bite you. Each gotcha includes what goes wrong, why it happens, and the fix.
-
Kill -9 as first resort — Force-killing a process doesn't let it clean up (close files, release locks). Fix: Try
kill <PID>first (SIGTERM), wait a few seconds, thenkill -9only if needed. -
Disk full but can't find files — Deleted files still held open by a process consume space until the process is restarted. Fix:
lsof | grep deletedto find them, then restart the holding process. -
SSH disconnect kills your process — Background processes started in a shell die when the SSH session ends. Fix: Use
tmux,screen, or run the process as a systemd service. -
Permission denied on port 80 — Non-root processes can't bind to ports below 1024. Fix: Use a reverse proxy (nginx/caddy) on port 80 that forwards to your app on 3000, or
sudo setcap cap_net_bind_service=+ep $(which node). -
ufw enablelocks you out — Enabling the firewall without allowing SSH first. Fix: Alwayssudo ufw allow 22/tcpbeforesudo ufw enable.
Alternatives
Other ways to solve the same problem — and when each is the better choice.
| Alternative | Use When | Don't Use When |
|---|---|---|
| Docker containers | Consistent environments, easy scaling | Simple single-app servers |
| PM2 | Node.js process management with clustering | Non-Node apps or when systemd suffices |
| Ansible/Terraform | Automating multi-server setup | One-off server tasks |
| Managed platforms (Vercel, Railway) | Zero server management desired | Full control over infrastructure needed |
FAQs
What is the difference between kill and kill -9?
kill <PID>sends SIGTERM, allowing the process to clean up gracefullykill -9 <PID>sends SIGKILL, forcing immediate termination with no cleanup- Always try
killfirst; only use-9as a last resort
How do I find out which process is using a specific port?
ss -tulnp | grep 3000
# or
lsof -i :3000How do I create a systemd service for a Node.js app?
- Create a file at
/etc/systemd/system/my-app.servicewith[Unit],[Service], and[Install]sections - Set
ExecStart=/usr/bin/node server.js,Restart=on-failure, andEnvironment=NODE_ENV=production - Run
sudo systemctl daemon-reload && sudo systemctl enable --now my-app
How do I check why a service is failing?
sudo systemctl status my-app
sudo journalctl -u my-app --since "10 minutes ago" --no-pagerstatusshows the current state and last few log linesjournalctlgives the full log history for that service
What does the free -h output mean?
total= total physical RAMused= RAM actively in use by processesbuff/cache= RAM used for disk caching (reclaimable)available= RAM that can be used without swapping (free + reclaimable cache)
Gotcha: I enabled ufw and locked myself out of SSH. How do I prevent this?
- Always run
sudo ufw allow 22/tcpbeforesudo ufw enable - If locked out, you need console access (e.g., cloud provider's web console) to disable the firewall
How do I find what is consuming all the disk space?
du -sh /* 2>/dev/null | sort -rh | head -10- Also check for deleted files held open by processes:
lsof | grep deleted
How do I keep a process running after I disconnect from SSH?
- Use
tmuxorscreento create a persistent session - Or run the process as a systemd service
nohup node server.js > app.log 2>&1 &works as a quick workaround
Gotcha: Why does my Node.js app get killed randomly on the server?
- The Linux OOM (Out Of Memory) killer terminates processes when memory is exhausted
- Check with
dmesg | grep -i oom - Fix by increasing server RAM, setting
--max-old-space-size, or fixing memory leaks
How do I give a user sudo access?
sudo usermod -aG sudo deploy- The user must log out and back in for the group change to take effect
What do Linux file permission numbers like 755 and 600 mean?
- Each digit represents owner, group, others; values are r=4, w=2, x=1
755= owner rwx, group r-x, others r-x (executables, scripts)600= owner rw-, group ---, others --- (secrets like.env)644= owner rw-, group r--, others r-- (regular files)
In a TypeScript deployment, how do I check if the build output is correct on the server?
- Verify the build directory exists:
ls -la /opt/my-app/.next/ - Check that
node_modulesis installed:ls /opt/my-app/node_modules/.package-lock.json - Test the start command manually:
cd /opt/my-app && NODE_ENV=production node server.js
Related
- Search & Regex — Find files, search content, and use regex patterns
- Node.js Developer Commands — npm, environment, and build commands
- Shell Productivity — Pipes, redirection, scripting, and workflow tips