Files
homelab/service-inventory.md
weston 6cbee11482 Phase 1 Complete: Foundation documentation
Added comprehensive homelab documentation:

README.md:
- Hardware inventory and specifications
- Network architecture overview
- Running services catalog
- Quick reference commands
- Project goals and roadmap

docs/network-map.md:
- All device IP assignments
- Port reference guide
- DNS configuration (Pi-hole + Unbound)
- Remote access setup (Tailscale + Cloudflare)
- Troubleshooting commands

docs/service-inventory.md:
- All 32 Docker containers cataloged
- Running services analysis (6 containers)
- Stopped services review (26 containers)
- Resource usage and recommendations
- Container decision matrix
- Cleanup plan to free 40GB
- Security recommendations
- Prioritized action plan

docs/quick-start.md:
- Emergency recovery procedures
- Service restart sequences
- Backup/restore guides with scripts
- Troubleshooting by scenario
- Health check automation
- Post-recovery checklist
- Common problem solutions

This establishes the foundation for all future homelab projects.
Phase 1 documentation complete! 🎉
2025-11-01 00:42:34 +01:00

15 KiB

📦 Service Inventory - Complete Container Catalog

Last Updated: October 31, 2025
Total Containers: 32 (6 running, 26 stopped)
Purpose: Comprehensive catalog of all services


📊 Quick Stats

Metric Value Status
Total Containers 32 -
Running 6 19%
Stopped 26 ⚠️ 81%
Total Docker Images ~50GB ⚠️ High
Cache Usage 578GB / 932GB ⚠️ 63%

Key Insight: 81% of containers are stopped - cleanup opportunity!


🟢 Running Services (6 containers)

1. open-webui

Status: Running (healthy)
Container: open-webui
Image: ghcr.io/open-webui/open-webui:main (4.55GB)
Created: 2025-10-16 (2 weeks ago)
Network: bridge (172.17.0.5)
Ports: 8080 → 3000

Resources:

  • CPU: 0.15%
  • Memory: 1.026GB / 60.55GB (1.69%)
  • Storage: 42.4MB

Purpose: LLM chat interface (ChatGPT-like UI for local models)

Dependencies:

  • ollama (currently STOPPED )
  • OpenAI API key (configured)

Access:

Issues:

  • ⚠️ Depends on ollama container which is stopped
  • ⚠️ OpenAI API key exposed in environment variables

Recommendations:

  1. KEEP - Active LLM interface
  2. Restart ollama container to enable local models
  3. Move API keys to Docker secrets
  4. Enable authentication

Priority: HIGH - Core AI/ML service


2. NginxProxyManager

Status: Running
Container: NginxProxyManager
Image: jlesage/nginx-proxy-manager (189MB)
Created: 2025-10-11 (3 weeks ago)
Network: bridge (172.17.0.4)
Ports: 4443→18443, 8080→1880, 8181→7818

Resources:

  • CPU: 0.08%
  • Memory: 77.45MB (0.12%)
  • Storage: 13.4KB

Purpose: Reverse proxy with web UI - SSL termination and routing

Dependencies: None

Access:

Configuration:

  • Routes traffic to backend services
  • Manages SSL certificates
  • Provides access control

Recommendations:

  1. KEEP - Critical infrastructure
  2. Document all proxy rules in Gitea
  3. Verify SSL auto-renewal is configured
  4. Enable MFA if available
  5. Review access logs regularly

Priority: CRITICAL - Core infrastructure


3. Gitea

Status: Running
Container: Gitea
Image: gitea/gitea (180MB)
Created: 2025-10-08 (3 weeks ago)
Network: bridge (172.17.0.3)
Ports: 22→22, 3000→3002

Resources:

  • CPU: 0.11%
  • Memory: 114.5MB (0.18%)
  • Storage: 113MB (active repositories!)

Purpose: Self-hosted Git server (GitHub alternative)

Dependencies: None (internal SQLite)

Access:

Configuration:

  • Using latest tag (unpinned version)
  • Storage: /mnt/user/appdata/gitea

Issues:

  • ⚠️ SSH port 22 conflicts with Unraid SSH
  • ⚠️ Using latest tag (version not pinned)
  • ⚠️ Backup strategy unknown

Recommendations:

  1. KEEP - Critical for version control
  2. Change SSH port to 2222 to avoid conflict
  3. Pin to specific version tag
  4. Implement automated backups (CRITICAL!)
  5. This is your version control hub - protect it!

Priority: CRITICAL - Infrastructure documentation depends on this


4. ApacheGuacamole

Status: Running (2+ months uptime!)
Container: ApacheGuacamole
Image: jasonbean/guacamole (737MB)
Created: 2025-08-22 (2+ months ago)
Network: bridge (172.17.0.2)
Ports: 8080→4000

Resources:

  • CPU: 0.16%
  • Memory: 785.8MB (1.27%)
  • Storage: 46.2MB

Purpose: Clientless remote desktop gateway (RDP/VNC/SSH via browser)

Dependencies:

  • MariaDB (STOPPED ) - BROKEN DEPENDENCY!

Access:

Configuration:

  • MySQL enabled but MariaDB stopped
  • Multiple auth modules: MySQL, LDAP, TOTP, etc.

Issues:

  • 🚨 CRITICAL: Depends on MariaDB which is stopped!
  • Currently using embedded database (not recommended)
  • Data loss risk without proper database backend

Recommendations:

  1. ⚠️ FIX IMMEDIATELY - Restart MariaDB or reconfigure
  2. If keeping: Start MariaDB and verify connection
  3. If not using: Stop Guacamole and remove both
  4. Document your use case for remote desktop access

Priority: MEDIUM - Fix dependency or remove


5. Cloudflared

Status: Running (2.5+ months - very stable!)
Container: Unraid-Cloudflared-Tunnel
Image: figro/unraid-cloudflared-tunnel (8.92MB)
Created: 2025-08-10 (2.5+ months ago)
Network: bridge (172.17.0.6)
Ports: 46495→46495 (metrics)

Resources:

  • CPU: 0.33% (highest of running containers)
  • Memory: 68.6MB (0.11%)
  • Network I/O: 41.7MB RX / 310KB TX

Purpose: Cloudflare Tunnel - secure external access without port forwarding

Dependencies: None

Access:

Configuration:

  • Tunnel token configured
  • No auto-update enabled
  • Metrics exposed for monitoring

Security:

  • ⚠️ Tunnel token in plain text environment variable
  • No open ports on router (excellent!)

Recommendations:

  1. KEEP - Excellent security practice
  2. Rotate tunnel token periodically
  3. Document which services are exposed
  4. Integrate metrics with monitoring stack

Priority: HIGH - Critical for secure remote access


6. Vaultwarden

Status: Running (healthy) - 3+ months uptime!
Container: vaultwarden
Image: vaultwarden/server (256MB)
Created: 2025-07-31 (3+ months ago)
Network: bridge (172.17.0.7)
Ports: 80→4743

Resources:

  • CPU: 0.00% (idle)
  • Memory: 24.96MB (0.04%) - Very lightweight!

Purpose: Self-hosted password manager (Bitwarden compatible)

Dependencies: None

Access:

Configuration:

  • Signups allowed: true ⚠️
  • Invitations allowed: false
  • WebSocket disabled ⚠️
  • Admin token exposed ⚠️

Issues:

  • 🚨 CRITICAL: No backup strategy evident!
  • ⚠️ Admin token in plain text
  • ⚠️ Signups open (verify intentional)
  • ⚠️ WebSocket disabled (reduces functionality)

Recommendations:

  1. KEEP - Critical security infrastructure
  2. 🚨 IMPLEMENT BACKUP IMMEDIATELY - This is your password vault!
  3. Close signups after initial setup
  4. Rotate admin token and use secrets management
  5. Enable WebSocket for better sync
  6. Automate daily backups to off-site location

Priority: CRITICAL - Contains all your passwords!


🔴 Recently Stopped Services (Worth Investigating)

7. ollama ⚠️

Status: Exited (128) 4 minutes ago
Image: ollama/ollama (3.33GB)
Purpose: Local LLM inference engine

Why It Matters: open-webui depends on this!

Recommendations:

  1. 🔧 RESTART - Required for open-webui local models
  2. Investigate exit code 128 (configuration issue?)
  3. Configure GPU acceleration (RTX 4090!)
  4. Test with open-webui after restart

Action: docker start ollama && docker logs -f ollama


8. Monitoring Stack (Stopped 12 days ago) 🚨

Containers:

  • Grafana (stopped 12 days)
  • InfluxDB (stopped 12 days)
  • Telegraf (stopped 12 days)

Total Size: ~1.7GB

Why Critical: Zero observability into system health!

Recommendations:

  1. 🚨 RESTART IMMEDIATELY - Priority 1!
  2. Configure dashboards for:
    • Docker container stats
    • System resources (CPU, RAM, disk)
    • Network traffic
    • Temperature sensors
  3. Set up alerting for critical issues
  4. Document in runbook

Action:

docker start Influxdb
sleep 15  # Wait for DB initialization
docker start Telegraf
docker start Grafana

9. MariaDB (Stopped 12 days ago) ⚠️

Status: Exited (0) 12 days ago
Image: lscr.io/linuxserver/mariadb (348MB)
Purpose: MySQL database for Guacamole

Issue: Guacamole is running but database is stopped!

Recommendations:

  1. If using Guacamole: RESTART
  2. If not using Guacamole: REMOVE BOTH
  3. Document decision

10. Database Admin Tools (Stopped 12 days ago)

CloudBeaver - Stopped 12 days
adminer - Stopped 12 days

Issue: Two database admin tools - redundant!

Recommendations:

  1. CHOOSE ONE:
    • CloudBeaver: Feature-rich (725MB)
    • adminer: Lightweight (118MB)
  2. Remove the other
  3. Only restart if you need database management

🟡 Experimental / Inactive Services (Decision Needed)

11. Nextcloud AIO Stack (7 containers!) 🚨

Status: All stopped 3 weeks ago
Total Size: ~7GB Docker images + data
Containers:

  • nextcloud-aio-mastercontainer
  • nextcloud-aio-apache
  • nextcloud-aio-nextcloud (2.19GB)
  • nextcloud-aio-database (PostgreSQL)
  • nextcloud-aio-redis
  • nextcloud-aio-onlyoffice (3.79GB!)
  • nextcloud-aio-imaginary
  • nextcloud-aio-notify-push

Data: /mnt/user/nextcloud (~1GB+)

Analysis:

  • Massive resource footprint
  • "All-in-One" = heavy coupling
  • Stopped for 3 weeks suggests not critical

Recommendations: DECISION REQUIRED:

Option A: Remove Everything

# Backup data first!
cp -r /mnt/user/nextcloud /mnt/user/backup/nextcloud-$(date +%Y%m%d)

# Remove containers
docker rm nextcloud-aio-*

# Remove images to free space
docker rmi $(docker images | grep nextcloud | awk '{print $3}')

# Archive data
tar -czf nextcloud-data-backup.tar.gz /mnt/user/nextcloud

Saves: ~7GB+ space

Option B: Keep and Restart

  • Document why you need it
  • Create restart procedure
  • Implement backup strategy
  • Monitor resource usage

My Recommendation: Remove unless actively needed. Nextcloud is great but this All-in-One stack is heavy.


12. Jellyfin (Stopped 2 weeks ago) ⚠️

Status: Exited (0) 2 weeks ago
Image: jellyfin/jellyfin (1.25GB)
GPU: RTX 4090 allocated but idle!

Media:

  • Movies: /mnt/user/movies
  • TV: /mnt/user/tv shows
  • Music: /mnt/user/music

Issue: $1600 GPU sitting idle!

Recommendations: If you want media server:

  1. RESTART with hardware transcoding:
    docker start Jellyfin
    
  2. Configure NVENC/NVDEC for RTX 4090
  3. Test 4K transcoding performance
  4. Switch from host network to bridge (security)

If you don't need media server:

  1. Remove GPU allocation from container
  2. Free GPU for other projects (AI/ML)

Action Required: Decide on media server strategy


13. Large AI/ML Containers (Rarely Used)

ebook2audiobook - 20.06GB! (stopped 3 weeks)
docling-serve - 14.45GB! (stopped 2 weeks)

Total: 34.5GB for two containers!

Analysis:

  • Massive images
  • Rarely used (stopped weeks ago)
  • Experimental/one-time use?

Recommendations:

  1. REMOVE both to free 34.5GB
  2. If needed again, pull fresh images
  3. Document use cases if keeping

Potential Savings: 34.5GB cache space!


14. Productivity Suite (Multiple Stopped)

baserow - Stopped 2 weeks (2.25GB)
NocoDB - Stopped 3 weeks (588MB)
OpenProject - Stopped 7 weeks (2.87GB)

Issue: Three project management tools - redundant!

Recommendations:

  1. CHOOSE ONE (or none if not used)
  2. Remove the others
  3. Migrate data if needed first

Potential Savings: ~5GB


15. Development Tools

n8n (workflow automation) - Created but never started
steam-headless - Created but not running

Recommendations:

  • Document if you have plans for these
  • Remove if experimental and abandoned

📋 Container Decision Matrix

Container Keep? Action Priority
open-webui Yes Keep running, restart ollama HIGH
NginxProxyManager Yes Keep, document configs CRITICAL
Gitea Yes Keep, fix SSH port, backup CRITICAL
ApacheGuacamole ⚠️ Decide Fix MariaDB OR remove both MEDIUM
Cloudflared Yes Keep, rotate token HIGH
Vaultwarden Yes Keep, BACKUP NOW! CRITICAL
ollama Yes Restart immediately HIGH
Monitoring Stack Yes Restart all 3 containers CRITICAL
MariaDB ⚠️ Conditional If Guacamole stays MEDIUM
Nextcloud AIO (7) Remove Backup data, remove stack LOW
Jellyfin ⚠️ Decide Use GPU or remove MEDIUM
ebook2audiobook Remove Free 20GB LOW
docling-serve Remove Free 14.5GB LOW
baserow/NocoDB/OpenProject Choose 1 Remove others LOW
CloudBeaver/adminer ⚠️ Choose 1 Keep one DB admin LOW

Phase 1: Critical (Do First!) 🚨

  1. Backup Vaultwarden (30 min)

    docker stop vaultwarden
    tar -czf vaultwarden-backup-$(date +%Y%m%d).tar.gz /mnt/user/appdata/vaultwarden
    docker start vaultwarden
    
  2. Backup Gitea (30 min)

    docker stop Gitea
    tar -czf gitea-backup-$(date +%Y%m%d).tar.gz /mnt/user/appdata/gitea
    docker start Gitea
    
  3. Restart Monitoring Stack (15 min)

    docker start Influxdb && sleep 15
    docker start Telegraf Grafana
    # Configure dashboards
    
  4. Restart ollama (5 min)

    docker start ollama
    docker logs -f ollama
    

Phase 2: Cleanup (Free Space!) 💾

  1. Remove Large Unused Containers (1 hour)

    • ebook2audiobook (20GB)
    • docling-serve (14.5GB)
    • Nextcloud AIO stack (7GB)
    • Saves: ~41GB!
  2. Docker System Cleanup

    docker system prune -a
    # Free unused images and build cache
    

Phase 3: Decisions (This Week)

  1. Guacamole + MariaDB - Keep or remove?
  2. Jellyfin - Restart with GPU or remove?
  3. Productivity tools - Choose one, remove others
  4. Database admin - CloudBeaver or adminer?

📊 Storage Cleanup Impact

Current Cache Usage: 578GB / 932GB (63%)

After Recommended Cleanup:

  • Remove ebook2audiobook: -20GB
  • Remove docling-serve: -14.5GB
  • Remove Nextcloud AIO: -7GB
  • Docker system prune: ~10-20GB
  • Total Freed: ~50-60GB

New Cache Usage: ~520GB / 932GB (56%)


🔐 Security Recommendations

  1. Secrets Management - Stop using plain text env vars
  2. Close Open Signups - Vaultwarden signups should be closed
  3. SSH Port Conflict - Fix Gitea port 22 conflict
  4. Network Mode - Move Jellyfin from host to bridge
  5. Version Pinning - Stop using latest tags

📈 Resource Summary

Docker Images Total: ~50GB
Container Data: Varies by appdata
Cache Impact: High (63% full)

Top Resource Consumers (Images):

  1. ebook2audiobook: 20.06GB
  2. docling-serve: 14.45GB
  3. Nextcloud stack: ~7GB
  4. open-webui: 4.55GB
  5. OpenProject: 2.87GB

🎓 Key Takeaways

  1. 6 services are your core - Keep these running
  2. 26 stopped containers - Cleanup opportunity
  3. ~40GB can be freed - Significant space available
  4. No monitoring - Critical gap (restart Grafana stack!)
  5. Backup critical - Vaultwarden and Gitea MUST be backed up

Last Updated: October 31, 2025
Next Review: After cleanup actions completed
Maintained By: Weston