Tactical RMM (TRMM) is a self-hosted RMM that integrates inventory, monitoring, patching, script execution, and remote access (via MeshCentral) with agents for Windows, Linux, and macOS (the latter two via the project’s sponsorship program). It runs on Django + Vue, NATS, PostgreSQL, and Redis, and is well-orchestrated with Docker Compose. For small/medium environments currently living on AnyDesk/TeamViewer + scripts + GPO, TRMM reduces tool sprawl and, critically, cuts OPEX: €0 per endpoint and full control of data and attack surface.

Below is a guide focused on operations (not marketing): architecture, requirements, IaC + hardening, update runbook, backup/DR, gotchas, and realistic sizing.


1) Reference Architecture

Components (containerized):

  • Panel/UI: Vue + Django API
  • NATS: messaging/queue for tasks/telemetry
  • PostgreSQL: state database
  • Redis: broker/cache (short queues, counters)
  • MeshCentral: remote desktop, shell, file transfer
  • TLS Proxy: Traefik/Caddy/nginx with Let’s Encrypt

Recommended FQDNs (two planes):

  • rmm.example.comUI
  • api.example.comAPI/Agents

External ports: 80/443.
Internal ports (not exposed): 4222 (NATS), 5432 (Postgres), 6379 (Redis), 443xx (Mesh sub-services as needed).

HA note: TRMM is designed for single-node. You can “stretch” to light HA (PG in HA, warm backups, active/passive reverse proxy), but it’s not Kubernetes.


2) Requirements and Sizing

  • Host OS: Ubuntu 22.04 LTS (recommended) or 24.04
  • Minimum HW: 4 vCPU, 8 GB RAM, 80–120 GB SSD
  • Recommended HW (≈1,000 endpoints, moderate checks): 8 vCPU, 16–24 GB RAM, NVMe SSD
  • Deployment time: 60–120 minutes (with DNS/TLS propagated)

Rule-of-thumb sizing (depends on checks, intervals, patch windows):

  • 300–500 endpoints: 4 vCPU / 8 GB → OK
  • 500–1,500 endpoints: 8 vCPU / 16–24 GB → tune NATS/PG (work_mem, autovacuum)
  • 1,500 endpoints: vertical scale and audit checks/intervals; move PG to its own or managed instance; consider splitting by tenant.

3) Network, DNS, and PKI

  1. DNS: A/AAAA for rmm. and api. to host IP.
  2. PKI: Let’s Encrypt HTTP-01 (or DNS-01 for wildcard).
  3. Firewall: open 22/80/443; deny everything else on host (UFW/nftables).
  4. Zero Trust (optional): Cloud proxy (Cloudflare, Zscaler) in front of api. with WAF and rate limiting.

4) Reproducible Deployment (IaC)

4.1 Host bootstrap (Ansible, idempotent)

  • Install Docker/Compose, git, jq, unzip
  • Configure TZ, sysctl (fs.inotify, vm.*), journald rotation
  • Create user without password, minimal sudoers, SSH key-only
  • UFW with deny by default profile

4.2 Compose stack

Folder layout:

/opt/tacticalrmm/
  docker-compose.yml
  .env
  traefik/ (or caddy/)
  postgres/data
  redis/data
  meshcentral/data
  backup/

Key variables (.env):

RMM_FQDN=rmm.example.com
API_FQDN=api.example.com
TZ=Europe/Madrid
[email protected]

POSTGRES_PASSWORD=XXXXXXXX
REDIS_PASSWORD=XXXXXXXX
NATS_PASSWORD=XXXXXXXX
DJANGO_SECRET_KEY=XXXXXXXX (>= 50 chars)

Compose (high level):

  • traefik/caddy with 80/443 entrypoints and automatic certs
  • Services rmm, api, nats, postgres, redis, meshcentral on an internal network
  • Healthchecks and restart policies

Launch:

docker compose pull
docker compose up -d
docker compose ps

Post-install:

  • Browse https://rmm.example.com → create admin
  • Verify https://api.example.com/api/ → HTTP 200
  • Configure SMTP/alerts, policies, and MeshCentral (2FA and certs)

5) Hardening (Checklist)

  • TLS mandatory, modern cipher suites (TLS 1.2/1.3)
  • UI: mandatory 2FA for admins; RBAC least privilege; audit changes
  • API: rate limiting (traefik/caddy), security headers (CSP, HSTS, X-Frame-Options)
  • MeshCentral: 2FA, short session expiry, auto-update, UAC consent configured
  • Internal services (PG/Redis/NATS): bind to internal networks; no public ports
  • Logs: forward to syslog/ELK/Vector with retention and GDPR compliance
  • Backups: see §7; encrypt-at-rest (rclone/Restic to S3/Backblaze)
  • Patching: host unattended-upgrades + monthly compose pull window
  • Secrets: keep .env and credentials in Vault/sops; never in git

6) Operating Model: Policies, Checks, Patching, Scripts

Structure

  • Tenants (Organizations) → Sites → Groups → Devices
  • Inheritable policies (Baseline, AV/EDR, Patching, Alert routing)

Baseline checks

  • CPU > 85% (5 min), RAM > 85%, disk < 15%, S.M.A.R.T., services up, Event Log (critical IDs), Windows Update status, AV/EDR active, last reboot, RDP state (per policy)

Patching

  • Patch Tuesday + X days per ring (crit/high → low)
  • Staggered reboots; pre/post scripts (DISM/SFC, WU reset when needed)
  • Exclude drivers unless support dictates otherwise

Scripts

  • Windows: PowerShell (clean %TEMP%, DISM /RestoreHealth, sfc /scannow, WU reset, Chocolatey installs)
  • Linux: apt/yum updates, systemd health, journald rotation
  • macOS: brew, profiles (with sponsorship)
  • Scheduled runner (weekly/monthly)

Remote

  • Desktop (consent/UAC), real-time shell, files; session recording optional per compliance

7) Backups and DR

What to back up

  • PostgreSQL: pg_dump (daily) + weekly base backup (pg_basebackup if size grows)
  • Redis: RDB snapshot (hourly); include off-site
  • MeshCentral: meshcentral-data/ and config
  • Configs: docker-compose.yml, .env, secrets (in Vault), cron jobs

Where

  • S3-compatible (AWS/Wasabi/Backblaze) with private bucket, encryption-at-rest, lifecycle (7/30/90 days)
  • VM snapshots weekly (cloud provider)

Test restores (quarterly)

  • Fresh VM → compose → import PG/Redis → validate agent check-in
  • Targets: Panel RTO < 60 min; PG RPO < 15 min

8) Updates (Runbook)

  1. Identical staging (subset of endpoints)
  2. Maintenance window: snapshot + backups OK
  3. docker compose pull && docker compose up -d
  4. Validate Django migrations, Mesh and UI health
  5. Prune old containers, dangling images; document compose version in internal changelog
  6. Open change ticket with version diff and 24–48 h observation period

9) Observability (SRE-Light)

  • Host metrics: node_exporter → Prometheus → Grafana (CPU, RAM, FS, net)
  • PG metrics: postgres_exporter (TPS, bloat, locks, vacuum)
  • NATS metrics: varz endpoint → exporter
  • App: /api/ healthchecks, response times, task queues, error rate
  • Alerts: high API latency, HTTP 5xx, PG connections, disk < 15%, Let’s Encrypt cert expiry

10) Endpoint Security

  • Windows: MSI via GPO/Intune (/qn), MST if required; code-signing (with sponsorship)
  • Linux: .deb/systemd; egress whitelist to api.:443
  • macOS: agent (sponsorship); MDM profiles for permissions/daemon
  • EDR/AV coexistence: exclude agent/Mesh paths if EDR requires it

11) AnyDesk/TV vs TRMM (at a glance)

CapabilityAnyDesk/TV (SaaS)Tactical RMM (+ MeshCentral, self-hosted)
Remote controlYesYes
Inventory/MonitoringNo/LimitedYes (integrated)
PatchingNoYes (policies)
Scripts/AutomationLimitedYes (PS/BAT/Python/NuShell/Deno)
AlertsNoYes (email/SMS/Webhook)
Per-endpoint costYesNo
Data/ComplianceVendor’s SaaSYour infrastructure

Migration strategy: 2–4 weeks coexistence, validate remote/UAC, scripted legacy removal.


12) Known Issues and Fast Fixes

  • Let’s Encrypt rate limits: use wildcard DNS-01 or staging CA for bulk tests
  • UAC breaks remote: adjust MeshCentral (elevation), enable consent prompt
  • Agent build flagged by AV: code-sign MSI/EXE; temporary AV exceptions
  • PG growth: tune autovacuum on events/log tables; date partitioning if volume is high
  • High agent latency: check DNS/anycast, NATS keepalive, network path (MTR)

13) Ops FAQ

Can I use managed PG (RDS/Aurora/Citus)?
Yes. Reduces host blast radius but adds latency and cost. Enforce SSL, proper parameter groups, and backups.

How do I limit api. exposure?
Rate limit, WAF, optional mTLS, IP allow-lists for panel (not for agents), origin rules in proxy.

Is multi-tenant a thing?
Yes via Organizations/Sites. For hard isolation (data/controls), split instances and/or databases.

SSO?
Available via sponsorship (SAML/OIDC). Alternative: SSO on MeshCentral and mandatory 2FA in TRMM.


14) TL;DR Deployment (key commands)

# Host prep
apt update && apt -y upgrade
apt -y install git curl jq ufw
curl -fsSL https://get.docker.com | sh
usermod -aG docker $USER

# Clone stack (per official docs) and fill .env
docker compose pull
docker compose up -d

# TLS and access
# -> https://rmm.example.com (create admin)
# -> https://api.example.com/api/ (200)

# Backups (PG example)
pg_dump -Fc -U trmm_user trmm_db > /opt/tacticalrmm/backup/trmm_$(date +%F).dump

# Updates
docker compose pull && docker compose up -d
Code language: PHP (php)

Closing

Tactical RMM won’t replace an orchestrator or a full ITSM platform; it will replace the typical “tool quilt” (AnyDesk/TV + scattered scripts + ad-hoc inventories + manual patching) with a coherent, self-hosted stack. For a systems team managing hundreds to low-thousands of endpoints, the TRMM + MeshCentral + solid ops practices combo delivers control, auditability, and predictable cost. The rest—hardening, runbooks, backups, and a culture of continuous improvement—is on us.

Source: Revista Cloud

Scroll to Top