← Back to Case Studies Network Engineering

Enterprise Network Diagnostic Dashboard

Enterprise-grade network diagnostic platform with ping, traceroute, DNS, port scan, SNMP polling, CDP/LLDP discovery, ARP table collection, and integrated SSH terminal — supporting batch testing of 500+ targets with real-time WebSocket updates.

FastAPI Svelte Python asyncio xterm.js SNMP Paramiko

Problem

Network engineers troubleshoot across hundreds of devices using separate CLI tools — ping, traceroute, nslookup, nmap, snmpwalk, SSH — then manually correlate results across multiple terminal windows. A single incident investigation might require running the same tests against 50 devices, pasting output between windows to compare, and rebuilding context every time focus shifts.

Existing network management platforms (SolarWinds, PRTG) solve this at the enterprise level but require dedicated infrastructure, licensing budgets, and dedicated staff. There’s nothing in the middle: a fast, self-hosted diagnostic tool that a single engineer can deploy and actually use.

Solution

Built a full-stack diagnostic dashboard consolidating 8 test types into a single UI: ping, traceroute, DNS resolution, TCP port scan, SNMP polling, CDP/LLDP neighbor discovery, ARP table collection, and an in-browser SSH terminal. The platform supports batch testing up to 500 targets simultaneously with real-time progress updates, VRF-aware routing, macOS Keychain credential management, and PDF export of results for incident documentation.

Architecture

Svelte Frontend

    ├── Test Forms ────────────→ FastAPI REST API
    ├── Results View ◄──────── WebSocket (real-time push)
    └── SSH Terminal (xterm.js) ←→ FastAPI WS /ssh/{session}

                                   Paramiko SSH client
                                   PTY allocation

FastAPI Backend
    ├── Async test runners (asyncio gather)
    │       ├── ping / traceroute (subprocess)
    │       ├── DNS (dnspython)
    │       ├── Port scan (asyncio streams)
    │       ├── SNMP (pysnmp / easysnmp)
    │       └── CDP/LLDP / ARP (SNMP MIB walk)
    ├── SQLite (test history, credential profiles)
    └── Keychain integration (security CLI)

Batch tests use asyncio.gather() to run all targets in parallel. WebSocket connections push individual result updates as they complete rather than waiting for the full batch, so engineers see results streaming in for the first devices while later ones are still running.

Key Decisions

Async test execution throughout. A 500-target ping batch that runs sequentially would take 25+ minutes with a 3-second timeout per host. Running them with asyncio.gather() completes the same batch in under 30 seconds. Asyncio was the correct choice over threading for I/O-bound network tests.

Keychain over plaintext credential storage. SSH and SNMP credentials for production devices cannot live in a config file or a database column. The platform stores credentials in macOS Keychain via the security CLI, accessible by name in the UI. Credentials are retrieved at test execution time and never written to disk by the application.

WebSocket for real-time progress instead of polling. With 500 targets in flight, polling for results would either hammer the API or introduce visible lag. WebSocket push means the UI updates within milliseconds of each result arriving, giving a live progress feel even for large batches.

In-browser SSH via xterm.js + Paramiko. Opening a native SSH terminal requires switching to a separate application. The integrated xterm.js terminal keeps engineers in the same UI throughout an investigation — run a batch of pings, spot an anomaly, open SSH to the outlier device, all without context switching.

Results

  • 8 diagnostic tools unified in one UI: ping, traceroute, DNS, port scan, SNMP, CDP/LLDP, ARP, SSH terminal
  • Batch testing up to 500 targets with parallel async execution
  • Real-time result streaming via WebSocket — results appear as they complete, not after full batch
  • VRF-aware test routing for multi-VRF environments
  • macOS Keychain integration for secure credential storage (no plaintext credentials)
  • Full SSH terminal in-browser via xterm.js with PTY support
  • PDF export of test results for incident documentation and change management
  • SQLite test history for trend comparison across incidents

How This Scales

  • Multi-platform deployment — Keychain integration is macOS-specific. Abstracting the credential store to support Linux secret managers (libsecret) and Windows Credential Manager would allow deployment on jump hosts of any OS.
  • Scheduled test suites — Run saved test configurations on a schedule and alert on regressions (e.g., latency increase or port state change between runs).
  • SNMP trap receiver — Complement active polling with passive trap collection, correlating incoming traps against batch test results for faster root cause identification.
  • Team mode — Shared test history and concurrent SSH sessions across multiple engineers working the same incident, with session recording for post-incident review.

Tech Stack

  • Backend: Python, FastAPI, asyncio
  • SSH: Paramiko (PTY-enabled SSH sessions)
  • SNMP: pysnmp / easysnmp
  • DNS: dnspython
  • Frontend: Svelte, xterm.js
  • Database: SQLite (test history and profiles)
  • Credentials: macOS Keychain via security CLI
  • Real-time: WebSocket (FastAPI native)

Need something similar?

I've built this before. Let's talk about adapting it for your needs.