**DRAFT** ===== Introducing the Packet Node Monitoring Service ===== A real-time, open stream of what’s actually happening on the amateur packet network — built with and for the community. Amateur packet is resilient by design — but our visibility into //why// a path is slow, //where// a route bounced, or //which// port is flapping has always lagged behind. This service makes the **live state** of the network observable in a consistent, open way, so sysops can troubleshoot faster and developers can build the tools we’ve been missing. * Front end (live view): **https://node-api.packet.oarc.uk/** * Source (service): **https://github.com/M0LTE/node-api/** * **Network Map (live topology):** **https://node-api.packet.oarc.uk/network-map.html** * There is a **public REST API** and an **MQTT feed over WebSockets**. * Two node authors (G8PZT and G8BPQ) will provide config sections so node owners can just turn it on. This **complements** the existing mapping API that many nodes already talk to. The mapping API tells you ''where is everything?''; this new monitoring service tells you ''what is it doing right now?'' — and one day the two may merge. ---- ===== What the Service Does ===== In plain terms: * Nodes send small telemetry updates to the service. * The service normalises that data and republishes it: * via **REST API** (query current state) * via **MQTT over WebSockets** (subscribe to live events) * The front end shows what the service is seeing right now. You can currently see: * **Reporting Nodes** – nodes that are actively sending telemetry right now. * **Discovered / neighbour nodes** – nodes seen from routing/traffic even if they aren’t reporting yet. * **Link Monitor** – real-time AX.25 link activity (links coming up and going down). * **Circuit Monitor** – real-time Net/ROM Layer 4 circuit activity (sessions established/closed). * **Network Map** – a live topology view that separates **RF links** from **Internet/unknown links**, shows how many nodes/links are active, and displays short-lived **traces** for recent traffic. This is effectively “what the network looks like right now” in map form. * **System Metrics panel** (on the home page) – expandable sections for **Database Metrics**, **System Resources**, **Application Metrics**, and **.NET Runtime** so that operators can confirm the service is healthy. **Important defaults:** * **Xrouter (PZT):** telemetry for this service is **default-ON**. * **BPQ (BPQ32):** telemetry for this service is **default-OFF** — you must enable it. This means: if you run Xrouter and you are on a recent build, you may already be sending data. If you run BPQ, you need to flip the switch. ---- ===== Why This Matters ===== Packet is a shared resource. If we can all see the same, consistent network picture: * **Troubleshooting gets faster.** * “I can’t reach via ” becomes “I can see the link flap right now.” * You can watch a circuit attempt in the Circuit Monitor and see where it stops. * With the **Network Map**, you can also confirm whether the node you expect to be a hub is actually present and linked. * **Flaky ports become obvious.** * If a port is going up and down, the live feed will show repeated events. * **Routing oddities are easier to spot.** * If a route keeps bouncing between two nodes, you can see the traffic pattern change. * **Developers can build tools on top.** * Because we expose REST **and** MQTT, no one has to scrape screens or reverse-engineer formats. * **Service operators can prove the service is healthy.** * The **System Metrics** views provide operational observability for people running their own instance. * **The whole network benefits** when more nodes report — your node’s data helps everyone, not just you. The overall aim is simple: **get as many nodes as possible to send telemetry**, and **get as many developers as possible to build visualisations and tooling** on that open stream. ---- ===== Current Capabilities (High-Level) ===== Without talking about the internal code, this is what the service understands and exposes today: * **Nodes** * Active/reporting nodes * Neighbour nodes seen from other nodes’ data * **AX.25 Links** * Link up / link down * Who linked to who * A live view of current links * **Net/ROM L4 Circuits** * Circuit/session life-cycle * Which nodes are talking right now * **Live Network Map** * Distinguishes RF vs Internet/unknown links * Shows current counts of nodes and active links * Displays recent traces (short-lived activity indicators) * **Service / Instance Observability** * System, DB, app, and runtime metrics panels on the home page * **Documented API** * A browsable OpenAPI/Scalar reference is linked from the front end * REST endpoints for programmatic access All of this is published openly, and the code is MIT-licensed in the GitHub repo. ---- ===== How to Turn It On (Sysops) ===== There are two main node stacks in play: * **Xrouter (G8PZT)** – telemetry for this service is **on by default**. * **BPQ / BPQ32 (G8BPQ)** – telemetry is **off by default**, you need to enable it. The two authors will each provide a short configuration section. You can drop these straight into your node documentation or local wiki. After you enable it: - Go to **https://node-api.packet.oarc.uk/** - Check that your callsign appears under ''Reporting Nodes'' - Open ''Link Monitor'' and ''Circuit Monitor'' - Open ''Network Map'' and confirm your node appears in the topology (if applicable) - Establish a simple test link/circuit from your node - Confirm it appears in the live view If it doesn’t show up: * check that your node can reach the service host/port, * check any firewall/NAT rules, * check you didn’t hit the rate-limiter (this shows on the front page). ---- ===== Configuration: Xrouter (G8PZT) ===== **Status:** default-on **(Paula’s section will go here.)** Suggested content for Paula: * Which Xrouter versions/builds have this telemetry already baked in * What, if anything, the sysop needs to set (host, port, interval) * How to confirm from Xrouter logs that it has sent an update * How to confirm on **https://node-api.packet.oarc.uk/** that the node is listed * Recommended reporting interval / not being too chatty * Notes about also talking to the *mapping API* (this monitoring service is separate but complementary) Example structure Paula could use: *Prerequisites* *Configuration block / commands* *Restart / reload instructions* *How to verify on the web front end (including the Network Map)* *How to back off reporting if bandwidth is tight* (Replace this whole subsection with Paula’s real notes.) ---- ===== Configuration: BPQ (G8BPQ) ===== **Status:** default-off — you must enable it. **(John’s section will go here.)** Suggested content for John: * The minimal parameter(s) to add to BPQ to start sending telemetry * How to verify BPQ has actually sent a packet to the monitoring service * Notes for sysops behind NAT/firewalls (allow outgoing UDP to the service) * Recommended reporting interval * A quick “disable” / “rollback” line in case someone wants to stop sending * How to check on **https://node-api.packet.oarc.uk/** that it worked * How to confirm the node is now visible on **Network Map** (if link data is present) Example structure John could use: *Add this to BPQ config:* *Restart BPQ:* *Check the front end:* *Check Network Map:* *If you don’t see it, check firewall:* (Replace this whole subsection with John’s real notes.) ---- ===== Other Packet Systems & Future Clients ===== Right now the service has first-class, “known good” support for **Xrouter (G8PZT)** and **BPQ/BPQ32 (G8BPQ)** because those are the two stacks whose authors have actively helped push telemetry out. But the service is **not limited to those two.** If you run any of the following, you are very much invited to send data: * **JNOS / JNOS2** (Linux, classic net/ROM + BBS environments) * **TheNet / TheNet-derived nodes** (older but still on air in some areas) * **FlexNet-style nodes** * **Linux-based packet stacks using kissattach/ax25d** with their own routing glue * **Direwolf-based nodes** that are already doing APRS/AX.25 and have access to link/session info * **Custom / Pi-based nodes** people have built for local RF + IP tunnelling At the moment these systems are **not** sending telemetry to the monitoring service simply because **nobody has written the small client for them yet.** The service itself is happy to accept the data — it just needs it in the expected format. What we need from the wider packet community: * someone with a **JNOS** system to add a lightweight exporter; * someone who still runs **TheNet / FlexNet** to see what info is available from the node and map it to the telemetry fields (node ID, neighbours, links, circuits/sessions if present); * people maintaining **modern Linux packet gateways** to add a tiny script/daemon that emits telemetry on a timer; * anyone writing **Node-RED / Python / Go** tooling to publish events via MQTT-over-WS straight to the service. In other words: **if your packet stack can tell you “who I am”, “who I’m connected to”, and “what sessions I have”, then it can probably send telemetry to this service.** The data is most welcome — the more diversity of nodes we see, the better the global picture becomes. ===== For Developers ===== This isn’t just a pretty front end — it’s an **open data source** for you to build things on. You have **two** good entry points: ==== 1. REST API ==== * Good for: dashboards, back-end jobs, querying current state * Described via OpenAPI/Scalar (see the link on the front end) * Lets you ask things like: * “What nodes are reporting?” * “What links are active?” * “What circuits are active?” * “What nodes have been seen recently as neighbours?” * “What does the current network graph look like?” * Easy to integrate from: Python, Go, Node.js, C#, Rust, even shell scripts ==== 2. MQTT over WebSockets ==== * Good for: anything live / event-driven * Runs straight in the browser (no extra back-end needed) * Also good for: Node-RED flows, small single-board computers, wallboard displays * Subscribe to “link events”, “circuit events”, “node events” and react in real time Because the MQTT endpoint is over **WebSockets**, you can build a full client-side app that connects directly to the live stream. That makes it very attractive for people who want to host a packet dashboard on a small web server or even locally. **Tip:** the site’s own **Network Map** is proof that you can drive a topology view straight from the telemetry. Use it as inspiration, or replace it with a version tailored to your region. ---- ===== Ideas to Build Today ===== * **Path watcher / path canary** * Periodically test a known-good path and compare with the live link/circuit feed * If the live feed does not show the expected circuit, raise an alert * **Flap detector** * Subscribe to link events * If a link flaps N times in M minutes, send an alert to the sysop * **Historical collector** * Ingest MQTT events into a TSDB (Prometheus, Influx, VictoriaMetrics, etc.) * Build Grafana dashboards for your local or regional packet segment * **Regional views** * Filter the API by callsign prefix or by nodes that talk to a given hub * Show only “my corner” of the network to keep it readable Everything above can be built **without** changing the service itself — that’s the point of making the data public. ---- ===== Relationship to the Mapping API ===== Many packet nodes already send to a **mapping API** that shows geography and topology. This new **monitoring service**: * is **separate** from the mapping API, * is designed to **complement** it, * focuses on **live** connectivity (links and circuits), * now has its **own network-map view** which is driven by the monitoring data, * may eventually **replace / subsume** parts of the mapping API so we have a single, cleaner source. So, if you already have your node talking to the mapping API: **keep it on**. Adding this monitoring feed — especially now that there is a built-in network-map view — just makes the picture richer. ---- ===== Safety, Rate-Limiting, and Fair Use ===== Because this is meant to be a **public community service**, it has some guard rails: * **UDP rate-limiting** – if a node is too chatty or misconfigured, the service will protect itself * **Blacklisting / lifecycle rules** – documented in the repo * **Visible counters** – the front end shows you that these protections are working * **Service metrics** – now visible on the site, so operators can tell whether problems are from the network or from the service * **Open source** – the code is under MIT, so you can self-host or inspect behaviour This is not about excluding people — it is about making sure a single bad sender cannot knock over the service that everyone else is relying on. ---- ===== What We Need From You ===== * **If you run Xrouter (PZT):** check you are on a version where this is on by default, make sure the target host/port is reachable, and confirm you appear on the front end and on the Network Map. * **If you run BPQ (BPQ32):** upgrade, enable the telemetry in config (it is OFF by default), restart, and check the front end and Network Map. * **If you write software / run Node-RED / like dashboards:** grab the OpenAPI spec, or subscribe to MQTT-over-WS, and build *something*. * **If you spot gaps:** open issues or PRs on GitHub: https://github.com/M0LTE/node-api The more nodes send data, the more accurate the live picture becomes — and the more useful it is to everyone when the network misbehaves. ---- ===== Quick Links ===== * **Front end / live views:** https://node-api.packet.oarc.uk/ * **Network Map:** https://node-api.packet.oarc.uk/network-map.html * **Source (MIT):** https://github.com/M0LTE/node-api * **API docs:** linked from the front end ---- ===== Thanks ===== * **John, G8BPQ** — for adding support from the BPQ side (default-off, but easy to enable) * **Paula, G8PZT** — for making the Xrouter side default-on and for documenting it * **Everyone who turns it on** — this only works if nodes actually report data When your node is visible, **the whole packet network gets better.**