The Remote Server Far, Far Away
At the heart of any great system administrator or network engineer you will find their consistent devotion and energy towards the reduction of liability, as well as a devotion to learn, adapt, and navigate the ways of least troubleshooting resistance. After all, a mental troubleshooting toolbox is just as valuable as a physical one. This article focuses on the latter by applying a little critical thinking on how administrators and engineers approach remote server and device management. From here on, let’s ask the question, “Do I have at least three unique ways to remotely address an issue?”
Of course, before we begin, I will address the elephant in the room: Compliance. Security should be top of mind when evaluating any network and system deployment or strategy. By abiding by regulatory compliance (PCI, HIPPA, etc) you are ensuring the protection your own paycheck. That said, acknowledge each compliance that regulates your industry. Are there ways of accomplishing similar tasks? If you modify five percent of your strategy, are you able to then incorporate methods otherwise initially labeled as prohibitive?
The Remote Scenario
Let’s start with a physical server running one virtual machine. This server is located in a building several hours from your main location. The server is connected to a switch, which is connected to a firewall, which uses only one WAN connection under a static IP (to make life easy).
We know the goal is to establish at least three separate and reliable methods of being able to remotely administer the virtual machine. We begin.
Software is easily the most flexible method of establishing remote management. Establish a software based VPN service. Preferably two, if possible. Begin with any established protocols your organization is using. Some software based VPN clients require RSA or two factor authentication, but again check with your security team if under that hood.
Consider These Remote Options
VPN Client 1 – OpenVPN is a common and excellent example of a client capable of re-establishing connection on reboot and can accept a wide variety of routing policies on a per-client basis. Keep this link always-on. This method relies on a DNS/IP based server connection which means a direct tunnel. It is ideal for common remote tasks on the VM.
VPN Client 2 – Establishing an OpenVPN (or other primary) VPN client is great, however these connections are always vulnerable to disconnection and other volatility. Software is an easy additive practice for redundancy, so let’s add another. LogMeIn’s Hamachi service, for example, is a great secondary VPN option! For roughly $60/year, this client software can install as a service, run at boot, and establishes a secure link utilizing its own IP network. Deploying this secure cloud based third party negotiation principle is very important and will likely be used from time to time.
Second VM – If you have the physical resources to do so, adding another VM on the host is a great way to remotely troubleshoot the mission-critical VM. Add all of the local tools you will need such as an SSH and FTP client, remote desktop manager, VNC client, third party firmware utility managers, etc. Consider this VM your on-site toolbox. Add VPN clients 1 and 2 to this toolbox VM as well!
Backup Physical Server – Add, even if it is at reduced physical capacity, a second VM host. Keep this host idle. Spin up a toolbox VM on this host as well. If running Microsoft Hyper-V, perhaps skip this step and install VPN clients 1 and 2 right on the host OS. If space, energy, budget, or scale is an issue, consider simply adding a small thin workstation, Intel NUC, or other simple workstation that you can still use as a physical presence aside the main VM host.
Managed PoE Switch – If there are devices, especially if they are Power over Ethernet (PoE) dependent, that may benefit from a port bounce, add a simple managed switch. Doing this allows you to dial into the switch and remotely disable/enable any port of a device that may be frozen, crashed, or performing abnormally. This will also provide you with another IP address that you can place in a troubleshooting metric to determine at what point a fault occurs.
4G LTE Modem / Secondary WAN – The advice of adding a fail-over WAN may itself be either redundant or obvious, however many times a second form of physical media is not possible. For that reason, Radio Frequency (RF) based communication may be your best friend. Establishing a Cradlepoint or Digi Transport based modem at the head-end of the network will absolutely offer you more information and remote abilities than initially thought. These modems often times have cloud based remote management suites that allow administrators the ability to console in and perform remote tasks. Those tasks could even be as simple as pinging or accessing an SSH terminal for internal hosts or devices, offering you that one extra entry method! Equally as important as the redundant WAN link they offer, these modems also support yet another VPN client interface!
Internet of Things (IoT) Switch/Relay – Often seen in data centers as metered Power Distribution Units (PDU), a remote power outlet will always be necessary at some point. Similar to the managed network switch, bouncing outlets remotely can save hours and in some cases days of downtime. Some network controlled power switching units even offer something called a “watchdog” service where administrators can define an IP address for the unit to ping constantly and power cycle an outlet (often times a modem/router) if that ping times out. This outlet itself also then doubles as another IP address you can ping and place in a performance metric to narrow down root cause of an outage.
4G LTE Console/Relay/Etc – At the end of the day, having a completely separate 4G LTE based relay or serial command based device may offer you that last connection option.
Each and every device in this topology offers you the ability to know more about the issue before packing up gear and heading out to the remote site. If, for example, you see (in the monitoring solution you are using to track all of this gear) that the mission-critical VM is not responding along with the VM’s host, you know that the root issue is likely not tied to the primary WAN connection, firewall, or switch. The more devices accessible in this dashboard-level view, the better your troubleshooting powers are whilst remote and far, far away.
Here’s an overview of what this solution could potentially look like. This, in combination with a great backup strategy, will ensure the remote site remains online well into five nines.
It’s important to not only build multiple paths to a remote system, but to also keep every method documented and top of mind when troubleshooting. Take each troubleshooting step as both a means to resolution as well as additional data to fuel your next move.