Is 100% Uptime 100% Possible? The Realities of Network and Systems Administration

 

Computers play an integral role in the daily lives and productivity of employees across America. It only takes a delay in browser speed to illustrate how heavily workers depend on the smooth operation of software applications, operating systems, Internet connections, servers, and computer hardware to perform job-related tasks. As a crucial business investment, many companies employ expert security professionals to avoid the costs associated with unexpected technology glitches. Who better to protect a company from cyber thieves, viruses and hackers and prevent dreaded downtime than a highly trained network or systems administrator?


Systems administrators spend the average workday maintaining software and hardware, providing technical support, backing up critical data, and testing system security. Simply put, they make life easy for the employees who rely on their ability to troubleshoot behind-the-scenes technical issues. When problems arise, however, a systems administrator's day is anything but average.


These technical experts must be prepared to resolve system threats in very complex technical environments at a moment's notice. While some problems are unforeseen, the more common technical challenges a network administrator faces fit into one or more of the following scenarios:








  • New computer viruses
  • Highly sophisticated hackers
  • Power outages 
  • Unplanned vendor downtime (Internet service provider, e-commerce host, etc.)
  • Program incompatibilities and glitches
  • Defective hardware

Thankfully, network and systems administrators are experts in planning for, and responding to, unexpected downtime. Experienced systems administrators and network engineers design, implement, and monitor plans that serve to protect an organization's IT infrastructure from potential disaster. A typical systems protection plan combines a detailed analysis of existing systems with data that supports projected growth, and includes:


  • Analysis of current and projected user requirements (software and hardware)
  • Analysis of current and projected network capacity demands
  • Policies that control user access permissions
  • Risk analysis with identified potential for repercussion
  • Data backup system
  • System restore capabilities
  • Documentation of proposed, approved, and implemented changes
  • Specifying and communicating scheduled changes to users
  • Methods for supporting users

While systems administrators design networks to maximize a system's performance and integrity, even the most seasoned professionals are unable to maintain uptime 100 percent of the time.


When unexpected downtime strikes, administrators must quickly shift their attention to troubleshooting the problem, and give priority to critical networks that are affected. Most professionals use a very systematic approach to troubleshooting, which may include the following steps:

  1. Review logs to detect all possible causes (hardware failure, user interaction, external event, etc.) and the approximate time the problem began
  2. Run diagnostic tools
  3. Note all facts related to the problem
  4. Diagnose the problem
  5. Isolate the problem
  6. Consult product manuals, online resources, and colleagues as needed for solutions
  7. Devise list of potential solutions
  8. Attempt to duplicate the problem in a test environment
  9. Test solutions in the same environment
  10. Solve the problem and thoroughly document the solution reached

When systems go down, the network administrator's main objective is to restore system access as soon as possible. The amount of time needed to resolve a technical problem can vary, but administrators stay focused on finding solutions under high-pressure circumstances, no matter how long it takes.


Companies that conduct business exclusively through computer networks and systems can take advantage of today's time- and money-saving technologies, but not without encountering occasional complications of various magnitudes. For as long as technology factors into business practices, companies will continue to employ skilled tech professionals to maintain smooth operations and solve unexpected problems. In fact, network and systems administrators can look forward to a 28 percent projected job growth rate over the next decade, so master troubleshooters in technology should enjoy steady employment for many years to come.


Tech professionals do the best they can to keep a system running smoothly, but maintaining 100 percent uptime is not a realistic expectation. If you're a network and systems administrator, what advice can you offer other professionals just entering the field about preparing for unexpected disasters? What are some unique methods you've discovered for detecting the source of a downed network?



blog comments powered by Disqus