How to debug and prevent flaky unreliable IT environment? [closed]

I am not an IT admin, I’m a software developer (microsoft stack) and I’m trying to understand what is wrong with the IT environment of one of our customers.

We have deployed our client\server solution to a medium sized business. The problem is, the customers IT environment (mostly various types of Microsoft servers – sql servers, SharePoint, lync, IIS servers, etc, etc) appears to be extremely chaotic and flakey. There constantly seems to be one system or another failing due to an admin having reconfigured something on a server that affects our software running on it. It is taking up lots of support time to keep going in and find an admin has changed some setting on a server that affects our solution, rather than anything directly to do with our software.

It not just our software either, it seems to be going on across all their systems and the admins seem to be constantly firefighting. No sooner are all the dominoes standing than someone changes something that knocks 1 down again…

I going to have a chat with their IT Manager but I’m not hugely knowledgeable about IT Admin practices.

What needs to be looked at or questioned?
In the IT Admin world is there any kind of best practice or process that can address this?
Other suggestions?


Generally reliability in IT is provided by a few different practices, namely:

  • Access control
  • Change management
  • Configuration management
  • Revision control
  • Secret Sauce

Access control is simply limiting who can make changes to critical/production systems. Change management is generally handled through access control and via a ticketing system. Requests must be approved by someone higher up before the change can be made. Configuration management is ensuring the consistency of systems by using an external tool to tightly control all of their configuration parameters. This is generally achieved by Group Policy or other tools like Puppet/Chef/etc. Revision control provides a history of the configuration.

The Secret Sauce is an IT team that knows what the hell it’s doing. All of the process and protocols in the world can’t make up for bad judgment and inexperienced/untalented engineers.

Source : Link , Question Author : abc123 , Answer Author : Joel E Salas

Leave a Comment