Troubleshooting Skills

RS_MCPRS_MCP Member Posts: 352
IT Pro's,

All IT desktop or support engineers have their own way of troubleshooting hardware, software or network faults.

Could some of you please give me examples of the steps you would take to troubleshoot technical faults.

You can base a scenario on anything.

Thanks!

Raj

Comments

  • undomielundomiel Member Posts: 2,818
    It depends upon the problem. Some problems are self evident as to the source, others may leave you wondering. In the latter cases I work from general to specific. Narrow down the source to a group of components, software, systems, or whatever, and the start testing more specifically to get even closer to the problem. A good example is troubleshooting a mail flow issue. You check to first to see if it looks like the problem is outside your site or inside your site. If you find that it is inside then you start narrowing down whether it is your hub transport, edge transport, spam filter, or whatever else you have implemented in the pipeline. Just be constantly critical of your work flow to see where you have unnecessary steps implemented, and then work to correct those.
    Jumping on the IT blogging band wagon -- http://www.jefferyland.com/
  • Forsaken_GAForsaken_GA Member Posts: 4,024
    The first thing you need to be able to do to troubleshoot a problem is to be able to reproduce it.

    This may seem like it's obvious on the tech side of things, but the customers seem to think otherwise.

    Once you can reliably reproduce, you have a starting point. If you understand how the various technologies involved fit together, you'll be able to follow the flow and find the problem quicker. Specific error messages are a godsend. For example, if someone sends me an error saying that a database connection from this user @ this host failed, I know it's a mysql authentication issue, and I either just need to modify a configuration file, or update some permissions in the database.

    But in general, I start with the following - check network connectivity (if applicable). Check DNS (again, if applicable). Then I check the log files. 9 times out of 10, the log files will tell me what the problem is and I can go fix it.

    Alot of problems end up being really stupid stuff. Permission errors. Typo's in a config file. So and so accidentally clobbered REALLYIMPORTANTSTUFF.xls and could you please restore it from backup?

    The nature of the problem dictates your approach to it. Some thing you'll know immediately, and people will think you've got your Jedi going on. Other stuff makes you 'wtf?' and requires you to do some digging in depth. Just make sure that when you have to get down and dirty, you still check the basic stuff first. Nothing sucks more than spending 5 hours on a problem just to figure out it was something you'd dismissed as unlikely in the first 5 minutes (or worse, having someone come by after you've been working on it for 5 hours and say 'hey could it be such and such?'... and they're right).

    One of the most important and yet often overlooked portions of troubleshooting is documentation. Once you've solved the problem, document it somewhere. It's really inefficient to have to solve the same problem more than once. I personally use a wiki. Once I've figured something out, it goes into the Wiki. I do not link to web pages if I used them as a resource, I copy and paste the text. You never know when a web page will go down, and it would suck to come back six months later just to find out that your link is dead.
Sign In or Register to comment.