Monday, September 27, 2010

CCNP - TSHOOT - TSHOOT in the real world...

When studying for any certification exam it's very easy to end up concentrating on the goal of passing the exam instead of the actual goal which should be gaining knowledge and achieving a certain standard of expertise that can be practically utilised in the real world. The exam simply confirms that you have a certain level of knowledge. The point is to enable you to use it in the real world.

So here I am, studying for my TSHOOT exam, which will complete my CCNP, and in the week before the exam comes 2 real world examples of how studying for the TSHOOT exam has helped me investigate and resolve the issue in hand. The next 2 posts will outline the problem, methodology used and how the TSHOOT exam materials directly related to resolving the problems.

Trouble Ticket #1 - The customer contacted us to state that SiteA no longer had access to the internet via it's default route. 

 Following the TSHOOT troubleshooting methodology I gathered as much relevant information as I could from the client. I then reviewed the information, eliminated the possible causes until I was down to a small   number of possible issues, I then formed my hypothesis, tested it and reviewed the outcome.

For this particular problem the customer stated that following the closure of one office and the migration to a new site access to the internet via it's default route was no longer possible. The customer supplied me their default route - 10.1.1.1 and that it originally routed via the now closed office. I also had the new network diagram outlining the new office and the appropriate routes. The diagram was essentially as follows:

SiteA Firewall ---> R1at SiteA ---> 'The core network' at R2 ---> R3 at SiteB - (trunk link)--> SiteB L3 switch ---> SiteB's ASA firewall --> INTERNET

First job was to see where the routing began to fail. So on to R1 and check the interfaces were up (yes) and then I moved on to the routing table to check for a route for 10.1.1.1. This was a static route pointing to our core network (for this article we'll call all of it R2). I performed a traceoute on R1 to 10.1.1.1 and it failed at R2.

On R2 I did the same, checked that interfaces were up and then checked the routing, This is where the first issue occurred. The static route on R2 was still pointing to the closed office. So the change was made. Static route was now pointing to R3 at SiteB.

I performed a traceroute to the default route from R1 again and it still didn't work. This time the trace stopped at R3.

On to R3 the routing table revealed that there was no route for 10.1.1.1 at all. After referring to the customer they advised me that the traffic had to go via a specific VLAN on the L3 switch so on R3 we needed to create VLAN10 and allowed VLAN10 on the trunk link to the L3 switch at SiteB. Finally the route for 10.1.1.1 was set to go via the SVI for VLAN10.

With this created I tested the route to 10.1.1.1 from R3. It still didn't work? With the information at hand I was able to formulate my hypothesis.

Was VLAN10 a) created on the L3 switch? and b) allowed on the trunk link?

Next I checked the state of the VLANs. VLAN10 was in place on R3, and was allowed on the trunk link to the L3 switch. The trunk link was up and protocol was up. Everything that I had admin access to was now properly configured and had been tested however the traceroute was still not returning. I put this to the customer who inspected their L3 Switch. They reported that VLAN10 had not been created after all on their equipment.

Once the customer applied their configuration on their equipment the route from SiteA to the ASA firewall on the LAN of SiteB worked. It was only after the issue was investigated that it became apparent the customer had neglected to account for the routing required for SiteA when they planned the closure of their old office. As a result none of the required configuration had been prepared before hand and SiteA had been cut off once the old office closed.

Trouble Ticket #1 - Closed.

As you can see on this occasion I was able to use a 'Divide and Conquer' trouble shooting approach. Starting L3 and then checking L1 and L2 when the traceroute failed. On R3 I was able to narrow the issue to L2 and formulate my hypothesis which was then tested.




No comments:

Post a Comment