SQL Server 2012 : Troubleshooting Methodology and Practices - Data Collection

- How To Install Windows Server 2012 On VirtualBox
- How To Bypass Torrent Connection Blocking By Your ISP
- How To Install Actual Facebook App On Kindle Fire
1/1/2015 8:32:44 PM

When the problem is defined and well understood, and the success criteria have been agreed upon, the next step is to gather data. What data you should gather depends on the problem and what (if any) work has already been completed.

It is critical that decisions about remedial action are based on data. Decisions or recommendations without the foundation of empirical data are simply guesses or assumptions. Gathering data is an iterative process that may require several iterations before the complete problem is captured and a conclusive decision can be reached. As mentioned earlier, ensure that stakeholders understand that sufficient data will be required ahead of any recommendations and problem resolution. These stakeholders must also understand the value and long-term benefits of quantitative analysis and evidence-based decisions. Clearly explaining your methodology helps to win their confidence in the process and its outcome.

Since data collection is so important in identifying the root cause, the following section provides an approach around data collection and specific guidance to ensure the data collection objectives are met.

Focused Data Collection

Only rarely is complete fault information provided when an incident is escalated. The nature of databases means serverwide problems are more immediately obvious, and support teams are notified rapidly. High-impact database problems reach support teams by phone or walk-up much faster than automated alerts or formal support-ticket escalation. Typically, escalated support cases contain insufficient data to make any kind of decision, and further analysis is required before any kind of remediation can begin. In the early minutes (sometimes hours) of these incidents, information can be vague while the cause of the problem is unknown. At this point, the scope of the issue is often the complete solution, consisting of all components of the solution.

During this time, it can be useful to adopt a top-down approach to troubleshooting by starting with Performance Monitor (PerfMon). PerfMon has the advantage of being a serverwide diagnostics tool, and it can be useful in identifying or indemnifying the database tier as the root cause of the problem.

The divide-and-conquer method is especially useful when troubleshooting performance problems. This approach takes the complete end-to-end application environment and selects a midpoint between the client and the database server to determine whether the performance problem exists at that point. Based on the outcome, you can then focus on the problem half and iterate through it until the problem component is identified. This approach can work particularly well with the database tier, as calling a number of stored procedures to validate database functionality and responsiveness can be a useful junction in identifying or absolving SQL Server as the cause of the performance problem.

Understanding Data Gathering

Data gathering is a balance between collecting sufficient information to capture the problem and not collecting so much data that the collection process itself affects system performance or there is simply too much data to analyze efficiently.

Consider a scenario with an intermittent problem for which there is no clear pattern to occurrences. Black box tracing creates a server-side trace, writing trace data to a circular log file that contains data for a specified time period (e.g., 1 hour or 4 hours). If the problem event occurs again and the trace is stopped before the occurrence of the problem is overwritten in the log (this could be automated), the trace will contain the problem. You can also look a setting up in Extended Events to help with this.

If the problem can be reproduced it will be much easier to collect data and refine the scope rapidly. If it occurs in a predictable pattern, it is usually possible to restrict data collection to a short period of time and gather all necessary data.

Conversely, if the problem happens infrequently or without pattern, a different strategy is required. Often it isn’t possible to start data capture when the problem occurs because events occurring just before the problem starts may be important. Therefore, consider using a black box circular trace to enable a continuous lightweight trace that can be stopped when the problem occurs.

Tools and Utilities

The following list summarizes some of the most commonly used data collection tools and analysis utilities.

  • PerfMon — Performance Monitor (PerfMon) ships with Windows and can be used to gather information on server resources and services. It can track serverwide information such as CPU and memory usage, I/O statistics, and network activity. Several SQL Server-specific counters can be useful for various troubleshooting and monitoring scenarios.
  • Profiler — SQL Server Profiler can be used to capture statement-level information from within the database engine.
  • XEvents — Extended Events are a lightweight event-driven data-capture feature that can assist troubleshooting while minimizing the monitoring footprint.
  • PSSDiag — This is a wrapper around SQLDiag, PerfMon, and other add-ins. SQLDiag can do anything that PSSDiag can do, but it is not pre-packaged with all the add-ins that PSSDiag may be configured with. PSSDiag is usually configured by a Microsoft support engineer and sent to help troubleshoot a specific problem. It is specific to a certain version of SQL Server, and the add-ins are usually architecture specific (x86 or x64).
  • SQLDiag — SQLDiag ships with SQL Server. It can be used to gather basic environmental information such as the SQL Server Error Logs, Event Logs, and SQL Server configuration settings. It can also be used to capture time-synchronized Profiler and PerfMon information.
  • Event Logs (Application, System, and Security) — These logs are often useful, displaying which errors, warnings, and informational messages have occurred in the recent past.
  • Application logs — If the application instrumentation includes writing Error Log output or diagnostic information, these logs can be useful for identifying the cause of a problem.
  • User dumps — If you see an exception in the SQL Server Error Logs, you should also see a mini-dump file with the extension .mdmp. This can be used by Microsoft CSS to help determine why the exception occurred.
  • NetMon — This is a network sniffer that is used to look at data as it is sent over the network. It is often used to diagnose connectivity or Kerberos problems.
  • CMS — Central Management Server is a feature with SQL Server Management Studio and provides a method of storing your SQL Server registrations in a central database. It can be useful in a troubleshooting scenario because you don’t have to remember specific SQL Server instance names and passwords — they are already stored in CMS. In addition, you can execute commands against groups of CMS servers at once.
  • Management Data Warehouse — This SQL Server Management Studio tool is used for performance trending. You can use it to collect and consolidate various data over time, which you can then analyze to see how performance has changed.
  • Policy-Based Management (PBM) — PBM can be used to validate whether predetermined standards have been followed. Some policies can prevent certain actions from ever occurring.
Top 10
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 2) - Wireframes,Legends
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 1) - Swimlanes
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Formatting and sizing lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Adding shapes to lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Sizing containers
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 3) - The Other Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 2) - The Data Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 1) - The Format Properties of a Control
- Microsoft Access 2010 : Form Properties and Why Should You Use Them - Working with the Properties Window
- Microsoft Visio 2013 : Using the Organization Chart Wizard with new data
- First look: Apple Watch

- 3 Tips for Maintaining Your Cell Phone Battery (part 1)

- 3 Tips for Maintaining Your Cell Phone Battery (part 2)
programming4us programming4us