Bangalore .Net User Group

Welcome to bdotnet.in
India's Largest and Most Active Microsoft User Group!!
Welcome to Bangalore .Net User Group Sign in | Join | Help
in Search

User Connections

Last post 10-15-2008 6:49 PM by Unsure. 6 replies.
Page 1 of 1 (7 items)
Sort Posts: Previous Next
  • 09-18-2008 7:56 AM

    • Unsure
    • Top 150 Contributor
    • Joined on 09-18-2008
    • Posts 5
    • Points 55

    User Connections

    Hello,

    We have a 2-node (Active/Passive) clustered environment. Running SQL Server 2000-SP4 on Windows 2003 SP2 server. Have 32GB of RAM. 30GB has been allocated to the SQL Server.

    At peaks hours (morning)  users receive timeout errors, cannot access the server. Gets to the point where you can't even refresh the 'Enterprise Manager - Current Process Activity' screen. I can execute SQL commands (eg. 'dbcc memorystatus') from Query Analyzer. 'sp_who' takes close to 5 minutes to display output.

    'dbcc memorystatus' shows around 12GB of free buffers. However, more than 2000 processes have lastwaittype as 'RESOURCE_SEMAPHORE'. Quite a few processes have lastwaittype as 'IO_COMPLETION'

     

    Each time this happens, 'User Connections' are between 300-500.  There is no blocking.

    I tried flushing the buffers but that doesn't help. A failover is required each time.

    Application uses Java and ASP.

    My question, is it that the resources  used by the client connections are not being released, so that is causing the problem? The developers are double checking to see that all the Java and ASP pages have  'close connection'.

    Some of the stats:

    Memory Grants Pending /Memory Grants Outstanding
    22 26
    19 194
    18 239
    18 238
    37 67
    28 233
    33 227

    Buffer Manager\Page life Expectancy
    6389
    6565
    6748
    6764
    6933
    7101
    7268
    7446
    7625
    7804
    7960

    Disk Que Length is around 7 and 10 for the 2 disks assigned the SQL Server.

    '% Processor Time' is at an average of 33%. Have 8 CPUs.

    Thank you. 

     


     

     

     

     

     

     

     

    • Post Points: 20
  • 09-18-2008 7:10 PM In reply to

    • vinodkumar
    • Top 25 Contributor
    • Joined on 09-20-2007
    • Bangalore
    • Posts 28
    • Points 420

    Re: User Connections

    The stats are a clear give-away. You CANNOT have a Disk Queue length of more than 2 and that itself suggests that you are potentially having a storage engine problem. I think you have some problem with the HDD used or in a RAID maybe having a bad controller or bad drivers or somthing on these lines.


    Vinod Kumar
    www.ExtremeExperts.com
    http://blogs.sqlxml.org/vinodkumar/
    • Post Points: 20
  • 09-19-2008 7:46 AM In reply to

    • Unsure
    • Top 150 Contributor
    • Joined on 09-18-2008
    • Posts 5
    • Points 55

    Re: User Connections

    Hi Vinod,

    Thank you for your reply.

    Today we had to failover twice. Once in the morning and later in the afternoon.  As mentioned earlier, I have flushed the buffers when the server sort-of freezes. That doesn't help and actually not neccessary as the SQL Server does have about 18GB of free memory at that point of time. Have tried killing connections, that too doesn't make a difference. We  re-booted the Application Servers. Did not help. Right now only a failover resolves the problem. Seems like the resources are not being released. Thought killing the spids should have helped.

    The following stats are from this morning before we had to do a failover. Do you mind taking a look at them.  The Disk Queue Length wasn't anywhere near the values it otherwise has (7, 10).

    There were 57 processes with lastwaittype = 'IO_COMPLETION' at the time the server was failed over. Also, 1894 processes with lastwaittype = 'RESOURCE_SEMAPHORE'.

                               Memory - Available Bytes
    -------------------------------------------------------
    9/18/08 7:46 AM            752MB
    9/18/08 7:49 AM            30GB    (Server Failover at 7:50AM)
    9/18/08 7:52 AM            38MB

     

                                    Memory/Pages/Sec
    --------------------------------------------------------------------------------
    9/18/08 7:43 AM            0.083327058
    9/18/08 7:46 AM           16.87095995
    9/18/08 7:49 AM           113.6970836

     

                                 \Processor(_Total)\% Processor Time - (Have 8 CPUs)
    ------------------------------------------------------------
    9/18/08 7:35 AM         53.84 
    9/18/08 7:37 AM        24.79 
    9/18/08 7:39 AM         2.41 
    9/18/08 7:41 AM         2.42 
    9/18/08 7:43 AM         2.94 
    9/18/08 7:45 AM         3.73  

     

                                                                 PhysicalDisk(1) PhysicalDisk(2)

                                                            Avg. Disk       Current Disk      Avg. Disk       Current Disk
                                                       Queue Length    Queue Length  Queue Length    Queue Length
                                                                                                                        

    ----------------------------------------------------- --------------------------------              -----------------------------------------
    9/18/08 7:45 AM                           0                         0                         0.002226667           0 
    9/18/08 7:46 AM                           0.000073             0                         0.002153333           0 
    9/18/08 7:47 AM                           0.004286667        0                         0.000086                0 
    9/18/08 7:48 AM                           0.030566667        0                         0.008506667           0 
    9/18/08 7:49 AM                           0.011612772        0                         0.048884304           0

     

     

                                    Memory                          Memory
                                    Grants                            Grants
                                   Outstanding                     Pending
    --------------------------------------------------------------------------------
    9/18/08 7:36 AM               15                            12
    9/18/08 7:39 AM               11                            120
    9/18/08 7:42 AM               10                            219
    9/18/08 7:45 AM               11                            239
    9/18/08 7:48 AM               11                            240

     

     

                                   User Connections
    --------------------------------------------------
    9/18/08 7:39 AM            180
    9/18/08 7:41 AM            232
    9/18/08 7:43 AM            298
    9/18/08 7:45 AM            338
    9/18/08 7:47 AM            405

     

                                        Buffer Manager\Page life expectancy
    ---------------------------------------------------------------------------------------
    99/18/08 7:44 AM                    16585
    9/18/08 7:46 AM                      16846
    9/18/08 7:48 AM                      17109
    9/18/08 7:52 AM                        2327

    Any advice will be appreciated.

    Thank you. 

     

    • Post Points: 20
  • 09-20-2008 5:41 AM In reply to

    • vinodkumar
    • Top 25 Contributor
    • Joined on 09-20-2007
    • Bangalore
    • Posts 28
    • Points 420

    Re: User Connections

    Dude - From the looks of this, as I mentioned before this looks like some sort of hardware failure. Check you HDD vendor / Server vendor to see if there are any hardware related issues (HDD/Memory). There is some bad sector or something wrong in connectivity or bad RAM module hanging around.

    Since you are also mentioning IO_Completion, this surely shows there is some sort of IO bottleneck happening in the system. Occurs while waiting for I/O operations to finish. This wait type generally represents non-data page I/Os. Data page I/O completion waits appear as PAGEIOLATCH_* waits but not the case as per your description. Identify disk bottlenecks by using Performance Counters, Profiler, sys.dm_io_virtual_file_stats and SHOWPLAN.

    Some hints:
    1. Adding additional IO bandwidth,
    2. Balancing IO across other drives
    3. Reducing IO with appropriate indexing
    4. Check for bad query plans

    How to narrow?
    See Disk performance counters:
    1. Disk sec/read
    2. Disk sec/write
    3. Disk queues
    See SQL Buffer Manager performance counters:
    1. Page Life Expectancy
    2. Checkpoint pages/sec
    3. Lazy writes/sec
    See SQL Access Methods for  correct indexing:
    1. Full Scans/sec
    2. Index seeks/sec
    See memory performance counter: Page faults/sec
    SQL Profiler can be used to identify which Transact-SQL statements do scan. Select the scans event class and events scan:started and scan:completed. Include the object Id data column. Save the profiler trace to a trace table, and then search for the scans event. The scan:completed event provides the associated IO so that you can also search for high reads, writes, and duration.
    Check SHOWPLAN for bad query plans

    Now the RESOURCE_SEMAPHORE can indicate memory bottlenecks too and can be interrelated sometimes. This occurs when a query memory request cannot be granted immediately because of other concurrent queries. High waits and wait times can indicate excessive number of concurrent queries or excessive memory request amount. Sometimes queries involving high amount og HASH joins can wait for the memory and stall. Check if you have proper indexing in place.

    To have a deeper look into this. Take a look at: Memory Grants Pending counters. Compare with Memory grants outstanding. If grants pending increases, you can do the following:
    1. add more memory to SQL Server
    2. add more physical memory to the box.

    This is the max that I could decifer ... Hope these help in getting closer to solving your problem. Do keep us posted on this.


    Vinod Kumar
    www.ExtremeExperts.com
    http://blogs.sqlxml.org/vinodkumar/
    • Post Points: 5
  • 09-25-2008 6:49 AM In reply to

    • vinodkumar
    • Top 25 Contributor
    • Joined on 09-20-2007
    • Bangalore
    • Posts 28
    • Points 420

    Re: User Connections

    Since we didnt hear back from you, I presume this problem has been resolved. And do post your actual resolution so that others can also benefit from your experience.


    Vinod Kumar
    www.ExtremeExperts.com
    http://blogs.sqlxml.org/vinodkumar/
    • Post Points: 20
  • 09-25-2008 9:34 PM In reply to

    • Unsure
    • Top 150 Contributor
    • Joined on 09-18-2008
    • Posts 5
    • Points 55

    Re: User Connections

    Far from that....the problem is still not resolved. A people problem right now that is why.

    I will definietly update this post with the resolution when we get this resolved.

    Thanks.

     

     

     

     

    • Post Points: 5
  • 10-15-2008 6:49 PM In reply to

    • Unsure
    • Top 150 Contributor
    • Joined on 09-18-2008
    • Posts 5
    • Points 55

    Re: User Connections

    An update to this issue. 

    After the developers went through each web page and ensured that connection to the SQL Server is being closed,  'User Connections'  have not increased or spiked to 300+.

    Since 'User Connections' have not increased, there have been no 'RESOURCE_SEMAPHORES' waittypes and so the server hasn't frozen.

    Thank you. 

     

     

     

    • Post Points: 5
Page 1 of 1 (7 items)
Powered by Community Server (Commercial Edition), by Telligent Systems