SA Developer .NET

Welcome to SA Developer .NET Sign in | Join | Help
in Search

Optimising Help

Last post 09-03-2008, 23:49 by GilaMonster. 12 replies.
Sort Posts: Previous Next
  •  09-03-2008, 10:26 14484

    Optimising Help

    My query below operates off a main table of +7 million records, after I trimmed several million off (with call dates before 2000).  The CallRecords table stores every phone call made from every extension in a nation wide org, and has no primary key!  It stores what number which extension called etc. 

    It has a clustered index on [Ext_ID+Date], non-clustereds on [Ext_ID+Date+CalledNumber], [Ext_ID], [Date].  I don't know if there is merit on adding one for 'Private'.  The query is supposed to get the number of times each extension has called a number in a non-private capacity.  My filters for the report will include at least ext_id and date, and possibly site, from the Extensions table.

    select exs.Extension, occs.DialedNumber, occs.Occurrences, exs.Name
    from
    (select Ext_ID, DialedNumber, Count(*) as Occurrences
        from CallRecords
        where Private = 0 and Date > '080701' and Date < '090000'
        group by Ext_ID, DialedNumber) occs
    inner join (select Ext_ID, Extension, Name from Extensions) exs on exs.Ext_ID = occs.Ext_ID

     


    Unscrambling Eggs: Decompiling ASP.NET
  •  09-03-2008, 12:41 14486 in reply to 14484

    Re: Optimising Help

    Hi Prof,

    Wouldn't it be easier to seperate the logic so that the first inner query ((select Ext_ID, DialedNumber, Count(*) as Occurrences
        from CallRecords
        where Private = 0 and Date > '080701' and Date < '090000'
        group by Ext_ID, DialedNumber) occs
    )

    is in a table function, taking parameters for the start and end date. And then just joining on the Extensions table in the 2nd join, joining it with the function.

    And maybe, for the new table function, use BETWEEN to compare the dates. I personally find it to be more cleaner.

    Hope this can help....

    The H................


    The Question is the Answer, and the Answer is the Question!
  •  09-03-2008, 13:38 14488 in reply to 14486

    Re: Optimising Help

  •  09-03-2008, 13:40 14489 in reply to 14488

    Re: Optimising Help

    ProfK:
    It's a 2000 database.

    What do you mean by that? You can use table functions in SQL 2000. I think they were included from 2000 onwards.

    Cheers,


    The Question is the Answer, and the Answer is the Question!
  •  09-03-2008, 14:34 14491 in reply to 14486

    Re: Optimising Help

    horatio:
    table function, taking parameters for the start and end date. And then just joining on the Extensions table in the 2nd join, joining it with the function.

    Table-valued functions (the multi-statement ones) have a nasty habit of causing more performance problems than they solve, especially if they're retuning more than a couple hundred rows. Behind the scenes they use a table variable with no indexes and no statistics. Optimiser always thinks there's only 1 row in a table variable. Put a few thousand in, join it to another table and the results usually aren't pretty.

    There's a long test on my blog (Views or functions) that show how well the multi-valued table-valued functions behave.

    The inline table valued functions are converted into subqueries during the parsing stage of a query's execution, and hence perform just about the same as an equivalent subquery.

    Prof: Can you post the table structure and the index defs please. Also, if possible the execution plan. On SQL 2000 run the query prefixed by
    SET SHOWPLAN_ALL ON
    GO

    Run to grid, then copy the whole result set into excel, zip and attach to a post please


    Gail Shaw - SQL In the Wild
    SQL Server MVP
    --
    Chaos, panic and disorder. My job here is done!
  •  09-03-2008, 14:48 14492 in reply to 14484

    Re: Optimising Help

    First observations:

    The index on Ext_ID is redundent (with both the cluster and the 1st noncluster) and can be removed. If Private is a bit, there's no point in indexing just that as it won't be used. Possibly in combination with other columns, yes.

    What's the distribution of values for Private?

    Date looks like it's not a datetime column. Correct?

    Are CalledNumber and DialedNumber  different columns, or is there an alias somewhere I've missed?

    Aprox how many rows would the subquery return if the group by wasn't there?


    Gail Shaw - SQL In the Wild
    SQL Server MVP
    --
    Chaos, panic and disorder. My job here is done!
  •  09-03-2008, 14:51 14493 in reply to 14491

    Re: Optimising Help

    GilaMonster:

    horatio:
    table function, taking parameters for the start and end date. And then just joining on the Extensions table in the 2nd join, joining it with the function.

    Table-valued functions (the multi-statement ones) have a nasty habit of causing more performance problems than they solve, especially if they're retuning more than a couple hundred rows. Behind the scenes they use a table variable with no indexes and no statistics. Optimiser always thinks there's only 1 row in a table variable. Put a few thousand in, join it to another table and the results usually aren't pretty.

    There's a long test on my blog (Views or functions) that show how well the multi-valued table-valued functions behave.

    The inline table valued functions are converted into subqueries during the parsing stage of a query's execution, and hence perform just about the same as an equivalent subquery.

    Hi Gail,

    Thanks for the advice. I'll definately keep that in mind.

    Regards,

    The H........................


    The Question is the Answer, and the Answer is the Question!
  •  09-03-2008, 15:25 14496 in reply to 14491

    Re: Optimising Help

    Attachment: QueryPlan.png
    GilaMonster:

    Prof: Can you post the table structure and the index defs please. Also, if possible the execution plan. On SQL 2000 run the query prefixed by

    Part 1 - Superfluous fields omitted for brevity.

     CREATE TABLE [dbo].[CallRecords](
        [Call_ID] [int] IDENTITY(1,1) NOT NULL,
        [Ext_ID] [int] NULL,
        [Date] [nvarchar](8) NULL,
        [DialedNumber] [nvarchar](20) NULL

    ) ON [PRIMARY]
    CREATE CLUSTERED INDEX [CallRecords1] ON [dbo].[CallRecords]
    (
        [Ext_ID] ASC,
        [Date] ASC
    )WITH FILLFACTOR = 90 ON [PRIMARY]
    CREATE NONCLUSTERED INDEX [CallRecords5] ON [dbo].[CallRecords]
    (
        [Ext_ID] ASC,
        [Date] ASC,
        [DialedNumber] ASC
    )WITH FILLFACTOR = 90 ON [PRIMARY]
    CREATE NONCLUSTERED INDEX [IX_CallRecords] ON [dbo].[CallRecords]
    (
        [Ext_ID] ASC
    )WITH FILLFACTOR = 90 ON [PRIMARY]
    CREATE NONCLUSTERED INDEX [IX_CallRecords_1] ON [dbo].[CallRecords]
    (
        [Date] ASC
    )WITH FILLFACTOR = 90 ON [PRIMARY]

     Part 2:  Forum only allows image files, so I renamed .zip top .png. Cool



    Unscrambling Eggs: Decompiling ASP.NET
  •  09-03-2008, 15:48 14497 in reply to 14492

    Re: Optimising Help

    GilaMonster:
    What's the distribution of values for Private?

    Only 7% private  .

    GilaMonster:
    Date looks like it's not a datetime column. Correct?

    Yes, it's an nvarchar(8) of which six are only ever used.  Getting the maximum date gave me something historic from [19]860403. Stick out tongue

    GilaMonster:
    Are CalledNumber and DialedNumber  different columns, or is there an alias somewhere I've missed?

    CalledNumber was of my deluded construction.

    GilaMonster:
    Aprox how many rows would the subquery return if the group by wasn't there?

    +206k - +104k with the group


    Unscrambling Eggs: Decompiling ASP.NET
  •  09-03-2008, 19:23 14503 in reply to 14484

    Re: Optimising Help

    This isn't an easy one to get a good index for...

    I would suggest, initially, try the following

    CallRecords (Private, Ext_ID, Date, DialedNumber)
    Extensions (Ex_ID, Extension, Name)
                      (Site, Ex_ID, Extension, Name)

    It should eliminate the need for the clustered scan, which I imagine is hurting a lot. Once you've implemented then, could you post a revised execution plan please?

    Other suggestions for this table are to drop the NC index on Ex_ID, and consider revising the cluster to just Date. The narrower cluster will make the NC indexes smaller and also it should reduce the rate of fragmentation of the index (which is probably at this point fairly high)

    Is Private a bit column or a tinyint/smallint/int ?

    You mentioned filters on the report. How are they implemented? Where clause on the outer query?


    Gail Shaw - SQL In the Wild
    SQL Server MVP
    --
    Chaos, panic and disorder. My job here is done!
  •  09-03-2008, 19:59 14504 in reply to 14503

    Re: Optimising Help

    Thanks a lot Gail, you've been a big help.  I normally would experiment a lot, but each experiment takes so long.

      I'll look into everything you suggested later, and then hope I can change the indexes on the prod database.  The SABS might have some tough DBA rulebooks.


    Unscrambling Eggs: Decompiling ASP.NET
  •  09-03-2008, 20:12 14506 in reply to 14504

    Re: Optimising Help

    Got a non-prod server that you can test them out on first?

    Gail Shaw - SQL In the Wild
    SQL Server MVP
    --
    Chaos, panic and disorder. My job here is done!
  •  09-03-2008, 23:49 14516 in reply to 14503

    Re: Optimising Help

    Just one more comment.

    If you're going to be adding a filter on Ex_ID to the query, then that index should be OK. If the filter on EX_ID is done in some other way (filter on temp table or the like) switch the order or Ex_ID and Date.


    Gail Shaw - SQL In the Wild
    SQL Server MVP
    --
    Chaos, panic and disorder. My job here is done!
View as RSS news feed in XML
Powered by Community Server (Commercial Edition), by Telligent Systems