Tuesday 4 September 2018

How to optimize the selection query with the index

I am not expert. I have following query, which contains 10 tables MainTable has 10 fields 1st Prime Key and rest foreign keys of 9 tables called TableE1 - 10.

The following query is making outer join in each table, i want to optimize this query with index.
I want to know, how can we optimize queries with index, this query is fetching 10 lacs (1 million) records in 36 seconds, how much time we can reduce ?
MainTable contains 10 lacs (1 million) records, TableE1-9 each table contains 5000 records
select M.RecID,
M.E1, E1.Descr as E1_D,
M.E2, E2.Descr as E2_D,
M.E3, E3.Descr as E3_D,
M.E4, E4.Descr as E4_D,
M.E5, E5.Descr as E5_D,
M.E6, E6.Descr as E6_D,
M.E7, E7.Descr as E7_D,
M.E8, E8.Descr as E8_D,
M.E9, E9.Descr as E9_D
from ((((((((tableMain M
    Left Outer Join TableE1 E1 ON (E1.RecID = M.E1) )
    Left Outer Join TableE2 E2 ON (E2.RecID = M.E2) )
    Left Outer Join TableE3 E3 ON (E3.RecID = M.E3) )
    Left Outer Join TableE4 E4 ON (E4.RecID = M.E4) )
    Left Outer Join TableE5 E5 ON (E5.RecID = M.E5) )
    Left Outer Join TableE6 E6 ON (E6.RecID = M.E6) )
    Left Outer Join TableE7 E7 ON (E7.RecID = M.E7) )
    Left Outer Join TableE8 E8 ON (E8.RecID = M.E8) )
    Left Outer Join TableE9 E9 ON (E9.RecID = M.E9)
Order by RecID


Indexes are probably not going to help this query very much, because the query has no filtering. You are retrieving a million records. How much of the time spent on the query is retrieving the values and how much is spent processing the query?
SQL Server has a good optimizer, that will use sophisticated join algorithms for doing joins. It is quite possible that the query will run pretty well even with no indexes.
That said, an index on each of the "E" tables with both RecId and Descr could help the query: E1(RecId, Descr)E2(RecID, Descr), and so on. These are covering indexes. For this query, SQL Server would use these indexes without having to read from the data pages . An index only RecId would not work as well, because the Descr data would still need to be looked up on the data pages.
Note that these indexes would be unnecessary (redundant?) if RecId is already the primary key and Descr is the only column in the table.
EDIT:
This is too long for a comment (I think).
Here are some ideas for optimizing this query:
First, are all the rows necessary? For instance, can you just add a top 1000 to get what you need? A lot of time is spent just passing the rows back to the application. Consider putting them into a temporary table (select into). That will probably run much faster.
Second, how much time is the order by taking? Try running the query without the order by to see if that is dominating the time.
Third, how long are the descr fields? If they are very long, even just a few thousand could be dominating the size of the data. Note "very long" here means many kbytes, not a few hundred bytes.
Fourth, are the descr fields varchar() or char() (or nvarchar() versus nchar()). char() and nchar() are very bad choices, because they occupy a lot of space in the result set.
Fifth (probably should be first), look at the execution plan. You have present a pretty simple scenario so I have assumed that the execution plan is a scan of the first table with index lookups into each of the other. If the plan doesn't look like this, then there may be opportunities for optimization.
EDIT II:
I will repeat. Transferring hundreds of megabytes from the server to an application will take time, and 30'ish seconds isn't unreasonable. (The return set has 10 ids = 40 bytes plus the description fields which are likely to be 100s of bytes per record.) The problem is the design of the layer between the database and the application, not the database performance.

0 comments:

Post a Comment