SQL Query Optimization Tips
The most important part and parcel of a Database management system is the SQL. It will differ from product to product. Given below are some tips on how to write a well optimized query and avoid performance bottlenecks.
Primary verification of the SQL.
1) The SQL query will need to be verified by two parameters
1. The SQL returns the appropriate results.
2. The SQL adequately addresses the issues of optimization and efficiency.
SQL processing.
When we write a query and process the same, we are carrying out other overhead operations to finally fetch the appropriate result. A typical query undergoes the process of (a) Parsing (b) Actual execution (c) and Fetch. When we tune/ optimize the SQL query, we are in fact finetuning the performance of one of these aspects. The result is faster time to generate the result and hence optimization.
Parsing includes
- Verification of query syntax
- Verification of objects
Execution includes
- The process of actual reads, writes in order to achieve the desired result.
Fetch includes
- Retrieval of the result rows and final display (such as ordered).
A very good method to counter the parsing time is to make extensive and judicious mix of Procedures, packages, views, functions and so on. Most RDBMS optimizers work on the principle of reducing the query complexity by arriving at the optimized way of execution. With increase in query complexity, it is found that the optimizer too takes a longer time to process the query.
Certain tips to work towards an optimized query.
1) Practically it is always feasible to tune the query after initial verification of the validity and working of the query.
2) Ensure that the SQL statements are written in a uniform and identical manner throughout to aid re-use. The added advantage is the avoidance of re-parsing when the optimizer encounters a similar query.
3) When working with large tables, avoid the usage of SELECT * to increase efficiency. Make sure to use the necessary column names and query the table for these columns only.
1. E.g. IF there exists a table called EMPLOYEE, with around 36 fields, it is better to use the relevant columns in the SELECT clause instead of using SELECT *
SELECT emp_id, emp_name, emp_bonus
FROM employee
is faster than using
SELECT *
FROM employee
4) Make judicious usage of the GROUP BY and ORDER BY. Verify if we can have alternate solutions instead of using GROUP BY and ORDER BY.
5) Desist from making use of HAVING clause with the GROUP BY all the time. Use it only if it is unavoidable.
SELECT empname
FROM employee
WHERE empname = ‘Abhilash’
Is better than
SELECT empName
FROM employee
GROUP BY empname
HAVING empname = ‘Abhilash’
6) Let us compare the following two queries.
SELECT emp_ID, empname, salary
WHERE salary > 0
And
SELECT emp_ID, empname, salary
WHERE salary != 0
The query written as
SELECT emp_ID, empname, salary
WHERE salary > 0 is more optimized than the other and will return a low cost.
7) Usage of an appropriate join will be more advantages than using a sub-query due to parsing constraints.
8) Avoid queries with full table scan unless it is warranted for.
9) Judiciously use the Table JOINS, IN and EXISTS.
10) Using the IN in a subquery is faster than using other constructs
11) Whenever possible use a non-column expression. A non-column expression indicates having the FIELD COLUMN on one side of the equation/ operator and the values on the other side.
E.g
A condition to verify for the electric charge and its slab_rate can be written in the following form.
Using the expression
WHERE Elec_Charge < 750/(1 + slab_rate)
Will be always faster than using
WHERE Elec_Charge + (slab_rate * Elec_Charge) < 750
12) Using the IN with a subquery holds good while using EXISTS within the parent query works better.
13) Using IN within the parent query tends to be slow.
14) Using image/ objects within the table tends to slow down the query. Instead use the underlying file system to save these objects and refer to these objects by using a pointer stored in the database table.
15) Avoid having full table scans for very large tables.
16) Verify proper usage of the indexes in order to derive the best out of the database.
17) Using EXISTS instead of IN definitely works faster.
E.g.
SELECT title
FROM titles
WHERE EXISTS
(SELECT *
FROM publishers
WHERE pub_id = titles.pub_id
AND city LIKE ‘B%’)
is faster than using the query with the IN
SELECT title
FROM titles
WHERE pub_id IN
(SELECT pub_id
FROM publishers
WHERE city LIKE 'B%')
18) In cases when we encounter a NOT IN, try writing the same query using an OUTER JOIN. Definitely helps.
E.g.
SELECT Order_id from Order
WHERE Order_id NOT IN (SELECT Order_id from Sales)
Can be rewritten using
SELECT o.Order_id from Order o, sales s
WHERE o.Order_Id = s.Order_ID(+)
19) Avoid the usage of LIKE. Instead use the equality operation, which works faster.
Hope this holds good for you. Get back with your comments and suggestions. Thank you.
Hey there!!! I am really very eccentric as my blogname suggests. Wanna get a dose of eccentricity. C'mon. Welcome to the eccentric place of Abhilash. | Welcome to the Eccentric Place. | Such a lovely place, such a lovely place. | Plenty of room at this eccentric place. | You can check out anytime you like but you can never leave. | A place for Software Testing basics, Health Tips and Weight Training tips.
Subscribe to:
Post Comments (Atom)
Calorie Calculator
Calculate how much you expend in 1 hour of your favorite exercise.
Health Tips.
No comments:
Post a Comment
Drop in your comments/ feedback