When we develop, an important part is to know how to interact properly with our database. However, we often only look at the configuration parameters of the database and the power of the server where it runs. Although this is important, it is more important to know how to use the SQL language correctly to interact with the database. For that reason, in this post, we will see a series of tips when making SQL queries that will increase our database’s performance.
The examples can be applied to different relational databases.
1. Analyze the Execution Plan
When we execute a SQL statement, our database engine first analyzes that there are no syntax errors. Then, it determines the optimal way to execute the statement; that is, it creates its execution plan.
We can use this SQL execution plan to analyze how the SQL statement is executed and search performance issues.
The way to do it is straightforward. For example, in Oracle, we can use the following command:
The first number in the cost tuple(36.04..37.05) is the startup cost, and the second is the total cost. Total cost is the whole execution time of the component, from reading in data to writing its output.
The previous results are estimations because I’ve used “EXPLAIN.” If we want the real measures, we can use instead “EXPLAIN ANALYZE.”
We can check which part of our SQL statement needs more optimization by analyzing the results.
2. Avoid Coding Loops
Using loops to make inserts in a table is a bad idea because we don’t let the database engine optimize the inserts. I have often seen loops in the code to insert hundreds of rows instead of using other more optimized ways.
Using a for loop:
A correlated subquery is a query that uses values from the parent query. Unless the SQL engine optimizer rewrites the correlated subquery with a JOIN, this kind of SQL query runs once for each row returned by the outer query and decreases the overall SQL query performance.
Table Partitioning in SQL is the process of dividing large data tables into small manageable tables called partitions. With this technique, we can improve the query performance and makes it easier to manage and query the data.
We have mainly two types of Partitioning:
Vertical Partitioning is split tables vertically and is done on columns. Thus, one set of columns goes into one data store and another into a different one.
Horizontal Partitioning is split tables horizontally and is done on rows. One set of rows goes into one data store, and another goes into a different data store.
Table partitioning is commonly used with tables containing historical data. For example, we could partition horizontally by year in these cases, which would provide quick access to the table with the current year’s data, usually the most consulted data.
5. Check Indexes
The indexes, if used well, are one of the things that will bring more speed to our iterations with the database.
It is recommended that we create indexes on the fields used in our WHERE and JOINS clauses and, if possible, create indexes on columns with number values rather than character values.
6. Drop unused Indexes
Creating indexes for everything without worrying about it can make our table space run out of space, especially if the table on which we are creating the index is enormous. For this reason, it is advisable to clean up from time to time and see if all the indexes we have are necessary.
7. Choose appropriate Data Type
Not all the types occupy the same, and when we use a concrete data type, we can also limit its size according to what we store. For example, VARCHAR(4000) is not the same as VARCHAR(40). We always have to adjust to what we will store in our fields not to occupy unnecessary space in our database.
Keep transactions small since transactions lock the processing tables data and may result in deadlocks.
9. Use EXISTS-NO EXISTS instead of IN-NOT IN
If we want to confirm the existence of a value in a particular table, the recommended way is to use the EXIST clause.
That is because the process that executes the EXISTS clause stops as soon as it finds the required value in the table. But, on the other hand, a query with the IN clause scans all even after finding the necessary value.
10. Use the LIKE operator only if necessary
To search a string, we can use either the LIKE operator or the “=” operator.
The main difference between the LIKE operator and the “=” operator is that we can search for partial strings using wildcards with the LIKE operator. In contrast, with the “=” operator, we search for exact matches.
If we perform an exact search, it is always better to use the “=” operator because it uses indexed columns and is faster.
11. Only Select Fields You Need
The SELECT * query returns records from all columns of the table. While helpful in some cases, in many other cases, it is not necessary.
Some of the problems of bringing us all the columns are:
- Bandwidth overconsumption.
- Columns can be added or deleted, or column names in a table can be modified. Therefore, we might get unexpected results for the SELECT *
- The order of columns returned by a SELECT * could be different.
12. Use joins instead of sub-queries
A Subquery is a SELECT statement that is embedded in a clause of another SQL statement. It’s easier to write, but the JOIN’s are usually better-optimized for the database engines.
In some cases, using sub-queries can be better than using JOIN’s. For example, when joining with large tables. In those cases, the preferred way is to do a sub-select from that large table (limiting the number of rows) and then join them.
SQL (Structured Query Language) is a powerful domain-specific language that allows us to manage and retrieve information from relational databases easily.
The problem is that if you do not know how to use it correctly, the performance of your database will drop significantly. That is why it is essential to consider the basic recommendations that I have written in this small post and, as a good practice, to use a good style guide when writing the SQLs.
There are many style guides for SQL, but choosing one and using it in all our developments is important.
And finally, let’s not forget that the first thing we have to do is understand our data, its type, and how it relates to each other to make the best decisions when interacting with our database.