Advance SQL Pt. 1
If you don’t know the advance SQL, then you write more code to adjust the result way you want and/or more SQL queries to fetch further additional or nested data. Knowing advance SQL meaning re-arranging your business logic from coding side to the SQL side.
I’m planning to write the series of almost independent articles on advance SQL under the same name — Advance SQL and this is the first article. For the purpose of demonstration, I’m going to use PostgreSQL. So, all the queries are guaranteed to work against this database. On the other side, it might possible and chances are high that the queries wouldn’t work on other databases. Make sure to adjust or take help from someone if you are using different database.
I’m using PostgreSQL v14.12
. You can confirm the version on your system by running following query.
psql --version
psql (PostgreSQL) 14.12 (Ubuntu 14.12-0ubuntu0.22.04.1)
Let’s create a new database with name advance
and connect to this database. This will be the playground for us to learn advance SQL.
postgres=# create database advance;
postgres=# \c advance
Go ahead and create contacts
table.
create table contacts (
id serial primary key,
first varchar(255) not null,
last varchar(255) not null
);
Running \dt
command should return list of all tables.
advance=# \dt
List of relations
Schema | Name | Type | Owner
--------+----------+-------+----------
public | contacts | table | postgres
(1 row)
Let’s add some entries in it.
insert into contacts (first, last) values ('Jane', 'Doe');
insert into contacts (first, last) values ('Merry', 'Doe');
insert into contacts (first, last) values ('Kai', 'Doe');
To fetch all the data we can write following query.
table contacts;
This query equivalent to following.
SELECT c.* FROM contacts AS c;
SQL always use alias internally using AS
keyword.
Prepend explain
keyword before the query and it will give you the query plan.
explain table contacts;
QUERY PLAN
-------------------------------------------------------------
Seq Scan on contacts (cost=0.00..10.70 rows=70 width=1036)
(1 row)
explain SELECT c.* FROM contacts AS c;
QUERY PLAN
---------------------------------------------------------------
Seq Scan on contacts c (cost=0.00..10.70 rows=70 width=1036)
(1 row)
explain
keyword wouldn’t execute the query. It only provide the possible plan that query might take to execute.
If you add analyze
after explain
keyword, it’ll execute the query and provide you the actual plan/time as well.
explain analyze table contacts;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on contacts (cost=0.00..10.70 rows=70 width=1036) (actual time=0.032..0.038 rows=3 loops=1)
Planning Time: 0.169 ms
Execution Time: 0.080 ms
(3 rows)
explain analyze SELECT c.* FROM contacts AS c;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Seq Scan on contacts c (cost=0.00..10.70 rows=70 width=1036) (actual time=0.023..0.027 rows=3 loops=1)
Planning Time: 0.123 ms
Execution Time: 0.057 ms
(3 rows)
Here, Seq Scan
means Sequential Scan meaning PostgreSQL scan all the rows one-by-one from the table to fetch the result you’ve asked for! There are few other types of scan available in PostgreSQL and it use based on the query. For example, let’s fetch the contact details by id = 1
.
explain analyze select * from contacts where id = 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Scan using contacts_pkey on contacts (cost=0.14..8.16 rows=1 width=1036) (actual time=0.033..0.038 rows=1 loops=1)
Index Cond: (id = 1)
Planning Time: 0.180 ms
Execution Time: 0.083 ms
(4 rows)
As id
is primary key and index has been generated for it, PostgreSQL did the Index Scan
in this query. But, if we try to fetch the contact details by first = ‘Jane’
then it wouldn’t be the Index Scan as we do not have the index for the first
column.
explain analyze select * from contacts where first = 'Jane';
QUERY PLAN
------------------------------------------------------------------------------------------------------
Seq Scan on contacts (cost=0.00..10.88 rows=1 width=1036) (actual time=0.023..0.027 rows=1 loops=1)
Filter: ((first)::text = 'Jane'::text)
Rows Removed by Filter: 2
Planning Time: 0.117 ms
Execution Time: 0.054 ms
(5 rows)
You wouldn’t notice any big difference in these queries but these are enough to get started!