The parent – child relationship is a popular concept which is being used nowadays in many applications (eCommerce, blogs, file manager …). There are many pattern to design our Database for storing the relationship of the nodes, such as
- Using a “parent_id” column in the same table
- The “Materialised path pattern” – Stores a path from the root to the parent for every node / row in DB
Either pattern has the pros & cons in storing and solving the problems, such as:
- Problem #1: Get all parents node of a child nodes
- Problem #2: Get all child nodes of a parent node
To solve these problems, we can do at both application layer (Backend) or Database layer, both will have pros & cons, but for me I think using SQL query for this task is much better in term of cost / performance than using recursive loops at application layer
Application layer
Using recursive loop
Pros:
- The code is easier to implement & understand, as usually developer familiar with their own programming language (PHP, Ruby, JS …)
Cons:
- Need to send multiple queries to Database, thus increase traffics to DB
- Not efficiency about cost / performance due to the payload need to exchange between app & DB
- Slower performance compare with using SQL query
Database layer
Using recursive query
Pros:
- We need to run only 1 query, the DB engine will take care the rest of the logic
- Much efficiency about cost / performance because there is no need for multiple queries
- Faster performance
Cons:
- SQL query is harder to understand, compare with writing code in our daily programming language
In this post, I would like to share an SQL query which help us solving the problem #1: “Get all parents node of a child nodes”, with the Database design using “parent_id” in same table.
Let’s prepare some data for testing
# -------------------------------------------------------
# 1. Create a "tree_node" table with "parent_id" field
# -------------------------------------------------------
create table tree_nodes
(
id serial
constraint tree_nodes_pk
primary key,
parent_id integer,
name varchar
);
# -------------------------------------------------------
# 2. Insert some records into our table
# -------------------------------------------------------
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (1, null, 'Root');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (2, 1, 'Child LV_1 - 1');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (3, 1, 'Child LV_1 - 2');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (4, 2, 'Child LV_2 - 1');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (5, 2, 'Child LV_2 - 2');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (6, 2, 'Child LV_2 - 3');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (7, 3, 'Child LV_2 - 4');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (8, 3, 'Child LV_2 - 5');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (9, 3, 'Child LV_2 - 6');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (10, 4, 'Child LV_3 - 1');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (11, 4, 'Child LV_3 - 2');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (12, 4, 'Child LV_3 - 3');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (13, 4, 'Child LV_3 - 4');
Here is the SQL to “get all parent nodes” of every nodes
WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id,
tn.name,
tn.parent_id AS parent_id,
1::INT AS depth,
tn.id::TEXT AS path
FROM public.tree_nodes AS tn
WHERE tn.parent_id IS NULL
UNION ALL
SELECT tn.id,
tn.name,
tn.parent_id as parent_id,
p.depth + 1 AS depth,
(p.path || '->' || tn.id::TEXT)
FROM nodes_cte AS p,
public.tree_nodes AS tn
WHERE tn.parent_id = p.id
)
SELECT *
FROM nodes_cte AS n
ORDER BY n.id ASC;
You can copy / paste and customize the SQL to fit your needs, I will leave that task to you, readers.
Happy coding & thanks for reading !