PostgreSQL – How to get all parent nodes in tree using recursive SQL query

Bring me SQL …. Argggg !!!!!

The parent – child relationship is a popular concept which is being used nowadays in many applications (eCommerce, blogs, file manager …). There are many pattern to design our Database for storing the relationship of the nodes, such as

  1. Using a “parent_id” column in the same table
  2. The “Materialised path pattern” – Stores a path from the root to the parent for every node / row in DB

Either pattern has the pros & cons in storing and solving the problems, such as:

  • Problem #1: Get all parents node of a child nodes
  • Problem #2: Get all child nodes of a parent node

To solve these problems, we can do at both application layer (Backend) or Database layer, both will have pros & cons, but for me I think using SQL query for this task is much better in term of cost / performance than using recursive loops at application layer

Application layer
Using recursive loop

Pros:

  • The code is easier to implement & understand, as usually developer familiar with their own programming language (PHP, Ruby, JS …)


Cons:

  • Need to send multiple queries to Database, thus increase traffics to DB
  • Not efficiency about cost / performance due to the payload need to exchange between app & DB
  • Slower performance compare with using SQL query

Database layer
Using recursive query

Pros:

  • We need to run only 1 query, the DB engine will take care the rest of the logic
  • Much efficiency about cost / performance because there is no need for multiple queries
  • Faster performance

Cons:

  • SQL query is harder to understand, compare with writing code in our daily programming language

In this post, I would like to share an SQL query which help us solving the problem #1: “Get all parents node of a child nodes”, with the Database design using “parent_id” in same table.

Just an image to describe what we want to achieve

Let’s prepare some data for testing

# -------------------------------------------------------
# 1. Create a "tree_node" table with "parent_id" field
# -------------------------------------------------------
create table tree_nodes
(
    id        serial
        constraint tree_nodes_pk
            primary key,
    parent_id integer,
    name      varchar
);

# -------------------------------------------------------
# 2. Insert some records into our table
# -------------------------------------------------------
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (1, null, 'Root');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (2, 1, 'Child LV_1 - 1');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (3, 1, 'Child LV_1 - 2');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (4, 2, 'Child LV_2 - 1');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (5, 2, 'Child LV_2 - 2');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (6, 2, 'Child LV_2 - 3');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (7, 3, 'Child LV_2 - 4');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (8, 3, 'Child LV_2 - 5');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (9, 3, 'Child LV_2 - 6');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (10, 4, 'Child LV_3 - 1');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (11, 4, 'Child LV_3 - 2');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (12, 4, 'Child LV_3 - 3');
INSERT INTO public.tree_nodes (id, parent_id, name) VALUES (13, 4, 'Child LV_3 - 4');

Here is the SQL to “get all parent nodes” of every nodes

WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
    SELECT tn.id,
           tn.name,
           tn.parent_id AS parent_id,
           1::INT       AS depth,
           tn.id::TEXT  AS path
    FROM public.tree_nodes AS tn
    WHERE tn.parent_id IS NULL
    UNION ALL
    SELECT tn.id,
           tn.name,
           tn.parent_id as parent_id,
           p.depth + 1  AS depth,
           (p.path || '->' || tn.id::TEXT)
    FROM nodes_cte AS p,
         public.tree_nodes AS tn
    WHERE tn.parent_id = p.id
)
SELECT *
FROM nodes_cte AS n
ORDER BY n.id ASC;
The SQL result

You can copy / paste and customize the SQL to fit your needs, I will leave that task to you, readers.

Happy coding & thanks for reading !

Share this on: