Closure table is a simple and elegant way of storing and querying hierarchical data in any RDBMS. By hierarchical data we mean a set of data that has some parent – child relationship among them. We use the word ‘tree’ instead of hierarchies commonly. As an example we may take the relationships between geographic locations like ‘Countries’, ‘States/ Province’, ‘Districts/ Cities’ etc.
Closure Table Storing a Tree of locations
I lived in a district called ‘Malappuram’ which belongs to the state, Kerala. Kerala share its border with two other states – Tamilnadu and Karnataka and all these three states form a major part of south India. Unfortunately I left my beautiful village and moved to Chennai, the Capital city of Tamilnadu because of my job requirements!
Now, we got some sample data which are simple enough to be represented as tree of locations. To make it even clear we represent it graphically as following image. (I have added few more places in image.)
Now we came back to the question, how to store these details using closure table? As the first step we store each of the locations in a table with three columns:
INSERT INTO locations VALUES (1, 0, 'India'); INSERT INTO locations VALUES (2, 1, 'Kerala'); INSERT INTO locations VALUES (3, 1, 'Tamilnadu'); INSERT INTO locations VALUES (4, 1, 'Karnataka'); INSERT INTO locations VALUES (5, 2, 'Malappuram'); INSERT INTO locations VALUES (6, 2, 'Calicut'); INSERT INTO locations VALUES (7, 3, 'Chennai'); INSERT INTO locations VALUES (8, 3, 'Coimbatore'); INSERT INTO locations VALUES (9, 5, 'Kohinoor'); INSERT INTO locations VALUES (10, 7, 'T.nagar');
- The first column ‘id’ which is the primary key of the table, uniquely identifies each location.
- Second column ‘parent_id’ holds the id of the immediate parent of each location.
- Third column stores the actual name of location.
There is no parent defined for India as per our test data. So we consider India as the root and keep it’s parent_id as zero. In contrast the ‘locations’ table store each of the locations and also specify its parent. As we can see, it store only depth one relation. But we need one more table to store relations of any depth. We need to tell the db that ‘Malappuram’ is a ‘grandchild’ of India at depth 2, using exactly one row. For this we create the second table, ‘location_hierarchy’. It is the actual Closure Table we talked about so far. Closure table stores the transitive closure of all possible parent – child relation of the ‘locations’ table.
Let’s fill out the closure table with some data:
You might feel strange since I declared every locations as its own parent. (Still it is logically valid, if you think like: a location is its own parent at a depth of zero.) But it is incredibly useful for filling out rest of the relations.
Next we are going to tell our closure table that Kerala belongs to India. For this use the following query which involves a simple self join:
INSERT INTO location_hierarchy(parent_id, child_id, depth) SELECT p.parent_id, c.child_id, p.depth+c.depth+1 FROM location_hierarchy p, location_hierarchy c WHERE p.child_id = 1 AND c.parent_id = 2
In the WHERE clause we specified the immediate parent and child ids in p.child_id and c.parent_id respectively. Now we got one more row which represent the relation between Kerala and India:
Making Malappuram a child of Kerala and grand child of India:
INSERT INTO location_hierarchy(parent_id, child_id, depth) SELECT p.parent_id, c.child_id, p.depth+c.depth+1 FROM location_hierarchy p, location_hierarchy c WHERE p.child_id = 2 AND c.parent_id = 5
We can see that it will add two rows to represent each of two relations:
Adding entries for Kohinoor:
INSERT INTO location_hierarchy(parent_id, child_id, depth) SELECT p.parent_id, c.child_id, p.depth+c.depth+1 FROM location_hierarchy p, location_hierarchy c WHERE p.child_id = 5 AND c.parent_id = 9
Querying data from closure table
Getting the list of parents of a location: let’s see how to get the list of parents of a location, say, T.nagar,
SELECT parent_id FROM location_hierarchy WHERE child_id = 10 and depth != 0 ORDER BY depth DESC
this simple query will give the list of ids of India, Tamilnadu and Chennai. Joining it with locations table we get the real names of parent locations:
SELECT location_hierarchy.parent_id, location_name FROM location_hierarchy JOIN locations ON(location_hierarchy.parent_id = locations.id) WHERE child_id = 10 AND depth!=0 ORDER BY depth DESC
SELECT location_hierarchy.child_id, location_name FROM location_hierarchy JOIN locations ON(location_hierarchy.child_id = locations.id) WHERE location_hierarchy.parent_id = 1 and depth = 1
Removing relations from closure table
To remove a parent child relationship from closure table simply use this query:
DELETE FROM location_hierarchy WHERE mapping_id IN( SELECT rel.mapping_id FROM location_hierarchy p, location_hierarchy rel, location_hierarchy c WHERE p.parent_id = rel.parent_id and c.child_id = rel.child_id AND p.child_id = CHILD_ID AND c.parent_id = PARENT_ID)
Remember, this will remove the relation between two nodes(places) never the nodes. To remove a node and all of its child nodes we use the following query:
DELETE FROM location_hierarchy WHERE mapping_id IN( SELECT rel.mapping_id FROM location_hierarchy p, location_hierarchy rel, location_hierarchy c, location_hierarchy delete WHERE p.parent_id = rel.parent_id AND c.child_id = rel.child_id AND p.child_id = delete.parent_id AND c.parent_id = delete.child_id AND (delete.parent_id = ID_TO_DELETE OR delete.child_id = ID_TO_DELETE) AND delete.depth < 2 )
Using Triggers to automate Closure Entry Formation
Most of the DBMS systems like PostgreSQL, MySQL etc support Triggers which is fired when an INSERT, UPDATE, DELETE operation is performed. Triggers can be used to automate the creation/ deletion of closure entries whenever a new item is added to the main(location) table. We will see it in detail in another post!
To learn more about postgreSQL triggers, see:
Recursive Queries to Eliminate depth Column
We can see from the location_hierarchy closure table that, as the depth increases, the number of entries to be inserted also increases. If you really want to decrease the size of the closure table, you can make use of Recursive query in PostgreSQL etc. It requires only depth one relations. But not all DBMS systems support recursive queries. We will see it in detail later!