There is a beautiful garden composed of several smaller flower gardens. It's soon going to open to the general public. Before that, the gardener wants to carve out a path and pave it with concrete, allowing visitors to stroll around the garden. He wants to put in the least effort and time, while also minimizing the damage to the ground and flowers. He also wants all the flower gardens' centers to be accessible from one another.
Top view of the garden
So, the gardener maps out all possible ways to connect all the flower gardens' centers and assigns a weight to each edge, which represents the combined effort(time+money+damage) to construct that edge.
Mapping out the garden
Now, the gardener tries to find a tree in the following graph that minimizes the sum of edge weights(i.e. effort).
Think!! try it out, and then proceed
After staring at it for 10 minutes the gardener makes a few observations :

  • The edge from \(Node 9\) to \(Node 10\) will have to be included in the final tree, that is any edge connected to a node that only has one edge coming out of it, must be included.

  • The edge with the minimum weight will always be included in the final tree if we want to minimize the overall sum.

  • Extension of the above thought : The edge with the second minimum weight will also be included in the final tree.

  • Extending the thought even further : The edge with the third minimum weight will be included only if it doesn't form a cycle with the first and the second least edges(see \(Nodes 3 ,4 ,5\))
Think what the algorithm could bebefore proceeding!
Algorithm : Remove all the edges. Then add the edge with the minimum weight and the edge with the second minimum weight. Then go through the rest of the edges, one by one, from the edge with the third minimum weight to the edge with the maximum weight, and add it if adding it doesn't form a cycle, and don't if it does.




Sub-algorithm : figuring out if adding an edge creates a cycle or not in a given graph.

Think!! then proceed..
After staring at it for 5 minutes :
We need to keep track of the sub-trees! that are forming... then if the edge is across two sub-trees we can add it safely, but if it's within a sub-tree then we cannot add it because it will definitely form a cycle.
Keeping track of sub-trees :
Storing sub-trees as a list of lists. [Tree after adding 7 edges(according to the above algo)]
Better way of keeping track of sub-trees :
Storing sub-trees as a dictionary.
It's better now because we can check if the two nodes are in the same sub-tree just by looking up their representative in the dictionary. Whereas, earlier we had to go through all the elements in the list of lists to check that.


Joining two subtrees :


Let's understand it by adding edges with the weights 20 in the above graph :
Since Node 3 and Node 5 have the same subtree representatives in the dictionary, this edge won't be added.
\(rep\_dict = \{1:1, 2:1, 3:3, 4:3, 5:3, 6:6, 7:3, 8:8, 9:8, 10:10, 11:3, 12:6\} \)

Since Node 9 and Node 10 are in different subtrees, this edge will be added!
\(rep\_dict = \{1:1, 2:1, 3:3, 4:3, 5:3, 6:6, 7:3, 8:8, 9:8, 10:8, 11:3, 12:6\} \)

Note : We chose to change the smaller subtree's representative to the bigger one because it's more convenient. We have to make fewer changes and fewer lookups of the dictionary. But finding the bigger subtree will require us to go through the whole \(rep\_dict\).
So, here's a better way :

We can maintain a \(sub\_trees\) dictionary as follows :

\(sub\_trees = \{1:[1, 2], 3:[3, 4, 5, 7, 11], 6:[6, 12], 8:[8, 9, 10]\} \)
Since Node 7 and Node 8 are in different subtrees, this edge will be added!
\(rep\_dict[7]=3\) and \(rep\_dict[8]=8\)
\(length(sub\_trees[3]) > length(sub\_trees[8]) \)
And now we can accordingly update the \(rep\_dict\) and \(sub\_trees\) dictionaries :
\(rep\_dict = \{1:1, 2:1, 3:3, 4:3, 5:3, 6:6, 7:3, 8:3, 9:3, 10:3, 11:3, 12:6\} \)
\(sub\_trees = \{1:[1, 2], 3:[3, 4, 5, 7, 11, 8, 9, 10], 6:[6, 12]\} \)
Here's the final tree :
Cost : 192

Further exploration :
  1. Try coding it yourself!
  2. Try proving that the above algorithm is correct!
  3. Space and time complexity
    Intuition space complexity : \(O(N)\)
    time complexity : \(O(E . logE)\)
    • for sorting edges : \(O(E . logE)\)
    • for checking for cycles : \(O(1)\)
    • for updating dictionaries : almost constant time i.e. \(O(1)\). Learn more!
  4. Here the 'path' is just a path : We wanted to carve out a path but we carved out the minimal cost tree at the end. Technically a tree is not a path in graph theory terms. So, above it was not the technical path but just a path!
  5. It's not the best way to connect the garden by finding the above tree. What if we want to craft an enchanted journey through the garden for the visitor... immersing them into a harmonious sequence of fragrances and colors, designed to be as aesthetically pleasing as possible..? or what if we want to carve out a path that makes it easier to maintain the garden?.. Just some questions to ponder about!
  6. Some more stuff
    1. Handwritten draft
    2. Code
    3. Learn formally about DSU(union by rank and path compression) and Kruskal's algorithm
    4. Comic
  7. Try finding the minimal cost cycle, instead of tree!
Thank you for reading! Please feel free to give your feedback/suggestions or have discussions or ask questions below!

The next blog discusses the "minimal cost cycle" question.