This is a method for the construction of minimun redundancy codes. It builds a binary tree with a minimum weighted path length from a set of given weigths. This algorithm is applicable to many forms of data transimission.
It consists in the following steps:
1. Scan the text that is going to be compressed, and count the occurrence of each character.
2. Sort characters based on number of occurrences in the text.
3. Bulid Huffman tree based on the list of sorted characters.
4. Traverse the tree to determine the code words.
5. Scan text again and create a new text with the code words.
For a simple explanation of how this algorithm works, you can watch this video:
Next, is the code where I implemented this algorithms (based on a wikipedia example).
As you can see, the code is incomplete. I just reached the point where you get the nodes of the three (next I had to code each letter).
The tree of the wikipedia example is this:
And the result of my code is this, where it shows each node with its children.
A simple video explanation