Data Compression Algorithm

Entropy Based Encoding

  1. Create a list of probabilities or frequency counts for the given set of symbols so that the relative frequency of occurrence of each symbol is known.
  2. Sort the list of symbols in decreasing order of probability, the most probable ones to the left and least probable to the right.
  3. Split the list into two parts, with the total probability of both the parts being as close to each other as possible.
  4. Assign the value 0 to the left part and 1 to the right part.
  5. Repeat steps 3 and 4 for each part, until all the symbols are split into individual subgroups.

Dictionary Based Encoding


In this blog, we discussed the situations that require lossless compression methods. There are different compression techniques discussed in detail. Our main focus in this blog was on various data compression methods like dictionary based and entropy based. In entropy based techniques, the RLE (Run length encoding) is not used much but Shannon Fano and Huffman encoding algorithms are the two methods which are better than the RLE algorithm. But Shannon Fano and Huffman compression are almost the same but the Huffman coding is better than the Shanon Fano algorithm. In a dictionary based algorithm there is the LZW method that gives the best and accurate result.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


If you want peace, prepare for war.