Language Models: Part 2

Language Models Use Cases and its Industrial Applications.

Aug 14, 2023

∙ Paid

In the my previous blog, I gave you simple introduction on Language Models and explained about N-Gram Architecture & its limitations. Continuing from there, we start to explore on the techniques used to overcome the shortcomings of Language Models specifically N-Gram Models.

Smoothing, interpolation, and backoff are techniques used to improve the performance of N-gram models in natural language processing.

Smoothing is a technique used to address the problem of data sparsity in N-gram models. Since it is impossible to have a training corpus that contains all possible N-grams, some N-grams will have zero counts, which can cause problems when calculating probabilities. Smoothing methods assign non-zero probabilities to unseen N-grams by redistributing probability mass from seen N-grams¹.

Interpolation is a smoothing technique that combines the probabilities of N-grams of different orders. For example, if we are trying to find the probability of a trigram and we have trigram, bigram, and unigram models, we can estimate the probability as a weighted sum of the probabilities from each model.

Backoff is another smoothing technique that approximates the probability of an unobserved N-gram using more frequently occurring lower-order N-grams. If an N-gram count is zero, its probability is approximated using a lower-order N-gram³.

These techniques help improve the performance of N-gram models by addressing the issue of data sparsity and providing better estimates for rare events.

Keep reading with a 7-day free trial

Subscribe to STJayaprakash's AI-Aqua Substack to keep reading this post and get 7 days of free access to the full post archives.