Understand subword-based tokenization algorithm used by state-of-the-art NLP models — WordPiece

Photo by Glen on Unsplash

Over the past few years, there has been a lot of buzz in the field of AI and especially NLP. 😎 Understanding and analyzing human language is not only a challenging problem but fascinating as well. The human language looks simple but is very complicated as even a short text…

Understand subword-based tokenization algorithm used by state-of-the-art NLP models — Byte-Pair Encoding (BPE)

Photo by Clark on Unsplash

The branch of Artificial Intelligence, Natural Language Processing (NLP), is all about making machines understand and process human language. Processing human language is not an easy task for machines as machines work with numbers and not text. 💻 NLP is such a vast and widely studied branch of AI that…

Hands-on Tutorials

The differences that anyone working on an NLP project should know

Image by Sincerely Media on Unsplash

Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) that provides machines (computers) the ability to understand written and spoken human language in the same way as human beings. NLP is almost everywhere and helping people in their daily tasks. 😍 It is such a common technology now…

… using Hugging Face Transformers and PyTorch on CoQA dataset by Stanford

Photo by Taylor on Unsplash

Whenever I think about a question answering system, the first thing that comes to my mind is a classroom — a question asked by a teacher and one or several students raising their hands 🙋 to answer that question. That said, question answering can be a trivial task for humans…

They look so alike, yet so different. Let’s find out the differences!

Image by Dil on Unsplash

Bar chart and histogram are two graphs that are commonly used in data analysis. They seem alike, as both have bars to display the data, both have an x-axis and a y-axis. In fact, they look so identical that people often get confused about which one to use when. 🤔

A light introduction to different sampling techniques in statistics

Image by Ryoji on Unsplash

Whenever we come across any statistical study, we hear a lot of different statistical terms. 😳 One of the most common terms we hear is sampling. In this article, we will try to understand what sampling is and then get into the details of different sampling techniques.


Sampling, in simple…

Transfer learning using state-of-the-art EfficientNet-B0

Photo by Marina Vitale on Unsplash

Convolutional Neural Network (CNN) is a class of deep neural networks commonly used to analyze images. In this article, we will together build a CNN model that can correctly recognize and classify colored images of objects into one of the 100 available classes of the CIFAR-100 dataset. In particular, we…

A handy guide about English stop words removal in Python

Image by Kai on Unsplash

We are well aware of the fact that computers can easily process numbers if programmed well. 🧑🏻‍💻 However, a large portion of the information we have is in the form of text. 📗 We communicate with each other by directly talking with them or using text messages, social media posts…

Chetna Khanna

Engineer — Data & ML | Love to read | Love to write I https://www.linkedin.com/in/chetna-khanna/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store