While many MOOCs educate starting from the basics of machine learning techniques to deep learning algorithms using toy datasets that are pretty clean and small to work with, but in reality, while working with real data, 50–60 % of the time is spent in making it proper to be used for any analysis.

Quality of data in terms of coverage, completeness, and correctness plays a crucial role in the success of data science projects by helping businesses providing the right insights!

With this intuition in mind, I thought of writing what all data quality checks can be performed based on…

In this post, we will discuss ‘Broadcasting’ using NumPy. It is also used while implementing neural networks as these operations are memory and computationally efficient.

So let’s understand what Broadcasting means followed by a few examples!

Broadcasting describes the way numpy treats arrays with different shapes for arithmetic operations. The smaller array is broadcasted across the larger array so that they have compatible shapes. It provides a way to vectorize array operations thus leading to efficient implementations.

The light bordered boxes represent the broadcasted values, this extra memory is not actually allocated during the operation, but can be useful conceptually…

In this post, I will discuss about Convolutions and how they act as image filters by implementing convolution operation using a few edge detection kernels.

So let’s start!

Convolutions are mathematical operation on two functions that produces a third function that expresses how one is modified by the other.

To give an example, the first function can be the image and the second function is a matrix sliding over the image(kernel) that results in transforming the input image. …

In this post, we will cover the differences between a Fully connected neural network and a Convolutional neural network. We will focus on understanding the differences in terms of the model architecture and results obtained on the MNIST dataset.

- A fully
- The major advantage of fully connected networks is that they are “structure agnostic” i.e. there are no special assumptions needed to be made about the input.
- While being structure agnostic makes fully connected networks…

In continuation of my previous blogs, part-1 and part-2 where we explored COVID tweet data and performed topic modeling respectively, in this part, we will build a sentiment classifier.

Although basic data exploration has been done in previous parts, showing a little glimpse of data again!!

- A glimpse of the COVID Tweet dataset

In continuation of part-1, where we explored twitter data related to COVID. In this post, we will use Topic Modelling to get to know more about the underlying key ideas that people are tweeting about.

Let’s first understand what Topic Modelling is!

Topic Modelling is an unsupervised technique which helps to find underlying topics also termed as latent topics, present in a plethora of documents available.

In real-world, we observe a lot of unlabelled text data, in form of comments, reviews or complaints, etc. …

In this blog, I have taken the COVID tweet dataset from Kaggle and explored it to understand what people are talking about using NLP techniques.

So let’s understand the data about new normal beginnings!!

I have taken a dataset “Corona Virus Tagged Data” from Kaggle. The tweets have been pulled from Twitter and manual tagging has been done. The names and usernames have been given codes to avoid any privacy concerns. There are two datasets available — train.csv and test.csv. I have used train.csv for this exploratory analysis.

**Columns present**:-

- UserName
- ScreenName
- Tweet At
- Original Tweet
- Label

In this post, I have discussed what we mean by a named entity, name entity recognition technique, and how to extract named entities using spaCy.

The term “named entity” is traditionally used to refer to the set of person, organization, and location names encountered in a given text. Further dates, monetary units or percentages, etc. are often included and detected using the same techniques, based on local grammars.

*Example:- “Facebook bought WhatsApp in 2014 for $16bn”*

Very deep networks often result in gradients that vanishes as the gradient is back-propagated to earlier layers, repeated multiplication may make the gradient infinitely small. ResNet uses the concept of residual blocks that include shortcut skip connections to jump over some layers.

The authors of ResNet paper provide evidence showing that residual networks are easier to optimize, and can gain accuracy from considerably increased depth by reformulating the layers as learning residual functions with reference to the layer inputs.

Let’s understand it in more detail!!

Activation functions in neural networks are used to define the output of the neuron given the set of inputs. These are applied to the weighted sum of the inputs and transform them into output depending on the type of activation used.

Output of neuron = Activation(weighted sum of inputs + bias)

The main idea behind using activation functions is to **add non-linearity**.

Now, the question arises why we need non-linearity? We need neural network models to **learn and represent complex functions. **…

Data Scientist. LinkedIn — https://www.linkedin.com/in/pooja-mahajan-69b38a98/.