Natural Language Processing- Basics

Natural language Processing:

Natural Language Processing is an area of Computer Science and Artificial Intelligence concern of the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.

- Often when performing analysis, lots of data is numerical, such as sales numbers, physical measurements,
 quantifiable categories.

- Computers are very good at handling direct numerical information.

- But what about do we do about text data? As humans we can tell that there is a plethora of information      inside of a text documents.

- But a computer needs specialized processing techniques in order to understand raw text data.

- Text data is highly unstructured and can also be in multiple languages!

- Natural Language Processing attempts to use a variety of techniques in order to create some sort of
  structure out of raw text data.

- Some example use cases of natural language processing:
 1. Classifying Emails as Spam vs Legitimate 
 2. Sentiment Analysis of Text Movie Reviews
 3. Analyzing Trends from written customer feedback forms.
 4.Understanding text commands, "Hey Google, play the song".

Library used: Spacy, NLTK

Spacy: 
Spacy is an open source natural language processing library for Python. It is designed to effectively handle Natural Language Processing tasks with the most efficient implementation of common tasks and algorithms.
-  Spacy only has one implemented method, choosing the most efficient algorithm currently available.

-  This means you often don't actually have the option to choose between algorithms for a particular task.

NLTK(Natural Language Tool Kit): 
- it's a very popular open source library.
- It was initially released in 2001. It's much older than spacy which was released in 2015.
- It also provides many functionalities, but includes less efficient implementations.


NLTK vs Spacy:
-Spacy is much faster and more efficient, at the cost of the user not being able to choose a specific
 algorithmic implementations.
-Spacy does not include pre-created models for some applications, such as sentiment analysis which
 is typically easier to perform with an altercation.

To install Spacy on Anaconda Prompt:
 -> conda install -c conda-forge spacy

After installing Spacy-
To download language library that Spacy needs, It's a big reason why Spacy can work efficiently:
 -> python -m spacy download en

 It will download english language library.

Spacy works with a Pipeline object:
The nlp() function from Spacy automatically takes raw text and performs a series of operations
to tag, parse and describe the text data.
To get hands dirty with NLP programming codes, Click on the github link given below:

Comments

Popular posts from this blog

Importance of Activation Functions

Introduction to Deep Learning

The idea of Neural Network