Course Overview

The Natural Language Processing with Python training course is designed to teach participants the concepts of Natural Language Processing (NLP) and also to provide hands-on experience dealing with text data. The course will help participants to detect patterns in textual data using Python. The course begins with an overview of NLP and some key techniques in NLP. Next, students will write their own spam detection code and sentiment analysis code in Python. The course then looks at Deep Learning, RNNs, Attention Models, Sequence Models and working with BERT Models. The course concludes with a look at using BERT for Q&A Systems.

Purpose:

Promote an in-depth understanding on how to use Natural Language Processing in your Python applications.

Who should attend

Data Scientists and Machine Learning Engineers looking to incorporate Natural Language Processing into their Python applications.

Prerequisites

Participants should preferably have basic knowledge of Python and should be familiar with common ML algorithms like Logistic Regression, Random Forest, Support Vector Machines, Bayesian Classification etc.

Course Objectives

Upon completion of this course, you should be able to:

Explain what is Natural Language Processing
Access Text Corpora and Lexical Resources
Process raw text
Write structured programs
Categorize and tag words
Learn to classify and extract information from text
Analyze sentence structure and meaning
Build your own Spam Detector and Sentiment Analyzer
Write your own Article Spinner
Describe Deep Learning
Understand and use BERT

Outline: Natural Language Processing With Python (NLPP)

Natural Language Processing

What is Natural Language Processing?
The NLTK package
Preparing text for analysis
Text summarization
Text classification
Topic Modelling
Hands-on Exercise(s)

Accessing Text Corpora and Lexical Resources

Accessing Text Corpora
Conditional Frequency Distributions
More Python: Reusing Code
Lexical Resources
WordNet
Hands-on Exercise(s)

Processing Raw Text

Back to the Basics
Sequences
Questions of Style
Functions: The Foundation of Structured Programming
Doing More with Functions
Program Development
Algorithm Design
A Sample of Python Libraries
Exercises

Writing Structured Programs

Categorizing and Tagging Words

Using a Tagger
Tagged Corpora
Mapping Words to Properties Using Python Dictionaries
Automatic Tagging
N-Gram Tagging
Transformation-Based Tagging
How to Determine the Category of a Word
Exercises

Learning to Classify Text

Supervised Classification
Further Examples of Supervised Classification
Evaluation
Decision Trees
Naive Bayes Classifiers
Maximum Entropy Classifiers
Modeling Linguistic Patterns
Exercises

Extracting Information from Text

Information Extraction
Chunking
Developing and Evaluating Chunkers
Recursion in Linguistic Structure
Named Entity Recognition
Relation Extraction
Exercises

Analyzing Sentence Structure

Some Grammatical Dilemmas
What’s the Use of Syntax?
Context-Free Grammar
Parsing with Context-Free Grammar
Dependencies and Dependency Grammar
Grammar Development
Exercises

Building Feature-Based Grammars

Grammatical Features
Processing Feature Structures
Extending a Feature-Based Grammar
Exercises

Analyzing the Meaning of Sentences

Natural Language Understanding
Propositional Logic
First-Order Logic
The Semantics of English Sentences
Discourse Semantics
Exercises

Build your own Spam Detector

Build your own spam detector – description of data
Build your own spam detector using Naive Bayes and AdaBoost – the code
Key Takeaway from Spam Detection Exercise
Naive Bayes Concepts
AdaBoost Concepts
Other types of features
Spam Detection FAQ
What is a Vector?
SMS Spam Example
SMS Spam in Code

Build your own Sentiment Analyzer

Description of Sentiment Analyzer
Logistic Regression Review
Preprocessing: Tokenization
Preprocessing: Tokens to Vectors
Sentiment Analysis in Python using Logistic Regression
Sentiment Analysis Extension
How to Improve Sentiment Analysis & FAQ

Latent Semantic Analysis

Latent Semantic Analysis – What does it do?
SVD – The underlying math behind LSA
Latent Semantic Analysis in Python
What is Latent Semantic Analysis Used For?
Extending LSA

Write your own Article Spinner

Article Spinning Introduction and Markov Models
More about Language Models
Trigram Model
Precode Exercises
Writing an article spinner in Python
Article Spinner Extension Exercises

Introduction to Deep Learning

What is Deep Learning?
Deep Learning Architecture
Deep Learning Frameworks
The relationship between Deep Learning and Machine Learning
Deep Learning Use cases
Concepts and Terms
How to implement Deep Learning?
Pre-Trained ML Models

Recurrent Neural Networks

What are Recurrent Neural Networks?
Different types of RNNs
Language model and sequence generation
Sampling novel sequences
Vanishing gradients with RNNs
Gated Recurrent Unit (GRU)
Long Short Term Memory (LSTM)
Bidirectional RNN
Deep RNNs
Seq to Seq Models
Transformers
Attention Models
Hands-on Exercise(s)

Getting started with BERT

What is BERT?
Embeddings
Architecture

BERT's tokenizer

Understanding CNN for NLP
How to import Files
Cleaning Data & Tokenization
Model Building
Evaluation

Tuning BERT for Q&A System

Overview of Q&A System
Data Preprocessing
Understanding Model Layers
Building and Compiling Model
Key Params
Training
Evaluation
Conclusion

Natural Language Processing With Python (NLPP)