Course Overview
The Natural Language Processing with Python training course is designed to teach participants the concepts of Natural Language Processing (NLP) and also to provide hands-on experience dealing with text data. The course will help participants to detect patterns in textual data using Python. The course begins with an overview of NLP and some key techniques in NLP. Next, students will write their own spam detection code and sentiment analysis code in Python. The course then looks at Deep Learning, RNNs, Attention Models, Sequence Models and working with BERT Models. The course concludes with a look at using BERT for Q&A Systems.
Purpose:
Promote an in-depth understanding on how to use Natural Language Processing in your Python applications.
Who should attend
Data Scientists and Machine Learning Engineers looking to incorporate Natural Language Processing into their Python applications.
Prerequisites
Participants should preferably have basic knowledge of Python and should be familiar with common ML algorithms like Logistic Regression, Random Forest, Support Vector Machines, Bayesian Classification etc.
Course Objectives
Upon completion of this course, you should be able to:
- Explain what is Natural Language Processing
- Access Text Corpora and Lexical Resources
- Process raw text
- Write structured programs
- Categorize and tag words
- Learn to classify and extract information from text
- Analyze sentence structure and meaning
- Build your own Spam Detector and Sentiment Analyzer
- Write your own Article Spinner
- Describe Deep Learning
- Understand and use BERT
Outline: Natural Language Processing With Python (NLPP)
Natural Language Processing
- What is Natural Language Processing?
- The NLTK package
- Preparing text for analysis
- Text summarization
- Text classification
- Topic Modelling
- Hands-on Exercise(s)
Accessing Text Corpora and Lexical Resources
- Accessing Text Corpora
- Conditional Frequency Distributions
- More Python: Reusing Code
- Lexical Resources
- WordNet
- Hands-on Exercise(s)
Processing Raw Text
- Back to the Basics
- Sequences
- Questions of Style
- Functions: The Foundation of Structured Programming
- Doing More with Functions
- Program Development
- Algorithm Design
- A Sample of Python Libraries
- Exercises
Writing Structured Programs
Categorizing and Tagging Words
- Using a Tagger
- Tagged Corpora
- Mapping Words to Properties Using Python Dictionaries
- Automatic Tagging
- N-Gram Tagging
- Transformation-Based Tagging
- How to Determine the Category of a Word
- Exercises
Learning to Classify Text
- Supervised Classification
- Further Examples of Supervised Classification
- Evaluation
- Decision Trees
- Naive Bayes Classifiers
- Maximum Entropy Classifiers
- Modeling Linguistic Patterns
- Exercises
Extracting Information from Text
- Information Extraction
- Chunking
- Developing and Evaluating Chunkers
- Recursion in Linguistic Structure
- Named Entity Recognition
- Relation Extraction
- Exercises
Analyzing Sentence Structure
- Some Grammatical Dilemmas
- What’s the Use of Syntax?
- Context-Free Grammar
- Parsing with Context-Free Grammar
- Dependencies and Dependency Grammar
- Grammar Development
- Exercises
Building Feature-Based Grammars
- Grammatical Features
- Processing Feature Structures
- Extending a Feature-Based Grammar
- Exercises
Analyzing the Meaning of Sentences
- Natural Language Understanding
- Propositional Logic
- First-Order Logic
- The Semantics of English Sentences
- Discourse Semantics
- Exercises
Build your own Spam Detector
- Build your own spam detector – description of data
- Build your own spam detector using Naive Bayes and AdaBoost – the code
- Key Takeaway from Spam Detection Exercise
- Naive Bayes Concepts
- AdaBoost Concepts
- Other types of features
- Spam Detection FAQ
- What is a Vector?
- SMS Spam Example
- SMS Spam in Code
Build your own Sentiment Analyzer
- Description of Sentiment Analyzer
- Logistic Regression Review
- Preprocessing: Tokenization
- Preprocessing: Tokens to Vectors
- Sentiment Analysis in Python using Logistic Regression
- Sentiment Analysis Extension
- How to Improve Sentiment Analysis & FAQ
Latent Semantic Analysis
- Latent Semantic Analysis – What does it do?
- SVD – The underlying math behind LSA
- Latent Semantic Analysis in Python
- What is Latent Semantic Analysis Used For?
- Extending LSA
Write your own Article Spinner
- Article Spinning Introduction and Markov Models
- More about Language Models
- Trigram Model
- Precode Exercises
- Writing an article spinner in Python
- Article Spinner Extension Exercises
Introduction to Deep Learning
- What is Deep Learning?
- Deep Learning Architecture
- Deep Learning Frameworks
- The relationship between Deep Learning and Machine Learning
- Deep Learning Use cases
- Concepts and Terms
- How to implement Deep Learning?
- Pre-Trained ML Models
Recurrent Neural Networks
- What are Recurrent Neural Networks?
- Different types of RNNs
- Language model and sequence generation
- Sampling novel sequences
- Vanishing gradients with RNNs
- Gated Recurrent Unit (GRU)
- Long Short Term Memory (LSTM)
- Bidirectional RNN
- Deep RNNs
- Seq to Seq Models
- Transformers
- Attention Models
- Hands-on Exercise(s)
Getting started with BERT
- What is BERT?
- Embeddings
- Architecture
BERT's tokenizer
- Understanding CNN for NLP
- How to import Files
- Cleaning Data & Tokenization
- Model Building
- Evaluation
Tuning BERT for Q&A System
- Overview of Q&A System
- Data Preprocessing
- Understanding Model Layers
- Building and Compiling Model
- Key Params
- Training
- Evaluation
- Conclusion