Course Content
- Analytics Framework
- Regression for Prediction
- Cleaning and Preprocessing Data
- Algorithms, Preprocessing and Feature Extrac7on
- Clustering Data
- Detecting Anomalies
- Forecasting
- Classification
Prerequisites
To be successful, students should have a solid understanding of the following courses:
- Fundamentals 1,2 &3 (Retired)
- Advanced Searching & Reporting
Or the following single-subject courses:
- What is Splunk? (Retired)
- Intro to Splunk (ITS)
- Using Fields (SUF)
- Scheduling Reports & Alerts (SRA)
- Visualizations (SVZ)
- Working with Time (WWT)
- Statistical Processing (SSP)
- Comparing Values (SCV)
- Result Modification (SRM)
- Leveraging Lookups and Subsearches (LLS)
- Correlation Analysis (SCLAS)
- Search Under the Hood (SUH)
- Intro to Knowledge Objects (IKO)
- Creating Field Extractions (CFE)
- Search Optimization (SSO)
Course Objectives
This 13.5-hour course is for users who want to attain operational intelligence level 4, (business insights) and covers implementing analytics and data science projects using Splunk's statistics, machine learning, built-in and custom visualization capabilities.
Please note that this course may run over three days, with 4.5 hour sessions each day.
Outline: Splunk for Analytics and Data Science (SADS)
Topic 1 – Analytics Workflow
- Define terms related to analytics and data science
- Describe the analytics workflow
- Describe common usage scenarios
- Navigate Splunk Machine Learning Toolkit
Topic 2 – Training and Testing Models
- Split data for tes7ng and training using the sample command
- Describe the fit and apply commands
- Use the score command to evaluate models
Topic 3 – Regression: Predict Numerical Values
- Differentiate predictions from estimates
- Identify prediction algorithms and assumptions
- Model numeric predictions in the MLTK and Splunk Enterprise
Topic 4 – Clean and Preprocess the Data
- Define preprocessing and describe its purpose
- Describe algorithms that preprocess data for use in models
- Use FieldSelector to choose relevant fields
- Normalize data with StandardScaler and RobustScaler
- Preprocess text using Imputer, NPR, TF-IDF, and HashingVectorizer
Topic 5 – Clustering
- Define Clustering
- Identify clustering methods, algorithms, and use cases
- Use Smart Clustering Assistant to cluster data
- Evaluate clusters using silhouette score
- Validate cluster coherence
- Describe clustering best practices
Topic 6 – Forecasting Fields
- Differentiate predictions from forecasts
- Use the Smart Forecasting Assistant
- Use the StateSpaceForecast algorithm
- Forecast multivariate data
- Account for periodicity in each time series
Topic 7 – Detect Anomalies
- Define anomaly detection and outliers
- Identify anomaly detection use cases
- Use Splunk Machine Learning Toolkit Smart Outlier Assistant
- Detect anomalies using the Density Function algorithm
- View results with the Distribution Plot visualization
Topic 8 – Classify: Predict Categorical Values
- Define key classification terms
- Identify when to use different classification algorithms
- Evaluate classifier tradeoffs
- Evaluate results of multiple algorithms