Data Science with R- Programming – 5 Days
Data Science with R- Programming – 5 Days
Pre-Requisites:
- Understanding of Data Type, scales of measurement
- Measures of Summary: Mean, mode median, standard deviation, variance, covariance, skewness and kurtosis and its significance
- Describing Data: bar, pie, box and whiskers plot, scatter plot
- Probability
- Discrete Probability Distribution: Binomial distribution
- Continuous Probability Distribution: Normal distribution
- Binomial to Normal
- Normal probability plot
- Various Sampling methods
- Sampling Distribution
- Estimation
- Test of hypothesis
- Correlation
- Regression
(It Is mandatory to accomplish the training prerequisite conditions before nominating for the session)
Day 1
Overview of Analytics
· What is Analytics?
· Types of Business Analytics
o Descriptive Analytics
o Predictive Analytics
o Prescriptive Analytics
· Application Of Business Analytics
o Used Cases from Banking
· Popular Tools
· Role of Data Scientist
· Analytics Methodology
· Problem Definition
Introduction to R
· Introduction to R
· R Set-up
· Advantages of R
· IDEs for R
· Setting the Workspace
· Benefits of Workspace
· Packages in R
Programming in R
· R Syntax and Objects
· Arithmetic Operators
· Relational Operators
· Logical Operators
· Assignment Operators
· Conditional Statements in R
· Ifelse() Function
· Loops in R
· Break Statement
· Next Statement
· Scan Function
· Running an R Script
· Running a Batch Script
· R Functions
Data Structure & Apply Functions in R
· Objectives
· Types of Data Structures in R
· Vectors
· Scalers
· Colon Operator
· Accessing Vector Elements
· Matrices
· Accessing Matrix Elements
· Arrays
· Accessing Array Elements
· Data Frames
· Elements of Data Frames
· Factors
· Lists
· Importing Files in R
· Importing an Excel File
· Importing a Minitab File
· Importing a Table File
· Importing a CSV File
· Exporting Files from R
· Types of Apply Functions
o Apply() Function
o Apply() Function (contd.)
o Apply() Function (contd.)
o Lapply() Function
o Sapply() Function
o Tapply() Function
o Tapply() Function (contd.)
o Tapply() Function (contd.)
o Vapply() Function
o Mapply() Function
· Basic Data Manipulation
o In built function for data manipulation
o Sub-setting and slicing data
o Modifying structure of the data
· Advanced Data Manipulation
o Dplyr Package—An Overview
o Dplyr Package—The Five Verbs
o Installing the Dplyr Package
o Functions of the Dplyr Package
o Functions of the Dplyr Package — Select()
o Functions of Dplyr Package—Filter()
o Functions of Dplyr Package—Arrange()
o Functions of Dplyr Package—Mutate()
Day 2
o Reshape package for data structure manipulations
· Function in R
Data Visualization in R
· Graphics in R
· Types of Graphics
· Bar Charts
· Creating Simple Bar Charts
· Pie Charts
· Histograms
· Kernel Density Plots
· Line Charts
· Box Plots
· Heat Maps
· Saving a Graphic Output as a File
· Exporting Graphs in RStudio
· Exporting Graphs as PDFs in RStudio
· Advance charting with GGPLOT
R Inbuilt Functions & Loops
Mathematical Functions: sum, mean, table, colsums etc
Aggregate Function
Head & Summary Function
Basic functions like
grep
gsub
Paste
substr
replace
strsplit
Merging data sets
Date functions and formats
o Format Function
o Date Function
o Extracting Month & Year
Loops
o For
o While
Conditional Statements
R User Defined Function
Writing our own function
User defined function with APPLY family
Connecting R to MySQL
R package for MySQL connection
DPLYR Package
GGPLOT Package
Case Study
Day 3
Introduction to Machine Learning
· Supervised Learning
· Unsupervised Learning
Application Area
Basic Ststistics
Types of Data
Summarization Techniques
Probability
Different types of Probability Distribution
Quiz
Introduction to Regression Analysis
Use of Regression Analysis—Examples
Use of Regression Analysis—Examples (contd.)
Types Regression Analysis
Simple Regression Analysis
Multiple Regression Models
Simple Linear Regression Model
Simple Linear Regression Model Explained
Correlation
Correlation Between X and Y
Correlation Between X and Y (contd.)
Method of Least Squares Regression Model
Coefficient of Multiple Determination Regression Model
Standard Error of the Estimate Regression Model
Dummy Variable Regression Model
Interaction Regression Model
Non-Linear Regression
Non-Linear Regression Models
Non-Linear Regression Models (contd.)
Non-Linear Regression Models (contd.)
Non-Linear Models to Linear Models
Algorithms for Complex Non-Linear Models
Summary and quizzes
Recap
Introduction to Classification
Examples of Classification
Classification vs. Prediction
Classification System
Classification Process
Classification Process—Model Construction
Classification Process—Model Usage in Prediction
Issues Regarding Classification and Prediction
Data Preparation Issues
Evaluating Classification Methods Issues
Decision Tree
Decision Tree—Dataset
Decision Tree—Dataset (contd.)
Classification Rules of Trees
Overfitting in Classification
Day 4
Tips to Find the Final Tree Size
Basic Algorithm for a Decision Tree
Statistical Measure—Information Gain
Calculating Information Gain—Example
Calculating Information Gain—Example (contd.)
Calculating Information Gain for Continuous-Value Attributes
Enhancing a Basic Tree
Decision Trees in Data Mining
Case Study
Nearest Neighbor Classifiers
Nearest Neighbor Classifiers (contd.)
Nearest Neighbor Classifiers (contd.)
Computing Distance and Determining Class
Choosing the Value of K
Scaling Issues in Nearest Neighbor Classification
Support Vector Machines
Advantages of Support Vector Machines
Geometric Margin in SVMs
Linear SVMs
Non-Linear SVMs
Summary and quizzes
Recap
Introduction to Clustering
Clustering vs. Classification
Use Cases of Clustering
Clustering Models
K-means Clustering
K-means Clustering Algorithm
Pseudo Code of K-means
K-means Clustering Using R
K-means Clustering—Case Study
K-means Clustering—Case Study(contd.)
K-means Clustering—Case Study (contd.)
Day 5
Hierarchical Clustering
Hierarchical Clustering Algorithms
Requirements of Hierarchical Clustering Algorithms
Agglomerative Clustering Process
Hierarchical Clustering—Case Study
Hierarchical Clustering—Case Study (contd.)
Hierarchical Clustering—Case Study (contd.)
Hierarchical Clustering—Case Study (contd.)
Summary and quizzes
Association Rule Mining
Application Areas of Association Rule Mining
Parameters of Interesting Relationships
Association Rules
Association Rule Strength Measures
Limitations of Support and Confidence
Apriori Algorithm
Apriori Algorithm—Example
Applying Apriori Algorithm
Step 1—Mine All Frequent Item Sets
Algorithm to Find Frequent Item Set
Finding Frequent Item Set—Example
Ordering Items
Ordering Items (contd.)
Candidate Generation
Candidate Generation (contd.)
Candidate Generation—Example
Step 2—Generate Rules from Frequent Item Sets
Generate Rules from Frequent Item Sets—Example
Problems with Association Mining
Summary and quizzes
Factor Analysis
a. Definition and examples
b. Factor Analysis
c. Communality
d. Rotation Of Factors
e. Implementation
f. Evaluation
Comments
Post a Comment