• video_business

This Course is a blend of DataScience and BigData Analytics.

Pre-requisites to attend:

  • Basic java understanding
  • Any programing or scripting tool expertise.
  • Hadoop and Map-reduce overview and fundamentals.

Total Course Duration:

  • 6 Weeks +4 Weeks End to End Data Analytics project

Course Objective:

  • To understand the fundamentals behind data science
  • To Know the most used statistical techniques in Data Science
  • To achieve a good understanding of Data Analysis and Data Profiling Techniques
  • To Know the various techniques of Text Analytics and unstructured data analytics
  • To Know Machine Learning and the most used algorithms in Machine Learning
  • To understand how to work with Mahout and build your own Algorithms
  • To learn how you can use Big Data to solve real world Analytics Problems
  • You will enter the world of Data Analytics and Data Science and see how Big Data can be used for
  • driving Business Value!!

Course Contents:

Introduction to Big Data Analytics and Data Science

  • Big Data Overview
  • Introduction to Big Data Analytics
  • Role of The Data Scientist
  • Typical Big Data Analytics Use Cases
  • What is Possible and what is not?

Introduction to Big Data Analytics and Data Science

  • R and Hadoop
  • Mahout
  • Visualization Tools

Introduction to R and Data Analytics

  • Basics of R
  • Data Exploration using R
  • Data Profiling using R

Course Contents:

Offline Practice of Assignments covered in Week 1

Week 2 and 3

Big Data Analytics Lifecycle

  • Data Discovery
  • Data Preparation
  • Model Planning and Model Building
  • Visualizing Results
  • Model Validation and Learning Optimization
  • Operationalizing Big Data Insights

Introduction to statistical Measures and Test

  • Data Analysis and Statistical Methods
  • Relationship Between Statistics and Probability
  • Descriptive Statistics
  • Inferential Statistics
  • Measures of Central Tendency
  • Measures of Spread or Dispersion
  • Measures of Data Distribution
  • Sampling and Confidence Intervals
  • Hypothesis Testing

Introduction to statistical Test

  • Chi-Square Test
  • ANNOVA Test
  • Others

Machine Learning Fundamentals

  • Supervised and Un-supervised Learning
  • Vectors and Matrix
  • Dense and Sparse Vectors
  • Sequence Files
  • Similarity Matrix
  • Confusion Matrix
  • Distance Measures
  • Vectorizations
  • Vector Encoding
  • SVD (Singular Value Decomposition)
  • Hands on Exercise to calculate these from Raw data

Hadoop and Map Reduce Concepts Refresher

  • End to End Driver Program for
  • Sequence Files
  • Introduction to Mahout
  • Mahout Utilities and their usages

Offline Practice of Assignments covered in Week 2 and 3

Week 4

Time Series and Regression

  • Fundamentals of Regression
  • Introduction to Time Series and Forecasting

Fundamentals for Text Analytics

  • Techniques for Text Analytics
  • N-Grams
  • Natural Language Processing (NLP)
  • Vectorizing Textual Data
  • Hands-on Text Analytics

Advanced Analytics


  • K Means Clustering
  • Canopy Clustering
  • Fuzzy K-Means
  • Latent Dirichlet Allocation (LDA)
  • CComparison of various Clustering methods
  • Practical Use Cases for Clustering
  • Hand-on Example of Clustering

Recommendation Techniques

  • Introduction to Collaborative Filtering
  • User and Item Similarity
  • Matrix Factorization
  • User Based Recommendation
  • Item Based Recommendation
  • Real Time Recommendation Engine
  • Hand-on Example of Recommendation

Offline Practice of Assignments covered in Week 4

Week 5

Regression Analysis

  • Linear Regression
  • Linear Regression
  • Hands-on Regression Examples


  • Naive Bayesian Classifier
  • Decision Trees
  • Hands on Example and Use Case for Classification
  • <

Pattern Mining

  • Frequent Pattern Growth
  • Sequential Clustering
  • Hands-on Example/UseCase for Pattern Mining

Time Series Analysis

  • Techniques for Time Series Analysis
  • Hands-on Example and Use Case for TimeSeries Analysis
  • Role of NoSQL in Time Series Analysis

Offline Practice of Assignments covered in Week 5

Week 6

Data Visualization Techniques

  • Standard Visualizations
  • Big Data Visualization Techniques

The Endgame, or Putting it All Together

  • End to End Data Analytics Use Case

End to End Data Analytics Project

  • Example Project with Real data and End to End Data Science Exploration
  • 4 Weeks Offline under Periodic Guidance
  • Weekly Meeting for 1hr to clarify question and provide hints

Cared and Crafted by: Velociter

Scroll to Top