Databyte Academy

About the Program

About The Certified Data Scientist Program

The data scientist role is an offshoot of the statistician role that includes the use of advanced analytics technologies, including machine learning and predictive modeling, to provide insights beyond statistical analysis. The demand for data science skills has grown significantly in recent years as companies look to glean useful information from the voluminous amounts of structured, unstructured and semi-structured data that a large enterprise produces and collects — collectively referred to as big data.

A data scientist uses large amounts of data to develop hypotheses, make inferences and hone in on customer, business and market trends. The data scientist must be able to communicate how to use analytics data to drive business decisions that may include changing course, improving a process or product, or creating new services or products.

Course Objective

This is a complete course that provides you detailed understanding of data science, encompasses basic statistical concepts to advanced analytics and predictive modeling techniques, along with project life cycle, data acquisition, analysis, statistical methods and machine learning.

The objective of the course is to learning statistical analysis techniques and tools to solve business problems that help you to emerge as ‘Industry Ready’ professional in the field of Data Science.

You will be learning the Data Science skills with the most popular and leading analytical tools widely used across industries such as R & Python.

Upon completion of Data Science course, you now have acquired valuable skills to

  1. Master key facets of data investigation, including data wrangling, cleaning, sampling, management, exploratory analysis, regression and classification, prediction, and data communication.
  2. Implement foundational concepts of data computation, such as data structure, algorithms, parallel computing, simulation, and analysis.
  3. Leverage your knowledge of key subject areas, such as statistical quality control, exponential smoothing, seasonally adjusted trend analysis, or data visualization.

Who should do this course?

  1. Graduates / Professionals from various quantitative backgrounds like Engineering, Finance, Mathematics, Statistics, Business Management who aspires to spearhead their career in Data Analytics.
  2. Basic knowledge of data analysis & business problems
  3. Analytics consultants
  4. IT/Software Professionals

Who are the trainers?

Our trainers are highly qualified industry experts and certified instructors with more than 10 years of global analytical experience.

Prerequisites

Prior knowledge in basic programming / statistics is recommended for this course.

I would like to know more

""
1
Nameyour full name
Contact Number
Subject
Messagemore details
0 /
Previous
Next

Databyte Academy Instructors

Learn from practitioners, not from trainers.

SUMEET BANSAL

CEO & Co-Founder of Analytixlabs

Sumeet is a former Business Consultant and worked with prestigious companies like McKinsey & Company, ZS Associates and AbsolutData in the past 7 years. He has worked in more than 10 countries and is an expert in Market Research and Business Analytics. He co-founded AnalytixLabs in 2011, and has taken up the lead role from 2014 onward. He has worked extensively in domains such as Market Research, Brand positioning, Building Go-to-market strategies, Customer Lifecycle Management and Pricing for clients in various sectors like Automotive, Banking, Hi-tech, Pharma and Telecom.


CHANDRA MOULI

Chief Data Scientist

Chandra has got rich experience in the field of Marketing and Risk Analytics, which he has attained during his meticulous tenure with companies like McKinsey & Company, Genpact and erstwhile Citigroup Services. He has 8+ years of experience and has been working in the domain of Analytics since 2006 after completing his education from IIT-Madras. With his analytical rigour he has effectively helped various Banking & Financial Services, Media and Retail clients with Pre & Post campaign management, Credit risk, Attrition modelling, Cross sell and Up sell programs. He was also recently featured in the top 10 data scientists in India by the Analytics India Magazine.


Ankita Gupta

Principal Consultant

Ankita has more than 8 years of experience with McKinsey, where she focused on Market Research and Behavioural Analytics working with global clients in domains like Hi-Tech, Telecom, Banking and Retail. Post McKinsey, she joined Fidelity as Snr Specialist, to lead the NPS analytics effort for their various business apart from looking into possible B2C analytics. Ankita is an expert in quantitative marketing research such as conjoint, attitudinal segmentation, pricing and brand equity along with measuring consumer loyalty. She holds an MBA from ISB, Hyderabad.

Exam & Certification

The certification is provided by Databyte Academy

Upon successful completion of the program, students will be conferred with dual certification:

  • Certificate of Completion
  • CERTIFIED DATA SCIENTIST*

In order to be “Certified” as part of the course, students need to complete the assignments and examination. Once all your assignments are submitted and evaluated, the certificate shall be awarded.

New Intake – 21st August 2017


Certified Data Scientist

Course ID – CDS
Duration – 80 Hours
Classes – 10 Days
Tools – R & Phyton

Learning Mode – Instructor Led-Classroom Training
Next Batch – Full Time (21st August 2017) & Weekend (19th August 201t)

Course Outcome

Ability to use advanced analytics techniques and tools to improve business performance across many functions by managing data with help of different tools like R and Python. Working with various forms of structured and unstructured real time data sets in solving business problems with advanced statistical techniques & algorithms like machine learning

Course Content

The field of data analysis, as the name implies, analyses data to discover trends. It has tremendous uses not only in the economics and financial sector but fields like law, healthcare, public administration, politics, telecom, social media, manufacturing, banking & financial institutions etc. who rely on quality data analysis to arrive at strategic business decisions. Working professionals can definitely improve their resume and their job prospects by achieving a certificate in data analytics.

Introduction to Data Science

  1. What is Data Science?
  2. Analytics vs. Data warehousing, OLAP, MIS Reporting
  3. Relevance in industry and need of the hour
  4. Types of problems and business objectives in various industries (Regression, classification, segmentation, forecasting, optimization etc)
  5. How leading companies are harnessing the power of analytics?
  6. Critical success drivers
  7. Different phases of a typical Analytics projects
  8. Understanding Heuristic vs. statistical models/analysis
  9. Understanding classical techniques vs. machine learning techniques
  10. Latest Trends in data science
  11. Opportunities with data science

R: Introduction to R- environment

  1. The Workspace
  2. Input/ Output
  3. Useful Packages (Base & other packages) in R
  4. Graphic User Interfaces (R studio)
  5. Customizing Startup
  6. Batch Processing and reusing Results

R: Data Input & Output (Importing & Exporting)

  1. Data Structure & Data Types (Vectors, Matrices, factors, Data frames,  and Lists)
  2. Importing Data from various sources
  3. Database Input (Connecting to database)
  4. Exporting Data to various formats)
  5. Viewing Data (Viewing partial data and full data)
  6. Variable & Value Labels –  Date Values

R: Data Manipulation

  1. Introduction to MS Access
  2. What is SQL – A Quick Introduction
  3. Getting started
  4. Understanding basic RDBMS concepts
  5. Data manipulation – Reading & Manipulating a Single Table
  6. Data based objects creation (DDL Commands)
  7. Optimizing your work
  8. Data manipulation – CaseStudy-1
  9. Data manipulation – Case Study-2

Python: Accessing/Importing and Exporting Data

  1. Overview of Python- Starting Python
  2. Importing Data from various sources (Csv, txt, excel, access etc…)
  3. Database Input (Connecting to database)
  4. Viewing Data objects – sub setting, methods
  5. Exporting Data to various formats

Python: Data Manipulation – cleansing

  1. Cleansing Data with Python
  2. Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, sub setting, derived variables, sampling, Data type conversions, renaming, formatting etc)
  3. Data manipulation tools(Operators, Functions, Packages, control structures,
  4. Creating Graphs
  5. Histograms & Density Plot
  6. Dot Plots – Bar Plots – Line Charts – Pie Charts – Boxplots – Scatterplots

Python – Introduction & Essentials

  1. Introduction to Python Editors & IDE’s(Canopy, pycharm, Jupyter, Rodeo, Ipython etc…)
  2. Custom Environment Settings
  3. Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
  4. Installing & loading Packages & Name Spaces
  5. Data Types & Data objects/structures (Tuples, Lists, Dictionaries)
  6. List and Dictionary Comprehensions
  7. Variable & Value Labels –  Date & Time Values
  8. Basic Operations – Mathematical – string – date
  9. Reading and writing data
  10. Simple plotting/Control flow/Debugging/Code profiling

R& Python: Basic Statistics (Exploratory Analysis)

  1. Univariate Analysis
  2. Bi-Variate Analysis (correlation, association etc)
  3. Descriptive Statistics(central tendency/variance)
  4. Frequency Tables /Summarization
  5. Exploratory Analysis
  6. Probability distributions
  7. Sampling – Central Limit Theorem
  8. Inferential statistics – Hypothesis testing
  9. Statistical tests (t/z-test, ANOVA, chi-square etc)

R& Python: Data Prep & Reduction techniques

  1. Data Audit report creation and understanding
  2. Need for data preparation
  3. Binning, Dummy and Derived variable creation Loops, arrays etc)
  4. Python Built-in Functions (Text, numeric, date, utility functions)
  5. User Defined Functions in Python
  6. Stripping out extraneous information
  7. Normalizing data and Formatting data
  8. Important Python Packages for data manipulation(Pandas, Numpy etc)

Python - Data Analysis – Visualization

  1. Introduction exploratory data analysis
  2. Descriptive statistics, Frequency Tables and summarization
  3. Univariate Analysis (Distribution of data & Graphical Analysis)
  4. Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
  5. Creating Graphs- Bar/pie/line chart/histogram/boxplot/scatter/density etc)
  6. Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, Pandas and scipy.stats etc)

R& Python: Regression Modelling

  1. Basics of regression analysis
  2. Approach: Model Estimation, OLS, MLE & Error Function for finding parameters, Assumptions verification (Linearity, Normality, multicollinearity, outliers etc)
  3. Linear regression Model fitting
  4. Logistic regression Model Fitting
  5. Measures of good ness of fit (R^2, Adj R^2, Concordance, Gini, KS, Lift etc…)
  6. Model Diagnostics – Residual Analysis– Decile Analysis – ROC Curves etc..
  7. Interpretation of results

R& Python: Time Series Forecasting

  1. Time Series Introduction / Regression on Time, Time Series components
  2. Modelling Seasonality as Deviation
  3. Basic methods(pattern & pattern less)
  4. Averages (MA, WMA, CMA etc)
  5. Standardization, Normalization
  6. Outlier treatment
  7. Missing values treatment (MI, clustering, regression based)
  8. Dimension reduction – Factor Analysis – PCA

R& Python: Customer Segmentation

  1. Basics of clustering
  2. Heuristic segmentation (RFM, Life stage, value based etc..)
  3. Cluster analysis (K-means and Hierarchical)
  4. Objective & subjective segmentation
  5. Decision Trees (CHAID/CART/C5.0)
  6. Cluster evaluation and profiling
  7. Interpretation and application
  8. Smoothening Techniques (Exponential)
  9. Advanced Methods(ARIMA etc

R& Python: Introduction to Machine Learning

  1. Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning,
  2. Different Phases of Predictive Modelling
  3. Concept of Overfitting and Under fitting (Bias-Variance Trade off) & Performance Metrics
  4. Types of Cross validation(Train & Test, Bootstrapping, K-Fold validation etc)
  5. Cost & optimization functions

R& Python: Machine Learning in Practice

  1. Ensemble Learning (Random Forest, Bagging & boosting)
  2. Artificial Neural Networks(ANN)
  3. Support Vector Machines(SVM)
  4. KNN, Naïve Bayes
  5. Text Mining & NLP

I would like to know more

""
1
Nameyour full name
Contact Number
Subject
Messagemore details
0 /
Previous
Next

Get Ahead with Databyte’s Certificate

Earn your Certificate

Our Certified Data Scientist Program is exhaustive and this certificate is proof that you have taken a big leap in mastering the domain.

Differentiate yourself with a Certificated Data Scientist Program

The knowledge and skills you’ve gained working on projects, simulations, case studies will set you ahead of competition.

Share your achievement

Talk about it on Linkedin, Twitter, Facebook, boost your resume or frame it – tell your friends and colleagues about it.

Learning Path

Login Form

Register Form