About the Program
The Python for Data Science training lets you master the concepts of the widely used and powerful programming language, Python. You will gain handson experience in working with various Python packages like SciPy, NumPy, Matplotlib, Lambda function and more. You will work on realworld projects in the domain of Python and apply it for various domains of Big Data, Data Science and Machine Learning.
Learning objective
By the end of the course, you’ll be able to:
 Introduction to Python for Data Science
 OOP concepts, expressions, and functions
 What is SQLite in Python, operations, and classes
 Creating Pig and Hive UDF in Python
 Deploying Python for MapReduce programming
 Realworld Python for Data Science projects
Program Highlights
About The Course
Course Modules
 Introduction to Data Science
 What is Data Science, what does a data scientist do, various examples of Data Science in the industries and how Python is deployed for Data Science applications, various steps in Data Science process like data wrangling, data exploration and selecting the model, understanding data visualization, what is exploratory data analysis and building of hypothesis, plotting and other techniques.
 Introduction to Python
 Introduction to Python programming language, important Python features, how is Python different from other programming languages, Python installation, Anaconda Python distribution for Windows, Linux and Mac, how to run a sample Python script, Python IDE working mechanism, running some Python basic commands, Python variables, data types, and keywords.
 Handson Exercise – Installing Python Anaconda for the Windows, Linux and Mac
 Python basic constructs
 Introduction to a basic construct in Python, understanding indentation like tabs and spaces, code comments like Pound # character, names and variables, Python builtin data types like containers (list, set, tuple and dict), numeric (float, complex, int), text sequence (string), constants (true, false, ellipsis) and others (classes, instances, modules, exceptions and more), basic operators in Python like logical, bitwise, assignment, comparison and more, slicing and the slice operator, loop and control statements like break, if, for, continue, else, range() and more.
 Handson Exercise – Write your first Python program, write a Python function (with and without parameters), use Lambda expression, write a class, create a member function and a variable, create an object and write a for loop to print all odd numbers
 Writing OOP in Python and connecting to the database
 Understanding the OOP paradigm like encapsulation, inheritance, polymorphism and abstraction, what are access modifiers, instances, class members, classes and objects, function parameter and return type functions, Lambda expressions, connecting with database to pull the data.
 Handson Exercise – Writing a Python program and incorporating the OOP concepts and connecting to a database for getting the data.
 NumPy for mathematical computing
 Introduction to mathematical computing in Python, what are arrays and matrices, array indexing, array math, NDarray object, datatypes, standard deviation, conditional probability in NumPy, correlation, covariance.
 Handson Exercise – How to import NumPy module, creating an array using NDarray, calculating standard deviation on the array of numbers and the calculating correlation between two variables.
 SciPy for scientific computing
 Introduction to SciPy, building on top of NumPy, what are the characteristics of SciPy, various sub packages for SciPy like Signal, Integrate, Fftpack, Cluster, Optimize, Stats and more, Bayes Theorem with SciPy.
 Handson Exercise: Importing of SciPy, applying the Bayes theorem on the given dataset.
 Data Analysis and Machine Learning (Pandas)
 Introduction to Machine Learning with Python, various tools in Python used for Machine Learning like NumPy, ScikitLearn, Pandas, Matplotlib and more, use cases of Machine Learning, process flow of Machine Learning, various categories of Machine Learning, understanding Linear Regression and Logistic Regression, what is gradient descent in Machine Learning, introduction to Python DataFrames, importing data from JSON, CSV, Excel, SQL database, NumPy array to DataFrame.

 various data operations like selecting, filtering, sorting, viewing, joining and combining, how to handle missing values, time series analysis.
 Data manipulation
 What is a data object and its basic functionalities, using Pandas library for data manipulation, NumPy dependency of Pandas library, loading and handling data with Pandas, how to merge data objects, concatenation and various types of joins on data objects, exploring and analyzing datasets?
 Handson Exercise – Doing data manipulation with Pandas by handling tabular datasets that include variable types like float, integer, double and others.
 Data visualization with Matplotlib
 Using Matplotlib for plotting graphs and charts like Scatter, Bar, Pie, Line, Histogram and more, Matplotlib API, Subplots and Pandas builtin data visualization.
 Handson Exercise – Deploying Matplotlib for creating pie, scatter, line and histogram.
 Supervised learning
 What is supervised learning, classification, Decision Tree, algorithm for Decision Tree induction, Confusion Matrix, Random Forest, Naïve Bayes, working of Naïve Bayes, how to implement Naïve Bayes Classifier, Support Vector Machine, working process of Support Vector Mechanism, what is Hyperparameter Optimization, comparing Random Search with Grid Search, how to implement Support Vector Machine for classification.
 Handson Exercise – Using Python library ScikitLearn for coming up with Random Forest algorithm to implement supervised learning.
 Unsupervised Learning
 Introduction to unsupervised learning, use cases of unsupervised learning, what is Kmeans clustering, understanding the Kmeans clustering algorithm, optimal clustering, hierarchical clustering and Kmeans clustering and how does hierarchical clustering work, what is natural language processing, working with NLP on text data, setting up the environment using Jupyter Notebook, analyzing sentence, the ScikitLearn Machine Learning algorithms, bags of words model, extracting feature from text, searching a grid, model training, multiple parameters and building of a pipeline
 Handson Exercise – Setting up the Jupyter notebook environment, loading of a dataset in Jupyter, algorithms in ScikitLearn package for performing Machine Learning techniques and training a model to search a grid.
 Web Scraping with Python
 Introduction to web scraping in Python, various web scraping libraries, BeautifulSoup, Scrapy Python packages, installing of BeautifulSoup, installing Python parser lxml, creating soup object with input HTML, searching of tree, full or partial parsing, output print and searching the tree
 Handson Exercise – Installation of Beautiful Soup and lxml Python parser, making a soup object with input HTML file and navigating using Py objects in soup tree.
 Python integration with Hadoop
 What is the need for integrating Python with Hadoop and Spark, the basics of the Hadoop ecosystem, Hadoop Common, the architecture of MapReduce and HDFS and deploying Python coding for MapReduce jobs on Hadoop framework, understanding Apache Spark, setting up Cloudera QuickStart VM, Spark tools, RDD in Spark, PySpark, integrating PySpark with Jupyter Notebook, introduction to Artificial Intelligence and Deep Learning, deploying Spark code with Python, the Machine Learning library of Spark MLlib, deploying Spark MLlib for classification, clustering and regression.
 Handson Exercise – How to write a MapReduce job with Python, connecting to the Hadoop framework and performing the tasks, how to implement Python in a sandbox, working with the HDFS file system.
 Python For Data Science Projects
Course Certiﬁcates
SmartBrains is associated with The National Skill Development Corporation (NSDC) as the Training and Certification partner for various job oriented training programs across various sectors including Oil & Gas, Power, Renewable Energy, Hydrocarbon, IT & ITs, Electronics, Telecom, Agriculture, Life science etc. offering assessment based Training & certifications for a gamut of job profile.
Who should join?
 Working professionals in Domains: Software Development, Database, Admin, Web developers, Software Tester & Analyst.
 Students (Computer Science Engineering & Diploma in Computer Science Engineering) who want to develop their career in Software development, database, Web Developer, Software Tester & Analyst.
For Corporate queries
A25, Sector59, Noida (UP),India
Pin Code201301
Connect with Program Advisor
training@smartbrains.com
+91 8955560560  +91 9891108700
www.smartbrains.com
Nodal Centers
Noida, Dehradun
Guwahati, Pune