About the Program
The Python for Data Science training lets you master the concepts of the widely used and powerful programming language, Python. You will gain hands-on experience in working with various Python packages like SciPy, NumPy, Matplotlib, Lambda function and more. You will work on real-world projects in the domain of Python and apply it for various domains of Big Data, Data Science and Machine Learning.
Learning objective
By the end of the course, you’ll be able to:
- Introduction to Python for Data Science
- OOP concepts, expressions, and functions
- What is SQLite in Python, operations, and classes
- Creating Pig and Hive UDF in Python
- Deploying Python for MapReduce programming
- Real-world Python for Data Science projects
Program Highlights
About The Course
Course Modules
- Introduction to Data Science
- What is Data Science, what does a data scientist do, various examples of Data Science in the industries and how Python is deployed for Data Science applications, various steps in Data Science process like data wrangling, data exploration and selecting the model, understanding data visualization, what is exploratory data analysis and building of hypothesis, plotting and other techniques.
- Introduction to Python
- Introduction to Python programming language, important Python features, how is Python different from other programming languages, Python installation, Anaconda Python distribution for Windows, Linux and Mac, how to run a sample Python script, Python IDE working mechanism, running some Python basic commands, Python variables, data types, and keywords.
- Hands-on Exercise – Installing Python Anaconda for the Windows, Linux and Mac
- Python basic constructs
- Introduction to a basic construct in Python, understanding indentation like tabs and spaces, code comments like Pound # character, names and variables, Python built-in data types like containers (list, set, tuple and dict), numeric (float, complex, int), text sequence (string), constants (true, false, ellipsis) and others (classes, instances, modules, exceptions and more), basic operators in Python like logical, bitwise, assignment, comparison and more, slicing and the slice operator, loop and control statements like break, if, for, continue, else, range() and more.
- Hands-on Exercise – Write your first Python program, write a Python function (with and without parameters), use Lambda expression, write a class, create a member function and a variable, create an object and write a for loop to print all odd numbers
- Writing OOP in Python and connecting to the database
- Understanding the OOP paradigm like encapsulation, inheritance, polymorphism and abstraction, what are access modifiers, instances, class members, classes and objects, function parameter and return type functions, Lambda expressions, connecting with database to pull the data.
- Hands-on Exercise – Writing a Python program and incorporating the OOP concepts and connecting to a database for getting the data.
- NumPy for mathematical computing
- Introduction to mathematical computing in Python, what are arrays and matrices, array indexing, array math, ND-array object, datatypes, standard deviation, conditional probability in NumPy, correlation, covariance.
- Hands-on Exercise – How to import NumPy module, creating an array using ND-array, calculating standard deviation on the array of numbers and the calculating correlation between two variables.
- SciPy for scientific computing
- Introduction to SciPy, building on top of NumPy, what are the characteristics of SciPy, various sub packages for SciPy like Signal, Integrate, Fftpack, Cluster, Optimize, Stats and more, Bayes Theorem with SciPy.
- Hands-on Exercise: Importing of SciPy, applying the Bayes theorem on the given dataset.
- Data Analysis and Machine Learning (Pandas)
- Introduction to Machine Learning with Python, various tools in Python used for Machine Learning like NumPy, Scikit-Learn, Pandas, Matplotlib and more, use cases of Machine Learning, process flow of Machine Learning, various categories of Machine Learning, understanding Linear Regression and Logistic Regression, what is gradient descent in Machine Learning, introduction to Python DataFrames, importing data from JSON, CSV, Excel, SQL database, NumPy array to DataFrame.
-
- various data operations like selecting, filtering, sorting, viewing, joining and combining, how to handle missing values, time series analysis.
- Data manipulation
- What is a data object and its basic functionalities, using Pandas library for data manipulation, NumPy dependency of Pandas library, loading and handling data with Pandas, how to merge data objects, concatenation and various types of joins on data objects, exploring and analyzing datasets?
- Hands-on Exercise – Doing data manipulation with Pandas by handling tabular datasets that include variable types like float, integer, double and others.
- Data visualization with Matplotlib
- Using Matplotlib for plotting graphs and charts like Scatter, Bar, Pie, Line, Histogram and more, Matplotlib API, Subplots and Pandas built-in data visualization.
- Hands-on Exercise – Deploying Matplotlib for creating pie, scatter, line and histogram.
- Supervised learning
- What is supervised learning, classification, Decision Tree, algorithm for Decision Tree induction, Confusion Matrix, Random Forest, Naïve Bayes, working of Naïve Bayes, how to implement Naïve Bayes Classifier, Support Vector Machine, working process of Support Vector Mechanism, what is Hyperparameter Optimization, comparing Random Search with Grid Search, how to implement Support Vector Machine for classification.
- Hands-on Exercise – Using Python library Scikit-Learn for coming up with Random Forest algorithm to implement supervised learning.
- Unsupervised Learning
- Introduction to unsupervised learning, use cases of unsupervised learning, what is K-means clustering, understanding the K-means clustering algorithm, optimal clustering, hierarchical clustering and K-means clustering and how does hierarchical clustering work, what is natural language processing, working with NLP on text data, setting up the environment using Jupyter Notebook, analyzing sentence, the Scikit-Learn Machine Learning algorithms, bags of words model, extracting feature from text, searching a grid, model training, multiple parameters and building of a pipeline
- Hands-on Exercise – Setting up the Jupyter notebook environment, loading of a dataset in Jupyter, algorithms in Scikit-Learn package for performing Machine Learning techniques and training a model to search a grid.
- Web Scraping with Python
- Introduction to web scraping in Python, various web scraping libraries, BeautifulSoup, Scrapy Python packages, installing of BeautifulSoup, installing Python parser lxml, creating soup object with input HTML, searching of tree, full or partial parsing, output print and searching the tree
- Hands-on Exercise – Installation of Beautiful Soup and lxml Python parser, making a soup object with input HTML file and navigating using Py objects in soup tree.
- Python integration with Hadoop
- What is the need for integrating Python with Hadoop and Spark, the basics of the Hadoop ecosystem, Hadoop Common, the architecture of MapReduce and HDFS and deploying Python coding for MapReduce jobs on Hadoop framework, understanding Apache Spark, setting up Cloudera QuickStart VM, Spark tools, RDD in Spark, PySpark, integrating PySpark with Jupyter Notebook, introduction to Artificial Intelligence and Deep Learning, deploying Spark code with Python, the Machine Learning library of Spark MLlib, deploying Spark MLlib for classification, clustering and regression.
- Hands-on Exercise – How to write a MapReduce job with Python, connecting to the Hadoop framework and performing the tasks, how to implement Python in a sandbox, working with the HDFS file system.
- Python For Data Science Projects
Course Certificates
SmartBrains is associated with The National Skill Development Corporation (NSDC) as the Training and Certification partner for various job oriented training programs across various sectors including Oil & Gas, Power, Renewable Energy, Hydrocarbon, IT & ITs, Electronics, Telecom, Agriculture, Life science etc. offering assessment based Training & certifications for a gamut of job profile.
Who should join?
- Working professionals in Domains: Software Development, Database, Admin, Web developers, Software Tester & Analyst.
- Students (Computer Science Engineering & Diploma in Computer Science Engineering) who want to develop their career in Software development, database, Web Developer, Software Tester & Analyst.
For Corporate queries
A-25, Sector-59, Noida (UP),India
Pin Code-201301
Connect with Program Advisor
training@smartbrains.com
+91 8955560560 | +91 9891108700
www.smartbrains.com
Nodal Centers
Noida, Dehradun
Guwahati, Pune
Course Features
- Lecture 0
- Quiz 0
- Duration 50 hours
- Skill level All levels
- Language English
- Students 0
- Assessments Yes