Unsure where
to begin?
Track
Certificate
Preparation for Data Science
4.5+
★★★★★
★★★★★
11 reviews
Intermediate
Track curriculum encompasses a collection of pivotal courses that provide foundational knowledge and skills essential for a successful journey in the field of data science. These courses encompass the comprehensive study of key concepts, tools, and methodologies integral to the realm of data analysis and modeling. Show more
python
Boost your Tech Skills!
with up to 55% off
What you'll get with our subscription:
- Access to 85+ top-rated courses
- AI-driven Learning
- Workspaces for practicing your skills
- Personalized study tracks
- Certificates of completion
Training 2 or more people?
Get your team access to Codefinity courses anytime, anywhere.
Try Codefinity TeamsTrusted by employees of leading companies
Learning track content
Module 2 / NumPy in a Nutshell
In this section we will get acquainted with what the NumPy library is, as well as learn how to create an array.
In this section we will get acquainted with arrays of different dimensions, and understand the difference between them.
In this section we should recall what slices are and learn how to make them for arrays of different dimensions. We shall also learn to refer to elements in an array using their indexes.
In this section we will learn how to reshape arrays and also how to concatenate an array. Moreover we will learn how to sort an array. Also we will learn about such an interesting method that is often used for arrays, namely: copy().
Module 3 / Getting into NumPy Basics
In this project, we will delve into the fundamentals of NumPy, exploring its core features and uncovering the reasons behind its significant impact on scientific computing.
Module 4 / Pandas First Steps
In this section, we'll explore the fundamentals of Series and DataFrame structures. You'll also learn about the distinctions between these two types of structures.
- What is pandas?Preview
- SeriesPreview
- Challenge: Creating a SeriesPreview
- DataFramePreview
- Quiz: Creating a SeriesPreview
- Quiz: Creating a DataFramePreview
- Adding a New ColumnPreview
- Inserting a New ColumnPreview
- Deleting a Row/ColumnPreview
- Quiz: Matching the FunctionsPreview
- Working with ColumnsPreview
- Quiz: Extracting ColumnsPreview
- iloc BasicsPreview
- Challenge: Using ilocPreview
Data can be sourced in various formats, such as CSV, JSON, SQL, HTML, and more. With Pandas, you're not limited to a single format — you can work with data across a multitude of file types. In this chapter, we'll specifically focus on the CSV and TXT formats.
Here, you'll learn how to process raw data by removing extraneous information and managing null values in a dataset.
- Viewing the DataPreview
- Quiz: Using HeadPreview
- Quiz: Head, Tail, and SamplePreview
- Exploring the DatasetPreview
- Column Names and Data TypesPreview
- Finding Null ValuesPreview
- Quiz: Identifying Null ValuesPreview
- Challenge: Dropping Null ValuesPreview
- Challenge: Filling Null ValuesPreview
- Quiz: Null ValuesPreview
- Describing the DataPreview
- max() and min()Preview
- Quiz: Statistical OperationsPreview
- sum() and count()Preview
- Unique ValuesPreview
Module 5 / Advanced Techniques in pandas
This section will teach you how to output specific columns by their titles or indices. Also, you will get acquainted with the ways you can select rows by indices.
Here, you will learn how to extract data that has specific conditions. Also, you will learn how to combine them and even create your own.
In this section, you will expand your knowledge on setting different data conditions. You will learn to check if your data is in a defined list of values or between two values. You will also learn how to find the largest and smallest values.
This section is one of the most fascinating of the course. Here, you will learn how to group data in different ways. It will help you work as a data analyst to find out information on specific data groups.
This section is one of the most significant for a data analyst because if the data contains missing data values in the incorrect format, it will be impossible to work with. Thus, you will learn how to deal with such inappropriate values here.
- Checking for Missing ValuesPreview
- Calculating the Number of Missing ValuesPreview
- What Will We Do With the NaN Values?Preview
- How to Delete Only NaN Values?Preview
- Filling In the Missing ValuesPreview
- Managing Categorical VariablesPreview
- Checking the Column TypePreview
- Managing an Incorrect ColumnPreview
- Renaming the ColumnPreview
Module 6 / Unveiling the Power of Data Manipulation with Pandas
In this project, we are going to understand what Pandas is and why it is so powerful.
Module 7 / Mathematics for Data Analysis and Modeling
Let's start with some basic definitions and concepts we'll use later. Consider the idea of a function, a numerical sequence, and its sum, and also understand what a coordinate system's basis is.
The simplest and most commonly used type of relationship is the linear relationship. Linear algebra is a branch of higher mathematics entirely devoted to linear functions and linear spaces. Let's look at some of the most important topics in linear algebra: vectors, matrices, solving linear equations, and solving the spectral problem for matrices.
- Numerical Operations on Vectors and MatricesPreview
- Challenge: Calculate the Matrix Multiplication ResultPreview
- Matrix DeterminantPreview
- Scaling Factor of the Linear TransformationPreview
- Challenge: Figures' Linear TransformationsPreview
- Inversed and Transposed MatricesPreview
- System of Linear EquationsPreview
- Challenge: Solving the Task Using SLEPreview
- Eigenvalues and EigenvectorsPreview
Mathematical analysis is a discipline that allows you to analyze functions according to various criteria. Consider how to check numerical sequences for convergence, find the maximum/minimum values of functions, solve nonlinear equations, and use integrals to solve applied problems.
Module 8 / Probability Theory Basics
We will start our way of learning probability theory by considering some basic definitions and rules: what is a stochastic experiment and random event, what is independence and incompatibility of events in the context of probability theory, what is the probability and how can we calculate probabilities of different elementary events.
In real-life tasks, we often have to deal with complex relationships and, as a result, calculate probabilities of several events or events that depend on each other. Let's consider how we can do this using probability theory.
To solve many real problems in probability theory, special models have been created that describe a particular situation. Let's consider some of the most used models that can be used to describe some discrete results of stochastic experiments.
What if the result of a stochastic experiment cannot be described by a discrete value? For this, models that work with continuous values are used. Consider the most popular of these models.
Often we are faced with the task of checking the dependence of the results of different stochastic experiments on each other. Moreover, it is necessary not only to assess the presence of dependencies but also to somehow quantify the degree of dependencies. To solve these problems, we can use covariance and correlation.
This section will help us deal with the first real statistical case: finding confidence intervals. It requires knowledge of NumPy, pandas, Matplotlib, and Seaborn library to calculate math formulas and build visualization! To encourage you to pass this section, I want to point out that you will run across a small amount of theory but a significant amount of practice!
An inseparable part of a data analyst's life is conducting hypothesis testing. After completing this section, you will understand the idea behind testing in statistics and will be able to conduct a t-test using Python.
Module 10 / Advanced Probability Theory
Now we will understand some fundamental theoretical concepts which are used in solving real live tasks: absolutely continuous and discrete random variables, probability density function, cumulative distribution function, the characteristics of a random variable, etc.
- Course OverviewPreview
- Absolutely Continuous and Discrete Random VariablesPreview
- Cumulative Distribution Functions and Probability Density FunctionsPreview
- Characteristics of Random VariablesPreview
- Random VectorsPreview
- Useful Properties of the Gaussian DistributionPreview
- Challenge: Detecting Outliers Using 3-Sigma RulePreview
The limit theorems of probability theory are fundamental laws of probability theory that are often used in practice in a wide variety of areas, such as: building confidence intervals, estimating distribution parameters, providing A/B testings, creating ensembles of ML models, etc. Now we will consider two of the most commonly used: the Law of Large Numbers and the Central Limit Theorem.
When we work with real data we usually do not know from which distribution this data was obtained. In order to determine this, we must be able to correctly estimate the parameters of this distribution and the type of distribution, which we will learn to do in this section.
- General population. Samples. Population parameters.Preview
- Momentum estimation. Maximum Likelihood EstimationPreview
- Challenge: Estimate Parameters of Chi-square DistributionPreview
- Unbiased EstimationPreview
- Challenge: Checking Bias of An Estimation Using SimulationPreview
- Consistent EstimationPreview
- Efficient EstimationPreview
- Confidence Intervals for Population ParametersPreview
- Challenge: Confidence Interval for Exponential Distribution ParameterPreview
We have already learned how to estimate the parameters of the population. But to estimate the parameter, we make an assumption about the population distribution. Can we say that our assumption is correct? How do we prove that the estimated parameters are the real parameters of the population? Can we show that two sets of samples are independent? To answer these questions, it is necessary to consider the concept of hypothesis testing.
- What is Statistic Hypothesis? Type 1 and Type 2 ErrorsPreview
- What is P-value?Preview
- Comparing Means of Two Different DatasetsPreview
- Challenge: Using CLT to Compare Mean Values of Non-Gaussian DatasetsPreview
- Challenge: Resampling Approach to Compare Mean Values of the DatasetsPreview
- Testing the Hypothesis of Independence of Two Random VariablesPreview
Requirements
- A computer with a browser - all browsers are supported.
- Your enthusiasm to enhance your tech skills.
- Everything else needed to start learning and practicing is already included in this course.
Over 200,000 5-star ratings and counting
Ruslan Kravchuk
The main thing is to learn and not give up
The material is good, there is a lot to learn, all in order to become better and the main thing is to learn what you want....
Matteo Comune
Thanks to them I'm learning a lot…
Thanks to them I'm learning a lot faster because they help you to understand everything from scratch. It's the best website that helps people with no background in IT...
Yuliana Cadavid
great course for beginners
great course for beginners, they test your knowledge in every lesson...
Elpunzon
I am enjoying my Codefinity experience…
I am enjoying my Codefinity experience learning Python. The self-paced way of learning is great because I can fit it into my schedule...
Alexandru Alexandru
Is nice to learn from codefinity
Is nice to learn from codefinity. Its easy and have good examples on what I learned here...
jacob Templet
Easy to follow along with and provides…
Easy to follow along with and provides challenge in my every day life. The challenge keeps me wanting to learn day after day...
Elan
Codefinity is a comprehensive learning…
Codefinity is a comprehensive learning tool to help you develop your skills as a software engineer or data scientist. The exercises are fun and a good way to sharpen your skills...
Thibault
First time learning how to code
First time learning how to code and successfully doing so with codefinity - thank you...
Adrien Morel
Well designed for total beginners
Well designed for total beginners, incremental progress and makes me feel confident....
_Gracy
it's simply perfectly well explained
it's simply perfectly well explained! so far I have not experienced any difficulty because everything is so well managed...
Ruslan Kravchuk
The main thing is to learn and not give up
The material is good, there is a lot to learn, all in order to become better and the main thing is to learn what you want....
Matteo Comune
Thanks to them I'm learning a lot…
Thanks to them I'm learning a lot faster because they help you to understand everything from scratch. It's the best website that helps people with no background in IT...
Yuliana Cadavid
great course for beginners
great course for beginners, they test your knowledge in every lesson...
Elpunzon
I am enjoying my Codefinity experience…
I am enjoying my Codefinity experience learning Python. The self-paced way of learning is great because I can fit it into my schedule...
Alexandru Alexandru
Is nice to learn from codefinity
Is nice to learn from codefinity. Its easy and have good examples on what I learned here...
jacob Templet
Easy to follow along with and provides…
Easy to follow along with and provides challenge in my every day life. The challenge keeps me wanting to learn day after day...
Elan
Codefinity is a comprehensive learning…
Codefinity is a comprehensive learning tool to help you develop your skills as a software engineer or data scientist. The exercises are fun and a good way to sharpen your skills...
Thibault
First time learning how to code
First time learning how to code and successfully doing so with codefinity - thank you...
Adrien Morel
Well designed for total beginners
Well designed for total beginners, incremental progress and makes me feel confident....
_Gracy
it's simply perfectly well explained
it's simply perfectly well explained! so far I have not experienced any difficulty because everything is so well managed...
Data Engineer
Certificate of Completion
Showcase your newly acquired skills. You've earned it
Discover more
Learning tracks
Learning tracks
track
Only for Ultimate
TEST TRACK 12
1 Course
1 Project
0 Task
Beginner
4.0
(5234)
track
Only for Ultimate
Full Stack Web Development
7 Courses
327 Tasks
Beginner
4.6
(56)
track
Only for Ultimate
Become a React Developer
5 Courses
119 Tasks
Intermediate
4.8
(5)
track
Only for Ultimate
Python Data Analysis and Visualization
5 Courses
134 Tasks
Beginner
4.6
(9)
track
Only for Ultimate
SQL from Zero to Hero
4 Courses
115 Tasks
Beginner
4.8
(90)
track
Only for Ultimate
С++ for Beginners
6 Courses
103 Tasks
Beginner
4.4
(17)
track
Only for Ultimate
Python from Zero to Hero
6 Courses
176 Tasks
Beginner
4.7
(293)
track
Only for Ultimate
Supervised Machine Learning
4 Courses
1 Project
99 Tasks
Advanced
4.8
(4)
track
Only for Ultimate
Python: Beyond Intermediate
4 Courses
1 Project
121 Tasks
Beginner
4.7
(262)
track
Only for Ultimate
Java Essentials
6 Courses
307 Tasks
Beginner
4.3
(9)
track
Only for Ultimate
Game Development with Unity
4 Courses
143 Tasks
Beginner
4.6
(7)
track
Only for Ultimate
Become a Django Developer
5 Courses
170 Tasks
Advanced
4.4
(27)
track
Only for Ultimate
Flask for Dummies
5 Courses
156 Tasks
Intermediate
4.5
(31)
track
Only for Ultimate
Frontend Development Foundations
6 Courses
287 Tasks
Intermediate
4.6
(52)
track
Only for Ultimate
Web Developer from Zero to Hero
6 Courses
227 Tasks
Beginner
4.6
(56)
track
Only for Ultimate
Deep Learning Odyssey
2 Courses
80 Tasks
Advanced
5.0
(3)
track
Only for Ultimate
Web Development with C#
7 Courses
293 Tasks
Beginner
4.8
(97)
track
Only for Ultimate
TEST E2E TRACK BEGINNER
1 Project
0 Task
Beginner
track
Only for Ultimate
Test Track
2 Courses
21 Tasks
Begginer
4.7
(3)
track
Only for Ultimate
Skilled Python BackEnd Developer
5 Courses
113 Tasks
Advanced
4.7
(260)
track
Only for Ultimate
Web & Cloud Fundamentals
4 Courses
123 Tasks
Beginner
4.5
(43)
track
Only for Ultimate
Test Recalculate
2 Courses
0 Task
Beginner
track
Only for Ultimate
Excel from Zero to Hero
4 Courses
52 Tasks
Beginner
4.5
(33)
track
Only for Ultimate
Data Analyst Foundation
4 Courses
100 Tasks
Beginner
4.7
(110)
track
Only for Ultimate
Full-Stack .NET Developer Journey
13 Courses
544 Tasks
Intermediate
4.8
(128)
track
Only for Ultimate
Full-Stack .NET Developer Journey
13 Courses
544 Tasks
Intermediate
4.8
(128)
track
Only for Ultimate
Test track with rating
1 Course
0 Task
Advanced
4.0
(4)
track
Only for Ultimate
TEST TEST TRACK
0 Task
Beginner
Become a Development expert
- Interactive exercises
- Learning videos
- AI-assistant on all courses
- Workspaces for designing your own projects
Ready to get started?
ProBest intro offer | UltimateA complete experience to kickstart your career | |
---|---|---|
85+ Top-Rated courses | ||
Completion certificates | ||
AI-Assistant in all courses | ||
20+ hands-on Real-world projects | ||
Personalized study tracks | ||
Unlimited workspaces | ||
Boost your Tech Skills!
with up to 55% off
What you'll get with our subscription:
- Access to 85+ top-rated courses
- AI-driven Learning
- Workspaces for practicing your skills
- Personalized study tracks
- Certificates of completion
Training 2 or more people?
Get your team access to Codefinity courses anytime, anywhere.
Try Codefinity Teams