Course Content
Preprocessing Data
Preprocessing Data
Explore the Dataset
Before you start, it's important to take a look at the data you'll work with. There is a list of useful methods which can be applied to the pandas
dataframes:
# info about the dataframe shape, data types data.info() # the size of the dataframe data.shape # list of the columns data.columns # returns all distinct values containing in the column called ColumnName data['ColumnName'].unique() # returns the metrics: mean, mode, min, max etc. data.describe() # returns top 5 rows data.head() # returns top 10 rows (or any other number you'll pass) data.head(10) # returns bottom 5 rows data.tail() # returns bottom 10 rows (or any other number) data.tail(10) # returns 10 random rows data.sample(10)
Task
For given dataset data
, extract and print 5 rows using sample()
function.
Find all the columns' names and put them to the cols
variable.
Find the unique values for each column and output these values.
Thanks for your feedback!
Explore the Dataset
Before you start, it's important to take a look at the data you'll work with. There is a list of useful methods which can be applied to the pandas
dataframes:
# info about the dataframe shape, data types data.info() # the size of the dataframe data.shape # list of the columns data.columns # returns all distinct values containing in the column called ColumnName data['ColumnName'].unique() # returns the metrics: mean, mode, min, max etc. data.describe() # returns top 5 rows data.head() # returns top 10 rows (or any other number you'll pass) data.head(10) # returns bottom 5 rows data.tail() # returns bottom 10 rows (or any other number) data.tail(10) # returns 10 random rows data.sample(10)
Task
For given dataset data
, extract and print 5 rows using sample()
function.
Find all the columns' names and put them to the cols
variable.
Find the unique values for each column and output these values.
Thanks for your feedback!
Explore the Dataset
Before you start, it's important to take a look at the data you'll work with. There is a list of useful methods which can be applied to the pandas
dataframes:
# info about the dataframe shape, data types data.info() # the size of the dataframe data.shape # list of the columns data.columns # returns all distinct values containing in the column called ColumnName data['ColumnName'].unique() # returns the metrics: mean, mode, min, max etc. data.describe() # returns top 5 rows data.head() # returns top 10 rows (or any other number you'll pass) data.head(10) # returns bottom 5 rows data.tail() # returns bottom 10 rows (or any other number) data.tail(10) # returns 10 random rows data.sample(10)
Task
For given dataset data
, extract and print 5 rows using sample()
function.
Find all the columns' names and put them to the cols
variable.
Find the unique values for each column and output these values.
Thanks for your feedback!
Before you start, it's important to take a look at the data you'll work with. There is a list of useful methods which can be applied to the pandas
dataframes:
# info about the dataframe shape, data types data.info() # the size of the dataframe data.shape # list of the columns data.columns # returns all distinct values containing in the column called ColumnName data['ColumnName'].unique() # returns the metrics: mean, mode, min, max etc. data.describe() # returns top 5 rows data.head() # returns top 10 rows (or any other number you'll pass) data.head(10) # returns bottom 5 rows data.tail() # returns bottom 10 rows (or any other number) data.tail(10) # returns 10 random rows data.sample(10)
Task
For given dataset data
, extract and print 5 rows using sample()
function.
Find all the columns' names and put them to the cols
variable.
Find the unique values for each column and output these values.