Course Content
Analyzing and Visualizing Real-World Data
Analyzing and Visualizing Real-World Data
Removing Characters: Method 1
There are at least two different ways to solve the problem of redundant symbols. The first method is to treat the column values as strings and then apply the necessary string method to remove the redundant characters.
Note
To treat column values as strings, use the
.str
accessor.
After deleting symbols, we can convert the columns into numerical format.
There are at least two ways to do this:
- The first method is to use the
.astype(type)
method on a column, wheretype
is eitherint
for integers orfloat
for real numbers. For instance,df['column'] = df['column'].astype(int)
; - The second method is to use the
.to_numeric()
method ofpd
(pandas
), passing the column as the parameter. For instance,df['column'] = pd.to_numeric(df['column'])
.
Task
- Import the
pandas
library with thepd
alias. - Read the
csv
file and save it as a dataframe in thedf
variable. - Remove the redundant symbols from prices and convert them to
float
type:- Select the
'Fuel_Price'
column; - Use the
.str
accessor; - Remove the
'$'
characters from the left using the.lstrip()
function; - Convert the resulting values to numerical format (
float
) using the.astype()
method; - Assign the result to the
'Fuel_Price'
column ofdf
.
- Select the
- Remove the
%
symbols from the'Unemployment'
column (which are located on the right side), using the.rstrip()
method, and convert the values tofloat
format using the same algorithm as in step 3. - Remove the
°C
symbols from the'Temperature'
column (which are located on the right side), using the.rstrip()
method, and convert the values tofloat
format using the same algorithm as in step 3. - Display the first row of the
df
dataframe and the data types of thedf
dataframe.
Once you've completed this task, click the button below the code to check your solution.
Thanks for your feedback!
Removing Characters: Method 1
There are at least two different ways to solve the problem of redundant symbols. The first method is to treat the column values as strings and then apply the necessary string method to remove the redundant characters.
Note
To treat column values as strings, use the
.str
accessor.
After deleting symbols, we can convert the columns into numerical format.
There are at least two ways to do this:
- The first method is to use the
.astype(type)
method on a column, wheretype
is eitherint
for integers orfloat
for real numbers. For instance,df['column'] = df['column'].astype(int)
; - The second method is to use the
.to_numeric()
method ofpd
(pandas
), passing the column as the parameter. For instance,df['column'] = pd.to_numeric(df['column'])
.
Task
- Import the
pandas
library with thepd
alias. - Read the
csv
file and save it as a dataframe in thedf
variable. - Remove the redundant symbols from prices and convert them to
float
type:- Select the
'Fuel_Price'
column; - Use the
.str
accessor; - Remove the
'$'
characters from the left using the.lstrip()
function; - Convert the resulting values to numerical format (
float
) using the.astype()
method; - Assign the result to the
'Fuel_Price'
column ofdf
.
- Select the
- Remove the
%
symbols from the'Unemployment'
column (which are located on the right side), using the.rstrip()
method, and convert the values tofloat
format using the same algorithm as in step 3. - Remove the
°C
symbols from the'Temperature'
column (which are located on the right side), using the.rstrip()
method, and convert the values tofloat
format using the same algorithm as in step 3. - Display the first row of the
df
dataframe and the data types of thedf
dataframe.
Once you've completed this task, click the button below the code to check your solution.
Thanks for your feedback!
Removing Characters: Method 1
There are at least two different ways to solve the problem of redundant symbols. The first method is to treat the column values as strings and then apply the necessary string method to remove the redundant characters.
Note
To treat column values as strings, use the
.str
accessor.
After deleting symbols, we can convert the columns into numerical format.
There are at least two ways to do this:
- The first method is to use the
.astype(type)
method on a column, wheretype
is eitherint
for integers orfloat
for real numbers. For instance,df['column'] = df['column'].astype(int)
; - The second method is to use the
.to_numeric()
method ofpd
(pandas
), passing the column as the parameter. For instance,df['column'] = pd.to_numeric(df['column'])
.
Task
- Import the
pandas
library with thepd
alias. - Read the
csv
file and save it as a dataframe in thedf
variable. - Remove the redundant symbols from prices and convert them to
float
type:- Select the
'Fuel_Price'
column; - Use the
.str
accessor; - Remove the
'$'
characters from the left using the.lstrip()
function; - Convert the resulting values to numerical format (
float
) using the.astype()
method; - Assign the result to the
'Fuel_Price'
column ofdf
.
- Select the
- Remove the
%
symbols from the'Unemployment'
column (which are located on the right side), using the.rstrip()
method, and convert the values tofloat
format using the same algorithm as in step 3. - Remove the
°C
symbols from the'Temperature'
column (which are located on the right side), using the.rstrip()
method, and convert the values tofloat
format using the same algorithm as in step 3. - Display the first row of the
df
dataframe and the data types of thedf
dataframe.
Once you've completed this task, click the button below the code to check your solution.
Thanks for your feedback!
There are at least two different ways to solve the problem of redundant symbols. The first method is to treat the column values as strings and then apply the necessary string method to remove the redundant characters.
Note
To treat column values as strings, use the
.str
accessor.
After deleting symbols, we can convert the columns into numerical format.
There are at least two ways to do this:
- The first method is to use the
.astype(type)
method on a column, wheretype
is eitherint
for integers orfloat
for real numbers. For instance,df['column'] = df['column'].astype(int)
; - The second method is to use the
.to_numeric()
method ofpd
(pandas
), passing the column as the parameter. For instance,df['column'] = pd.to_numeric(df['column'])
.
Task
- Import the
pandas
library with thepd
alias. - Read the
csv
file and save it as a dataframe in thedf
variable. - Remove the redundant symbols from prices and convert them to
float
type:- Select the
'Fuel_Price'
column; - Use the
.str
accessor; - Remove the
'$'
characters from the left using the.lstrip()
function; - Convert the resulting values to numerical format (
float
) using the.astype()
method; - Assign the result to the
'Fuel_Price'
column ofdf
.
- Select the
- Remove the
%
symbols from the'Unemployment'
column (which are located on the right side), using the.rstrip()
method, and convert the values tofloat
format using the same algorithm as in step 3. - Remove the
°C
symbols from the'Temperature'
column (which are located on the right side), using the.rstrip()
method, and convert the values tofloat
format using the same algorithm as in step 3. - Display the first row of the
df
dataframe and the data types of thedf
dataframe.
Once you've completed this task, click the button below the code to check your solution.