Python - Learning Basic of Pandas

8 months ago
13

Data Cleaning: Hands-on Pandas for Beginners 🐼

Pandas is the second popular Python library in Data Science (55%, Statista).

Here are most useful data cleaning functions in Pandas:

✅ Handling Missing Values:

↪ fillna(): Fills missing values with a specified value
↪ dropna(): Drops rows or columns with missing values
↪ interpolate(): Fills missing values using interpolation methods

✅ Data Type Conversion and Cleaning:

↪ astype(): Converts data types of columns
↪ to_numeric(): Converts strings to numeric data types
↪ applymap(): Applies a function for cleaning individual values.

✅ String Manipulation and Cleaning:

↪ str.strip(): Removes whitespaces from strings in a column.
↪ str.lower(): Converts all characters in a string to lowercase.
↪ str.replace(): Replaces specific characters in strings with desired values.

✅ Data Exploration and Outlier Detection:

↪ boxplot(): Visualizes the distribution of data
↪ IQR(): Calculates the Interquartile Range using quantiles.
↪ describe(): Generates summary statistics for numerical columns

✅ Data Aggregation and Transformation:

↪ groupby(): Groups data by specific columns
↪ resample(): Resamples time series data at different frequencies
↪ pivot_table(): Summarizes data with various aggregation functions

Loading comments...