Data Warehousing, BI and Data Science

29 March 2021

Pandas

Filed under: Python — Vincent Rainardi @ 6:29 am
#Rename columns: 
df = df.rename(columns={"col1": "column1", "col2": "column2"}

#Drop columns:
df.drop(columns = ["column1", "column2"], axis=1, inplace=True)

#Unique values in each column:
df.nunique()

#Number of missing values in each column:
df.isnull().sum()

#Convert a string column to date:
df["date_column"] = pd.to_datetime(df["string_column"], dayfirst = True)

#Get the day element from a date column:
df["day"] = df["date_column"].dt.day

#Get the weekday name (e.g. Monday, Tuesday) from a date column:
df["weekday_name"] = df["date_column"].dt.day_name(locale='English')

#Group by 2 columns and get the row count:
df.groupby(["column1","column2"]).size()

#Combine 2 data frames:
df = pd.concat([df1, df2], axis=1)

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: