Description
Context
Developers may need a new empty column in DataFrame.
Problem
If they use zeros or empty strings to initialize a new empty column in Pandas, the ability to use methods such as .isnull() or .notnull() is retained.
Solution
Use NaN value (e.g. np.nan) if a new empty column in a DataFrame is needed. Do not use “filler values” such as zeros or empty strings.
Type
API-Specific
Existing Stage
Data Cleaning
Effect
Robustness
Example
import pandas as pd
+ import numpy as np
df = pd.DataFrame([])
- df['new_col_int'] = 0
- df['new_col_str'] = ''
+ df['new_col_float'] = np.nan
+ df['new_col_int'] = pd.Series(dtype='int')
+ df['new_col_str'] = pd.Series(dtype='object')