Description
Context
Developers may need a new empty column in DataFrame
.
Problem
If they use zeros or empty strings to initialize a new empty column in Pandas, the ability to use methods such as .isnull()
or .notnull()
is retained.
Solution
Use NaN
value (e.g. np.nan
) if a new empty column in a DataFrame
is needed. Do not use “filler values” such as zeros or empty strings.
Type
API-Specific
Existing Stage
Data Cleaning
Effect
Robustness
Example
import pandas as pd
+ import numpy as np
df = pd.DataFrame([])
- df['new_col_int'] = 0
- df['new_col_str'] = ''
+ df['new_col_float'] = np.nan
+ df['new_col_int'] = pd.Series(dtype='int')
+ df['new_col_str'] = pd.Series(dtype='object')