error-prone

NaN Equivalence Comparison Misused

Be careful when using the NaN equivalence comparison in NumPy and Pandas.

Chain Indexing

Avoid using chain indexing in Pandas.

Merge API Parameter Not Explicitly Set

Explicitly specify on, how and validate parameter for df.merge() API in Pandas for better readability.

In-Place APIs Misused

Remember to assign the result of an operation to a variable or set the in-place parameter in the API.

Dataframe Conversion API Misused

Use df.to_numpy() in Pandas instead of df.values() for transform a DataFrame to a NumPy array.

No Scaling Before Scaling-sensitive Operation

Check whether feature scaling is added before scaling-sensitive operations.

Hyperparameter not Explicitly Set

Hyperparameters should be set explicitly.

Missing the Mask of Invalid Value

Add a mask for possible invalid values. For example, developers should add a mask for the input for tf.log() API.

TensorArray Not Used

Use tf.TensorArray() in TensorFlow 2 if the value of the array will change in the loop.

Training / Evaluation Mode Improper Toggling

Call the training mode in the appropriate place in PyTorch code to avoid forgetting to toggle back the training mode after the inference step.

Gradients Not Cleared before Backward Propagation

Use optimizer.zero_grad(), loss_fn.backward(), optimizer.step() together in order in PyTorch. Do not forget to use optimizer.zero_grad() before loss_fn.backward() to clear gradients.

Data Leakage

Use Pipeline() API in Scikit-Learn or check data segregation carefully when using other libraries to prevent data leakage.