How EDA helps in improving model performance?
EDA helps in
- Helps to gain familiarity with dataset
- Identifying feature distribution
- Identifying feature with null or erroneous values
- Helps to identify feature that are not important like having same value for all observations
You can also refer the below link
If you have ever gone through any kaggle competition kernels, you must find an EDA kernel every time with most upvotes. Why is it so?
It is generally because EDA helps us in the better understanding of data and only using that we derive out trends and relationship among variables. That ultimately results in generation and selection of useful features that directly impact the model performance.
For learning EDA, you can refer this article.
Hope this clears your doubt.