How to decide which curve to fit to our model?




How can we decide which type of curve would be better for our dataset and would fit best to our data?

For example, what would be a better curve for this plot, a quadratic or a linear curve?




There are multiple considerations which play out in order to decide which curve to fit in. Here are a couple of considerations, which would matter:

1. Underlying phenomena and its nature (if you have understanding of it)
The most important consideration is to think about the underlying problem you are trying to fit to. If it is Height vs. Weight relationship, there is no reason to believe that with increasing height the weight will start falling. So, the last few points in the graph look more noise than signal. So, irrespective of what model you fit, weight should increase with height

2. Growing complexity with higher degree of polynomials
As you increase the degree of polynomial, you might get better fit, but you will end up complicating and overfitting the problem unneccessarily. For example, if you fit a third degree polynomial to this data, it will try and fit the data points between 20 and 25 (i.e. increase followed by flattening), which does not validate the understanding. Also, the polynomial will behave very differently later on. If you are not sure, prefer simpler models over complex ones.

In summary, if I were you, I would actually fit a linear model.