Say there are two data sets. One is training data sets and other is test data set. Here we need to model the data using training data sets and validate the same model using test data sets. I hope I am correct till here.
My question here is (according to me), Do not you analyst think there are two ways to solve this???
Say we want to predict some variable(say Y) in the test data set. The total variables in training data set are 7(Say), that are A,B,C,D,E,F and Y(both categorical and numerical). And total no of variables in test data sets should also be 7, that are A,B,C,D,E,F and Y(both categorical and numerical). But in test data set, we need to predict Y(which is the objective).
Solution: There are 2 solutions(What i feel)
Making use of both train and test data sets(comparing). That is, since I already said that the variables are common in both data sets. We can keep those variables as independent and calculate Y(variable).
To make it very specific, lets say, in training data set, for some values of A,B,C,D,E,F we got Y value. Like this we can conclude in test data sets that for same values of A,B,C,D,E,F, we can get same Y value. This is one type of solution.
Not comparing both train and test data sets. I mean, just taking train data sets and constructing a model( say regression etc…) and then validating the model using test data sets. This is second type of solution.
i want to know which is called supervised learning. Solution 1 or Solution 2 or None of them?