Validating custom variables to get better estimate of a parameter I am trying to compare

datavisualization

#1

Hi,

I am currently facing a challenge of comparing a metric (y) across different times (x). But there is a parameter ‘z’ which affects the calculation of ‘y’. How to bring ‘z’ across different ‘x’ to a common reference level so that ‘y’ can be compared across ‘x’.

I have two ideas:

  1. Either create a custom metric out of 'y and ‘z’ and compare. If yes, how do I validate the invention of this new metric?
  2. Find out a threshold for ‘x’ so that only the ‘y’ values corresponding to above that ‘x’ value would be considered and compared ignoring the outliers.

Which one to go with?

Regards
Akshay


#2

The first option , like creating one more variable which is a mix of, say x and z.
When there is a strong co relation between x and z, z can not be ignored.


#3

Thanks Malathi!

Would like to read some theory on feature creation criteria before creating a hyperparameter out of the two existing parameters. Any resources you would suggest?


#4

This article may help you understand about feature engineering.


#5

Hi @akshay.kotha
You face a problem of conditional probability it seems. You have to build Y for X given Z, therefore multiple models based on the conditional probability. You could check the interaction for example if linear type of model you have X*Z if it not the same as conditional but you will see if the interaction is significant if significant what will be the coefficient. If z is categorical then every thing is simpler and you use ANOVA type,

Hope this help a bit.
Alain