@shashwat.2014
Let us take some random data and try to analyse it pre and post applying transformation
data< data.frame(x=seq(1,1000))
# Normal Random Data
data$y=rnorm(1000,4,1)
plot(data$x,data$y)
# Find the curve assuming that we know nothing of the normal distribution
model < smooth.spline(data$x,data$y)
plot(data)
lines(model,col="blue")
# This method is entirely on the basis of distance from the smooth line
data$mean < predict(model,data$x)$y
data$diff < ((data$y  data$mean)/data$mean)^2
data < data[order(data$diff),]
# I will just plot the top 10 deviations ( Either i can show top deviations or deviations that are above a particular threshold)
plot(data$x,data$y)
lines(model,col="blue")
points(data[1:10,]$x,data[1:10,]$y,col="red")
# Now I will apply some transformation
######### LOG ####################
data$y < log(data$y)
# Find the curve assuming that we know nothing of the normal distribution
model < smooth.spline(data$x,data$y)
plot(data)
lines(model,col="blue")
# This method is entirely on the basis of distance from the smooth line
data$mean < predict(model,data$x)$y
data$diff < ((data$y  data$mean)/data$mean)^2
data < data[order(data$diff),]
# I will just plot the top 10 deviations ( Either i can show top deviations or deviations that are above a particular threshold)
plot(data$x,data$y)
lines(model,col="blue")
points(data[1:10,]$x,data[1:10,]$y,col="red")
You will see that the range of the deviations has changed. This will be true for every transformation. Based on the nature of the data, we have to choose the most appropriate transformation.
The metric to be used can be

 Deviation from the best fit curve ( which i have used here )

 Cluster Analysis ( k means )
If I apply k means clustering, then
data$z < scale(data$y)
kmeans(data$z,6)
Kmeans clustering with 6 clusters of sizes 234, 244, 139, 216, 113, 54
Cluster means:
[,1]
1 0.4890484
2 0.1350999
3 1.1713196
4 0.7783585
5 1.6766376
6 2.0981320
Within cluster sum of squares by cluster:
[1] 7.264497 7.244164 6.677956 10.098386 23.295321 7.060033
(between_SS / total_SS = 93.8 %)
After the transformation
data$z < scale(data$y)
kmeans(data$z,6)
Kmeans clustering with 6 clusters of sizes 114, 206, 254, 278, 132, 16
Cluster means:
[,1]
1 1.53359075
2 0.60790059
3 0.02595344
4 0.62930994
5 1.35159480
6 3.74337400
Within cluster sum of squares by cluster:
[1] 14.726094 9.513906 7.014095 9.545692 10.416358 27.929292
(between_SS / total_SS = 92.1 %)
There is a reduction of deviation that is explained by the cluster. We can conclude that the transformation was not appropriate
Hope this will give some idea
Regards,
Anant