Starting data mining with poor statistic knowledge


do you think that it is a good idea to use data mining techniques with a poor knowledge in statistic?
I mean learn statistic and data mining together or learn statistic then use data mining
which one would be a good strategy for a newbie person in statistic and data mining?

Inferential statistics knowledge is important for interpreting the data mining technique outputs. If you are starting fresh, my suggestion is to study the stat and then you can go for data mining. If you are using data mining in work at present, then you can study statistics in parallel.

Completely agree with @karthe1 here. You will not be able to make sense of data mining with basic inferential stats.

You can look at descriptive and inferential stats from Udacity to get started.



To be honest I cant enroll in courses( I am in Iran and its too expensive) but I have access to khan academy courses
do you think this link would be useful for basic inferential stats
khan academy statistic course
thanks you guys



You can access the course material for free. Don’t need to enroll.



thanks knual
it is so great site and free! :grinning:


If you’re from Iran, you could also be at the receiving end of nonsensical policies such as this one. Last week, in a stats course I’m enrolled in at Coursera, a student from China informed us that (s)he could not even access the free/open course book! Thankfully e-mail worked, and the PDF was sent as an attachment.

Incidentally: @hossein_mortazavy, you may also find that book helpful. Here’s the link: I hope you have access to Udacity. :slight_smile:


This link helps u start with some of the basic concepts of statistical learning. The introduction to statistical learning and The elements of statistical learning are one of the best resources for learning statistics. I supppose you can get the ebboks for free online. do check


Prashanth, please note that the OP is asking about statistics (the rudimentary stuff, like mean, SD, and p-values and t-tests) which is different from statistical learning. The latter consists of advanced methods and assumes that the person already has an understanding of basic statistics, not to mention linear algebra, calculus etc.


please list out the basic statistic topics that everyone should learn before starting data mining


The really basic ones are covered in the textbook I linked to earlier. To be honest, I too am in my path of learning statistics.


Dear Anon
I find the attachment thankfully
at the moment my professor teach The elements of statistical learning and ISLR (Hastie & Tibsherani) (I attned the class for myself and its for PHD students while I am master) and find it a little hard so I think I should improve my Stat beside it