What is Machine Learning and how is it different from Big Data and Business Analytics?


I am new to Business Analytics and as I go deeper in this field, I learn about new terminology and this time it is Machine Learning.

Wikipedia says:

Machine learning is a scientific discipline that explores the construction and study of algorithms that can learn from data. Such algorithms operate by building a model based on inputs :2 and using that to make predictions or decisions, rather than following only explicitly programmed instructions.

While going through definition I have some doubts, please help me to answer below questions:-

  1. Is it not dependent on tool and language or can we perform this using any tool like SAS, R or Python?
  2. Does it use same algorithm like linear, logistic, decision tree and many other classification algorithms?
  3. As name suggest, do we write codes to understand algorithms behind data set?
  4. Is this widely used to handle Big data only?



First of all machine learning is a field of data science which prevents you from explicitly coding rules on a system but making the system learn by itself using some algorithms.

Now coming to your questions -
I will answer 2nd first than come to your 1st question.

  1. All the algorithms you have talked about are part of machine learning algorithms. This is what machine learning is, making use of the algorithms and try to predict some variables

Now coming to your rest of the questions in sequence -

  1. Do you still need an answer to that? Yes you can do machine learning using these languages :smile:

  2. What typically happens is that the algorithms you mentioned like linear/ logistic regression, decision trees etc are generic in nature though they work very well, they have problems in scaling to bigger datasets and to reach the most optimum solution. So a typical machine learner who knows the math behind these algos try to tailor it for the set of problem/ data he’s working on by either making use of ensemble of these models or writing codes by himself to cater to the challenges present.
    Bottom line is that there is no need of writing specific codes for it, you can use inbuilt functions but say you want to make the solution more optimum then there is a chance that you have to write codes.

  3. Not necessarily as we use machine learning algorithms in most of our work.But as most of the algorithms we use are not scalable over bigger datasets (like random forest on a 100 mb datasets with 100 trees and equal number of variable can take hours to run) we need to go for techniques like mini - batch processing or stochastic methods or other not so famous algorithms to make the solution scalable for which some coding may also be required.

Hope this helps.

Aayush Agrawal

© Copyright 2013-2019 Analytics Vidhya