Why is SQL a pre-requisite to get started in Data Science?




I have done quite a few courses in Data Science / Machine learning. All of them suggest that I should have a basic knowledge of SQL, but they don’t mention why is it necessary. Could you guys help me out?


Hi @albela_angur,

SQL has its own benefits that can be very useful for a data scientist. Some of the advantages of using SQL are:

  • SQL Queries can be used to retrieve large amounts of records from a database quickly and efficiently.

  • Using standard SQL it is easier to manage database systems without having to write substantial amount of code.

Given the huge amount of data, we need a database management software like SQL to store and analyze data. So, SQL is a pre-requisite to get started in data science.


Data, for most part, which can be used readily is not available. When working on a particularly project, it is highly likely that you would have to extract data from various sources, for example relational Databases.
So, SQL is for extracting data from RDBMS.
For those reasons it is suggested to learn SQL


Hi Pulkit, The advantages you explain are the advantages of using SQL, and not why a data scientist should use SQL Could you suggest the same?


So is it necessary to learn SQL to read from relational databases? Is there any alternative?


Hi @albela_angur,

Let’s take the example of machine learning. Machine learning involves self-learning algorithms that can adjust their performance without having the process hard-coded in a set of logical rules. SQL works in a similar way. SQL is designed specifically for accessing data. The primary difference between SQL and conventional programming languages (R, Python, Java, etc.) is that SQL statements specify WHAT data operations should be performed rather than HOW to perform them.

SQL’s concise set of commands save time and reduce the amount of programming required to perform complex queries. Learning SQL will also give you a good understanding of relational databases, which are the bread and butter of data science.


yes, It is necessary to learn SQL to work with relational databases. as for other approaches, there are some query languages which are programming language specific and at the base level, they just act as interfaces between programming language and SQL but as for true alternative to SQL, there are none.

I believe if you can take time to learn SQL, it will be highly beneficial for your work and projects, generally it is assumed that if a person is working with databases, then it is likely that they have some experience with SQL

It highly suggested to learn SQL because this is just initial step towards working with Databases and SQL, it self is similar to simple sentences.


I too have recently completed one course on data science so I can totally understand why you are having this doubt. I am not an engineer so I don’t understand all these technical terms(which people here are using) but I do understand data science.
So why SQL? I will share my experience here. So, almost in all situation you are not going to get data in csv format(life would be so easier). Companies have data warehouse where they store their data. Now, depending on the kind of business problem you want to solve as a data scientist, you have to pull data from that warehouse. How are you going to do that? That’s where SQL come to your rescue.


Thanks @PulkitS @anand_vidvat @djkumar for your valuable sugesstions