Knowledge of Data Structures



Is knowledge of Data Structures required for Data Analytics.
or more specifically how does knowledge of data structures help a data scientist.



I believe data scientists are part software engineers and part statisticians and they are expected to have knowledge of both worlds. So simply, I would say data scientists should have good enough knowledge of data structures.

Let’s take this in another way,

As a data scientist, you are constantly in connection with data, both structured and unstructured. If he/she wants to work better, he/she has to know how to handle data in an efficient manner to save both time and resources required to do a project.

Hope it helps



I think it is essential to know Data Structures to be a successful Data Scientists. When looking at data, knowing that it’s Panel or Hierarchical Data immediately points you to what type of modeling techniques would work. For example, if you had a dataset of Patients, who all have Primary Care Physicians, Parents with children, Countries with different races, Companies with departments, and employees in each department, web traffic data with return visitors having session data etc., you will need ensure that you use Models that support these Hierarchial Data Structures. You can use regression, but you will need to do Multilevel regression that takes all levels into account. It’s also useful to know what kind of Data Structure you are dealing with if you need to partition variance at different levels.

The bottom line is, without a good understanding of your data, and knowing the structure is just one aspect of knowing, you can’t really have a full view to ensure you run appropriate modeling methods against that data set.

I think you see that it’s not only required, but also useful to a Data Scientists to know Data Structures.




You mean data structure or structure of data ?

The technical term may be irrelevant or relevant on tools you will work with. Each tool has its own interpretation and application and use of the same.
The second part of knowing structure of data, i am sure it will only help.

Do everyone running models know in detail about the structure of the data, probably not but they still make better models. Knowing the structure makes your job easier in not only modelling, descriptive analysis but also presenting the data.