Choosing a sub domain in Data Science



Hi Everyone,

I am a final year master’s student in IT and looking forward to pursue MS in Data Science starting fall 2016. I possess working knowledge of R, Machine learning & Stats. How can I best prepare myself so that I can ace in my MS program ? After MOOC’s & Data Science Competitions, What’s the next step ? Should I be choosing a sub-domain like Sociology, Business, etc ? Should I be going for research problems or should I gain some industrial experience ?

 Thanks in advance


Hi Bhavya,

As someone who has an MS degree (not in data science), I’d suggest that the best way to ace your MS program would be to do the following

  1. Study ahead before class. Ask lots of questions

  2. Form a study group with your peers and friends. Collaboration works wonders.

  3. Don’t try postponing things to the last moment. Start your assignments the day they are posted.

  4. Plan your studies. The general rule of thumb is to spend at least three times the number of credits for a particular course. For example, if your machine learning course is for 4 credits, be prepared to spend about 12 hours every week.

  5. I have seen people who want to take easy courses, complete their MS with ease and get a job. But I’d recommend you to take challenging courses that will definitely provide excellent foundations for your career.

  6. Discuss your career goals, your progress and other academic issues with your advisor every week. Stay in touch with your professors, do well in their courses. Who knows, you might get great letters of recommendations from them.

  7. If you want to review your mathematics for data science, concentrate on the following.
    a. Single variable Calculus- Limits and continuity, Differentiation, Integration, Parametric curves, Sequences and Series
    b. Linear Algebra - matrices, vectors, lines and planes (in 3D and n dimensions), vector spaces, subspaces, linear transformations, determinants, inner product spaces, eigenvalues and eigenvectors, singular value decomposition (extremely important for ML), other matrix factorization methods.

[STUDY MULTI-VARIABLE CALCULUS AFTER SINGLE VARIABLE CALCULUS AND LINEAR ALGEBRA. Things in multi-variable calculus become a lot clear when you possess working knowledge of both]
c. Multivariable calculus- Limits and continuity in several variables, partial derivatives, gradients and directional derivatives, total derivative, chain rule, integration in several variables (double and triple integrals), vector analysis and vector calculus (div, curl, grad etc)

[Study probability and statistics after multivariable calculus]

d. Probability and Statistics - Probability, conditional probability, basic counting methods, random variables (continuous and discrete), expectation and variance, covariance, correlation, functions of random variables, generating functions, distributions, limit theorems, point estimation, hypothesis testing, interval estimation.

""Should I be choosing a sub-domain like Sociology, Business, etc ? Should I be going for research problems or should I gain some industrial experience ? “”

Depends on what you want to do. Talk to your advisor about it. If you wanna do financial analytics, my suggestion would be to take a finance 101 course, which will help you understand the terminology used by finance professionals, Combining that with your knowledge of machine learning and data science, you’d probably be able to communicate better. If you wanna do research, talk to professors that work do research on data science (theory, algorithm development , visualization, or anything that interests you), get some ideas (or generate your own) and work on independent projects. I’d be happy to answer more should you have any questions.