What are the deliverables for a data scientist?

analytics
data_science

#1

I would like to know what are the deliverables for a data scientist?
For example, if they created a model based using decision tree, how that model will be deployed?


#2

More than how, the deliverable is what is that model solving and if its remotely related to the problem in hand. How is just means to get it done.

I mean, the deliverable is what more than How? How just look at the stack and find the best solution to implement the model as long as what is addressed.

Shivaa


#3

Thanks @shivaacc!

Actually my query was suppose a data scientist has crated a model and the model is working perfectly on test data set.
How that module will be helpful?
Is business going to implement that model in their IT solution?
Or suppose a new data item come, then the model will be applied on that item.


#4

When you relook at what the model does its basically generating a probable(probability) value for say an observation. The model is meant to statistically help you guess with maximum accuracy, so you prepare. You will know what you are trying to guess. And this will also help you

So usually the data you work with is either a subset of actual data or the actual data available upto that period of time. You use actual historical data to create a model in general.

Lets consider that to be version 1. Usually you try to run your model on current data and you will realize accuracy goes for a toss. You simply use this new information to start tuning your model with an aim to increase accuracy.

In all practical sense you need to understand models are not developed in factory and simply fit into a data set. Instead structure of model is usually created in the factory. It helps you speed things when you work on actual data. But there is always lots of work is required even after the first creation.

If you do create a dynamic model that learns on its own and the data, you are talking high end AI range models :slight_smile: but even that will require major tweaking if not created on actual data set or major time.

Hope this helps

Shivaa


#5

Hi @shivaacc ,

thanks for replying. It is very helpful.

Regards,
Animesh


#6

@animeshdevarshi

In a business environment, things like model interpretability, relevance of the features created are also important and not only the result on the test score.

Your score might be perfect on the test set. But is the evaluation metric you have chosen is the best possible for the given business situation? How much time does your model take to train? Is your model generalized? Things like this matter as well other than accuracy.

Netflix challange is a classic example of this.


#7

Thanks @sauravkaushik8 for this useful information.