Hi, I am new to the field of Data science and I am learning Data Analytics stuffs of my own, I gained some good foundational knowledge on R Programming and some data analytics concepts like Descriptive Statistics, Inferential Statistics etc, now I thought of Analyzing one Health care related Dataset on my own and here is the Processes that I followed for the Purpose of Analysis that I have listed below
- Collected One Raw Dataset from website called Haberman Survival Data
- Then I Cleaned the dataset since many columns were merged one
another and then I cleaned those data by separating and doing some more data cleaning work like converting the data types of each column with respect to those variables like numeric, character, factor since in the description of the dataset I found what the variables represent based on that I have converted those data types.
- Then I did some Exploratory data analysis Stuffs by plotting different plots like Histogram to know the kind of distribution with each variables and then Boxplot to find the Outliers and then scatterplot to determine the relationship between variables.
- Then I found the central Tendency on each variables(mean, median, mode) on each variables.
Now I am Stuck here, how should I proceed here?
Is the way that I approach a dataset Is correct?
what are all the Process that would be followed by a data scientist when analyzing the data?
Though I know the tool and statistics concepts I don’t know the correct procedure in Analyzing the data set so please kindly help me
Correct me If I wrong in any of these Process