Using SAS Studio (University Edition) for Loan Prediction problem



Is anyone using SAS Studio (University Edition) for this exercise? This is my first time in a hackathon.

Link to hackathon problem:

What steps should I follow to eventually get to building a predictive model for this exercise? Also, if someone is using SAS Studio (University Edition), since there is a rich collection of pre-written tasks, are there any of these tasks that I can use in my attempt to build a predictive model?





I had worked on SAS Studio a couple of years back, so my knowledge might be outdated. But here is what I think you can try:

  1. Use the data exploration modules like checking the distribution of variables, understanding the outliers.
  2. You can also check significance of Chi-Square test to understand which variables are significant.
  3. Finally, you can build your logistic regression model and interpret the outcome from SAS Studio.

The open course on should be able to tell you these steps in more details.



Thanks for your input Kunal. I will need to do the SAS Statistics course to gain a better understanding.

Even before going on to the steps you have outlined, what data transformations would I need so that I can perform sound analyses on the data? I had compiled a list of some task sections in SAS Studio that I think are useful in data and statistical analysis generally. In addition, to what you have recommended do you think there are other tasks / analyses from the list below that might be of value:

Data Tasks
List Table Attributes, Characterize Data Task, Describe Missing Data, List Data, Transpose Data, Stack/Split Columns, Filter Data, Select Random Sample, Partition Data, Sort Data, Rank Data, Transform Data, Standardise Data
Graph Tasks
Bar Chart, Bar-Line Chart, Box Plot, Bubble Plot, Histogram, Line Chart, Mosaic Plot, Pie Chart, Scatter Plot, Series Plot, Simple HBar
Statistics Tasks
Data Exploration, Summary Statistics, Distribution Analysis, One-Way Frequencies, Correlation Analysis, Table Analysis, t-Tests, One-Way ANOVA, Nonparametric One-Way ANOVA, N-Way ANOVA, Analysis of Covariance, Linear Regression, Binary Logistic Regression, Predictive Regression Models, Generalised Linear Models, Mixed Models, Partial Least Squares Regression
Power and Sample Size
Pearson Correlation, Multiple Regression, Confidence Intervals, Tests of Proportion, t Tests, Cox Regression
Multivariate Analysis
Principal Component Analysis, Factor Analysis, Canonical Correlation, Discriminant Analysis, Correspondence Analysis, Multidimensional Preference Analysis
Time Series Data Preparation, Time Series, Exploration, Modeling and Forecasting

Appreciate your time.