Identify the Digits


How to convert image data set into csv format for analysis


Hi @deepak9001,

Based on the language you’re working on, their are libraries available for reading image files and converting them to arrays for analysis. Suppose you have a 28x28 image, you can read them into an array of shape (28, 28).

Below is a script written in python for “Identify the Digits” Practice problem,

from scipy.misc import imread

temp = []
for img_name in train.filename:
    image_path = os.path.join(data_dir, 'Train', 'Images', 'train', img_name)
    img = imread(image_path, flatten=True)
    img = img.astype('float32')
train_x = np.stack(temp)

Now this (numpy) array can be easily be used for analysis.

For a more indepth analysis of the practice problem along with the solution, refer this article.


HI Thanks a lot. I am planning to use R for the analysis.Please explain me how to do the same in R


Hi @deepak9001

Let’s say, image name is pic.jpg


myimage<- readJPEG("pic.jpg")

df <- #convert matrix to data frame
write.csv(df,"image.csv",row.names = F)


Thanks a lot Manish


that was so well Explained, just curious to know what next i mean a data frame per image to build a model


For png files, png library and readPNG() function can be used.