Identify the Digits


#1

How to convert image data set into csv format for analysis


#2

Hi @deepak9001,

Based on the language you’re working on, their are libraries available for reading image files and converting them to arrays for analysis. Suppose you have a 28x28 image, you can read them into an array of shape (28, 28).

Below is a script written in python for “Identify the Digits” Practice problem,

from scipy.misc import imread

temp = []
for img_name in train.filename:
    image_path = os.path.join(data_dir, 'Train', 'Images', 'train', img_name)
    img = imread(image_path, flatten=True)
    img = img.astype('float32')
    temp.append(img)
    
train_x = np.stack(temp)

Now this (numpy) array can be easily be used for analysis.

For a more indepth analysis of the practice problem along with the solution, refer this article.


#3

HI Thanks a lot. I am planning to use R for the analysis.Please explain me how to do the same in R


#4

Hi @deepak9001

Let’s say, image name is pic.jpg

install.packages("jpeg")
library(jpeg)

myimage<- readJPEG("pic.jpg")
dim(myimage)

df <- as.data.frame(myimage) #convert matrix to data frame
write.csv(df,"image.csv",row.names = F)

#5

Thanks a lot Manish


#7

that was so well Explained, just curious to know what next i mean a data frame per image to build a model


#8

For png files, png library and readPNG() function can be used.