How to import random rows from a large csv in Python?

ipython
python

#1

Hello,

I cannot read a large dataset in a reasonable time due to the specification limits of my machine. So I decided to import random records from the dataset. I would like to read the records directly from the csv file. How can I perform this task in Python?

Thanks


#2

@Imran,

You can use something like this->

import pandas
import random

n = 100000 #number of rows in the file
s = 1000 #desired sample size
file = "xyz.csv"
skip = sorted(random.sample(xrange(n),n-s))
df = pandas.read_csv(file, skiprows=skip)