Read a large json dataset in pandas

pandas
python
json

#1

Hey,

I have a large dataset in a json file. The file is 1.2GB in size. I’m not able to read it using pandas.read_json(..) . Do we have a way of handling large datasets like this?

Thanks in advance!


#2

Hi,
For handling large JSON files using read_json() is not an efficient way. You will always get an memory error. Because this will take complete data in a memory and process it further.

You can try ijson module that will work with JSON as a stream, rather than as a block file.

Also worth a look - Python & JSON: Working with large datasets using Pandas