How to parse nested json using pyspark

json
pyspark
#1

I have a nested Json file and I need to parse the data into each column. The schema of my data is https://i.stack.imgur.com/35kIn.png

Now, how to extract all data in the json with separate column like companynumb,drugadministrationroute, drugauthorizationnumb, drugbatchnumb,medicinalproduct, application_number,brand_name,generic_name,manufacturer_name, reaction, receiptdate,receivedate,serious

I tried using wholeTextFiles,multiLine,expr, But I cannot achieve to extract all fields.

I tried to extract medicinalproduct alone and got type mismatch error. data=df.select(psf.expr(‘results.patient.drug.medicinalproduct’))

0 Likes