Ok so, I’m an Economist in Central Banking. My job is to -literally - extract data (not info) from whatever datasets made available to us (Stock Exchanges, financial derivatives’ operations, Trade Chambers, and so forth). We filter these huge datasets so we get data that is well defined and can be analyzed by Academics and industry researchers.
This is the tricky part: I receive several datasets, in many file-types, shapes and sizes. My question is: How should I go about it? What steps should I take?
I want to develop a methodology, a series of steps to be taken whenever trying to process a new data frame, from a raw dataset.