Hdf5 vs. mdf4: a suitable file format for big data analytics

Dear community,

as a newbi in the field of big data I’m quite intertested in feedback of experienced people :wink: One of my interests is the automated evalaution of huge amount of measurment data in automotive industry. In my job we are gathering tons of data worldwide from different test benches which must be evaluated resulting a “good” or “need to be observed”.
The data containers include different data types such as float, integers (long time vectors) but also strings as meta description. The used file formats are different e.g. mdf4, dat, mat, xml.
A time consuming and faulty step is the format conversion. Now I’m thinking about a common file format to eliminate this step. To keep a certain compatibility I’m focussing on mdf4 (ASAM standard) and hdf5 (NCSA).
I know that in principle both are fitting requirements like handling of different data types, compression, data base integration.
At the moment I can’t evaluate which one is better to use existing data analytics tools and open standards for example to collaborate with universities.
What is your opinion? Thanks in advance for your feedback.

© Copyright 2013-2019 Analytics Vidhya