I am a graduate student of Computer Engineering and a newbie in the world of Data Science. But this area has got me interested and I wish to learn more and develop my knowledge and skills. I now have a very basic understanding of Big Data Terminology - Hadoop, MapReduce, Pig, YARN, Spark etc.
However I want to pursue an interesting area - optimizing underlying hardware for efficient Big Data processing. I read that a combination of FPGA’s, GPU’s or so called Many/Multicore computing is well suited for Big Data and solving Data Science related problems.
So if any one has worked in this area or is pursuing research in this area help/guide me by answering my questions:
Is it possible for a person to work on both the software (Hadoop, MapR, Spark etc) and hardware (FPGA, GPU, Many Multicore computing) aspects of Big Data.
If I want to pursue this line of research what skills do you suggest me to develop? For example:
Software - Hadoop, MapR, Pig, Spark, YARN etc ;
Hardware - FPGA, GPU ;
Programming - VHDL, Verilog, OpenCL ;
Scripting Languages ;
Courses - Computer Architecture, Data Science, Data Visualization, Parallel Programming etc.
Are there any research groups or people you know who are working in this area?
General advice or suggestions or links to more resources.
Thanks in advance!