The interview questions for a Hadoop developer can be broadly classified in 3 categories:
- Technical questions about Hadoop and Big Data
- Questions regarding application of Hadoop & MapReduce to business problems
- Puzzles and case studies to check on logical thinking
Here are a few examples of questions for each of the categories:
Technical questions about Hadoop:
- What is Hadoop?
- What is MapReduce?
- How do Hadoop & MapReduce work? Explain with an example
- What is NameNode and JobTracker? Why are they necessary in a Hadoop cluster?
…the list can go on
You get the idea, there will be a lot of questions to understand whether you understand the technical details. These kind of rounds would likely be filter rounds. Selection would likely happen basis the 2 rounds described next:
Questions regarding application of Hadoop & MapReduce to business problems
- If we want to analyze 1GB data, what is they best architecture for that? What about 100GB? 1TB? 100TB?
- Please explain how would you run various operations like addition, substraction, finding minimum, maximum etc. using Hadoop MapReduce stack? You can find some of these manipulations here: http://www.analyticsvidhya.com/blog/2014/06/mapreduce-data-manipulation/
- What considerations would you keep in mind while moving from an Oracle database to Hadoop clusters? How would you decide the right size and noes of a cluster?
- What happens when a datanode fails in Hadoop?
In addition to this, you should expect few puzzles and case studies:
Here are a few links you may find useful:
Hope you get enough leads to prepare for your interview.