Interview questions & experience for Hadoop developer?



I have completed my Big Data certification from an institute with affiliation from Wiley publishers and have to appear for a Hadoop developer interview next week.

Please share any interview questions (& answers) or experiences for a Hadoop developer.



The interview questions for a Hadoop developer can be broadly classified in 3 categories:

  • Technical questions about Hadoop and Big Data
  • Questions regarding application of Hadoop & MapReduce to business problems
  • Puzzles and case studies to check on logical thinking

Here are a few examples of questions for each of the categories:

Technical questions about Hadoop:

  • What is Hadoop?
  • What is MapReduce?
  • How do Hadoop & MapReduce work? Explain with an example
  • What is NameNode and JobTracker? Why are they necessary in a Hadoop cluster?
    …the list can go on
    You get the idea, there will be a lot of questions to understand whether you understand the technical details. These kind of rounds would likely be filter rounds. Selection would likely happen basis the 2 rounds described next:

Questions regarding application of Hadoop & MapReduce to business problems

  • If we want to analyze 1GB data, what is they best architecture for that? What about 100GB? 1TB? 100TB?
  • Please explain how would you run various operations like addition, substraction, finding minimum, maximum etc. using Hadoop MapReduce stack? You can find some of these manipulations here:
  • What considerations would you keep in mind while moving from an Oracle database to Hadoop clusters? How would you decide the right size and noes of a cluster?
  • What happens when a datanode fails in Hadoop?

In addition to this, you should expect few puzzles and case studies:
Here are a few links you may find useful:

Hope you get enough leads to prepare for your interview.



Here are some mostly asked interview questions:
Define Hadoop streaming.
Explain inputsplit in Hadoop?
Explain the difference between Job and task?
Define Sqoop in Hadoop?
What are the mapfiles in Hadoop?
What is default HDFS replication factor in Hadoop?
What is the outer most part of Hbase data model?

Hope it will help you…