Interview questions & experience for Hadoop developer?



I have completed my Big Data certification from an institute with affiliation from Wiley publishers and have to appear for a Hadoop developer interview next week.

Please share any interview questions (& answers) or experiences for a Hadoop developer.



The interview questions for a Hadoop developer can be broadly classified in 3 categories:

  • Technical questions about Hadoop and Big Data
  • Questions regarding application of Hadoop & MapReduce to business problems
  • Puzzles and case studies to check on logical thinking

Here are a few examples of questions for each of the categories:

Technical questions about Hadoop:

  • What is Hadoop?
  • What is MapReduce?
  • How do Hadoop & MapReduce work? Explain with an example
  • What is NameNode and JobTracker? Why are they necessary in a Hadoop cluster?
    …the list can go on
    You get the idea, there will be a lot of questions to understand whether you understand the technical details. These kind of rounds would likely be filter rounds. Selection would likely happen basis the 2 rounds described next:

Questions regarding application of Hadoop & MapReduce to business problems

  • If we want to analyze 1GB data, what is they best architecture for that? What about 100GB? 1TB? 100TB?
  • Please explain how would you run various operations like addition, substraction, finding minimum, maximum etc. using Hadoop MapReduce stack? You can find some of these manipulations here:
  • What considerations would you keep in mind while moving from an Oracle database to Hadoop clusters? How would you decide the right size and noes of a cluster?
  • What happens when a datanode fails in Hadoop?

In addition to this, you should expect few puzzles and case studies:
Here are a few links you may find useful:

Hope you get enough leads to prepare for your interview.



Here are some mostly asked interview questions:
Define Hadoop streaming.
Explain inputsplit in Hadoop?
Explain the difference between Job and task?
Define Sqoop in Hadoop?
What are the mapfiles in Hadoop?
Why we use .btn-group class?
What is default HDFS replication factor in Hadoop?
What is the outer most part of Hbase data model?

Hope it will help you…