Calculating UCB in MCTS

In this article, in iteration 4, for S1, UCB1 is calculated as follows:


Should it be following?:


UCB1 formula is given as:


where Vi is the average reward/value of all nodes beneath this node. Does that reduce Vi at S1 from iteration 3 to iteration 4 from 20 to 10, because in interaction 4, S1 has 2 more children? If yes, I am unable to get why exactly. Can someone please explain?

© Copyright 2013-2020 Analytics Vidhya