I am new to machine learning.
In an MLP with one hidden layer, if the input matrix has “n” observations and “d” features,
How many neurons we should expect to see in the input layer and how many neurons in the hidden layer?
What should be the size of the “input to hidden layer weight matrix” and “hidden layer to output weight matrix”?
Should the output layer always have single neuron? If not, how do we calculate the final output of the activation function(s)?