I am going to train a neural network (e.g., a feed-forward network) in which the output is just a real value representing a probability (and thus in the [0, 1] interval). Which activation function shall I use for the last layer (i.e., the output node)?
If I don't use any activation functions and just output tf.matmul(last_hidden_layer, weights) + biases
it may result in some negative outputs, which is not acceptable, since the outputs are probabilities and thus the prediction should be also a probability. If I use tf.nn.softmax
or tf.nn.softplus
the model always returns 0 in the test set. Any suggestion?
Copyright Notice:Content Author:「boomz」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/40901226/training-a-tensorflow-model-for-regression-when-labels-are-probabilities