See also the detailed analysis in this question. categorical_crossentropy (and tf.nn.softmax_cross_entropy_with_logits under the hood) is for multi-class classification (classes are exclusive).binary_crossentropy (and tf.nn.sigmoid_cross_entropy_with_logits under the hood) is for binary multi-label classification (labels are independent).You are right by defining areas where each of these losses are applicable: Likewise the categorical crossentropy: should only work for one-hot encoded labels if the documentation is correct?! So, what is the idea behind Keras' naming of the loss functions? Is the documentation correct? If the binary cross entropy would really rely on binary labels, it should not work for autoencoders, right?! Please correct me if I am wrong, but it looks to me that the Keras documentation is - at least - not very "detailed"?! If they are not, the computation of the gradient will be incorrect. All that is required is that each row of labels is a valid probability distribution. NOTE: While the classes are mutually exclusive, their probabilities need not be. If I am not mistaken, this is just the special case of one-hot encoded classification tasks, but the underlying cross-entropy loss also works with probability distributions ("multi-class", dependent labels)?Īdditionally, Keras uses tf.nn.softmax_cross_entropy_with_logits (TF python backend) for the implementation, which itself states: if you have 10 classes, the target for each sample should be a 10-dimensional vector that is all-zeros expect for a 1 at the index corresponding to the class of the sample). (.) when using the categorical_crossentropy loss, your targets should be in categorical format (e.g. On the other hand, my expectation for categorical_crossentropy was to be intended for multi-class classifications where target classes have a dependency on each other, but are not necessarily one-hot encoded. I thought binary_crossentropy should not be a multi-class loss function and would most likely use binary labels, but in fact Keras (TF Python backend) calls tf.nn.sigmoid_cross_entropy_with_logits, which actually is intended for classification tasks with multiple, independent classes that are not mutually exclusive. I have found several tutorials for convolutional autoencoders that use _crossentropy as the loss function. After using TensorFlow for quite a while I have read some Keras tutorials and implemented some examples.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |