This talk covers the theoretical background behind two common loss functions, mean squared error and cross entropy, including why they are used for machine learning at all, and what limitations you should keep in mind.
Well, you probably know that mean squared error is a good default choice for training machine learning models on regression tasks while cross entropy is commonly recommended for classification. (No worries if not, you will be able to understand the talk nevertheless.) But, do you also know why this is the case? Why these loss functions can be used for machine learning at all, and when you should consider an alternative? No? Great! This talk is for you.
The talk begins with a recap on the definition of loss functions in the context of gradient descent, followed by a short introduction to maximum likelihood estimation using a practical example. As the talk proceeds, it is shown how to derive mean squared error and cross correlation from Gaussian- and multinoulli distributions, which allows us to identify potential limitations of the two loss functions. The remaining time is used to relate these findings into the broader context of machine learning, with especial regard to neuronal networks.
The talk should be useful for those getting started with machine learning, as well as for practitioners that desire a theory recap. After the talk you should have a general intuition what actually happens behind the scenes while you train your machine learning model with mean squared error or cross correlation.
Artificial Intelligence, Algorithms, Deep Learning, Data Science, Machine Learning, Statistics
Domain Expertise:some
Python Skill Level:none
Abstract as a tweet:This talk covers the theoretical background behind two common loss functions, mean squared error and cross entropy, including why they are used for machine learning at all, and what limitations you should keep in mind.