Logistic Regression is a machine learning model used for classification. When a prediction of a dependent variable consists of 2 values (e.g. Yes/No, Will a person win or loose, Will it rain or not) then logistic regression can be useful. In other words, it is a binary classification.
Let’s take a simple example of whether a student will be late to school or not. If the probability of this event happening is greater than 0.5, we consider that the student will be late. This depends on multiple factors but to keep it simple we will consider the time when the student goes to bed the previous night as an input. As opposed to a linear relation where the later the student goes to bed is directly proportional to the student getting up late (and being late to school), logistic regression follows something called a Sigmoid function which is ‘S’ shaped. This function maps a real value to a value between 0 and 1.
From the above curve we can see that if the student goes to bed after 11 PM – the probability
of being late to school is more than 0.5 and we conclude that the student will be late. The above was an example of simple logistic regression. If there are more than one input variables to predict a binary output – then that is called a multiple logistic regression. So, for example if we add another independent variable such as traffic on route to school in addition to the time the student goes to bed – that becomes multiple logistic regression. The two input variables are not related to each other. The time that the student goes to bed is not related to the traffic the next morning, but the combination does impact the dependent variable.
Comments