[A, SfS] Chapter 2: Probability: 2.3: Conditional Probability and Independence
Conditional Probability and Independence
Conditional Probability and Independence
In this lesson you will learn:
- What a conditional probability is
- What dependent and independent events are
- The correct notation for conditional probability
- How to compute a conditional probability
- How to determine whether two events are independent
- How to compute a joint probability for two independent events
- How to use Bayes’ rule to compute prior probabilities
Conditional Probability
The conditional probability of some event , given some other event , is the probability of event occurring if it is given that event occurs. This is denoted .
Dependence
Event is dependent on if the conditional probability of event given event is different from the (unconditional) probability of event , i.e., if:
In this situation it is also true that:
Interpreting Dependence
When two events are dependent on each other, this means that knowing that one of the events occurred makes the other event either more or less likely to occur.
Independence
Event is independent of if the conditional probability of event given event is equal to the (unconditional) probability of event , i.e., if:
In this situation it is also true that:
Interpreting Independence
When two events are independent of each other, this means that knowing that one of the events occurred does not make the other event either more or less likely to occur.
If you already know that event occurs, then the sample space is reduced. For example, if you are rolling a pair of dice (one and one ), the sample space consists of the ordered pairs when is the number showing on the die and is the number on the die (each die can have any of the integers through showing).
Suppose event is the event that . Of the ordered pairs, only pairs comprise event .
So .
But now suppose we are given that event occurs, where event is the event that is an odd integer.
Now the sample space is cut in half, from pairs to the pairs for which is odd. The outcomes comprising this new sample space are precisely the outcomes comprising event .
Further, only of the outcomes comprising event are now possible, that is, only the outcomes in , which are and .
So .
Since we can conclude that and are dependent events. Knowing that is odd makes the event a little less likely to occur.
In the example above, we computed by dividing the number of outcomes in by the number of outcomes in .
We would get the same result if we divided by :
Conditional Probability Formula
This gives us the general formula for computing a conditional probability:
assuming, of course, that .
Joint Probability of Dependent Events
If we rearrange the formula for computing conditional probabilities, we have a formula for computing the joint probability of two events (i.e., the probability of their intersection):
Equivalently, we could derive the formula
in the same way.
Joint Probability of Independent Events
We have shown that
But if and are independent, then
and This means that, if we already know that and are independent, then the formula for the joint probability of and is:
For example, suppose a security system has two components that work independently. The system will fail if and only if both components fail.
Let denote the event that the first component fails and let denote the event that the second component fails. Thus denotes the event that the system fails.
Suppose , and . What is the probability the system will fail?
Since the components are independent, the joint probability of event and is:
Joint Probability of Multiple Independent Events
The rule for the joint probability of independent events can be extended to any finite collection of events .
If these events are mutually independent, then
Likewise, if this formula is true both for the entire collection and for every possible subset of that collection, then we can conclude that the events are mutually independent.
This formula will come in very handy for statistics. If we observe some variable measured on a random sample of elements, we can assume that the measurements are independent, which will allow us to replace the probability of an intersection of events across all elements with the product of the probabilities of the individual events per element.
For example, a random sample of children selected from a large population is tested for dyslexia. If of the children in that population have dyslexia, what is the probability that none of the children in the sample have dyslexia?
First, let denote that the event that the th randomly-selected child does not have dyslexia. The probability of event occurring is:
We are asked for the joint probability .
But since the sample is random, we have independent events, so this joint probability becomes:
So there is a probability that at least of the students has dyslexia.
We previously mentioned the Law of Total Probability when there are two events and :
Using conditional probability, this formula becomes:
Of course we can exchange the ’s and the ’s here, since the labeling is arbitrary.
Consider the following probability tree:
This tree shows that event can occur with probability , and thus fail to occur with probability . Given that event has occurred, event can occur with probability (and fail to occur with probability ). But given that event has failed to occur, event can occur with probability (and fail to occur with probability ).
The tree diagram does not show us the probability that event will occur, only the probability of event under two different conditions ( occurring or not occurring).
But we can use the Law of Total Probability to compute
We can generalize this Law to a situation in which the entire sample space of an experiment can be partitioned into mutually-exclusive events .
This means that and for each pair of indices and . In this case the tree diagram has branches on the left side rather than branches.
General form of the Law of Total Probability
Let be a partition of the sample space for some experiment.
Then given any event for the same experiment, the general form of the Law of Total Probability is:
We might have an experiment that can be thought of as having two stages. At the first stage either event occurs or event does not occur. At the second stage either event occurs or event does not occur.
There can be situations in which you know the outcome of the second stage, and based on that knowledge you would like to know the probability of either outcome of the first stage.
Prior Probability
In other words, based on an observed effect, you might want to know the probability of a proposed cause. We call this a prior probability.
Suppose we want the prior probability of event , given that we have observed event , i.e., we want:
We know the formula:
We can use the Law of Total Probability to replace with .
We can also replace the joint probability with either or .
But since we are trying to compute we obviously don't already have that information, so we go with the second option.
Bayes' Rule
This gives the formula for the prior probability of given :
This formula is called Bayes’ Rule, named for Thomas Bayes, an English statistician, philosopher and Presbyterian minister from the 18th century.
Generalized Bayes' Rule
We can generalize Bayes’ Rule to the situation in which the entire sample space of an experiment can be partitioned into mutually-exclusive events .
Then the prior probability of event given the event has occurred is:
Suppose of passengers who take the ferry from country to country are attempting to smuggle cocaine in their luggage. Dogs at the border are specially trained to detect cocaine in luggage.
If there is indeed cocaine in a bag, there is a probability of they will detect it. However, if there is no cocaine in a bag, there is a probability of that the dog will mistakenly detect cocaine.
A randomly-selected passenger is checked and the dogs indicate that his luggage contains cocaine. What is the probability that the dogs are correct?
Let denote the event that the passenger's luggage contains cocaine. From the given information we know , since of the passengers are smuggling cocaine. Thus .
Let denote the event that the dogs detect cocaine in the luggage. From the given information we know and .
We want to know . Using Bayes’s Rule:
Surprisingly, there is about a out of probability that the dogs are mistaken!
Or visit omptest.org if jou are taking an OMPT exam.