Science

Fact-checked

What is a Binomial Distribution?

D. Poupon

Last Modified Date: February 23, 2024

A binomial distribution with parameters (n,p) gives the discrete probability of having x successes out of n trials, with the probability of success p, assuming each trial is independent and the outcome of a trial is either a success or a failure. The average number of successes out of n trials is the mean np, and the variance is np(1-p). The binomial belongs to a family of event related distributions including the the negative binomial and the Bernoulli distribution. Since binomial distribution probability is calculated using the factorial function, which gets very large as the number of trials increases, binomial distribution approximation of a normal or a Poisson distribution is typically used.

For example, a fair coin is flipped twice and a success is defined as getting heads. The number of trials is n = 2 and the probability of tossing a head is p = ½. The results can be summarized in a binomial distribution table: the probability of getting no heads, P(x = 0) is 25%, the probability of one head, P(x = 1) is 50%, and the probability of two heads P(x = 2) is 25%. The expected number of heads tossed is np = 2*1/2 = 1. The variance is np(1-p) = ½.

Other distributions describe the probability of events and belong to the same family as the binomial. A Bernoulli distribution gives the probability of success of a single event and is equivalent to a binomial with n = 1. The negative binomial distribution gives the probability of having x failures, where as the regular binomial gives the probability of x successes.

Often the binomial distribution’s cumulative density function is used, which gives the probability of having x or less successes in n trials. Calculating this probability is simple for a small n, but becomes tedious as n gets large, because of the binomial coefficient. The binomial coefficient is read “n choose x”, and refers to the number of combinations that x outcomes can be picked from n possibilities. It is calculated using the factorial function. As the number of trials (n) gets larger than 70, n factorial gets enormous and can no longer be calculated on a standard calculator.

The binomial distribution's approximation when n gets large may be discrete or continuous. If n is very large and p is very small, then the binomial distribution becomes a discrete Poisson distribution. If n is sufficiently large without any constraint on p, then the binomial normal distribution approximation may be used. The binomial mean and standard deviation become the normal distribution’s parameters and a correction for continuity is applied when calculating the cumulative density function.