In this tutorial we will present the key elements that define a probability distribution. First of all, we need to start with giving a broad and general definition: A probability distribution is a function that describes the probabilistic behavior of a random variable X, in a way that it allows us to compute probabilities of the occurrence of all possible (well formed) events. In other words, a probability function gives us a clear and unequivocal mechanism to compute probabilities associated to a certain random variable X. That is what I want you to retain and keep in mind for now.

### Notation

Now, let us talk about notation a little bit. So, assume that X is a random variable and we are working with its distribution. Say that \(f\) is the distribution of X. So, usually, you will see reference to \({{f}_{X}}\), where X appears indicating __specifically__ that \(f\) is the distribution of X. It does not always happens like that, but when the distribution function has a subscript, that means to refer the actual random variable it corresponds to.

### Distinction Between Discrete and Continuous Random Variables

We need to be precise from now on in terms of the notation we use. The term "probability distribution" is some sort of a umbrella term that is used carelessly in many contexts, but we will try to not to be too loose about it, so we don't get confused. So, let's record this on our mind: When a random variable X is a *continuous random variable*, then we will use a __density function__ \({{f}_{X}}\) to calculate probabilities associated to it. On the other hand, when a random variable Y is a *discrete random variable*, then we will use a __probability function__ \({{g}_{Y}}\) to calculate probabilities associated to it. Density functions and probability functions work in a different way, although they work in a COMPLETELY analogous way. I promise.

Remember, discrete random variables use *probability functions*, and continuous random variables use *density functions*. So for instance, a Poisson random variable uses a probability function and a Binomial random variable uses a probability function. Or a normally distributed random variable uses a density function.

### Properties that Need to be Met by ALL Probability and Density Functions

We promised that probability functions and densities work in a different but yet completely analogous way. Now we will see why.

· __For densities__

Look at this: A density function \(f\) for a continuous random variable X will satisfy the two following conditions:

(1) \(f\left( x \right)\ge 0\) for all x in \(\mathbb{R}\).

(2) \(\int\limits_{-\infty }^{\infty }{f\left( x \right)dx} = 1\)

Let's not get too hung up about the above. Condition (1) is saying that a density function cannot be negative at any point. It takes either positive values or zero. Condition (2) is saying that the integral of a density function \(f\) over the whole real line must be 1. In layman's terms, the total area under the curve is 1.

· __Now for probability functions__

A probability function \(g\) for a discrete random variable X will satisfy the two following conditions:

(1) \(g\left( x \right)\ge 0\) for all \(x\in \left\{ {{a}_{1}},{{a}_{2}},....,{{a}_{n}},.... \right\}\).

(2) \(\sum\limits_{i=1}^{\infty }{g\left( {{a}_{i}} \right)} = 1\)

Notice that \(\left\{ {{a}_{1}},{{a}_{2}},....,{{a}_{n}},.... \right\}\) corresponds to all the possible values that can be taken by the random variable \(X\) (remember, we are assuming that \(X\) is a discrete variable). As far as I can see, (1) and (2) for probability functions look quite the same (1) and (2) for density functions. In fact, in more advanced mathematics topics, you could see that (1) and (2) can be seen as the exact same for both cases, in a more general context (Measure Theory), but we won't touch that here. What I want you to keep in mind is that ALL probability functions and density function will satisfy those 2 conditions.

### EXAMPLE 1

Let X be a random variable that can take the values 1, 2, 3 and 4. Is

\[f\left( x \right)=\left\{ \begin{align} & \frac{1}{2}\text{ for }x=1, \\ & \frac{1}{4}\text{ for }x=2, \\ & \frac{1}{8}\text{ for }x=3, \\ & \frac{1}{8}\text{ for }x=4 \\ \end{align} \right.\]a probability function for the random variable X?

### ANSWER:

Let us see, we need to see if conditions (1) and (2) are met. First of all, notice that we have \(f\left( x \right)\ge 0\) for all values {1, 2, 3, 4}, which is the set of all possible values that X can take, since \(f\left( 1 \right)=\frac{1}{2}>\), \(f\left( 2 \right)=\frac{1}{4}>0\), \(f\left( 3 \right)=\frac{1}{8}>0\) and \(f\left( 4 \right)=\frac{1}{8}>0\). Therefore, condition (1) is met.Now, let us see if condition (2) is met: We have that

\[\sum\limits_{i=1}^{4}{f\left( {{x}_{i}} \right)}=f\left( 1 \right)+f\left( 2 \right)+f\left( 3 \right)+f\left( 4 \right)=\frac{1}{2}+\frac{1}{4}+\frac{1}{8}+\frac{1}{8}=1\]

and therefore, condition (2) is met as well. So the final answer is, yes, \(f\left( x \right)\) is a probability function for the random variable \(X\).

### EXAMPLE 2

Consider the function \(f\left( x \right)={{x}^{2}}\) on [0,2], on 0 elsewhere. Is \(f\left( x \right)\) a density function?

### ANSWER:

Let us see, we need to see if conditions (1) and (2) are met. First of all, notice that we have \(f\left( x \right)\ge 0\) for all \(x\) since \(f\left( x \right)={{x}^{2}}\ge 0\) on [0, 2], and \(f=0\) elsewhere. So then the function does not take negative values, and henceforth condition (1) is met.

For condition (2), we compute:

\[\int\limits_{-\infty }^{\infty }{f\left( x \right)dx}=\int\limits_{0}^{2}{{{x}^{2}}dx}=\left. \frac{{{x}^{3}}}{3} \right|_{0}^{2}=\frac{{{2}^{3}}}{3}-\frac{{{0}^{3}}}{3}=\frac{8}{3}>1\]Hence, condition (2) is not met, and therefore, $f\left( x \right)$ is NOT a density function.

### Finally, How to Compute Probabilities with Densities and Probability Functions?

This is the final step we were looking for. Why we deal with probability and density functions, anyway? Well, there is a good reason, it is because they allow us to have a clear, unequivocal procedure to compute probabilities. In other words, once you know the corresponding density (probability function) of a random variable, then you know ALL about a random variable. It gives you the POWER.

Nice, but how do you do it??? Simple. As usual, let us see the two cases, for continuous random variables (using densities) and for discrete random variables (using probability functions).

__Computing Probabilities for Continuous Random Variables__

Let X be a continuous random variable. A typical probability even is written as $X\in D$, where $D\subseteq \mathbb{R}$. For example, an event of interest could be that "X is less than or equal 5 but greater than or equal to 1". That is the same as saying that $X\in \left[ 1,5 \right]$, so in that case we would have $D=\left[ 1,5 \right]$. So in other words, probability events are represented by sets (typically intervals, but not necessarily always).

The probability that the event \(X\in D\) occurs is

\[\Pr \left( X\in D \right)=\int\limits_{D}^{{}}{f\left( x \right)dx}\]For example, if \(D=\left[ 1,5 \right]\), we have

\[\Pr \left( X\in \left[ 1,5 \right] \right)=\Pr \left( 1\le X\le 5 \right)=\int\limits_{1}^{5}{f\left( x \right)dx}\]So, it is SUPER SIMPLE. We integrate the density function over a range determined by the event we want to compute the probability for.

__Computing Probabilities for Discrete Random Variables__

Let X be a discrete random variable. In this case, a probability event is also expressed as a set of values, only that in this case, an event is a subset of \(\left\{ {{a}_{1}},{{a}_{2}},....,{{a}_{n}},.... \right\}\), the set of all possible values that can be taken by \(X\). So let \(D\subseteq \left\{ {{a}_{1}},{{a}_{2}},....,{{a}_{n}},.... \right\}\), the probability that the event \(X\in D\) occurs is

\[\Pr \left( X\in D \right)=\Pr \left( X\in \left\{ {{b}_{1}},{{b}_{2}},...,{{b}_{k}} \right\} \right)=\sum\limits_{j=1}^{k}{f\left( {{b}_{j}} \right)}\]

For example, assume that X is binomial with parameters \(N = 10\) and \(p = 0.5\). Then, if I wanted to compute the probability that X is 1 or 2, I need to compute

\[\Pr \left( X\in \left\{ 1,2 \right\} \right)=f\left( 1 \right)+f\left( 2 \right)\]

where \(f\) is the corresponding probability function for a Binomial distribution with parameters \(N = 10\) and \(p = 0.5\). So, it is SUPER SIMPLE TOO. We sum the values of the probability function evaluated at the points in the event we are computing the probability for.

**This tutorial is brought to you courtesy of MyGeekyTutor.com**

In case you have any suggestion, please do not hesitate to contact us.