Basic Probability: Expectation and Variance

Introduction to the basic ideas of expectation and variance in probability

4 min readJul 18, 2023

Often, when looking at a dataset, it is very useful to extract a few key parameters that can give us a quick insight into our data. Two of those parameters are the expectation (mean) and the variance.

In this article, we focus on discrete random variables, but the main ideas hold for a continuous random variables as well.

The Expectation

The expectation is the mean of a random variable. Let’s start by defining it mathematically:

where X is a discrete random variable, Ω is the sample space, x is a state the random variable X can take, and P(x) is the probability of the outcome x. We can think about this formula as a weighted mean with the probability values as the weights.

We can show that this formula is actually related to the mean by considering the sample mean:

For simplicity we assume that xᵢ can take only M values so we can rewrite the formula above as

where the expression in the fraction is the percentage of each of the M values in the dataset, which can be seen as the probability of getting each value. That’s exactly the definition of the expectation as in the first equation.

Even so, there is a fundamental difference between the expection value and the sample mean. The sample mean is the empirical mean we get from the data we have. The expectation value, however, is the theoretical mean we will have from the ideal sample (or the “real” data distribution).

Properties of the Expectation

Expectation of a function of a random variable:

Hand-waving proof:
Applying the function g() on x changes the value, but does not change the probability to get the new value. It is still the same probability of getting x, and so the probability of getting g(x) is the same as in the regular case of x.

2. Linearity:

proof:

3. Scale by constant:

proof:

4. DC shift:

proof:

while in the last step we used the fact that the sum of probabilities over all the states is 1.

The Variance

The variance of a random variable X is given by:

where μ = E[X]. The variance measures the deviation of a random variable relative to its expectation value (the mean). In other words, it measures the average squared distance between the random variable value and its mean.

The variance can also be written as:

proof:

The square root of the variance is called the standard deviation and is written as σ. σ, as opposed to the variance — σ², has the same units as the expectation value.