Page cover image

Mean median mode

The most important concept

mean

The mean(average) is to add all the values together and divide by the number of samples.

Calculating the average in python is very simple.

We only need to use numpy's built-in functions.

data = [
    0, 0, 1, 2, 2, 2, 2, 3, 3, 3,
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
    7, 7, 8, 8, 9, 9, 9, 10
    ]

print('data = ', end='')
print(data)
print()

import numpy as np

print('mean = ', end='')
print(np.mean(data))

The results of this program are as follows:

data = [
    0, 0, 1, 2, 2, 2, 2, 3, 3, 3, 
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 
    7, 7, 8, 8, 9, 9, 9, 10
    ]

mean = 4.821428571428571
>>> 

median

The number in the middle of the sorted data is median.

We still use the previous data, this time we will find the median.

We only need to use numpy's built-in functions.

data = [
    0, 0, 1, 2, 2, 2, 2, 3, 3, 3,
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
    7, 7, 8, 8, 9, 9, 9, 10
    ]

print('data = ', end='')
print(data)
print()

import numpy as np

print('mean = ', end='')
print(np.mean(data))

print('median = ', end='')
print(np.median(data))

The results of this program are as follows:

data = [0, 0, 1, 2, 2, 2, 2, 3, 3, 3, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 7, 7, 8, 8, 9, 9, 9, 10]

mean = 4.821428571428571
median = 5.0
>>> 

mode

Mode refers to the number with the largest number in a set of data.

Obviously, the most numerous number in this set of data is 5.

The solution of the mode is also very simple. We only need to use scipy's built-in functions.

data = [
    0, 0, 1, 2, 2, 2, 2, 3, 3, 3,
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
    7, 7, 8, 8, 9, 9, 9, 10
    ]

print('data = ', end='')
print(data)
print()

import numpy as np

print('mean = ', end='')
print(np.mean(data))

print('median = ', end='')
print(np.median(data))

from scipy import stats

print('mode = ', end='')
print(stats.mode(data)[0][0])

The result of the operation is:

data = [0, 0, 1, 2, 2, 2, 2, 3, 3, 3, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 7, 7, 8, 8, 9, 9, 9, 10]

mean = 4.821428571428571
median = 5.0
mode = 5
>>> 

Summarize

Mean median mode is the most commonly used method in data analysis. They are very simple but also very effective. All data mining engineers should keep in mind and master these three methods.

Statistics

Start time of this page: December 20, 2021

Completion time of this page: December 20, 2021

Last updated