Python complete tutorial
  • Python Complete Tutorial
  • About this book
  • What you need to prepare
  • 1️⃣Try python for the first time
    • Install python
    • Hello world!
    • Hello world in a nutshell
    • The first simple python project
    • most useful libraries
    • Recommended websites
  • 2️⃣Data structure and basic operations
    • Python data structure
    • Data structure without hash table
    • Data structure with hash table
    • Variability and address
    • basic python programming
    • basic python programming 2
    • basic python programming 3
    • some additions
    • Fibonacci sequence
    • Judging prime numbers
    • txt/csv file operation
  • 🐍Practice program
    • 🚩fancy print
    • 🚩Remove duplicate elements
    • 🚩Palindrome detection
  • 😎leetcode
    • what is leetcode
  • 3️⃣Data mining and machine learning
    • What is data mining
    • iris data set
    • Mean median mode
    • Harmonic mean
    • Histogram
    • Correlation algorithm
    • Gaussian distribution data set
    • projection
    • PCA
    • MDS
    • Bayesian and Frequentist
    • Data normalization
    • binary SVM
    • One Hot Encoding
    • Multi-class SVM
    • Accuracy and error rate
    • Confusion matrix & Accuracy, Precision, Recall
    • F1 score
    • ROC and AUC
  • 4️⃣big data and data visualization
    • line chart
    • Parallel coordinates
    • Histogram chart
  • 5️⃣Mathematical algorithm and digital signal processing
    • Mathematical constants and basic operations
    • Normal distribution
    • Permutation and combination
    • Bernoulli distribution
    • Chaotic system
  • 6️⃣Classes and design patterns
    • Classes and design patterns
  • 7️⃣Operate the database with python
    • MySQL
      • Install MySQL
      • First try MySQL
      • MySQL Architecture
      • database operations
      • database
  • 8️⃣Cryptography
    • beginning of Cryptography
  • 9️⃣deep learning
    • What is Deep Learning
    • basic
  • 💔algorithm
    • Algorithms and Data Structures
Powered by GitBook
On this page
  • mean
  • median
  • mode
  • Summarize
  • Statistics
  1. Data mining and machine learning

Mean median mode

The most important concept

mean

The mean(average) is to add all the values together and divide by the number of samples.

Calculating the average in python is very simple.

We only need to use numpy's built-in functions.

data = [
    0, 0, 1, 2, 2, 2, 2, 3, 3, 3,
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
    7, 7, 8, 8, 9, 9, 9, 10
    ]

print('data = ', end='')
print(data)
print()

import numpy as np

print('mean = ', end='')
print(np.mean(data))

The results of this program are as follows:

data = [
    0, 0, 1, 2, 2, 2, 2, 3, 3, 3, 
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 
    7, 7, 8, 8, 9, 9, 9, 10
    ]

mean = 4.821428571428571
>>> 

median

The number in the middle of the sorted data is median.

We still use the previous data, this time we will find the median.

We only need to use numpy's built-in functions.

data = [
    0, 0, 1, 2, 2, 2, 2, 3, 3, 3,
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
    7, 7, 8, 8, 9, 9, 9, 10
    ]

print('data = ', end='')
print(data)
print()

import numpy as np

print('mean = ', end='')
print(np.mean(data))

print('median = ', end='')
print(np.median(data))

The results of this program are as follows:

data = [0, 0, 1, 2, 2, 2, 2, 3, 3, 3, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 7, 7, 8, 8, 9, 9, 9, 10]

mean = 4.821428571428571
median = 5.0
>>> 

mode

Mode refers to the number with the largest number in a set of data.

Obviously, the most numerous number in this set of data is 5.

The solution of the mode is also very simple. We only need to use scipy's built-in functions.

data = [
    0, 0, 1, 2, 2, 2, 2, 3, 3, 3,
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
    7, 7, 8, 8, 9, 9, 9, 10
    ]

print('data = ', end='')
print(data)
print()

import numpy as np

print('mean = ', end='')
print(np.mean(data))

print('median = ', end='')
print(np.median(data))

from scipy import stats

print('mode = ', end='')
print(stats.mode(data)[0][0])

The result of the operation is:

data = [0, 0, 1, 2, 2, 2, 2, 3, 3, 3, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 7, 7, 8, 8, 9, 9, 9, 10]

mean = 4.821428571428571
median = 5.0
mode = 5
>>> 

Summarize

Mean median mode is the most commonly used method in data analysis. They are very simple but also very effective. All data mining engineers should keep in mind and master these three methods.

Statistics

Start time of this page: December 20, 2021

Completion time of this page: December 20, 2021

Previousiris data setNextHarmonic mean

Last updated 3 years ago

3️⃣
Page cover image