Page cover image

Data structure with hash table

Dictionaries and sets

I will explain the hash table in detail in the database section later. At present, we only need to know that the hash table can be quickly searched.

Let me give an example. The time it takes to search 10,000 times in the list (data structure that does not contain the hash table) and set (the data structure that contains the hash table):

a = list(range(100000))
b = set(range(100000))

import time

time_start = time.time()

for _ in range(10000):
    99999 in a

time_end = time.time()
print('list time cost = ', time_end - time_start, 's')



time_start = time.time()

for _ in range(10000):
    99999 in b

time_end = time.time()
print('set time cost = ', time_end - time_start, 's')

The result of running this program is:

list time cost =  8.223406553268433 s
set time cost =  0.0 s
>>> 

You will find that it takes a long time to search when the data structure does not contain a hash table, and when the data structure contains a hash table, the search can be completed in almost no time.

This difference becomes more obvious when the amount of data becomes larger.

So when we perform large-scale data search, we should use data structure contains hash tables.

set

The meaning of set is derived from mathematics and refers to an unordered list without repetition.

We can create a new collection in the following way.

>>> a = set()
>>> a
set()
>>> 

We can also convert a list into a collection.

>>> a = [1, 2, 2, 2, 3, 4, 4, 5, 5, 5]
>>> a
[1, 2, 2, 2, 3, 4, 4, 5, 5, 5]

>>> a = set(a)
>>> a
{1, 2, 3, 4, 5}
>>> 

You will find that the repeated elements have been deleted from the set.

The set uses a hash table index, so the set is unordered, and we cannot index the set:

>>> test_set = {1,2,3,4,5}

>>> test_set[3]
Traceback (most recent call last):
  File "<pyshell>", line 1, in <module>
    test_set[3]
TypeError: 'set' object is not subscriptable
>>> 

Set is not used much in python, it is mainly used to remove redundant elements in the list:

>>> a = [1, 2, 2, 2, 3, 4, 4, 5, 5, 5]
>>> a = list(set(a))
>>> a
[1, 2, 3, 4, 5]
>>> 

dictionary

Dictionaries are used a lot in python. Because the hash table index is used, the dictionary can be looked up quickly.

>>> d = {'name':'tom', 'age':24, 'gender':'male'}
>>> d
{'name': 'tom', 'age': 24, 'gender': 'male'}
>>> 

As shown in the code block above, we have created a dictionary.

>>> d = {'name':'tom', 'name':'alice', 'age':24, 'gender':'male'}
>>> d
{'name': 'alice', 'age': 24, 'gender': 'male'}
>>> 

The keys of the dictionary must be hashable. Any non-hashable data structure mentioned before cannot be used as the key of the dictionary:

>>> d = {[1,2,3]:'test'}
Traceback (most recent call last):
  File "<pyshell>", line 1, in <module>
    d = {[1,2,3]:'test'}
TypeError: unhashable type: 'list'
>>> 
>>> d = {set([1,2,3]):'test'}
Traceback (most recent call last):
  File "<pyshell>", line 1, in <module>
    d = {set([1,2,3]):'test'}
TypeError: unhashable type: 'set'
>>> 
>>> d = { {'name':'tom'}:'test' }
Traceback (most recent call last):
  File "<pyshell>", line 1, in <module>
    d = { {'name':'tom'}:'test' }
TypeError: unhashable type: 'dict'
>>> 

The hashable tuple can be used as a dictionary key:

>>> d = { tuple([1,2,3]):'test' }
>>> d
{(1, 2, 3): 'test'}
>>> 

Modify dictionary

Modifying the dictionary is very simple, we only need to overwrite the contents of the corresponding key of the dictionary:

>>> d = {'name':'tom', 'age':24, 'gender':'male'}
>>> d
{'name': 'tom', 'age': 24, 'gender': 'male'}

>>> d['name'] = 'ali'
>>> d
{'name': 'ali', 'age': 24, 'gender': 'male'}
>>> 

Create empty dictionary

>>> a = dict()
>>> a
{}
>>> type(a)
<class 'dict'>
>>> 

Creating an empty dictionary is very simple, just use dict() to create it.

Query dictionary

>>> d
{'name': 'ali', 'age': 24, 'gender': 'male'}

>>> d['name']
'ali'
>>> 

You only need to enter the key to query the dictionary. The above is the query method.

Delete a key of the dictionary

>>> d
{'name': 'ali', 'age': 24, 'gender': 'male'}

>>> del(d['name'])
>>> d
{'age': 24, 'gender': 'male'}
>>> 

The above is how to delete a key in the dictionary.

Delete the entire dictionary

>>> d
{'age': 24, 'gender': 'male'}

>>> del(d)

>>> d
Traceback (most recent call last):
  File "<pyshell>", line 1, in <module>
    d
NameError: name 'd' is not defined
>>> 

The above code will delete a dictionary. If you have deleted this dictionary, you will get an error when you call this dictionary again.

Statistics

Start time of this page: December 19, 2021

Completion time of this page: January 6, 2022

Last updated