'Be wary of time complexity pitfalls\'
Be wary of time complexity pitfalls
Write Here
A bilibili vedio also shows this : [Bilibili Video][https://www.bilibili.com/video/BV16u4m1c7cU/?spm_id_from=333.999.0.0] and I think it's a good vedio, but its language is Chinese.
time complexity
- What does time complexity mean?
Time complexity is a measure of how much time an algorithm takes to run as a function of the size of its input. It is a way to describe the efficiency of an algorithm, and it is used to compare different algorithms and determine which one is the most efficient.
How to calculate time complexity?
To calculate time complexity, we need to consider the number of operations performed by the algorithm as a function of the size of the input. The most common way to measure the number of operations is by counting the number of times a particular operation is performed.
What are some common pitfalls when calculating time complexity?
- Not considering the input size: If we only consider the number of operations performed by the algorithm, we may underestimate the time complexity. For example, if we count the number of times a loop runs, but we don't take into account the size of the input, then the time complexity may be too high.
- Not considering the algorithm's efficiency: Some algorithms may perform many operations even when the input size is small, which can lead to high time complexity. For example, sorting algorithms like bubble sort and selection sort have quadratic time complexity, even though they may only need to swap two elements in a small array.
- Not considering the algorithm's worst-case scenario: Some algorithms have a best-case scenario where they perform fewer operations than the worst-case scenario. For example, searching algorithms like binary search have a best-case scenario where they find the target element in logarithmic time, but they have a worst-case scenario where they need to examine all elements in the array.
Let's see some examples of time complexity
Here is a question:
Find the max 10 integers in a list.
import random ls = [random.randint(1, 100) for _ in range(n)]
Here is the test function:
import time def benchmark(func, *args) -> float: start = time.perf_counter() func(*args) end = time.perf_counter() return end - start
Solution1: Use heap
Here is the solution which uses the heapq module:
def find_max_n(ls, n): import heapq return heapq.nlargest(n, ls)
Or we write it in the python way:
def find_largest_n(nums, n): if n <= 0: return [] max_heap = [] for num in nums: if len(max_heap) < n: max_heap.append(num) # call sift_down for i in range((len(max_heap) - 2) // 2, -1, -1): _sift_down(max_heap, i) elif num > max_heap[0]: max_heap[0] = num _sift_down(max_heap, 0) return max_heap def _sift_down(heap, index): left = 2 * index + 1 right = 2 * index + 2 largest = index if left < len(heap) and heap[left] > heap[largest]: largest = left if right < len(heap) and heap[right] > heap[largest]: largest = right if largest != index: heap[index], heap[largest] = heap[largest], heap[index] _sift_down(heap, largest)
Solution2: Use sort
Here is the solution which uses the sort function:
def find_max_n(ls, n): return sorted(ls, reverse=True)[:n]
Alomost everyone know that The time complexity of the heap, is O(n log k), and the time complexity of the sort function is O(n log n).
It seems that the First solution is better than the second one. But it is not true in python.
Look at the final code:
import time def benchmark(func, *args) -> float: start = time.perf_counter() func(*args) end = time.perf_counter() return end - start def find_max_n_heapq(ls, n): import heapq return heapq.nlargest(n, ls) def find_max_n_python_heap(nums, n): if n <= 0: return [] max_heap = [] for num in nums: if len(max_heap) < n: max_heap.append(num) # call sift_down for i in range((len(max_heap) - 2) // 2, -1, -1): _sift_down(max_heap, i) elif num > max_heap[0]: max_heap[0] = num _sift_down(max_heap, 0) return max_heap def _sift_down(heap, index): left = 2 * index + 1 right = 2 * index + 2 largest = index if left < len(heap) and heap[left] > heap[largest]: largest = left if right < len(heap) and heap[right] > heap[largest]: largest = right if largest != index: heap[index], heap[largest] = heap[largest], heap[index] _sift_down(heap, largest) def find_max_n_sorted(ls, n): return sorted(ls, reverse=True)[:n] # test import random for n in range(10, 10000, 100): ls = [random.randint(1, 100) for _ in range(n)] print(f"n = {n}") print(f"Use Heapq: {benchmark(find_max_n_heapq, ls, n)}") print(f"Python Heapq: {benchmark(find_max_n_python_heap, ls, n)}") print(f"Sorted : {benchmark(find_max_n_sorted, ls, n)}")
I run it in Python3.11 VScode, Here is the result:
n is not big
Use Heapq: 0.002430099993944168
Python Heapq: 0.06343129999004304
Sorted : 9.280000813305378e-05
n = 910
Use Heapq: 9.220000356435776e-05
Python Heapq: 0.07715500006452203
Sorted : 9.360001422464848e-05
n = 920
Use Heapq: 9.440002031624317e-05
Python Heapq: 0.06573969998862594
Sorted : 0.00012450001668184996
n = 930
Use Heapq: 9.689992293715477e-05
Python Heapq: 0.06760239996947348
Sorted : 9.66999214142561e-05
n = 940
Use Heapq: 9.600003249943256e-05
Python Heapq: 0.07372559991199523
Sorted : 9.680003859102726e-05
n = 950
Use Heapq: 9.770004544407129e-05
Python Heapq: 0.07306570000946522
Sorted : 0.00011979998089373112
n = 960
Use Heapq: 9.980006143450737e-05
Python Heapq: 0.0771204000338912
Sorted : 0.00022829999215900898
n = 970
Use Heapq: 0.0001601999392732978
Python Heapq: 0.07493270002305508
Sorted : 0.00010840001050382853
n = 980
Use Heapq: 9.949994273483753e-05
Python Heapq: 0.07698320003692061
Sorted : 0.00010300008580088615
n = 990
Use Heapq: 9.979994501918554e-05
Python Heapq: 0.0848745999392122
Sorted : 0.00012620002962648869
n이 크면?
n = 10000
Heapq 사용: 0.003642000025138259
파이썬 힙q: 9.698883199947886
정렬 : 0.00107999995816499
n = 11000
Heapq 사용: 0.0014836000045761466
파이썬 힙q: 10.537632800056599
정렬 : 0.0012236000038683414
n = 12000
Heapq 사용: 0.001384599949233234
파이썬 힙q: 12.328411899972707
정렬 : 0.0013226999435573816
n = 13000
Heapq 사용: 0.0020017001079395413
파이썬 힙q: 15.637207800056785
정렬 : 0.0015075999544933438
n = 14000
Heapq 사용: 0.0017026999266818166
파이썬 힙q: 17.298848500009626
정렬 : 0.0016967999981716275
n = 15000
Heapq 사용: 0.0017773000290617347
파이썬 힙q: 20.780625900020823
정렬 : 0.0017105999868363142
내가 찾은 내용 및 개선 방법
n이 큰 경우 Sorted는 약간의 시간이 소요되지만(때로는 heapq를 사용하는 것보다 더 나음) Python Heapq은 많은 시간이 소요됩니다.
- Sorted는 시간이 조금 걸리지만 Python Heapq은 시간이 많이 걸리는 이유는 무엇입니까?
- sorted()는 Python에 내장된 함수이기 때문에 이에 대한 Python 공식 문서를 찾을 수 있습니다.
Bulit-in 기능은 컴파일 언어인 C로 작성되었기 때문에 heapq보다 빠릅니다.
- 어떻게 개선할 수 있나요?
- heapq.sort() 대신 내장 함수 sorted()를 사용하여 코드 성능을 향상시킬 수 있습니다. sorted() 함수는 Python에 내장된 함수로 C로 구현되므로 heapq.sort()보다 훨씬 빠릅니다
경련
대량 데이터를 처리할 때는 코드 성능을 향상시키기 위해 heapq.sort() 대신 내장 함수를 사용해야 합니다. 대용량 데이터를 다룰 때 시간 복잡도 문제에 주의해야 합니다. 때로는 시간 복잡성의 함정을 피할 수 없지만 가능한 한 이를 피하도록 노력해야 합니다.
나에 대해서
안녕하세요 멍친위안 입니다. 나는 학생입니다. 나는 새로운 것을 배우는 것을 좋아합니다.
내 github을 볼 수 있습니다: [MengQinYuan의 Github][https://github.com/mengqinyuan]
The above is the detailed content of 'Be wary of time complexity pitfalls\'. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Solution to permission issues when viewing Python version in Linux terminal When you try to view Python version in Linux terminal, enter python...

How to avoid being detected when using FiddlerEverywhere for man-in-the-middle readings When you use FiddlerEverywhere...

When using Python's pandas library, how to copy whole columns between two DataFrames with different structures is a common problem. Suppose we have two Dats...

How to teach computer novice programming basics within 10 hours? If you only have 10 hours to teach computer novice some programming knowledge, what would you choose to teach...

How does Uvicorn continuously listen for HTTP requests? Uvicorn is a lightweight web server based on ASGI. One of its core functions is to listen for HTTP requests and proceed...

Fastapi ...

Using python in Linux terminal...

Understanding the anti-crawling strategy of Investing.com Many people often try to crawl news data from Investing.com (https://cn.investing.com/news/latest-news)...
