Algorithm Sharing 01: Sieve of Eratosthenes (Sieve Method) [Simple]
Algorithm Application Scenarios
The Sieve of Eratosthenes efficiently solves problems related to recording prime numbers (including their quantities).
Core Concept
Eliminate composite numbers, keep prime numbers:
If x is a prime number, then multiples of x must not be prime numbers. These numbers can be directly excluded without needing subsequent one-by-one traversal, thereby reducing efficiency losses.
Since we need to record that multiples of x are not prime numbers, obviously before performing sieving, we must create a container to save the primality status of all numbers (referring to all numbers within the required range).
Therefore, we implement this through an array isPrime[]. If number i is prime, then isPrime[i] equals 1; otherwise, it equals 0.
Then we traverse from small to large. If a number is prime, mark all its multiples (excluding the prime number itself) as composite numbers (that is, set the corresponding array index value to 0).
In this way, after completing the traversal, we simply check which array values equal 1—those are the prime numbers. Then operations like recording prime quantity or outputting all primes become effortless.
Correctness Explanation
From the basic concept described above, we can understand this algorithm selects composite numbers from within and marks them as 0. Actually, when creating the array, we initially mark everything as 1, meaning we first assume all are prime numbers, then through the algorithm screen out determined primes to identify composite numbers, marking them as 0. Therefore, throughout the process, array numbers continuously change from 1 to 0—as mentioned at the beginning, this method picks out composite numbers.
From the feeling of this method, we can certainly determine: prime numbers won't be easily marked as composite, but composite numbers might not be discovered and thus be mistaken as primes.
When first encountering this method, we inevitably feel a sense of uncertainty: Why can we certainly determine that when traversing to a certain prime number (that is, array value equals 1), it must be prime? Could it be that it was originally composite, but previous traversal operations failed to discover it, thus having escaped detection?
Let's explore this thoroughly: First, this method definitely won't mark prime numbers as composite. When we traverse from small to large, if number x is composite, then according to composite number properties, x has at least one prime factor (note that 1 is not prime). Let's denote it as y. Then during small-to-large traversal, we will encounter y before x. When marking multiples of y as composite numbers, we will definitely mark x as composite. Therefore, there won't be escaped composite numbers left unmarked, causing them to be mistakenly considered prime.
Optimization
When understanding the reasoning behind its correctness establishment, subsequent optimization operations become easy to comprehend:
For prime number x, if following the previous method starting marking from 2x actually involves duplicate operations. Could it be that this 2x was already marked as composite previously? If so, wouldn't this operation be wasted effort?
Now we must research from when to start marking ensures neither omission nor overlap:
Actually, from the previous section stating composite number x has at least one prime factor y, we can guess: numbers before x×x don't need marking. Because for previous numbers k×x where k takes values from 2 to x-1, k is less than x. Two situations exist: First, k is prime—then when traversing to k, k×x will definitely be marked as composite. Second, k is not prime—then same as previously stated, k must have a prime factor y, so k×x also must have a prime factor y. Obviously, when traversing to y, k×x will definitely be marked as composite.
Meanwhile, x×x only has factors x and 1, so it can only be marked as composite by x. Therefore, starting marking from x×x both guarantees previous numbers have been marked and ensures the starting x×x can only be marked by x. Thus, starting from x×x represents the optimal "starting point".
Algorithm Steps
Step 1: Initialize the Sieve
Create a bool-type array (int is also acceptable) isPrime[n + 1] with length n + 1. The array index represents the corresponding number, while the array value indicates whether the corresponding number is prime (for example, isPrime[11] = 1 represents number 11 is prime). Initially, set all array values to true, but note to set positions 0 and 1 to false (these two numbers are defined as not prime).
Step 2: Sieve Out Composite Numbers
Traverse from i = 2 to √n. If isPrime[i] = true, then mark all multiples of i as false. Of course, after optimization, start marking from i×i as false.
Step 3: Extract Results
After traversal completion, indices with value true represent all prime numbers between 1 and n.
Example Code (C++)
class Solution {
public:
int Primes(int n) {
if (n <= 2) return 0;
if (n == 3) return 1;
vector<bool> IsPrime(n, true);
for(int i = 3; i * i < n; i += 2){
if(IsPrime[i]){
for(int j = i * i; j < n; j += i){
IsPrime[j] = false;
}
}
}
// At this point, all numbers in the container have been filtered
// Next, perform desired operations according to problem requirements
// For further efficiency, some operations can even be performed
// synchronously during the filtering process, but must ensure
// operated numbers have been filtered
}
};Complexity Analysis
Time Complexity: O(n log log n)
The time complexity derives from the harmonic series property. Each prime p marks n/p numbers, and the sum of 1/p for all primes approximates log log n.
Space Complexity: O(n)
The space requirement comes from the boolean array storing primality status for each number up to n.
Reference Problems
This classic problem asks to count all prime numbers less than a given non-negative integer n. The Sieve of Eratosthenes provides an efficient solution, significantly outperforming naive primality testing for each individual number.
Practical Considerations
When implementing the Sieve of Eratosthenes in practice, several considerations enhance performance:
- Odd Number Optimization: Since all even numbers except 2 are composite, we can skip even numbers entirely, reducing both time and space by half.
- Bit Array Implementation: For very large n, using a bit array instead of boolean array reduces memory consumption by a factor of 8.
- Segmented Sieve: For extremely large ranges, the segmented sieve processes the range in chunks, maintaining O(n log log n) time complexity while reducing space to O(√n).
- Wheel Factorization: Advanced implementations skip multiples of small primes (2, 3, 5) simultaneously, further reducing operations.
The Sieve of Eratosthenes remains one of the most elegant algorithms in number theory, demonstrating how simple mathematical insights can lead to highly efficient computational solutions. Its beauty lies in transforming a seemingly complex primality testing problem into a systematic elimination process.