Sum over Subsets DP
Authors: Rameez Parwez, Siyong Huang, Aakash Gokhale
Taking bitmask DP to the next level.
Prerequisites
Resources | ||||
---|---|---|---|---|
CF | Good explanation + problem list | |||
GFG | Goes over brute force solutions | |||
CF | Characterizing SOS DP as multidimensional prefix sums |
Sum Over Subsets (SOS) is technique used to efficiently calculate the sum of values for all subsets of a given set or bitmask. Instead of manually generating every subset, which can be slow, SOS uses bitwise operations to quickly go through only the necessary subsets.
The main trick is that for any bitmask , we can directly loop through all its subsets using:
This ensures we efficiently handle only the subsets of , skipping unnecessary combinations.
Implementation
Consider an array of elements. For a bitmask , we define as the sum of over all subset mask . That is:
The goal is to compute for all as efficiently as possible.
Solution
Brute Force
The naive solution is to iterate over all pair of masks, summing only when one of them is a subset of the other (i.e.,).
C++
vector<int> sos(1 << n);for (int x = 0; x < (1 << n); x++) {// iterate over all other sets and checks whether they're a subset of xfor (int i = 0; i < (1 << n); i++) {if ((i & x) == i) { sos[x] += a[i]; }}}
Python
sos = [0] * (1 << n)for x in range(1 << n):# iterate over all other sets and checks whether they're a subset of xfor i in range(1 << n):if (x & i) == i:sos[x] += a[i]
This solution has a time complexity of , which is too slow for large values of .
Optimized Solution : Iterating Over Submasks
Instead of iterating over all bitmasks for , we can optimize by iterating only over the subset mask of .
C++
vector<int> sos(1 << n);for (int x = 0; x < (1 << n); x++) {sos[x] = a[0];// iterate over all subsets of x directlyfor (int i = x; i > 0; i = (i - 1) & x) { sos[x] += a[i]; }}
Python
sos = [0] * (1 << n)for x in range(1 << n):sos[x] = a[0]i = x# iterate over all subsets of x directlywhile i > 0:sos[x] += a[i]i = (i - 1) & x
How does this works?
Time Complexity:
Proof
Faster Solution using Dynamic Programming
While the previous method is better, it still has some redundancy. For example, if a bitmask has unset bits, then is summed times. If we precompute and reuse these sums, we can eliminate repeated additions.
Subset Mask Partitioning with
We define the subset as follows:
In simpler terms, contains all subset masks of whose bits to the left of the -th bit match those of .
For example:
We can break as follows:
- If the -th bit of is , then .
- If the -th bit of is :
- Subsets with the -th bit .
- Subsets with the -th bit .
Thus:
Using the above partitioning, we define a DP table where:
Time Complexity:
C++
vector<int> sos(1 << n);vector<vector<int>> dp(1 << n, vector<int>(n));for (int x = 0; x < (1 << n); x++) {dp[x][0] = a[x];if (x & 1) { dp[x][0] += a[x ^ 1]; }for (int i = 1; i < n; i++) {dp[x][i] = dp[x][i - 1];if (x & (1 << i)) { dp[x][i] += dp[x ^ (1 << i)][i - 1]; }}sos[x] = dp[x][n - 1];}
Python
sos = [0] * (1 << n)dp = [[0] * n for _ in range(1 << n)]for x in range(1 << n):dp[x][0] = a[x]if x & 1:dp[x][0] += a[x ^ 1]for i in range(1, n):dp[x][i] = dp[x][i - 1]if x & (1 << i):dp[x][i] += dp[x ^ (1 << i)][i - 1]sos[x] = dp[x][n - 1]
Optimized Memory Usage
Since only depends on , we can reuse the DP array.
C++
for (int i = 0; i < (1 << n); i++) { sos[i] = a[i]; }for (int i = 0; i < n; i++) {for (int x = 0; x < (1 << n); x++) {if (x & (1 << i)) { sos[x] += sos[x ^ (1 << i)]; }}}
Python
sos = [0] * (1 << n)for i in range(1 << n):sos[i] = a[i]for i in range(n):for x in range(1 << n):if x & (1 << i):sos[x] += sos[x ^ (1 << i)]
SOS DP as N-Dimensional Prefix Sum
To understand this, let's revisit prefix sums. Given array , the prefix sum array is defined as:
The standard approach uses inclusion-exlcusion:
However, this approach has a drawback: as the number of dimensions grows, the number of terms we need to add or subtract grows exponentially. A simple and more scalable approach is to sweep through each dimensions one at a time and compute the prefix sum step by step:
C++
// Initializefor (int i = 0; i < n; i++) {for (int j = 0; j < m; j++) { S[i][j] = A[i][j]; }}// Sweep along x-axisfor (int i = 1; i < n; i++) {for (int j = 0; j < m; j++) { S[i][j] += S[i - 1][j]; }}// Sweep along y-axisfor (int i = 0; i < n; i++) {for (int j = 1; j < m; j++) { S[i][j] += S[i][j - 1]; }}
Python
# Initializefor i in range(n):for j in range(m):S[i][j] = A[i][j]# Sweep along the x-axisfor i in range(1, n):for j in range(m):S[i][j] += S[i - 1][j]# Sweep along the y-axisfor i in range(n):for j in range(1, m):S[i][j] += S[i][j - 1]
This approach generalizes to higher dimensions. For example, in :
- After sweeping along the x-axis, contains the sum of where and .
- After sweeping along the y-axis, contains the sum of where and .
- After sweeping along the z-axis, contains the sum of where and .
- After sweeping along the w-axis, contains the sum of where and .
If we extend this idea to dimensions, here's what happens: after sweeping along the -th axis, contains the sum of values where the first coordinates are less than or equal to , and the remaining coordinates match . Sound familiar? That's exactly what represents in the SOS problem!
If we think of each bit in a bitmask as its own axis, then a bitmask with bits can be viewed as an -dimensional coordinate where each component is either or . A submask of corresponds to an -dimensional coordinate where each component is less than or equal to . Therefore, when we interpret the bitmask as an -dimensional coordinate, alings with the definition of an -dimensional prefix sum!
By applying the sweeping algorithm along each axis, we get the memory-optimized SOS DP solution mentioned earlier, demonstrating that SOS DP is indeed an n-dimensional prefix sum.
C++
for (int i = 0; i < (1 << n); i++) { F[i] = A[i]; }for (int i = 0; i < n; i++) { // Sweep along the i-th axisfor (int x = 0; x < (1 << n); x++) {if (x & (1 << i)) // If the i-th bit is set, accumulateF[x] += F[x ^ (1 << i)];}}
Python
for i in range(1 << n):F[i] = A[i]for i in range(n): # Sweep along the i-th axisfor x in range(1 << n):if x & (1 << i): # If the i-th bit is set, accumulateF[x] += F[x ^ (1 << i)]
Focus Problem – try your best to solve this problem before continuing!
Explanation
First, think of each word as a combination of letters represented by a bitmask. For example, bcd
= 0b1110
, and ada
= 0b1001
, where each bit stands for a letter. We'll also keep track of how often each bitmask appears in the dictionary.
Next, we use SOS DP to compute the number of words disjoint from the mask(i.e., words containing none of the vowels in the mask).
Once this is calculated, the number of valid words for a subset is simply because a word is valid if it contains at least one vowel from the subset, meaning it is not disjoint from mask. Finally, we square the count of valid words for every subset, XOR all those squared values together, and that gives us the answer.
Implementation
Time Complexity:
C++
#include <iostream>#include <string>const int M = 24;int sos[1 << M];int main() {int n;std::cin >> n;for (int i = 0; i < n; i++) {
Python
M = 24sos = [0] * (1 << M)n = int(input())for i in range(n):st = input()mask = ((1 << (ord(st[0]) - 97)) | (1 << (ord(st[1]) - 97)) | (1 << (ord(st[2]) - 97)))sos[mask] += 1
General Problems
Status | Source | Problem Name | Difficulty | Tags | |
---|---|---|---|---|---|
CSES | Easy | Show TagsSOS DP | |||
Platinum | Easy | Show TagsCombo, DP | |||
CF | Normal | Show TagsBitmasks, SOS DP | |||
Platinum | Normal | Show TagsNT, Prefix Sums | |||
CF | Normal | Show TagsBitmasks, SOS DP | |||
CF | Normal | Show TagsBitmasks, SOS DP | |||
kilonova | Hard | Show TagsBitmasks, NT, SOS DP | |||
CF | Hard | Show TagsBitmasks, DP | |||
JOI | Hard | Show TagsSOS DP | |||
CF | Insane | Show TagsBitmasks, DP, SOS DP |
Module Progress:
Join the USACO Forum!
Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!