Matrix Chain Multiplication
Problem
Given a sequence of matrices, find the most efficient way to multiply these matrices together. The problem is not to perform the multiplications, but to decide the order in which to multiply the matrices to minimize the total number of scalar multiplications.
Matrix multiplication is associative, so the result does not depend on how the matrices are parenthesized, but the number of operations does. Given an array p[] of length n+1, where the ith matrix has dimensions p[i-1] x p[i], write a function MatrixChainOrder() that returns the minimum number of multiplications needed to multiply the chain.
Examples
Example 1
Input: p = [40, 20, 30, 10, 30]
Output: 26000
Explanation: There are 4 matrices of dimensions 40x20, 20x30, 30x10, and 10x30.
The minimum number of multiplications is obtained by parenthesizing as (A(BC))D.
Calculation: 20*30*10 + 40*20*10 + 40*10*30 = 6000 + 8000 + 12000 = 26000
Example 2
Input: p = [10, 20, 30, 40, 30]
Output: 30000
Explanation: There are 4 matrices of dimensions 10x20, 20x30, 30x40, and 40x30.
The minimum number of multiplications is obtained by parenthesizing as ((AB)C)D.
Calculation: 10*20*30 + 10*30*40 + 10*40*30 = 6000 + 12000 + 12000 = 30000
Example 3
Input: p = [10, 20, 30]
Output: 6000
Explanation: Only two matrices: 10x20 and 20x30. Only one way to multiply: 10*20*30 = 6000
Similar Problems
[Printing brackets in Matrix Chain Multiplication](matrix-chain-multiplication-2-print-brackets)
Method 1 - Naive
Intuition
The naive approach tries every possible way to parenthesize the product and recursively calculates the cost for each way. It chooses the minimum among all possible parenthesizations.
Approach
- For a chain of matrices from index
itoj, try placing a parenthesis at every possible positionkbetweeniandj. - Recursively compute the minimum cost for multiplying the left and right subchains.
- Add the cost of multiplying the two resulting matrices.
- Base case: if there is only one matrix (
i == j), no multiplication is needed.
Code
C++
int matrixChain(const std::vector<int>& p, int i, int j) {
if (i == j) return 0;
int minCost = INT_MAX;
for (int k = i; k < j; ++k) {
int cost = matrixChain(p, i, k)
+ matrixChain(p, k + 1, j)
+ p[i - 1] * p[k] * p[j];
minCost = std::min(minCost, cost);
}
return minCost;
}
// Usage: matrixChain(p, 1, n-1)
Java
int matrixChain(int[] p, int i, int j) {
if (i == j) return 0;
int min = Integer.MAX_VALUE;
for (int k = i; k < j; k++) {
int cost = matrixChain(p, i, k)
+ matrixChain(p, k + 1, j)
+ p[i - 1] * p[k] * p[j];
if (cost < min) min = cost;
}
return min;
}
// Usage: matrixChain(p, 1, n-1)
Python
def matrixChain(p, i, j):
if i == j:
return 0
min_cost = float('inf')
for k in range(i, j):
cost = (matrixChain(p, i, k)
+ matrixChain(p, k + 1, j)
+ p[i - 1] * p[k] * p[j])
min_cost = min(min_cost, cost)
return min_cost
# Usage: matrixChain(p, 1, len(p)-1)
Complexity
- ⏰ Time complexity:
O(2^n)(exponential, as all possible parenthesizations are tried) - 🧺 Space complexity:
O(n)(due to recursion stack)
Method 2 - Top Down DP
Intuition
The naive approach recalculates the same subproblems multiple times. Top-down DP (memoization) stores the results of subproblems in a table, so each subproblem is solved only once.
Optimal Substructure
The Matrix Chain Multiplication problem can be broken down into smaller subproblems. For a sequence of matrices, the goal is to determine the most efficient way to multiply them by placing parentheses in different positions. For a chain of n matrices (A, B, C, D), you can split the multiplication at any position, resulting in subchains like:
- (A) × (BCD)
- (AB) × (CD)
- (ABC) × (D)
Each split divides the problem into two smaller matrix chain multiplication problems. The minimum number of scalar multiplications needed for the whole chain is the minimum among all possible splits, where each split’s cost is the sum of the costs of multiplying the left and right subchains plus the cost of multiplying the resulting two matrices. This recursive structure demonstrates the optimal substructure property.
Example
A (10×20), B (20×30), C (30×40), D (40×30). So, p = [10, 20, 30, 40, 30]
We want to compute the minimum number of scalar multiplications needed to multiply A × B × C × D.
graph TD A1["matrixChain(1,4)"] B1["matrixChain(1,1)"] B2["matrixChain(2,4)"] C1["matrixChain(2,2)"] C2["matrixChain(3,4)"] D1["matrixChain(3,3)"] D2["matrixChain(4,4)"] C3["matrixChain(2,3)"] D3["matrixChain(2,2)"] D4["matrixChain(3,3)"] B3["matrixChain(1,2)"] C4["matrixChain(1,1)"] C5["matrixChain(2,2)"] B4["matrixChain(1,3)"] C6["matrixChain(1,1)"] C7["matrixChain(2,3)"] D5["matrixChain(2,2)"] D6["matrixChain(3,3)"] A1 --> B1 A1 --> B2 B2 --> C1 B2 --> C2 C2 --> D1 C2 --> D2 B2 --> C3 C3 --> D3 C3 --> D4 A1 --> B3 B3 --> C4 B3 --> C5 A1 --> B4 B4 --> C6 B4 --> C7 C7 --> D5 C7 --> D6
Suppose you have matrices A, B, C, D. To find the minimum multiplication cost, you can split the chain at different positions:
- First split: (A) × (BCD)
- Second split: (AB) × (CD)
- Third split: (ABC) × (D)
Each split leads to recursive calls for the left and right segments. For instance, the subproblem of multiplying (B × C × D) may be encountered in both the first and second splits, leading to redundant calculations if not stored and reused.
Approach
- Use a 2D memoization table
dp[i][j]to store the minimum cost for multiplying matrices fromitoj. - If the value is already computed, return it.
- Otherwise, compute recursively as in the naive approach, but store and reuse results.
Code
C++
int matrixChain(const std::vector<int>& p, int i, int j, std::vector<std::vector<int>>& dp) {
if (i == j) return 0;
if (dp[i][j] != -1) return dp[i][j];
int minCost = INT_MAX;
for (int k = i; k < j; ++k) {
int cost = matrixChain(p, i, k, dp)
+ matrixChain(p, k + 1, j, dp)
+ p[i - 1] * p[k] * p[j];
minCost = std::min(minCost, cost);
}
dp[i][j] = minCost;
return minCost;
}
// Usage:
// int n = p.size();
// std::vector<std::vector<int>> dp(n, std::vector<int>(n, -1));
// matrixChain(p, 1, n-1, dp);
Java
int matrixChain(int[] p, int i, int j, int[][] dp) {
if (i == j) return 0;
if (dp[i][j] != -1) return dp[i][j];
int min = Integer.MAX_VALUE;
for (int k = i; k < j; k++) {
int cost = matrixChain(p, i, k, dp)
+ matrixChain(p, k + 1, j, dp)
+ p[i - 1] * p[k] * p[j];
if (cost < min) min = cost;
}
dp[i][j] = min;
return min;
}
// Usage:
// int n = p.length;
// int[][] dp = new int[n][n];
// for (int[] row : dp) Arrays.fill(row, -1);
// matrixChain(p, 1, n-1, dp);
Python
def matrixChain(p, i, j, dp):
if i == j:
return 0
if dp[i][j] != -1:
return dp[i][j]
min_cost = float('inf')
for k in range(i, j):
cost = (matrixChain(p, i, k, dp)
+ matrixChain(p, k + 1, j, dp)
+ p[i - 1] * p[k] * p[j])
min_cost = min(min_cost, cost)
dp[i][j] = min_cost
return min_cost
# Usage:
# n = len(p)
# dp = [[-1]*n for _ in range(n)]
# matrixChain(p, 1, n-1, dp)
Complexity
- ⏰ Time complexity:
O(n^3)(since there are O(n^2) subproblems and each takes O(n) time) - 🧺 Space complexity:
O(n^2)(for the memoization table)
Method 3 - Bottom up DP
Intuition
Bottom-up DP builds the solution iteratively by solving smaller subproblems first and using their results to solve larger subproblems. We fill a DP table where dp[i][j] stores the minimum cost to multiply matrices from i to j.
Approach
- Initialize a 2D table
dpof sizen x nwith zeros on the diagonal (cost of multiplying one matrix is zero). - For chain lengths from 2 to
n-1, compute the minimum cost for all subchains. - For each subchain
(i, j), try all possible splitskand updatedp[i][j]with the minimum cost. - The answer is in
dp[1][n-1].
Code
C++
int matrixChainOrder(const std::vector<int>& p) {
int n = p.size();
std::vector<std::vector<int>> dp(n, std::vector<int>(n, 0));
for (int len = 2; len < n; ++len) {
for (int i = 1; i < n - len + 1; ++i) {
int j = i + len - 1;
dp[i][j] = INT_MAX;
for (int k = i; k < j; ++k) {
int cost = dp[i][k] + dp[k+1][j] + p[i-1]*p[k]*p[j];
dp[i][j] = std::min(dp[i][j], cost);
}
}
}
return dp[1][n-1];
}
Java
int matrixChainOrder(int[] p) {
int n = p.length;
int[][] dp = new int[n][n];
for (int len = 2; len < n; len++) {
for (int i = 1; i < n - len + 1; i++) {
int j = i + len - 1;
dp[i][j] = Integer.MAX_VALUE;
for (int k = i; k < j; k++) {
int cost = dp[i][k] + dp[k+1][j] + p[i-1]*p[k]*p[j];
dp[i][j] = Math.min(dp[i][j], cost);
}
}
}
return dp[1][n-1];
}
Python
def matrixChainOrder(p):
n = len(p)
dp = [[0]*n for _ in range(n)]
for length in range(2, n):
for i in range(1, n - length + 1):
j = i + length - 1
dp[i][j] = float('inf')
for k in range(i, j):
cost = dp[i][k] + dp[k+1][j] + p[i-1]*p[k]*p[j]
dp[i][j] = min(dp[i][j], cost)
return dp[1][n-1]
Complexity
- ⏰ Time complexity:
O(n^3) - 🧺 Space complexity:
O(n^2)
Dry Run
Let's fill the DP table for this example (p = [10, 20, 30, 40, 30], n = 5):
| 0 | 1 | 2 | 3 | 4 | |
|---|---|---|---|---|---|
| 0 | 0 | ||||
| 1 | 0 | ||||
| 2 | 0 | ||||
| 3 | 0 | ||||
| 4 | 0 |
Now, fill for chain length 2:
dp[1][2]= 10×20×30 = 6000dp[2][3]= 20×30×40 = 24000dp[3][4]= 30×40×30 = 36000
| 0 | 1 | 2 | 3 | 4 | |
|---|---|---|---|---|---|
| 0 | 0 | ||||
| 1 | 0 | 6000 | |||
| 2 | 0 | 24000 | |||
| 3 | 0 | 36000 | |||
| 4 | 0 |
Chain length 3:
-
dp[1][3]= min( (10×20×40 + 0 + 24000) = 32000, (6000 + 10×30×40 + 0) = 18000 ) = 18000 -
dp[2][4]= min( (20×30×30 + 0 + 36000) = 54000, (24000 + 20×40×30 + 0) = 48000 ) = 48000
| 0 | 1 | 2 | 3 | 4 | |
|---|---|---|---|---|---|
| 0 | 0 | ||||
| 1 | 0 | 6000 | 18000 | ||
| 2 | 0 | 24000 | 48000 | ||
| 3 | 0 | 36000 | |||
| 4 | 0 |
Chain length 4:
dp[1][4]= min( (10×20×30 + 0 + 48000) = 54000, (6000 + 10×30×30 + 36000) = 51000, (18000 + 10×40×30 + 0) = 30000 ) = 30000
| 0 | 1 | 2 | 3 | 4 | |
|---|---|---|---|---|---|
| 0 | 0 | ||||
| 1 | 0 | 6000 | 18000 | 30000 | |
| 2 | 0 | 24000 | 48000 | ||
| 3 | 0 | 36000 | |||
| 4 | 0 |
Final answer:
Minimum number of multiplications = 30000 (dp[1][4])