Welcome back. If you're here for the first time I suggest reading from the beginning of the series (over here).

I also suggest reading all my blog posts though...

Recap

I feel like starting off each post in this series with a recap is a good idea.

Last time we outlined and built the beginning stages of our Matrix class. A matrix is a just a 2d grid filled with numbers. It's always weird recapping your work and remembering that it took like 3 hours to produce but 1 sentence to summarize. I guess what I'm saying is, that's the recap. Onto the main event.

1.2 Gaussian Elimination

I believe at the end of the last post (yup confirmed) I mentioned this post would deal with gaussian elimination. I've though about it a bit since then and am excited to give this a try.

What is Gaussian Elimination?

Let's say you're presented with the following problem on a test:

$$ \begin{align} 3x + 8y &= 13\\ 6x + 32y &= 84 \end{align} $$

and asked to find what values you should set x and y to in order to make the equation evaluate to true.

You may start by inspecting the two equations and testing some values. More likely you will solve one equation (let's say the first one) so that one variable (x for example) is in terms of y. Something like:

$$ \begin{align} 3x + 8y &= 13 \\ 3x &= 8y + 13 \\ x &= \frac{8y + 13}{3} \end{align} $$

Now you can replace x with

$$ \frac{8y + 13}{3} $$

and solve regularly. This approach works well when you have 2 equations and 2 unknowns, but what happens with the following:

$$ \begin{align} 3x + 8y - 54z &= 13\\ 6x + 32y + z &= 84\\ 91x + 82y + 9z &= 344\\ 13x + 5y + 8z &= 91\\ \end{align} $$

Whether or not this has a solution (I have no clue) solving it via substitution is much harder.

Introducing Gaussian Elimination. Gaussian Elimination is one method for making progress on a problem like this.

How Gaussian Elimination Works

First off, Gaussian Elimination works on a Matrix. So given our system of equations

$$ \begin{align} 3x + 8y - 54z &= 13\\ 6x + 32y + z &= 84\\ 91x + 82y + 9z &= 344\\ 13x + 5y + 8z &= 91\\ \end{align} $$

we must produce the matrix:

$$ \begin{bmatrix} 3 & 8 & -54 & 13\\ 6 & 32 & 1 & 84\\ 91 & 82 & 9 & 344\\ 13 & 5 & 8 & 91\\ \end{bmatrix} $$

Now that we have our matrix, we can discuss the rules.

There are 3 valid moves you can make while performing Gaussian Elimination

Swap 2 rows
Subtract one row by k times another row
Multiple through a row by k

For example I can swap row 0 and row 1 and get:

$$ \begin{bmatrix} 6 & 32 & 1 & 84\\ 3 & 8 & -54 & 13\\ 91 & 82 & 9 & 344\\ 13 & 5 & 8 & 91\\ \end{bmatrix} $$

I can subtract row 1 by 1/2 times row 0 and get:

$$ \begin{bmatrix} 6 & 32 & 1 & 84\\ 0 & 8 & -27 & -29\\ 91 & 82 & 9 & 344\\ 13 & 5 & 8 & 91\\ \end{bmatrix} $$

And I can multiple row 0 by 1/6 and get

$$ \begin{bmatrix} 1 & \frac{32}{6} & \frac{1}{6} & 14\\ 0 & 8 & -27 & -29\\ 91 & 82 & 9 & 344\\ 13 & 5 & 8 & 91\\ \end{bmatrix} $$

With these powers we have the following goal:

Reduce our matrix into row echelon form. This basically (not exact definition) amounts to getting our matrix into the following state: 1. Each row (starting from the left) has one or more 0's a 1 and then anything afterwards. These are called "leading 1's" 2. Each leading 1 has only 0's below it

Here is an example:

$$ \begin{bmatrix} 1 & 8 & -27\\ 0 & 1 & 9\\ 0 & 0 & 1\\ \end{bmatrix} $$

There is a more persice and correct definition to row-echelon form, I suggest looking it up in the textbook (pg. 11)

Programming Gaussian Elimination

The algorithm

Alright enough math (even though we love it, writing it in latex is still time consuming :p ), let's get into the code.

The general algorithm for running Gaussian Elimination is to do the following:

Find the leftmost column, c, which is not only filled with 0's
If the value in the top row is 0 swap the top row with a row whose value in that column is not 0.
Now let's say the value in the row 0, column c is k. Multiply row 0 by 1/k
Since k * (1/k) = 1 we now have our leading 1. If the value in row 1 below our leading 1 is v, subtract row 1 by v * row 0. This will cause the value below our leading 1 to become 0, which is exactly what we want.
Repeat this process by covering row 0 and pretending row 1 is not the top most row.

I won't show an example here since once we get to programming we'll be able to run through many examples.

Outline

I think breaking these steps into there own methods (functions) is a good approach here. We can have:

Locate leftmost ( not all 0's ) column
Find row with none 0's value
Swap rows
Multiply row by value
Subtract row by row

Locate Left Most Column

At first I felt the title line for this method should be:

std::optional<uint32_t> _LocateLeftMostNoneZeroColumn()

Now I realize that after the first iteration of our Gaussian Elimination algorithm we will end up with a column with a single 1 and 0's below it but that if we called this method again, it would return the same column back to us. We need a way of telling this method not to consider certain columns. For now I think we can use a single integer to specify which row/column to start with. Now our title line is:

std::optional<uint32_t> _LocateLeftMostNoneZeroColumn(const std::pair<uint32_t, uint32_t>  &start)

I am using std::optional here in case there does not exist a left most none zero column. In such a case we will return a nil optional.

The body of this method should be fairly straight forward. We loop over the matrix column by column checking if we can find one with a none 0 row value. Once we find one, return that column.

std::optional<uint32_t> _LocateLeftMostNoneZeroColumn(const std::pair<uint32_t, uint32_t>  &start) {
    const auto [start_row, start_col] = start;
    for(uint32_t c = start_col; c < NumCols; c++) {
        for(uint32_t r = start_row; r < NumRows; r++) {
            if (matrix_[r][c] != 0) {
                return c; 
            }
        } 
    }
    return {};
}

Locate Row With None 0 Value

Turns out that we already did this work in the previous step, though we did't make use of it. We only returned the column previously but now I am thinking we can return a pair (column, row). Let's modify our title line and function body.

std::optional<std::pair<uint32_t, uint32_t> > _LocateNextLeadingOnePosition(const std::pair<uint32_t, uint32_t>  &start) {
    const auto [start_row, start_col] = start;
    for(uint32_t c = start_col; c < NumCols; c++) {
        for(uint32_t r = start_row; r < NumRows; r++) {
            if (matrix_[r][c] != 0) {
                std::make_pair(r, c);
            }
        } 
    }
    return {};
}

That's better. Killed two steps with 1 function change (as they say).

Swap Rows

This one is going to require its own function. On the bright side, cpp std::vector comes with a fancy vectorA.swap(vectorB) method (found here).

I think this seems cool, let's use that one.

void _SwapRows(const uint32_t &row1, const uint32_t &row2) {
    matrix_[row1].swap(matrix_[row2]);
}

The cpp reference website says time complexity for this swap is constant time. Love to see that.

Multiply Row by Value

This function also seems pretty straight forward. We'll take some value k as an argument and multiply each value in our row by k.

void _ScaleRowBy(const uint32_t &row, const Type &k) {
    for(auto &elm : matrix_[row]) {
        elm *= k;
    }
}

Here elm is a reference to the value in our row. That is why modifying elm modifies the value in our row.

This function, as it stands, will help for scaling a row by a value. On the other hand, this function fails to solve the Guassian Elimination step which subtracts one row by the scaled version of another. This is because when we want to subtract row A but k * row B, we don't want to actually modify row B.

We deal with this in the next section.

Modifying Row A by Another Row

I think the idea of modifying a row can occur in different scenarios and therefore making this type of function more generic would be a good idea. I am going to paste my thought below and then go over it:

void _ModifyRowBy(const uint32_t &row, const std::function<const Type &(const uint32_t &col, const Type &row_value)> &modifier) {
    uint32_t col = 0;
    for(auto &elm : matrix_[row]) {
        elm = modifier(col, elm);
        col += 1;
    }
}

Alright let's break down the title line. We request that the caller pass in the row they want to modify (first argument) and a function (second argument) which given a column number and a value returns a new value.

We can then take that new value, and set our rows value to that new value.

Let's take a look at using our _ModifyRowBy function to perform our "Multiply Row by Value" function.

Previously our "Multiply Row by Value" function was:

void _ScaleRowBy(const uint32_t &row, const Type &k) {
    for(auto &elm : matrix_[row]) {
        elm *= k;
    }
}

Now it becomes:

void _ScaleRowBy(const uint32_t &row, const Type &k) {
    _ModifyRowBy(row, [&k](const uint32_t &col, const Type &old_value) {
        return old_value * k;
    };
}

No real difference in size, but our new _ModifyRowBy function now allows us to write our "Subtract row by a multiple of another row" function as follows:

void _SubtractRowByRowMultiple(const uint32_t &row_to_modify, const uint32_t &other_row, const Type &k) {
    _ModifyRowBy(row_to_modify, [this, k, &other_row](const uint32_t &col, const Type &old_value) {
        return old_value - (matrix_[other_row][col] * k);
    });
}

Here we tell _ModifyRowBy we want to modify row_to_modify. We then have to capture, this, k, other_row because we want to use them inside our lambda function (between the curly braces).

Note that even though this isnt used explicitly in our lambda, we need it for matrix_ since that variable is really this->matrix_.

Then given some row value old_value we want to subtract k * matrix_[other_row][col] (k times our other row's value).

With this complete we should be able to chain these functions together and reduce our matrix. Let's hook these functions together and update our main.cpp.

Row Echelon Form

Our Row echelon form function will be public and look as follows:

void ReduceToRowEchelonForm() {
    uint32_t r = 0,
             c = 0;
    while(r < NumRows && c < NumCols) {
        const auto &leading_1_pos = _LocateNextLeadingOnePosition(std::make_pair(r, c));

        // Check if our optional has a value
        if (!leading_1_pos.has_value()) {
            break;
        }

        // Get values our of our optional
        const auto[leading_1_row, leading_1_col] = leading_1_pos.value();

        // Check if we should swap rows
        if (leading_1_row != r) { 
            _SwapRows(leading_1_row, r);
        }

        // Produce leading 1
        _ScaleRowBy(leading_1_row, 1.0 / matrix_[leading_1_row][leading_1_col]);
        
        // Zero out all values below leading 1
        for (uint32_t below_row = leading_1_row + 1; below_row < NumRows; below_row++) {
            _SubtractRowByRowMultiple(below_row, leading_1_row, matrix_[below_row][leading_1_col]);
        }

        r = (leading_1_row + 1);
        c = (leading_1_col + 1);
    }
}

and our main.cpp will be updated to the following:

#include "linalg/matrix.h"

int main() {
    mathlib::linalg::Matrix<float, 5, 5> matrix;
    matrix << 10,   83,  33,  931,  11,
              65,  -12,   3,   99,   9,
              81,    5, 653, -125, 271,
              144,  32,  77,  -30, -15,
              10,   49,  95,   61,  -9;
              
    std::cout << "---------------Matrix---------------" << std::endl;
    matrix.Print();
    matrix.ReduceToRowEchelonForm();
    std::cout << "-------- Row Echelon Form ----------" << std::endl;
    matrix.Print();
}

Let's give this a whirl!

Output ( in latex form ):

$$ \begin{bmatrix} 10 & 83 & 33 & 931 & 11\\ 65 & -12 & 3 & 99 & 9\\ 81 & 5 & 653 & -125 & 271\\ 144 & 32 & 77 & -30 & -15\\ 10 & 49 & 95 & 61 & -9\\ \end{bmatrix} $$ $$ \begin{bmatrix} 1 & 0 & 33 & 931 & 11\\ 0 & 0 & 1 & 33 & 3\\ -0 & -0 & -0 & 1 & 0.0778813\\ -0 & -0 & -0 & -0 & 1\\ 0 & 0 & 0 & 0 & 0\\ \end{bmatrix} $$

Heyyy! Exactly what we hoped for (minus the minus signs before the 0's)!

Alright, I think this is a good place to stop for today. Today we implemented Guassian Elimination in our Matrix class. Next up is Gauss-Jordan Elimination. That is only a minor modifcation to Guassian Elimination so hopefully we breeze through that!

Until then,

-- Ehud

Tagged with: