[1/3] A Complete Guide to Gaussian Elimination

Part 1 of 3: What it means, and how we [humans] compute it, pivots, singular cases, row exchanges, and a geometric understanding using span and linear combinations.

adam dhalla
18 min readJan 3, 2021

--

This series can act as a kind of summary of the lectures 1–4 of prof. Gilbert Strang’s class 18.06 on MIT OCW, as well as some extra tidbits of intuition I was able to gather.

Continuing on my own personal journey learning linear algebra, I’d like to share, mostly out of my own selfish reasons (writing about concepts does make them stick) intuitions on a fundamental concept in linear algebra: using matrix notation to solve a system of equations.

Gaussian Elimination is a starter concept in many first-year Linear Algebra courses. This quick explanation is meant to help anyone in one of those courses, learning about gaussian elimination, and more importantly (we will spend more time on this) factorizing this system into the form A = LU.

I’d recommend a solid understanding of basic gaussian elimination (and all concepts below that) although I’ll review it anyways.

Contents:

This is a comprehensive guide, but if you’re looking for specific tidbits of information, utilize these links:

Why Our Classic Way of Solving Linear Systems Doesn’t Work

We’ve been pretty familiar with linear systems of equations since middle school maybe — not later than the sophomore year of high school.

There were a few ways to solve these — you could use one equation to solve for some unknown, and then replace that unknown with it’s new value in the other equation. This is called substitution.

You could also eliminate entire equations by adding and subtracting the equations themselves.

And while gaussian elimination is a little closer to the latter, there’s still a problem with either of these methods. Both of these work in small systems of 2 or 3 equations and unknowns, but when you get to large systems, you need a methodical and reliable formula that works with all systems.

When you get to solving large systems, this is done by a computer. It is quite hard to tell a computer to do either elimination or substitution the ways we were taught classically, as the methods differ from equation.

This need for a methodical formula is why Gaussian Elimination exists.

Motivation for Gaussian Elimination

Gaussian Elimination is a way of solving a system of equations in a methodical, predictable fashion using matrices.

Let’s look at an example of a system, and solve it using elimination.

We don’t need linear algebra to solve this, obviously. Heck, we can solve it at a glance. The answer is quite obviously x = y = 1. But things get exponentially harder with the more unknowns and equations we have.

We can store all this information in one matrix and two vectors, a matrix which stores all our coefficients A, a vector of the unknowns x and our answers b.

If we multiply this out we get the same system of equations. More importantly, we can actually get rid of the x and y vector during elimination, since it doesn’t contribute any information. We know that the system is 2 x 2, and thus has two unknowns.

Thus, we can create an augmented matrix of A and b.

On the right of the dashed line is our coefficient matrix, on the left is our answers.

We can read the above as two equations: 1x + 1y = 2, and 2x + 1y = 3 (if you imagine the x and y vector still being there — or, easier, just imagine the corresponding unknown sitting next to each coefficient.

We can do elimination in this matrix form in a similar way. What we want to do is turn the coefficient matrix A into U — what we call the upper triangular.

The Upper Triangular

The upper triangular (U) is a special variant of matrix with only zeros below the diagonal. Upper triangular patterns only occur in square matrices. Below are (2 x 2), (3 x 3) and (4 x 4) upper triangular matrices.

The reason why upper triangular matrices are favored when solving systems of equations is that they make them easy to solve. Let’s momentarily imagine a system of equations with three equations and three unknowns.

on the left is the Ax + b format, and on the right the augmented style. Both are equivalent.

If the coefficient matrix is already in upper triangular form, as it is above, the system becomes very easy to solve. You can solve it from the bottom up. The bottom element simply states what it is.

In an upper triangular, this is what your system becomes.

We can now solve easily from the bottom up. We can easily calculate what z must be, and we can substitute this value of z in the equation above it to solve for d easily. This is called back-substitution.

This might be easier to understand with a concrete example. Let’s view another upper triangular in the format Ax = b and augmented.

on the left is the Ax + b format, and on the right the augmented style. Both are equivalent.

The associated system of equations (you should have it in your head as you see this) looks like this.

We can easily figure this out bottom-up. We can see at a glance that z is equal to 3, and we can then substitute this in the equations above (back-substitution) and find out each variable one at a time, making it quite easy. We can find out that y must = 1 and x = 2.

Gaussian Elimination in a (2 x 2) System

Let us now return to our example with a better idea of where we want to head. We want to turn the coefficient matrix on the left, somehow, into a upper triangular matrix that is easy to solve.

Our (2 x 2) example.

The way we do this is reminiscent of our classic elimination strategy. To create a (2 x 2) upper triangular matrix, we want the place occupied by our 2 to be occupied by a 0. We don’t care what the numbers are — remember, an upper triangular has only 0’s below the diagonal. That’s the only criteria.

We do our first elimination to get this 2 to a 0. We multiply the first row of our coefficient matrix [ 1 1 ] and subtract it from the bottom row [ 2 1 ], choosing the multiplier that cancels out the 2.

If it helps, think of this just like old-school elimination — we’re doing the exact same thing. The correct multiplier to multiply the first row is 2. After multiplying our first row by 2, we get [ 2 2 ], which we then subtract from the second row to get our updated matrix.

But, you may point out, we didn’t change our right hand side! And you’d be completely right. Just like in classic elimination, we do the same multiply-subtract action on the left hand side. So, we multiply the first number 2 by our multiplier 2, and subtract it from 3, to get -1.

Notice our top row doesn’t change when we multiply it and subtract it from the next row, just the row below changes. This is also reminiscent of how we do classic elimination — we use the equation as a tool to simplify the other equations — we don’t alter the original equation itself.

And now look! We have a simplified system with a upper triangular as our coefficient matrix. Writing this out again in full form, we get:

Where U is the upper triangular, x the unknown matrix, and c, our modified ‘right hand side’ (formerly b).

Just to recap. We went from Ax = b to Ux = c by transforming our matrix A (elimination) into the upper triangular U while applying the same transformation to b to keep things consistent on both sides of the equality. So, these two completely equivalent systems.

Gaussian Elimination in a (3 x 3) System

Let’s get a little more practice with Gaussian Elimination in a less trivial form, in a (3 x 3) matrix. Let’s again use a concrete example. Say we wanted to solve this system.

That looks rather intimidating. Note that the blank space in the first row corresponds to the first equation not including z. First, let’s convert it into the equivalent matrix form and augmented matrix Ax = b and A | b.

Now, we can convert our coefficient matrix into the upper triangular in a series of steps similar to the ones carried out on the (2 x 2) matrix.

To make our matrix into the upper triangular, we want the three spots in the bottom left corner (currently occupied by 4, 2, and 2) to be 0. Again, we do not care about the numbers on and above the diagonal — they can be anything.

Our first step is to make the 4 in the second row, first column a zero in a similar elimination manner. The multiplier we multiply the first row before subtracting is 2, so the 2 in the first row cancels out the 4 in the second row. We then subtract this 2 x [ 2 3 0 ] = [ 4 6 0 ] from the second row.

We then do the same calculation on the right-hand side, multiplying 8 by 2 and subtracting it from 9, getting -7.

One step closer getting from our A to U!

Our next step is to eliminate the 2 in the bottom left hand corner. We will do this by multiplying the first row again, but this time, subtracting it from the third row, choosing a multiplier which will multiply with the 2 from the first row and cancel out the 2 in the third row.

This multiplier is only 1. We multiply our first row by 1 x [ 2 3 0 ] = [ 2 3 0 ], which we then subtract from [ 2 2 3 ]. [ 2 2 3 ] - [ 2 3 0 ]= [ 0 -1 3].

And don’t forget to do the same update to the right hand side — taking 8, multiplying it by 1, and subtracting it from 9 to get 1.

Notice now that we’ve cleared out all the numbers below the diagonal in the first column. We’re done with that column.

Before we move onto clearing that -1 in the second column, I’d like to point out something about the eliminations we’ve been doing.

Pivots

So far, we’ve based our choice of multiplier on what multiplies with our 2 (first row first column) and eliminates with the numbers below it. It is how we decide our multiplier. This makes it significant, in this column anyways. We call this number the pivot. More specifically, the first pivot, since it is the first of the three pivots in our equation.

These are the three pivots in our matrix right now. I stress the importance of that right now part — we don’t really consider pivots until we get to them. You will see how our third pivot, 3, will no longer be 3 once we do another step of elimination. Thus, we don’t really think about pivots until we get to them, since steps of elimination often alter the rows below it, and along with them, the pivots.

Pivots will come into more play later. For now let’s return to our example.

We now have to get rid of the -1 in the third row, second column. (3, 2 position). To do this, we use our second pivot. Since the first column is all zeros and won’t impact our calculations, we can almost think of it as now finding the upper triangular of a 2 x 2 system.

Just like before, we look at our second row, and see what number we should multiply it by for the -1 in the second row to cancel out with the -1 in the third row. This multiplier is 1, since when multiplied it’ll make -1 (-1 x 1), and when subtracted from the next row, it will add to 0 (-1- -1 = -1 + 1 = 0).

Again, we do the same calculation with the right hand side, multiplying our by -7 by 1 and subtracting it from 1, to get 8 (1- -7 = 1 + 7 = 8).

Now, our original matrix A has become the upper triangular U, and our right hand side has gone from b, the original right hand side, to c, the modified.

It can now be easily solved, bottom up, using back-substitution.

After starting from the bottom with z = 1, and expanding upwards, we get the answer z = 1, y = 2, and x = 2.

To better understand Gaussian Elimination we must ask a critical question: When does Gaussian Elimination break down? There are two cases that can happen. Temporary failure and permanent failure.

Zeros can never exist in the position of a pivot. When this happens, elimination either breaks down temporarily or permanently.

Let’s see how.

Temporary breakdown and row exchanges (non-singular)

When a zero exists in anywhere but the last pivot, we have a temporary breakdown. i.e, they are non-singular and have a single solution.

When zero exists in the pivot place, we cannot eliminate the 2 or 5 in the first column because there is nothing we can multiply 0 by to eliminate the 2 or 5, since 0 x any number = 0.

What do we do to solve this? A row exchange. Since the number in the first pivot place is 0, we can switch the first and second rows, so that the zero does not cause us trouble any more.

Now, we can multiply our pivot. As an added bonus, we don’t need to eliminate the number in the (2, 1) position since it’s already 0.

This same problem can happen in the second column as well. Say we start of with a rather unassuming (3 x 3) matrix like this:

After we use elimination to get rid of the first column (the 4) we’re left with a problem.

The second pivot has turned into a 0! We can do the exact same thing to fix this, but this time, a row exchange between rows 2 and 1. This fixes the system, and also solves it.

Remember, when doing a row exchange in an augmented matrix, we must also exchange the right hand side b.

This makes a lot of sense if you think of it in terms of equations. Switching the order of equations around quite obviously requires you to switch the answers around. Taking a look at the equation picture of this will make that quite obvious.

Okay, so when is breakdown permanent?

Permanent breakdown, i.e, the singular case

When we encounter permanent breakdown, it is because we either have not enough information in the system, or conflicting information in the system. The former creates infinite solutions, and the latter creates no solutions.

Cutting to the chase: breakdown is permanent when the last pivot is zero. Logically, this makes sense. When you’re at the last row and encounter a zero, there is no more rows below it to switch with, and you can’t switch with a row above (it would ruin the upper triangular form you’re building to).

Singular system case #1: Infinite Solutions

Let’s look at these in (2 x 2) matrices since it’s quite easy to understand that way. Here is a system that is singular.

Notice that when we do elimination, we get:

Oh no — the last pivot is a 0, meaning it is singular. Let’s look at what this means as a system of equations.

We can clearly see that this system of equations has infinite solutions. Since the second equation evaluates to 0 = 0, which is true no matter what the x and y values are, we’re just left with one equation to describe two unknowns, giving an infinite amount of examples.

Looking back at the original system, it’s clear to see why we have an infinite amount of examples.

The second equation adds no information to our system. It is simply the first equation multiplied by two. We are left with the same amount of information as if we we only had one equation, and are left with not enough restrictions to come up with a single value of x and y.

This occurs in a multitude of situations. In a nutshell, this infinite-solutions-problem occurs when one (or more) of our equations are made up of some combination of other equations in the system.

Here are two less obvious examples of this similar type of infinite-answer case.

In both of these cases, we end up with two equations to describe 3 unknowns. This is not enough restrictions, and we thus get an infinite amount of solutions.

Singular system case #2: No solutions

The second type of permanent breakdown, where we end up with no solutions, is quite similar. In essence, it is when the left hand side of our system is some combination of each other, but the right hand side contains conflicting information. Let’s see what this means in a simple 2 x 2 example.

This is identical to the system we explored in the infinite solutions case, except that I have changed the 12 in b to an 13.

You might already see where this is going. Let’s use gaussian elimination to simplify the system of equations. We end up with this:

If we write this out in equation form, we can see that it contains conflicting information.

0 does not equal 1. Thus this equation is untrue for any unknowns x or y. This system has no solutions.

This type of no-solution singular case occurs when the left hand side is some combination of each other, as it is here — but the right hand sides do not match up to the same combination. Before, we had infinite solutions because the right hand side matched up and we ended up with 0 = 0. But since we changed the left hand side and changed the right hand side differently, in a sense, we have no solutions instead of an infinite amount.

Our two other examples in the infinite case become examples for no solution cases if we just change the right hand sides to make them inconsistent.

Really, changing 9 and 10 to any other number that does not = 9 or 10 would result in no solution.

But how do we tell if a system is singular without going all the way through elimination? This, like many concepts in linear algebra, requires geometric understanding.

Geometric view of the singular case

This part requires a good understanding of span and linear combinations. If you’re iffy on those, this video explains it all.

As well as that, we must look at the both the row picture but also the column view of a matrix. This involves looking at a matrix no longer as a system of equations, but vectors in space. If you’re unfamiliar with both of those, view, this lecture covers it all superbly.

Here’s a super quick one-picture refresher on row picture and column picture.

When using the row view to analyze a (2 x 2) system, a singular case can mean two different things. Let’s use the same examples as last time.

In higher dimensions, there become more and more ways for the planes to intersect in ways that either produce infinite or no solutions. For example, in 3D, two planes can intersect for a solution, but the 3rd plane misses the intersection, leading there to be no solutions for the system as a whole — or the planes could intersect, but they could all be straight vertical — causing an entire vertical line of infinite solutions.

But now let’s look at this in terms of linear combinations and span in the column form of our system to get some more intuition.

Notice that the source of the problem in both of these singular cases is that the vectors are colinear, and thus can only describe a single line as a combination: their span is one-dimensional and exists only on the line they share.

The 3D geometric counterpart to this would be two of three 3D vectors (a, b, c) and (e, f, g) existing on the same plane, causing the span of the three vectors in total to be only on the plane they share — if the answer, also a 3-dimensional vector, does not exist on the plane (which, probability-wise, is likely), there is no solution. And if the answer does exist on the plane, there are an infinite amount of combinations of the three vectors to make that answer.

Back to 2D systems — we’ve identified that the problem is that they have colinear vectors. How do we identify this in advance? It’s a simple test.

Take the columns of your matrix A and put them in the form of a linear combination. Is there any way for a combination of vectors to combine to form the zero vector where x ≠ y ≠ 0?

Why does this work as an appropriate litmus test for if a system contains dependent vectors? Well, if there is a way that we can find to cancel out the vectors using combinations of other vectors in the system, it means that one of the vectors must be made up of some combination of the others.

And obviously, we don’t count x = y = 0 since that would work for any linear combination. Here, there is an answer that is nonzero, and that is:

In summary, we can solve a system of equations by using using matrix notation to create a coefficient matrix A. We put this in either Ax = b or augmented form to solve. We convert this matrix A to an easily-solvable upper triangular U using multiplication-subtraction elimination on the rows on both sides of the equality sign (turning b into c on the other side of the equality sign).

We then use back-substitution to solve the system bottom-up.

That’s all the information you need to know to carry out and understand elimination. There are many more efficient and interesting ways to carry out this elimination in which I will cover in the following two articles.

The next article will deal with viewing these transformations, eliminations and row exchanges as matrix multiplications — essentially, finding matrices that we can multiply our A by to carry out elimination in the same way we just did manually.

Adam Dhalla is a high school student out of Vancouver, British Columbia, currently in the STEM and business fellowship TKS. He is fascinated with the outdoor world, and is currently learning about emerging technologies for an environmental purpose. To keep up,

Follow his Instagram, and his LinkedIn. For more, similar content, subscribe to his newsletter here.

--

--

adam dhalla

17 y old learning about machine learning, as well as a lifelong naturalist. Climate activist in Vancouver. Writer. Visit me @ adamdhalla.com