Echelon form by P. Dewilde

Linear Algebra 3: Solving Ax = 0, Free Variables, and Pivot Variables, and Echelon Form

Last time we explored the null space, but how do we find the null space vectors? Additionally, how do we do these on nonsquare matrices?

--

This article assumes a base knowledge of all basic matrix and vector manipulation concepts, like dot products, linear combinations, as well as ideas involving span, vector spaces and subspaces, and column spaces.

Review of Null Space

Continuing off the last article, we introduced ideas of the null space.

To review, the null space is the vector space of some group of x that satisfy Ax = 0. x = 0 will always be a part of the null space, but if the matrix is not fully independent, it will also include a combination of vectors — the amount of dependent vectors is the amount of vectors in the linear combination that makes up the null space.

We left off with a good conceptual understanding of the null space, but how do we reliably find the combinations of x that properly make Ax = 0?

For example, the only way we know so far to find the null space is to intuitively, using some abstract mental math, think of a combination of x, y, and z that combine to form 0. Here it’s quite easy, after we realize the second column is just two times the first. Our (only) null space vector is c [-2, 1, 0] or something along that line.

But this gets hard when our dimensionality gets bigger, and we get more and more vectors, since there can be tens, hundreds, or even thousands of null space vectors, and we can’t use mere intuition anymore. We need an algorithm to follow.

In other words, we need to find the algorithm that can get us the exact null space vectors. How can we get c [-2, 1, 0] by using a reliable algorithm?

Pivot Columns and Variables, Free Columns and Variables

Let’s take a system with two dependent columns, which we can therefore expect to have two different vectors in our null space.

The second and third columns are just multiples of the first.

How do we find the two x that satisfy this? We use a system reminiscent of Gaussian Elimination, to turn our A into something like U, the upper triangular. If this sounds like witchcraft, check out my series on Gaussian Elimination.

First, let’s eliminate through our A to get our U matrix. I’m assuming you’re pretty comfortable with elimination at this point, so I’ll just skip to our fully eliminated A. We don’t have to worry about doing the same steps on the right hand side since it is just 0 and won’t change with our multiplication and subtraction steps.

What we did here was multiply the first row by two and subtract from the second row, and then multiplied the first row by three and subtracted from the second row.

Well, we have a bit of a problem. This surely doesn’t look like a proper upper triangular matrix. It doesn’t even fit the most important criteria — we have two pivot places that are occupied by zeros!

We have two zeros in our pivot places!

If we were to write out this as a system of equations, it becomes very clear the we won’t be able to find a single x, y and z to solve our system.

The second equations always evaluate to true, no matter the x, y or z. We have a ‘double infinity’ of possible values, and can only figure out one of the variables at most, if we set the other two as constants.

For example, we could set y to 1, z to 0, and we could solve for x, using the one equation we have. If you understand this concept, you’re well on your way to understanding Ax = 0, because that’s basically what we’re going to do to solve for x.

Returning back to our matrix.

We can see that only one of our pivots are nonzero, and that’s our first column’s pivot. This is what we call our pivot column. The other two we call free columns. I’ll explain why shortly. First, let’s go back to the linear combination view.

The variables associated with the pivot columns will be called pivot variables. The variables associated with the free columns are free variables. The pivot variable here is x, and the free variables are y and z. But why does this matter?

Remember how we saw that we can only solve for one variable if we set the other two as constant (since we have one equations and three variables)? This is exactly what we will do — we will solve for our pivot variable x and set our free variables x and y to a constant.

But there’s just one caveat — remember how since we have two dependent, which I will now call free, columns, we will have two vectors in the nullspace? How do we find these two distinct columns, and make sure that they aren’t actually just combinations or multiples of one another?

There’s an easy way to do this. Solve the system twice, once with y = 1 and z = 0, and the y = 0 and z = 1. Let’s first see how this works, and then I’ll explore why.

Reading this should make sense — I omitted the two zero equations 0 = 0 to make things less crowded. The z = 0 makes the 4 disappear so we are just left with x + 2, which becomes x = -2. We have now solved for our x, and we have already set our y and z as 1 and 0 respectively.

Now, we solve for the other null space vector, in which now z gets the 1 and y gets the 0.

Now we get our alternate solution, which is our second null space vector. Of course, for both of these, we can take any multiple of either, so our null space can be represented by the linear combination:

Since there are two independent columns, this will create a two dimensional plane in the three dimensional null space.

More generally, if we have more free variables, for each free variable, each will get it’s “turn” being equal to 1, while the rest of the free variables are equal to 0. For example if we have the free variables j, k, and l and pivot variable i, we will have three vectors in the null space, and we can find each one by solving three different versions of the same system of equations.

  • solving for i when setting j = 1, k = 0, l = 0
  • solving for i when setting j = 0, k = 1, l = 0
  • solving for i when setting j = 0, k = 0, l = 1

Okay, so that was great, we have a practical way to solve for the null space vectors. But why do we set each free variable to 1 “in turn” like this?

Intuition

It can be thought of that setting each free variable to 1 and the rest to zero for each free variable guarantees that we aren’t getting null space vectors that are just combinations or multiples of one another.

By setting different components of the vector to 0, we make sure that they can’t be multiples of another. Why? let’s look at a more general case.

There is no way we can have a c that results in 0 being in the 2nd spot other than c = 0, which doesn’t mean anything valuable. Thus, by switching the places we have zero on our different null space vectors, we guarantee that they cannot be multiples of one another.

Graphical Intuition for Ax =0

More importantly, we can get a graphical understanding of why this works.

Our null space must not contain any dependent vectors — that is why we are going through all this trouble to come up with null space vectors that are not multiples or combinations of each other.

By setting 0 and 1 to different free variables, we guarantee that each new null space vector will add a new dimension to the null space, and thus be independent. Let’s graphically look at what the null space looks like with our earlier example.

Notice that since we alternate the 1’s and 0’s for the free variables, each of our null space vectors will be flat along some plane. For our first vector, (-2, 1, 0), it lies flat in the z dimension, which can be seen in the picture. Our second vector (-4, 0, 1) lies flat against the y dimension, which is a little tougher to see but take my word on it.

We can’t graph this, but now let’s look at a hypothetical 5 dimensional null space (let’s say, with only one pivot column) with the following vectors.

If we can imagine a 5 dimensional plane of sorts — we can almost ‘see’ that each one of these vectors would be flat in most of the dimensions, but each not flat in one different dimension. This forces them to go in different directions, and thus each define an new dimension when added to the null space.

Now that we have a conceptual and practical understanding, let’s tackle a slightly more tricky system that’s rectangular.

Solving Rectangular Systems for 0

Since we can fully describe a 3 dimensional space with 3 vectors, and the column space of our vector A contains 4 vectors, at least one of them needs to be redundant.

We use the same elimination on A, but this time, we don’t worry about running into zero pivots — we just keep going. Since we have at least one dependent column, we’re expecting to run into them anyways .

After only one step of elimination (multiplying the first row by 2 and subtracting from the second) we run into this predicament. The pivot place (green) has a zero — and even worse, no row switching can help us, since the only row under it also has a zero in the same place. Instead of worrying, we go onto the next column, effectively skipping column 2.

Now let’s treat this two as our new pivot, and multiply / subtract to eliminate the row under it. We multiply it by one and subtract to make the number under the 2 equal to 0, which also eliminates the entire bottom row.

So, we’re left with a wonky upper triangular matrix U, which we will instead call the echelon form matrix. Echelon means staircase, and we can view our divide between 0 and nonzero entries as a staircase.

The pivots are highlighted in green.

We have only two pivots in our echelon form matrix. A pivot in an echelon form matrix is the first nonzero number in it’s row. In general, a matrix in echelon form looks like this:

by P. Dewilde

And remember what we’re doing — we’re just cancelling out different equations by multiplying other equations, just like in grade school.

So now, quite similarly, we can split our matrix up into pivot columns and free columns.

And, if we look at this in linear combination style, we can identify the free and pivot variables that multiply each of the pivot and free columns.

Thus, y and t are our free variables, and x and z our the pivot variables. Now that we also have it in upper triangular form, we can turn this back into a system of two equations and solve for our two different cases (y = 0, t = 1 and y = 1, t = 0).

Let’s first solve when y = 0 and t = 1.

Let’s now solve when y = 1 and t = 0.

So now, we have a set of two vectors that can be multiplied by any constant, which will make up our null space.

And lastly, as a finishing touch, it is common to display the null space vectors in a single matrix N, which has the null space vectors as it’s columns. So, we could show N as:

We can’t really draw this, since it exists in a four-dimensional nullspace, which is impossible to draw, but, we can imagine that this creates a two dimensional plane in some four dimensional space.

And that’s it! That’s all you need to know to solve for any vector in the null space in any dimension, any shape of matrix at all.

--

--

adam dhalla

17 y old learning about machine learning, as well as a lifelong naturalist. Climate activist in Vancouver. Writer. Visit me @ adamdhalla.com