Breathing in the crisp autumn air and staring out over what has been his home, neighborhood, close friend, and therapist, for the past two years, Henry David Thoreau lightly closed the fragile cedar-wooded door of his cabin for the last time.

A few years ago, I remember being faced with an introductory physics problem. It was the first time I had encountered the idea of acceleration, and I was puzzled by the notation for it.

*This article explains the probability theory that underlies the concept of Naive Bayes’, so if you’re looking for a theoretical understanding, see that.*

*I have a GitHub Repository of my homemade Naive Bayes Classifier **here.** It includes a submission to the Titanic Dataset. In my experiment, it actually scored 5% higher than the builtin Scikit-Learn Naive Bayes.*

The nice thing about Naive Bayes’ is that the computations that underlie it are quite simple, as opposed to something like a neural network or even a support vector machine. We can create our own `MultinomialNaiveBayes()`

class, which takes in a matix of…

*If you are looking for the practical implementation of a Naive-Bayes model from scratch, part II of this article explains that. I highly encourage you to read this first to understand the theory:*

The rise of deep learning has to lead many of us to forget the importance (and often, superiority) of *shallow learning algorithms *like Naive Bayes.

Different learning algorithms use different branches of mathematics to arrive at a sensible conclusion on some set of data. Modern neural networks use a combination of linear algebra and multivariable calculus, archaic perceptrons used linear algebra and a simple addition/subtraction update rule…

*The mathematics here should be tackled with individuals who have completed an introductory linear algebra course. For those just looking for the code implementation, visit the GitHub repository **here.*

The Normal Equations for Least Squares are ubiquitous and for good reason. Apart from very large datasets, the Normal Equations are easy to use, easily generalizable to datasets with multiple variables, and easy to remember.

*I’ll outline my theoretical approach to the problem in python here, using only some code, and you can get an idea of what is going on. I’ll link to the code on GitHub for the actual code.*

Balancing chemical equations is a common activity in high-school classrooms and beyond. The question is (and is, for nearly any activity) — can we automate this process?

The answer is a bold yes, and there are a few ways that we can approach the problem. Mine might not be the fastest, but it is accurate for *all *chemical equations that do not include…

*This article assumes the reader is comfortable with the contents covered in any introductory linear algebra course — orthogonality, eigendecompositions, spectral theorem, Singular Value Decomposition (SVD)…*

Confusion of the proper method to do Principal Component Analysis (PCA) is almost inevitable. Different sources espouse different methods, and any learner quickly deduces that PCA isn’t really a *specific algorithm, *but a series of steps that may vary, with the final result being the same: data that is simplified into a more concise set of features.

After talking about the basic goal of PCA, I’ll explain the mathematics behind *two *commonly shown ways…

This one’s a little unusual of a post since I don’t have any stories, algorithms, math or philosophy to write about, but I spent the past week putting together a complete lecture on the mathematics that underlie neural networks.

It’s about 5 hours long, but there in the description there are timestamps to every subject covered in the video, which is in the twenties or thirties.

Here’s the syllabus of covered topics if you’d like.

If you were to gather a group of scientists from 1962 and ask them about their outlooks on the future and potential of artificial intelligence in solving computationally hard problems, the consensus would be generally positive.

If you were to ask the same group of scientists a decade later in 1972, the consensus would appear quite different, and contrary to the nature of scientific progress, it would be a lot more pessimistic.

We can attribute this change in attitude to the rise and fall of a single algorithm: the *perceptron.*

The perceptron algorithm, first proposed in 1958 by Frank Rosenblatt…

*This is a continuation of my Linear Algebra series, which should be viewed as an extra resource while going along with Gilbert Strang’s class 18.06 on OCW. This can be closely matched to **Lecture 16** his series.*

*This article requires understanding of the four fundamental subspaces of a matrix, projection, projection of vectors onto planes, projection matrices, orthogonality, orthogonality of subspaces, elimination, transposes, and inverses. I would highly recommend understanding everything in **Lecture 15.*

In a previous article, I wrote about fitting a line to data points on a two-dimensional plane in the context of linear regression with gradient descent…

15 y old learning about machine learning, as well as a lifelong naturalist. Climate activist in Vancouver. Writer. Visit me @ adamdhalla.com