# Real-time learning in data streams using stochastic gradient descent

In a typical supervised learning problem we have a fixed volume of training data with which to fit a model. Imagine, instead, we have a constant stream of data. Now, we could dip into this stream and take a sample from which to build a static model. But let’s also imagine that, from time to time, there is a change in the observations arriving. Ideally we want a model that automatically adapts to changes in the data stream and updates its own parameters. This is sometimes called “online learning.”

We’ve looked at a Bayesian approach to this before in Adaptively Modelling Bread Prices in London Throughout the Napoleonic Wars. Bayesian methods come with significant computational overhead because one must fit and continuously updated a joint probability distribution over all variables. Today, let’s look at a more lightweight solution called stochastic gradient descent.

In this post I’ll demonstrate one way to use Bayesian methods to build a ‘dynamic’ or ‘learning’ regression model. When I say ‘learning,’ I mean that it can elegantly adapt to new observations without requiring a refit.

Today we lay our scene in London between the years 1760-1850. The good people of London have bread and they have wheat, which is, of course, what bread is made from. One would expect that the price of bread and the price of wheat are closely correlated and generally they are, though in periods of turmoil things can come unstuck. And Europe, between 1760-1850, was a very eventful place.

# Archaeological Illustration of Bayesian Learning

This is the first of a series of posts on Bayesian learning. The learning mechanism which is built into Bayes’ theorem can be used in all sorts of clever ways and has obvious uses in machine learning. Today we’ll build our intuition of how the mechanism works. In the next post, we’ll see how we can use that mechanism to build adaptive regression models.

Bayesian statistics work in the same way as deductive reasoning. Bayes’ theorm is a mathematical form of deductive reasoning. It took me a long time to understand this simple but profound idea. Let me say it again: Bayes’ theorm gives us a mathematical way to do deductive reasoning: it allows us to take a piece of evidence, weigh it against our prior suspicions, and arrive at a new set of beliefs. And if we have more evidence, we can do it over and over again.

Let’s do an example. It occured to me that archaeology is an excellent illustration of how this works.

## We are archaeologists

We’re excavating the site of a Roman village. Our job is to try and pinpoint what year(s) the village was likely inhabited. In probability terms, we want to work out `p(inhabited)` for a range of years. As we uncover relics like pottery and coins, we use that evidence to refine our beliefs about when the village was inhabited.

# Bayesian Vitalstatistix: What Breed of Dog was Dogmatix?

In the Asterix comics we are never told just what breed Obelix’s little dog Dogmatix is. In fact, he is specifically referred to as being an “undefined breed” in his first appearance in Asterix and the Banquet.

I’ve compiled four hypotheses:

• West Highland White Terrier: In the first two Asterix movies, the part of Dogmatix was played by a West Highland white terrier (a “Westie”). Of all the present-day dog breeds, westies look most like Dogmatix. There is one major hole in the westie hypothesis: it’s extremely unlikely that a Westie would be found in Gaul circa 50BC. That’s because the dogs didn’t exist back then.

• Melitan: A melitan was a breed of lapdog popular with Romans and Greeks in ancient times. They were small companion dogs, typically pampered things who existed solely to amuse their owners, typically noblewomen. “The Melitan was a small, fluffy, spitz-type dog, commonly white in colour… with a very appealing, pointed, fox-like muzzle.” In appearance, they sound nothing like Dogmatix. But the Melitan hypothesis is very strong for another reason: these small dogs actually existed as pets in 50BC. There’s no record of them amongst Gauls, however. We’d have to imagine that rowdy, warring Obelix came into possession of a small Roman lapdog during a raid.

• A Gallic war dog, a.k.a. Irish Wolfhound: What we call an Irish wolfhound is a very ancient breed of dog which Celts were known to use as guard dogs and in combat. Caesar is supposed to have mentioned them in his Commentaries on the Gallic War, though I couldn’t find the reference. This is the only breed of dog which we might realistically find in a Gaulish village in 50BC, which makes it a good hypothesis for Dogmatix. The obvious flaw: wolfhounds are huge. Maybe Dogmatix could be a wolfhound, but he’d have to be an anemic, albino, malformed runt of a wolfhound. He is known to accompany Obelix on raids and bite Romans.

• Schnauzer: The last hypothesis: Dogmatix could be a schnauzer. This is unlikely for so many reasons: (1) schnauzers didn’t exist in 50BC; (2) they aren’t native to Gaul/France; (3) they look nothing like Dogmatix except, except, except that schnauzers have whiskers that perhaps vaguely resemble Dogmatix’s unusual moustache.

Which breed does Dogmatix belong to…?

Our goal is to determine which of these four breeds Dogmatix belongs to. First, we need some data, then we need a method for decision making.