Karin Knudson

February 22, 2022

This is another post in the spirit of: "get to know your friendly neighborhood distributions." Today, we consider the t-distribution, also known as the Student's t-distribution.

Can I interest you in a blog post on that?

(Read post)

August 27, 2021

A little bit of ARIMA (AutoRegressive Integrated Moving Average models)

The goal of this post is to give a bit of intuition and basic background about how ARIMA models for time series work. Why? 1) For fun, and 2) because they are useful— both on their own and as building blocks that show up in larger or fancier models.

We'll start with the picture this time...

(Read post)

February 22, 2021

My Python project setup basics (with tests and linting)

I have this problem that when I go to start a new Python project, I often realize that I've... forgotten some basic things about how I like to set up a Python project. And if, as I get more practice, I'm getting better at remembering those things, I'm also having more occasion to explain to someone else how I like to set up a project.

So, I put together this post for the dual purpose of 1) replacing (or at least complementing!) the digging around in an old project for templates for myself and trying to remember things I totally used to know how to do and 2) serving as a resource or explainer that I can share with others.

(Read post)

July 31, 2020

Expectation Maximization (EM)

Trying to maximize a likelihood or a posterior? Got some latent variables on hand? The EM algorithm might be useful for you.

(Read post)

June 14, 2020

A reading list about race and racism in statistics, math, machine learning and data science

May 3, 2020

Marathon mapping with D3, GeoPy, and a little Node.js

This post outlines the steps I took to make a set of visualizations of US marathons by month of race, date of inception, and temperature.

(Read post)

March 22, 2020

Kullback-Leibler divergence

Quick, what's better than one probability distribution?

That's right, two probability distributions.

Once you have two probability distributions, though, it is likely only a matter of time before find yourself comparing them to each other. Are they similar, or different, you might ask, and to what extent? If one of them represents ground truth, how far is the other one from it? Since these are common questions, there are several widely used tools to help answer them.

In this post and an upcoming one, we will explore two widely used ways of quantifying how one distribution differs from another: Kullback-Leibler (KL) divergence (also called relative entropy), and the Wasserstein metric. This post discusses KL divergence. We will define KL divergence, explore it graphically, then look deeper into the definition with a few different interpretations.

(Read post)

January 5, 2020

Poisson distributions, gamma distributions, and Poisson processes

This post is meant to introduce some friendly and useful distributions — the Poisson distribution and the gamma distribution — and to get some intuition for the related Poisson process. As a bonus, we'll see the exponential and negative binomial distributions appear along the way, with the negative binomial appearing in the role of prior predictive distribution.

Poisson distributions are useful when you have whole number counts of events. Gamma distributions are a flexible family of distributions that generalize some other distributions that might be familiar to you: the exponential and chi-squared distributions. Poisson processes can be used to model events occuring with certain independence properties in space and/or time. Exponential distributions show up in modeling wait times between events, among other applications. If these sound like a useful bunch of distributions for modeling in many settings, they are.

As we work with these distributions, we will take advantage of the opportunities that arise along the way to become more familiar with ideas about working with priors, likelihoods, and posterior distributions, recognizing and using conjugacy properties when it is reasonable to do so, and looking at the role of hyperparameters. When we encounter the negative binomial distribution it will be as a prior predictive distribution in our model, which is a useful-to-understand role for a distribution. We'll also take an optional detour to look at different parametrizations of the gamma distribution in more detail, which might sound boring until you — say — spend a bunch of time that you will never get back on failing to replicate someone's methods, only to realize that you and they were parametrizing the gamma distribution differently.

Ready? Let's get started, beginning with the Poisson distribution.

(Read post)

December 18, 2019

A Git workflow with branches

This post is a reference and explainer for a basic collaborative git workflow. I felt like it took me forever to learn the basics of git, so I put this put this together for myself, and posted it in case it could be helpful for anyone else!

(Read post)

December 4, 2019

Beta distributions, Dirichlet distributions and Dirichlet processes

Today I am writing about some of my favorite distributions: the beta distribution, the Dirchlet distribution, and the Dirichlet process. All three are handy options to have at your disposal in specifying priors in certain Bayesian models, and they're also rather lovely in their own right.

One thing that makes these distributions really neat is that they can be thought of as distributions of distributions. That is, if we take a draw from one of these distributions, we get a result that itself fully describes a probability distribution. If we take another draw, we get another probability distribution. If we take a lot of draws, we get a whole bunch of distributions, with our beta distribution/Dirichlet distribution/Dirichlet process governing what kinds of distributions we might expect to see more or less commonly represented in the bunch.

My goal today is to give some visual intuition for how these distributions of distributions work. Along the way, we'll also see examples of simple conjugate Bayesian analysis and hyperparameter interpretation for these distributions. We will begin with beta and Dirichlet distributions and build our way up to Dirichlet processes and beyond.

(Read post)

November 18, 2018

Learning resources for app development with iOS

Last year I created and taught a class on mobile app development for iOS to high schoolers. The prerequisite was one term of computer science, so all students had done some coding before, although in some cases they were still quite new to coding. It was a joy to teach coding via app development in part because it was such a motivating experience for students as they learned programming to be able to see the results of their creative and problem-solving efforts so quickly and concretely on their own phone.

Here are some resources for getting started creating iOS apps with Xcode and Swift.

Developing iOS 11 Apps with Swift by Stanford (iTunes U) - So much for saving the best for last. Stanford very generously has made its lectures and notes available for free through iTunes U. This is a GREAT course, and going through it in detail helped my learn what I needed to know in order to teach my own course. It assumes a background in object oriented programming, so might feel a bit fast paced if you are still pretty new to programming. This was not a resource I pointed my students to directly, but one I used for my own learning. Enjoy.
Coding iPhone Apps for Kids: A Playful Introduction to Swift by Gloria Winquist and Matt McCarthy (No Starch Press, 2017) - Don't let the title throw you off! This is a beautifully produced book with wonderfully clear and helpful exposition. Who - kid or adult - wouldn't appreciate that? The book gives a tour of core ideas of coding in Swift, and then walks through the creation of several apps that would give a nice jumping off point for building your own. It assumes no programming background, but I think it's a great resource even if you've done some coding but are just new to Swift.
App Development with Swift by Apple Education (iBooks) - This eBook is freely available and assumes no background in coding. It was a useful resource for my students to get a solid grounding in the basics of coding, Swift, and Xcode, and its section on HTTP and URL Sessions was particularly helpful in breaking down a complex topic to make it accessible to my students as they built their own apps. Its interactive quiz questions at the end of each chapter were a good way for students to give themselves a quick check for understanding. The book is meant to lend itself well to an academic course structure, so if you are a teacher new to the subject (like I was!) it could give helpful ideas for structuring a syllabus. I found myself wishing that it got to more exciting example applications more quickly with its labs and guided projects, especially as I was using it for a ten week course, but with a little adaptation it worked out well for us.
raywenderlich.com - I haven't spent much time on this site yet - and some of the content seems to have been restructured since I have, but I have this on my list of resources to explore more deeply before the next time I teach the course, as it seems to offer a number of helpful and practically-orienteed tutorials on different aspects of iOS development and Swift.
The Swift Programming Language (Apple) - I found Apple's language guide very clear and helpful for getting to know Swift. It's concise enough that you can reasonably read the whole guide. If you're on the newer side, you'll learn more about programming along the way, and if you've been programming for a long time, I think it will get you pretty quickly into what you need to know about Swift in particular. On its own, it won't teach you how to make apps, but it's a great resource to have on hand.
Human Interface Guidlines (Apple) - Since I'm coming from a math background, design principles for apps that are enjoyable to use feels pretty out of my field. So, the idea that there are concrete guidelines out there seems magical and reassuring! I can't say I have delved into all of what is here, but each piece I have looked at has given me useful ideas to think about as I look to make and help my students make an app that someone would actually want to use!