What is a 'cost function' in Machine Learning / ML

Christopher Simpson

Nov 14, 2024 • 3 min read

Photo by Kony on Unsplash

I don't really know, deeply what a cost functino is in Machine Learning. Let's learn it.
All quotes are from the excellent "Hands-On Machine Learning with Scikit-Learn, and TensorFlow, 1st Edition" unless otherwise stated.

In "Hands-On Machine Learning with Scikit-Learn, and TensorFlow, 1st Edition", a cost function is defined as:

"You can either define a utility function (or fitness function) that measures how good your model is, or you can define a cost function that measures how bad it is. For linear regression problems, people typically use a cost function that measures the distance between the linear model's predictions and the training examples; the objective is to minimize this distance."
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition, 2017, Aurelien Geron

As I'm reading this I ask myself:

How can a 'cost function' determine what is a good model vs a bad model: What's it comparing/looking at? How does the function 'know' or 'calculate' that, and how can we 'make' or 'create' a function which measures 'goodness/badness' of a model.
- I like I'm now aware sometimes people create functions which measure how good a model is (aka "a utility function (or fitness function) that measures how good your model")
- I'm starting to imagine, perhaps the output of these 'cost functions' might be like a 'hey, that model measurement is 10% correct for a utility function, or 10% bad for a cost function. Or maybe it's more often a expressed number say, 1 meaning 100% correct or 0.5 meaning 50% bad. Let's func out. Do cost functinos evaluate the model 'goodness' as a whole, or individual measurements?
- Almost. Here's some realistic examples from the experts
Let's break the definition down further

"For linear regression problems, people typically use a cost function that measures the distance between the linear model's predictions and the training examples; the objective is to minimize this distance."

OK that's great if you come from a mathmetics background. I'm left with the following:

How does one 'measure the distance between the linear model's predictions and the training examples?'
I don't like that sometimes these explainations read (to me) as a sentiant alive thing, that can be 'fed' information and 'do things' too vuage. "You feed it your training examples and it finds the parameters that make the linear model fit best your data"

We need to know what Vectors are, this is the best explanation of 'What is a vector' I've read which uses a Rocket 🚀 as an example, all the way to why you're not killing it on YouTube with your Subscribers (how Google evaluates spammy video content vs engaging cotent), what could be more fun?

Above is a visualization using Python and Matplotlib to demonstrate vectors as points in different dimensions:

Left Plot (2D): Shows a 2D coordinate plane with a point at (3, 4) and an arrow from the origin to this point, representing the vector.
Right Plot (3D): Displays a 3D coordinate system with a single point at (1, 2, 3), representing the vector as a point in 3D space.

Code for the plots above

```

import matplotlib.pyplot as plt

import numpy as np

from mpl_toolkits.mplot3d import Axes3D

# Create a figure for plotting

fig = plt.figure(figsize=(12, 6))

# 2D Plot on the left

ax1 = fig.add_subplot(121)

ax1.set_title('2D Vector as Point and Arrow')

ax1.set_xlabel('X')

ax1.set_ylabel('Y')

# Plotting a point (3, 4) and an arrow from origin to that point

ax1.plot(3, 4, 'ro') # Point at (3,4)

ax1.arrow(0, 0, 3, 4, head_width=0.3, head_length=0.3, fc='blue', ec='blue') # Arrow from origin to (3,4)

# Annotating the point

ax1.text(3.2, 4, 'P(3, 4)', fontsize=12)

# 3D Plot on the right

ax2 = fig.add_subplot(122, projection='3d')

ax2.set_title('3D Point Representation')

ax2.set_xlabel('X')

ax2.set_ylabel('Y')

ax2.set_zlabel('Z')

# Plotting a point (1, 2, 3)

ax2.scatter(1, 2, 3, color='red')

ax2.text(1.1, 2, 3, 'Point (1, 2, 3)', fontsize=12)

# Display the plot

plt.tight_layout()

plt.show()

```

Code for the plots above

Sign up for more like this.