For my training "Create a neural network from scratch with Python", I needed to show a diagram that visualises the cost function.
The training data contains of a single input and single output:
training_data_input = 2 training_data_output = 14 # input * 2 + 10
As you can see, the weight should be 2 and the bias should be 10.
For this single training point, the cost needs to be calculated for a weight and bias within a certain range:
weights, biases = np.meshgrid(range(-4, 8), range(-4, 12))
Here is the cost function. It is the Mean Squared Error cost function:
(training_data_output - (w * training_data_input + b)) ** 2
The result:
Here is the code to create the diagram:
import matplotlib.pyplot as plt import numpy as np training_data_input = 2 training_data_output = 14 # 2 * 2 + 10 def cost(w, b): return (training_data_output - (w * training_data_input + b)) ** 2 fig = plt.figure() ax = fig.add_subplot(projection='3d') weights, biases = np.meshgrid(range(-4, 8), range(-4, 12)) costs = cost(weights, biases) ax.plot_wireframe(weights, biases, costs, rstride=1, cstride=1, linewidth=0.5, edgecolor='black') ax.set_xlabel('Weight') ax.set_ylabel('Bias') ax.set_zlabel('Cost') plt.show()
One of the things I noticed, is that the lowest cost is not a point, but many points on a line in the diagram.
This is because many combinations of weight and bias can cause a low cost and that is one of the reasons why it is important to have a lot of training data while training a neural network!