medium.com
neural networks. Neural networks are the foundation of deep learning. They consist of layers of interconnected neurons that aim to learn complex patterns that traditional machine learning cant pick up. The possibilities are endless with them.To implement it, well use PyTorch. When implementing deep learning, most people either use PyTorch or another library called TensorFlow.To start, we need to define our own r2_score as we are no longer using sklearn.def r2_score(y_true, y_pred): """ Calculate R^2 (coefficient of determination) score in PyTorch. Parameters: y_true (torch.Tensor): Actual target values y_pred (torch.Tensor): Predicted values Returns: torch.Tensor: R^2 score """ ss_total = torch.sum((y_true - torch.mean(y_true)) ** 2) # Total sum of squares ss_residual = torch.sum((y_true - y_pred) ** 2) # Residual sum of squares r2 = 1 - (ss_residual / ss_total) return r2Now, we must convert our datatype torch.tensor, so we can use PyTorch.X_train_tensor = torch.tensor(X_train, dtype=torch.float32)X_test_tensor = torch.tensor(X_test, dtype=torch.float32)y_train_tensor = torch.tensor(y_train, dtype=torch.float32).unsqueeze(dim=1)y_test_tensor = torch.tensor(y_test, dtype=torch.float32).unsqueeze(dim=1)Now, we can get to the actual neural network. We must first design the structure. Lets go with a very simply network that takes in 16 inputs, has one hidden layer of 32 neurons, and leads to one output (price).model_7 = nn.Sequential( nn.Linear(in_features=16, out_features=32), nn.ReLU(), nn.Linear(in_features=32, out_features=32), nn.ReLU(), nn.Linear(in_features=32, out_features=1))We must also define our optimization metric (MSE)loss_fn = nn.MSELoss()optimizer = torch.optim.SGD(params=model_7.parameters(), lr=0.1)Finally, we can train the model.epoch_count = []loss_values = []test_loss_values = []for epoch in range(epochs): model_7.train() y_pred = model_7(X_train_tensor) loss = loss_fn(y_pred, y_train_tensor) train_r2 = r2_score(y_true=y_train_tensor, y_pred=y_pred) optimizer.zero_grad() loss.backward() optimizer.step() model_7.eval() with torch.inference_mode(): y_pred = model_7(X_test_tensor) test_loss = loss_fn(y_pred, y_test_tensor) test_r2 = r2_score(y_true=y_test_tensor, y_pred=y_pred) if epoch % 10 == 0: print(f"Epoch: {epoch} | Train Loss: {loss.item()} | Train R2: {train_r2} | Test Loss: {test_loss.item()} | Test R2: {test_r2}") epoch_count.append(epoch) loss_values.append(loss) test_loss_values.append(test_loss)Its often helpful to plot the models loss curve to see how it learns, and when we should stop it.# Loss curvesplt.plot(epoch_count, np.array(torch.tensor(loss_values).numpy()), label="Train loss")plt.plot(epoch_count, test_loss_values, label="Test loss")plt.title('Training and test loss curves')plt.ylabel('Loss')plt.xlabel('Epochs')plt.legend()This shows that the model stops learning at around 300 epochs. With this in mind, lets retrain the model but only with 300 epochs.Note that we could (and often its best to) automatically stop the algorithm in the training loop, but for this project, lets keep it simple.Results (r2):Train 0.8200255036354065Test 0.722881555557251We could definitely optimize the network by adjusting the number of neurons and other techniques. However, I deem this unnecessary because weve already received a good result with the Gradient Boosting algorithm.