MEDIUM.COM
All about Neural Style Transfer
All about Neural Style Transfer4 min readJust now--Neural Style TransferNeural Style Transfer is a technique that utilizes Deep Learning to apply one images artistic style to anothers content. The method uses Convolution Neural Networks for this. It basically extracts content and style from the images respectively and constructs a new one.How does NST work?NST work by utilising two things: style and content. Here content refers to object, structures etc in the image and style refers to texture and colors. So now we will focus on three loss functions.Content loss: This provides the difference between actual content and the content generated in the image.Style loss: Difference between the gram matrices of the style image and the generated image.Total variation loss: Ensure smoothness in the image. The lower the value better is the image texture.Categories of NSTBelow are the categories of NST. It has not been discussed in detail as I will be covering them later.CategoriesPython codeBelow is the Python code that utilizes VGG19 to transfer the styles to the content image. First, the images are loaded and preprocessed. Then features are extracted using the pretrained model i.e VGG19. Before the extraction of features, we have picked up some convolution layers. After calculating the Gram Matrix, we initialize the target image as a copy of the content image and make it trainable. We use the Adam optimizer to iteratively update this image, refining it to blend the content and style.During each iteration, we:Extract features from the target image using VGG19.Compute content loss, which measures the difference between the target and content image features.Compute style loss by comparing the Gram matrices of the target and style images across selected layers, weighted accordingly.Combine both losses with predefined weights and backpropagate the error.Update the target image using gradient descent to minimize the total loss.After 500 iterations, the target image gradually transforms, capturing the style patterns while retaining the content structure. Finally, the transformed image is displayed as the stylized output.import torchimport torch.optim as optimimport torchvision.transforms as transformsfrom torchvision import modelsfrom PIL import Imageimport matplotlib.pyplot as plt# Set devicedevice = torch.device("cuda" if torch.cuda.is_available() else "cpu")# Load and preprocess imagesdef load_image(image_path, max_size=400): image = Image.open(image_path).convert("RGB") transform = transforms.Compose([ transforms.Resize((max_size, max_size)), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) image = transform(image).unsqueeze(0).to(device) # Ensure it's 3-channel RGB return imagedef imshow(tensor, title=None): image = tensor.clone().detach().cpu().squeeze(0) # Unnormalize using the original mean and std mean = torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1) std = torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1) image = image * std + mean # Reverse normalization image = torch.clamp(image, 0, 1) # Ensure valid range image = transforms.ToPILImage()(image) plt.imshow(image) if title: plt.title(title) plt.show()# Load VGG19 modelvgg = models.vgg19(pretrained=True).features.to(device).eval()def get_features(image, model): layers = { '0': 'conv1_1', '5': 'conv2_1', '10': 'conv3_1', '19': 'conv4_1', # Content layer '28': 'conv5_1' } features = {} x = image for name, layer in model._modules.items(): x = layer(x) if name in layers: features[layers[name]] = x return features# Compute Gram Matrixdef gram_matrix(tensor): _, d, h, w = tensor.size() tensor = tensor.view(d, h * w) gram = torch.mm(tensor, tensor.t()) return gram / (d * h * w)# Load imagescontent_img = load_image("content_1.jpg")style_img = load_image("style.jpg")# Extract featurescontent_features = get_features(content_img, vgg)style_features = get_features(style_img, vgg)# Compute Gram Matrices for style featuresstyle_grams = {layer: gram_matrix(style_features[layer]) for layer in style_features}# Generate target imageinput_img = content_img.clone().requires_grad_(True)# Use Adam Optimizeroptimizer = optim.Adam([input_img], lr=0.003)# Define weightsstyle_weights = {'conv1_1': 1.0, 'conv2_1': 0.75, 'conv3_1': 0.5, 'conv4_1': 0.45, 'conv5_1': 0.3}content_weight = 1style_weight = 150000# Optimization loopsteps = 500for i in range(steps): optimizer.zero_grad() target_features = get_features(input_img, vgg) # Compute content loss content_loss = torch.mean((target_features['conv4_1'] - content_features['conv4_1']) ** 2) # Compute style loss style_loss = 0 for layer in style_weights: target_gram = gram_matrix(target_features[layer]) style_gram = style_grams[layer] layer_style_loss = style_weights[layer] * torch.mean((target_gram - style_gram) ** 2) style_loss += layer_style_loss # Total loss total_loss = content_weight * content_loss + style_weight * style_loss total_loss.backward(retain_graph=True) optimizer.step() if i % 50 == 0: print(f"Step {i}, Loss: {total_loss.item()}")# Show final imageimshow(input_img, title="Stylized Image")It is to be noted that the hyperparameter tweaking in NST is a bit hectic as you need to adjust the style weight, content weight and style_weights(the weights that are applied at each convolution layer).Below are the sample outputsContent Image (Sample taken from Google)The Starry Night by Van GoghThe resultThe swirling effect is clearly visible in the final image. Hence we can say that the style has been transferred to the content image.Applications of the Neural Style TransferMany companies like Google, Meta, Adobe use this technique. Some of the applications are as follows:Turning Photos into Art Converts images into paintings like Van Goghs style.Restyling Images & Videos Applies artistic styles to photos and videos.Enhancing Historical Photos Restores and recolors old images.Game Design & Virtual Worlds Changes game graphics dynamically.Advertising & Branding Creates stylish content for marketing.Medical Imaging Enhances scans for better diagnosis.Fashion & Design Generates unique clothing and textile patterns.
0 Yorumlar 0 hisse senetleri 61 Views