Randomized Optimization on Discrete Landscapes and Neural Networks

Early Draft

This is an early version of this project write-up. For now it’s largely a placeholder. I’m actively working on it.

Introduction

There is a whole world of optimization that begins where gradients fail. Rugged objectives, deceptive local maxima, noisy evaluations—these are the domains where simple calculus runs out of road and heuristic search takes over. This post is a concept-first look at how three randomized methods navigate those landscapes and when each shines in practice.

I focus on the underlying ideas covered in the second assignment for Georgia Tech’s Machine Learning course. These cover what shapes of search spaces favor simulated annealing over hill climbing, why genetic algorithms thrive on recombinable structure, and how any of these can stand in for backprop when tuning a small neural network.

Note

This post is part of my series from Georgia Tech’s Machine Learning course. For a broader survey of the course material, see the companion post: Machine Learning: A Retrospective.

Overview

Randomized optimization provides practical strategies for navigating rugged objective functions when gradients are unavailable or unhelpful. This piece compares three local/heuristic search methods:

Randomized Hill Climbing (RHC)
Simulated Annealing (SA)
Genetic Algorithm (GA)

Looking at how these behave on common landscape archetypes and consider how they can stand in for gradient methods when searching continuous neural network weight spaces.

Canonical Problem Archetypes

Discrete landscape with deceptive local structure: A rugged objective with many local maxima and a single narrow global basin. This stresses exploitation vs. controlled exploration.
Discrete landscape with compositional substructures: A building-block objective where good partial solutions can be recombined. This emphasizes recombination and population diversity.
Neural network weight optimization: Replace backprop with RHC/SA/GA to minimize classification loss on a small task. Evaluate training loss, validation accuracy, wall-clock time, and sample efficiency.

A Note on Code Availability

In accordance with Georgia Tech’s academic integrity policy and the license for course materials, the source code for this project is kept in a private repository. I believe passionately in sharing knowledge, but I also firmly respect the university’s policies. This document follows Dean Joyner’s advice on sharing projects with a focus not on any particular solution and instead on an abstract overview of the problem and the underlying concepts I learned.

I would be delighted to discuss the implementation details, architecture, or specific code sections in an interview. Please feel free to reach out to request private access to the repository.

Table of Contents

Introduction
Overview
Canonical Problem Archetypes