Random Restart Hill Climbing
Random Restart Hill Climbing alternates between two phases: a local Hill Climbing
phase that runs for n_iter_restart iterations, and a restart phase that jumps
to a uniformly random position in the search space. During each Hill Climbing
phase, the algorithm evaluates neighboring positions and moves greedily toward
improvements. After the fixed number of local iterations, it discards the current
position and begins a new Hill Climbing run from a fresh random starting point.
The global best across all runs is retained.
Convex function: Hill Climbing phases converge quickly, restarts explore different regions.
Multi-modal function: Multiple restarts explore different basins of attraction.
This algorithm extends Hill Climbing to multi-modal problems by sampling from
multiple basins of attraction rather than converging to a single local optimum.
Compared to Simulated Annealing, which uses a probabilistic acceptance schedule
to escape local optima, Random Restart uses hard resets that provide a cleaner
separation between exploration and exploitation. The n_iter_restart parameter
directly controls this trade-off: short intervals sample more basins while long
intervals allow deeper convergence within each basin. Choose Random Restart Hill
Climbing when the search space contains well-separated local optima and when the
cost per evaluation is low enough to tolerate the wasted iterations at the end of
each restart cycle.
Algorithm
The algorithm alternates between:
Hill Climbing phase: Local search from current position for
n_iter_restartiterationsRestart phase: Jump to a random position
for each iteration:
if iter % n_iter_restart == 0:
pos = random_position(search_space) # Restart
else:
pos = hill_climbing_step(pos) # Local search
Note
Random Restart HC provides a hard boundary between
exploration and exploitation. Each restart is a completely independent
local search, which means the algorithm naturally samples from multiple
basins. The n_iter_restart parameter controls the balance: shorter
restarts give more global coverage, longer restarts give better local
convergence within each basin.
When to Use
Good for:
Multi-modal functions with distinct basins
When Hill Climbing gets stuck in local optima
Problems where local optima are well-separated
Compared to other approaches:
vs. Pure Random Search: More efficient within each basin
vs. Simulated Annealing: Sharper transitions between explore/exploit
vs. Repulsing HC: Completely new starts instead of larger steps
Example
import numpy as np
from gradient_free_optimizers import RandomRestartHillClimbingOptimizer
def rastrigin(para):
x, y = para["x"], para["y"]
A = 10
return -(A * 2 + (x**2 - A * np.cos(2 * np.pi * x))
+ (y**2 - A * np.cos(2 * np.pi * y)))
search_space = {
"x": np.linspace(-5.12, 5.12, 100),
"y": np.linspace(-5.12, 5.12, 100),
}
opt = RandomRestartHillClimbingOptimizer(
search_space,
epsilon=0.03,
n_neighbours=3,
n_iter_restart=10,
)
opt.search(rastrigin, n_iter=1000)
print(f"Best: {opt.best_para}, Score: {opt.best_score}")
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
int |
10 |
Number of iterations between random restarts |
|
float |
0.03 |
Step size for Hill Climbing phase |
|
str |
“normal” |
Step distribution |
|
int |
3 |
Neighbors per Hill Climbing iteration |
Higher-Dimensional Example
import numpy as np
from gradient_free_optimizers import RandomRestartHillClimbingOptimizer
def schwefel_3d(para):
vals = [para["x"], para["y"], para["z"]]
return sum(
v * np.sin(np.sqrt(abs(v))) for v in vals
)
search_space = {
"x": np.linspace(-500, 500, 500),
"y": np.linspace(-500, 500, 500),
"z": np.linspace(-500, 500, 500),
}
opt = RandomRestartHillClimbingOptimizer(
search_space,
epsilon=0.05,
n_neighbours=5,
n_iter_restart=20,
)
opt.search(schwefel_3d, n_iter=3000)
print(f"Best: {opt.best_para}")
print(f"Score: {opt.best_score}")
Trade-offs
Exploration vs. exploitation: Controlled by
n_iter_restart. Short restart intervals give more exploration (more basins sampled); long intervals give deeper exploitation per basin.Computational overhead: Same as Hill Climbing (minimal).
Parameter sensitivity:
n_iter_restartis the critical parameter. Too short means Hill Climbing never converges locally; too long means fewer basins are explored.