HillClimbingOptimizer
- class HillClimbingOptimizer(search_space: dict[str, list], initialize: dict[Literal['grid', 'vertices', 'random', 'warm_start'], int | list[dict]] = None, constraints: list[callable] = None, random_state: int = None, rand_rest_p: float = 0, nth_process: int = None, epsilon: float = 0.03, distribution: Literal['normal', 'laplace', 'gumbel', 'logistic'] = 'normal', n_neighbours: int = 3)[source]
Local search optimizer that iteratively moves towards better neighboring solutions.
Hill Climbing is a simple yet effective local search algorithm that starts from an initial position and iteratively moves to neighboring positions that improve the objective function. At each step, the algorithm samples a set of neighboring positions and moves to the best one found. This greedy approach makes it fast and memory-efficient, but susceptible to getting stuck in local optima.
The algorithm is well-suited for:
Unimodal optimization problems with a single global optimum
Fine-tuning solutions found by other optimizers
Problems where function evaluations are expensive (due to low overhead)
Situations requiring a simple, interpretable optimization process
The epsilon parameter controls the step size: smaller values lead to finer local search but slower convergence, while larger values enable broader exploration but may overshoot optima. The n_neighbours parameter determines how many candidate positions are evaluated before making a move.
- Parameters:
- search_spacedict[str, list]
The search space to explore, defined as a dictionary mapping parameter names to arrays of possible values.
Each key is a parameter name (string), and each value is a numpy array or list of discrete values that the parameter can take. The optimizer will only evaluate positions that are on this discrete grid.
Example: A 2D search space with 100 points per dimension:
search_space = { "x": np.linspace(-10, 10, 100), "y": np.linspace(-10, 10, 100), }
The resolution of each dimension (number of points in the array) directly affects optimization quality and speed. More points give finer resolution but increase the search space size exponentially.
- initializedict[str, int], default={“vertices”: 4, “random”: 2}
Strategy for generating initial positions before the main optimization loop begins. Initialization samples are evaluated first, and the best one becomes the starting point for the optimizer.
Supported keys:
"grid":int– Number of positions on a regular grid."vertices":int– Number of corner/edge positions of the search space."random":int– Number of uniformly random positions."warm_start":list[dict]– Specific positions to evaluate, each as a dict mapping parameter names to values.
Multiple strategies can be combined:
initialize = {"vertices": 4, "random": 10} initialize = {"warm_start": [{"x": 0.5, "y": 1.0}], "random": 5}
More initialization samples improve the starting point but consume iterations from
n_iter. For expensive objectives, a few targeted warm-start points are often more efficient than many random samples.- constraintslist[callable], default=[]
A list of constraint functions that restrict the search space. Each constraint is a callable that receives a parameter dictionary and returns
Trueif the position is valid,Falseif it should be rejected.Rejected positions are discarded and regenerated: the optimizer resamples a new candidate position (up to 100 retries per step). During initialization, positions that violate constraints are filtered out entirely.
Example: Constrain the search to a circular region:
def circular_constraint(para): return para["x"]**2 + para["y"]**2 <= 25 constraints = [circular_constraint]
Multiple constraints are combined with AND logic (all must return
True).- random_stateint or None, default=None
Seed for the random number generator to ensure reproducible results.
None: Use a new random state each run (non-deterministic).int: Seed the random number generator for reproducibility.
Setting a fixed seed is recommended for debugging and benchmarking. Different seeds may lead to different optimization trajectories, especially for stochastic optimizers.
- rand_rest_pfloat, default=0
Probability of performing a random restart instead of the normal algorithm step. At each iteration, a uniform random number is drawn; if it falls below
rand_rest_p, the optimizer jumps to a random position instead of following its strategy.0.0: No random restarts (pure algorithm behavior).0.01-0.05: Light diversification, helps escape shallow local optima.0.1-0.3: Aggressive restarts, useful for highly multi-modal landscapes.1.0: Equivalent to random search.
This is especially useful for local search optimizers (Hill Climbing, Simulated Annealing) that can get trapped. For population-based optimizers, the effect is less pronounced since they already maintain diversity through multiple agents.
- epsilonfloat, default=0.03
Step size for generating neighbor positions, expressed as a fraction of each dimension’s range. Controls how far the optimizer looks from the current position when sampling neighbors.
0.01-0.02: Fine-grained local search, slow convergence.0.03-0.05: Moderate step size, good default range.0.1-0.3: Large steps, broader exploration.0.5-1.0: Very large steps, nearly global jumps.
Example: For a dimension with
np.linspace(0, 100, 1000)(range = 100):epsilon=0.03leads to neighbors within ~3 unitsepsilon=0.1leads to neighbors within ~10 units
Smaller values are better for fine-tuning near a known good solution. Larger values help escape local optima but may overshoot narrow peaks.
- distribution{“normal”, “laplace”, “gumbel”, “logistic”}, default=”normal”
Probability distribution used to sample neighbor offsets. Each distribution produces different exploration patterns:
"normal": Gaussian distribution. Most neighbors are close to the current position with rare far jumps. Best general-purpose choice."laplace": Sharper peak than normal with heavier tails. Good for landscapes where occasional large jumps help."gumbel": Asymmetric, skewed distribution. Can bias exploration in one direction."logistic": Similar to normal but with slightly heavier tails. A middle ground between normal and Laplace.
The distribution interacts with
epsilon: heavy-tailed distributions (Laplace) effectively increase the chance of large steps beyond whatepsilonalone suggests.- n_neighboursint, default=3
Number of neighbor positions to sample and evaluate per iteration. The optimizer moves to the best among these neighbors (if it improves on the current position).
1: Minimal sampling, fast iterations but may miss good directions. Good for very cheap objective functions.3-5: Moderate sampling, good balance of quality and speed.10-20: Thorough neighborhood evaluation, better directional choices but more function evaluations per step.
Higher values act like a local beam search, giving the optimizer more information about the local landscape at each iteration.
- Attributes:
best_paraReturn the best parameters found as a dictionary.
best_valueReturn the best values found (raw parameter values).
search_dataLazily construct and return the search results DataFrame.
Methods
eval_time
init_stats
iter_time
search
See also
StochasticHillClimbingOptimizerProbabilistic acceptance to escape local optima.
RepulsingHillClimbingOptimizerIncreases step size when stuck.
SimulatedAnnealingOptimizerTemperature-based acceptance with cooling.
RandomRestartHillClimbingOptimizerPeriodically restarts from random positions.
Notes
The hill climbing update rule at each step:
\[\begin{split}x_{\\text{neighbor}} = x_{\\text{current}} + \\epsilon \\cdot \\text{range}(d) \\cdot \\text{sample}(\\text{distribution})\end{split}\]The move is accepted only if it improves the objective:
\[\begin{split}f(x_{\\text{neighbor}}) > f(x_{\\text{current}})\end{split}\]Time complexity per iteration is O(
n_neighbours* d), where d is the number of dimensions.For visual explanations, algorithm flowcharts, and tuning guides, see the Hill Climbing user guide.
Examples
>>> import numpy as np >>> from gradient_free_optimizers import HillClimbingOptimizer
>>> def parabola(para): ... return -(para["x"] ** 2 + para["y"] ** 2)
>>> search_space = { ... "x": np.linspace(-10, 10, 100), ... "y": np.linspace(-10, 10, 100), ... }
>>> opt = HillClimbingOptimizer(search_space) >>> opt.search(parabola, n_iter=100)
- property best_para[source]
Return the best parameters found as a dictionary.
Uses the Converter to transform the best position into user-friendly parameter names and values.
- Returns:
- dict or None
Dictionary mapping parameter names to their best values, or None if no evaluation has been performed yet.
- property best_value[source]
Return the best values found (raw parameter values).
- Returns:
- list or None
List of best values in parameter order, or None if no evaluation has been performed yet.
- property search_data: pd.DataFrame[source]
Lazily construct and return the search results DataFrame.
The DataFrame is only built when this property is accessed, avoiding a large memory spike at the end of high-dimensional optimizations. The result is cached so subsequent accesses don’t rebuild it.