Downhill Simplex
Downhill Simplex (Nelder-Mead) operates on a geometric structure rather than sampling from a neighborhood distribution. It maintains a simplex of n+1 vertices in n-dimensional space (a triangle in 2D, a tetrahedron in 3D) and transforms it through four operations: reflection of the worst vertex through the centroid, expansion along the reflection direction, contraction toward the centroid, and shrinking the entire simplex toward the best vertex. The simplex reshapes itself at each iteration based on the objective values at its vertices, adapting its orientation and size to the local curvature of the function.
Convex function: The simplex contracts toward the minimum.
Multi-modal function: May converge to local optimum depending on initial simplex placement.
Downhill Simplex is structurally different from the other local search algorithms in this library, which all generate candidates by sampling from a distribution around the current position. The geometric approach makes it effective on smooth, low-dimensional functions with curved valleys or ridges, where the simplex can align itself along the function’s contours. Its performance degrades as dimensionality increases because the simplex requires n+1 vertices and the geometric operations become less informative in high-dimensional spaces. Choose it for problems with fewer than roughly 10 continuous dimensions where the objective function is smooth and deterministic.
Algorithm
The algorithm maintains n+1 vertices for an n-dimensional problem. At each iteration:
Order vertices by score (best to worst)
Reflect the worst vertex through the centroid of the remaining vertices
Depending on the reflection result:
If best so far: Expand further in that direction
If improvement: Accept the reflection
If still worst: Contract toward the centroid
If contraction fails: Shrink entire simplex toward best vertex
This geometric approach doesn’t require gradient information but adapts to the local curvature of the objective function.
Note
The Downhill Simplex is the only algorithm in GFO that uses geometric transformations rather than random sampling. The simplex shape adapts to the local landscape: it elongates along valleys, contracts near optima, and can flip over ridges through reflection. This makes it particularly effective for smooth, low-dimensional functions with curved valleys (like Rosenbrock’s function).
Simplex Operations
Given vertices: [best, ..., worst] and centroid C of [best, ..., second_worst]
Reflection: R = C + alpha * (C - worst) # Try opposite side
Expansion: E = C + gamma * (R - C) # Go further if R is good
Contraction: K = C + beta * (worst - C) # Pull back toward center
Shrink: x_i = best + 0.5 * (x_i - best) # Contract entire simplex
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
float |
1.0 |
Reflection coefficient (typically 1.0) |
|
float |
2.0 |
Expansion coefficient (typically 2.0) |
|
float |
0.5 |
Contraction coefficient (typically 0.5) |
Standard Coefficients
The default values (alpha=1.0, gamma=2.0, beta=0.5) are the original Nelder-Mead coefficients and work well for most problems. Adjust only if you have specific needs:
Larger alpha/gamma: More aggressive exploration
Smaller beta: More conservative contraction
0.5 shrink factor: Standard (not configurable in GFO)
Example
import numpy as np
from gradient_free_optimizers import DownhillSimplexOptimizer
def rosenbrock(para):
x, y = para["x"], para["y"]
return -((1 - x)**2 + 100 * (y - x**2)**2)
search_space = {
"x": np.linspace(-5, 5, 100),
"y": np.linspace(-5, 5, 100),
}
opt = DownhillSimplexOptimizer(
search_space,
alpha=1.0, # Standard reflection
gamma=2.0, # Standard expansion
beta=0.5, # Standard contraction
)
opt.search(rosenbrock, n_iter=500)
print(f"Best: {opt.best_para}, Score: {opt.best_score}")
When to Use
Good for:
Low-dimensional problems (< 10 dimensions)
Smooth objective functions
When you want a deterministic, geometry-based approach
Problems where function evaluations are cheap
Not ideal for:
High-dimensional spaces (simplex has n+1 vertices)
Noisy objective functions
Functions with many local optima
Discrete or categorical parameters
Comparison with Other Local Methods
vs. Hill Climbing |
Simplex uses n+1 points vs. neighborhood sampling. Better for smooth functions, worse for discrete spaces. |
|---|---|
vs. Powell’s Method |
Simplex optimizes all dimensions jointly, Powell optimizes sequentially. Simplex is better for coupled parameters. |
Initialization Note
The Downhill Simplex requires n+1 initial points. In GFO, these are generated from the initialization strategy (grid, random, vertices, or warm_start). Ensure you have at least n+1 initial positions:
n_dims = len(search_space)
opt = DownhillSimplexOptimizer(
search_space,
initialize={"random": n_dims + 1} # Ensure enough initial points
)
3D Example with Warm Start
import numpy as np
from gradient_free_optimizers import DownhillSimplexOptimizer
def rosenbrock_3d(para):
x, y, z = para["x"], para["y"], para["z"]
return -(
(1 - x)**2 + 100 * (y - x**2)**2
+ (1 - y)**2 + 100 * (z - y**2)**2
)
search_space = {
"x": np.linspace(-5, 5, 200),
"y": np.linspace(-5, 5, 200),
"z": np.linspace(-5, 5, 200),
}
opt = DownhillSimplexOptimizer(
search_space,
alpha=1.0,
gamma=2.0,
beta=0.5,
initialize={"warm_start": [{"x": 0.5, "y": 0.5, "z": 0.5}],
"random": 3},
)
opt.search(rosenbrock_3d, n_iter=1000)
print(f"Best: {opt.best_para}")
print(f"Score: {opt.best_score}")
Trade-offs
Exploration vs. exploitation: Downhill Simplex is primarily exploitative. The simplex explores locally through its geometric operations but has no mechanism for large jumps to unexplored regions.
Computational overhead: Requires n+1 points per dimension, which means overhead grows linearly with dimensionality.
Parameter sensitivity: The standard coefficients (alpha=1, gamma=2, beta=0.5) work well for most problems. Rarely needs tuning.