Downhill Simplex

Downhill Simplex (Nelder-Mead) operates on a geometric structure rather than sampling from a neighborhood distribution. It maintains a simplex of n+1 vertices in n-dimensional space (a triangle in 2D, a tetrahedron in 3D) and transforms it through four operations: reflection of the worst vertex through the centroid, expansion along the reflection direction, contraction toward the centroid, and shrinking the entire simplex toward the best vertex. The simplex reshapes itself at each iteration based on the objective values at its vertices, adapting its orientation and size to the local curvature of the function.

Downhill Simplex on Ackley function — **Multi-modal function**: May converge to local optimum depending on initial simplex placement.

Downhill Simplex is structurally different from the other local search algorithms in this library, which all generate candidates by sampling from a distribution around the current position. The geometric approach makes it effective on smooth, low-dimensional functions with curved valleys or ridges, where the simplex can align itself along the function’s contours. Its performance degrades as dimensionality increases because the simplex requires n+1 vertices and the geometric operations become less informative in high-dimensional spaces. Choose it for problems with fewer than roughly 10 continuous dimensions where the objective function is smooth and deterministic.

Algorithm

The algorithm maintains n+1 vertices for an n-dimensional problem. At each iteration:

Order vertices by score (best to worst)
Reflect the worst vertex through the centroid of the remaining vertices
Depending on the reflection result:
- If best so far: Expand further in that direction
- If improvement: Accept the reflection
- If still worst: Contract toward the centroid
- If contraction fails: Shrink entire simplex toward best vertex

This geometric approach doesn’t require gradient information but adapts to the local curvature of the objective function.

Note

The Downhill Simplex is the only algorithm in GFO that uses geometric transformations rather than random sampling. The simplex shape adapts to the local landscape: it elongates along valleys, contracts near optima, and can flip over ridges through reflection. This makes it particularly effective for smooth, low-dimensional functions with curved valleys (like Rosenbrock’s function).

Simplex Operations

Given vertices: [best, ..., worst] and centroid C of [best, ..., second_worst]

Reflection:   R = C + alpha * (C - worst)      # Try opposite side
Expansion:    E = C + gamma * (R - C)          # Go further if R is good
Contraction:  K = C + beta * (worst - C)       # Pull back toward center
Shrink:       x_i = best + 0.5 * (x_i - best)  # Contract entire simplex

Parameters

Parameter	Type	Default	Description
`alpha`	float	1.0	Reflection coefficient (typically 1.0)
`gamma`	float	2.0	Expansion coefficient (typically 2.0)
`beta`	float	0.5	Contraction coefficient (typically 0.5)

Standard Coefficients

The default values (alpha=1.0, gamma=2.0, beta=0.5) are the original Nelder-Mead coefficients and work well for most problems. Adjust only if you have specific needs:

Larger alpha/gamma: More aggressive exploration
Smaller beta: More conservative contraction
0.5 shrink factor: Standard (not configurable in GFO)

Example

import numpy as np
from gradient_free_optimizers import DownhillSimplexOptimizer

def rosenbrock(para):
    x, y = para["x"], para["y"]
    return -((1 - x)**2 + 100 * (y - x**2)**2)

search_space = {
    "x": np.linspace(-5, 5, 100),
    "y": np.linspace(-5, 5, 100),
}

opt = DownhillSimplexOptimizer(
    search_space,
    alpha=1.0,   # Standard reflection
    gamma=2.0,   # Standard expansion
    beta=0.5,    # Standard contraction
)

opt.search(rosenbrock, n_iter=500)
print(f"Best: {opt.best_para}, Score: {opt.best_score}")

When to Use

Good for:

Low-dimensional problems (< 10 dimensions)
Smooth objective functions
When you want a deterministic, geometry-based approach
Problems where function evaluations are cheap

Not ideal for:

High-dimensional spaces (simplex has n+1 vertices)
Noisy objective functions
Functions with many local optima
Discrete or categorical parameters

Comparison with Other Local Methods

vs. Hill Climbing	Simplex uses n+1 points vs. neighborhood sampling. Better for smooth functions, worse for discrete spaces.
vs. Powell’s Method	Simplex optimizes all dimensions jointly, Powell optimizes sequentially. Simplex is better for coupled parameters.

Initialization Note

The Downhill Simplex requires n+1 initial points. In GFO, these are generated from the initialization strategy (grid, random, vertices, or warm_start). Ensure you have at least n+1 initial positions:

n_dims = len(search_space)

opt = DownhillSimplexOptimizer(
    search_space,
    initialize={"random": n_dims + 1}  # Ensure enough initial points
)

3D Example with Warm Start

import numpy as np
from gradient_free_optimizers import DownhillSimplexOptimizer

def rosenbrock_3d(para):
    x, y, z = para["x"], para["y"], para["z"]
    return -(
        (1 - x)**2 + 100 * (y - x**2)**2
        + (1 - y)**2 + 100 * (z - y**2)**2
    )

search_space = {
    "x": np.linspace(-5, 5, 200),
    "y": np.linspace(-5, 5, 200),
    "z": np.linspace(-5, 5, 200),
}

opt = DownhillSimplexOptimizer(
    search_space,
    alpha=1.0,
    gamma=2.0,
    beta=0.5,
    initialize={"warm_start": [{"x": 0.5, "y": 0.5, "z": 0.5}],
                 "random": 3},
)

opt.search(rosenbrock_3d, n_iter=1000)
print(f"Best: {opt.best_para}")
print(f"Score: {opt.best_score}")

Trade-offs

Exploration vs. exploitation: Downhill Simplex is primarily exploitative. The simplex explores locally through its geometric operations but has no mechanism for large jumps to unexplored regions.
Computational overhead: Requires n+1 points per dimension, which means overhead grows linearly with dimensionality.
Parameter sensitivity: The standard coefficients (alpha=1, gamma=2, beta=0.5) work well for most problems. Rarely needs tuning.