optuna.samplers.GPSampler

class optuna.samplers.GPSampler(*, seed=None, independent_sampler=None, n_startup_trials=10, deterministic_objective=False, constraints_func=None, warn_independent_sampling=True)[source]

Sampler using Gaussian process-based Bayesian optimization.

This sampler fits a Gaussian process (GP) to the objective function and optimizes the acquisition function to suggest the next parameters.

The current implementation uses Matern kernel with nu=2.5 (twice differentiable) with automatic relevance determination (ARD) for the length scale of each parameter. The hyperparameters of the kernel are obtained by maximizing the marginal log-likelihood of the hyperparameters given the past trials. To prevent overfitting, Gamma prior is introduced for kernel scale and noise variance and a hand-crafted prior is introduced for inverse squared lengthscales.

As an acquisition function, we use:

log expected improvement (logEI) for single-objective optimization,
log expected hypervolume improvement (logEHVI) for Multi-objective optimization, and
the summation of logEI and the logarithm of the feasible probability with the independent assumption of each constraint for (black-box inequality) constrained optimization.

For further information about these acquisition functions, please refer to the following papers:

Please also check our articles:

The optimization of the acquisition function is performed via:

Collect the best param from the past trials,
Collect n_preliminary_samples points using Quasi-Monte Carlo (QMC) sampling,
Choose the best point from the collected points,
Choose n_local_search - 2 points from the collected points using the roulette selection,
Perform a local search for each chosen point as an initial point, and
Return the point with the best acquisition function value as the next parameter.

Decoupled optimizer update with a batched evaluation is employed to perform a batch of local searches simultaneously, specifically speeding up Step 5 above.

The following paper details the methodology:

Batch Acquisition Function Evaluations and Decouple Optimizer Updates for Faster Bayesian Optimization

Note that the procedures for non single-objective optimization setups are slightly different from the single-objective version described above, but we omit the descriptions for the others for brevity.

The local search iteratively optimizes the acquisition function by repeating:

Gradient ascent using l-BFGS-B for continuous parameters, and
Line search or exhaustive search for each discrete parameter independently.

The local search is terminated if the routine stops updating the best parameter set or the maximum number of iterations is reached.

We use line search instead of rounding the results from the continuous optimization since EI typically yields a high value between one grid and its adjacent grid.

Note

This sampler requires scipy and torch. You can install these dependencies with pip install scipy torch.

Parameters:

seed (int | None) – Random seed to initialize internal random number generator. Defaults to None (a seed is picked randomly).
independent_sampler (BaseSampler | None) – Sampler used for initial sampling (for the first n_startup_trials trials) and for conditional parameters. Defaults to None (a random sampler with the same seed is used).
n_startup_trials (int) – Number of initial trials. Defaults to 10.
deterministic_objective (bool) – Whether the objective function is deterministic or not. If True, the sampler will fix the noise variance of the surrogate model to the minimum value (slightly above 0 to ensure numerical stability). Defaults to False. Currently, all the objectives will be assume to be deterministic if True.
constraints_func (Callable[[FrozenTrial], Sequence[float]] | None) –
An optional function that computes the objective constraints. It must take a FrozenTrial and return the constraints. The return value must be a sequence of float s. A value strictly larger than 0 means that a constraints is violated. A value equal to or smaller than 0 is considered feasible. If constraints_func returns more than one value for a trial, that trial is considered feasible if and only if all values are equal to 0 or smaller.

The constraints_func will be evaluated after each successful trial. The function won’t be called when trials fail or are pruned, but this behavior is subject to change in future releases.
warn_independent_sampling (bool) – If this is True, a warning message is emitted when the value of a parameter is sampled by using an independent sampler, meaning that no GP model is used in the sampling. Note that the parameters of the first trial in a study are always sampled via an independent sampler, so no warning messages are emitted in this case.

Note

Added in v3.6.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v3.6.0.

Methods

`after_trial`(study, trial, state, values)	Trial post-processing.
`before_trial`(study, trial)	Trial pre-processing.
`infer_relative_search_space`(study, trial)	Infer the search space that will be used by relative sampling in the target trial.
`reseed_rng`()	Reseed sampler's random number generator.
`sample_independent`(study, trial, param_name, ...)	Sample a parameter for a given distribution.
`sample_relative`(study, trial, search_space)	Sample parameters in a given search space.

after_trial(study, trial, state, values)[source]

Trial post-processing.

This method is called after the objective function returns and right before the trial is finished and its state is stored.

Note

Added in v2.4.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v2.4.0.

Parameters:

study (Study) – Target study object.
trial (FrozenTrial) – Target trial object. Take a copy before modifying this object.
state (TrialState) – Resulting trial state.
values (Sequence[float] | None) – Resulting trial values. Guaranteed to not be None if trial succeeded.

Return type:

None

before_trial(study, trial)[source]

Trial pre-processing.

This method is called before the objective function is called and right after the trial is instantiated. More precisely, this method is called during trial initialization, just before the infer_relative_search_space() call. In other words, it is responsible for pre-processing that should be done before inferring the search space.

Note

Added in v3.3.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v3.3.0.

Parameters:

study (Study) – Target study object.
trial (FrozenTrial) – Target trial object.

Return type:

None

infer_relative_search_space(study, trial)[source]

Infer the search space that will be used by relative sampling in the target trial.

This method is called right before sample_relative() method, and the search space returned by this method is passed to it. The parameters not contained in the search space will be sampled by using sample_independent() method.

Parameters:

study (Study) – Target study object.
trial (FrozenTrial) – Target trial object. Take a copy before modifying this object.

Returns:

A dictionary containing the parameter names and parameter’s distributions.

Return type:

dict[str, BaseDistribution]