optuna.samplers.GPSampler

class optuna.samplers.GPSampler(*, seed=None, independent_sampler=None, n_startup_trials=10, deterministic_objective=False, constraints_func=None)[source]

Sampler using Gaussian process-based Bayesian optimization.

This sampler fits a Gaussian process (GP) to the objective function and optimizes the acquisition function to suggest the next parameters.

The current implementation uses Matern kernel with nu=2.5 (twice differentiable) with automatic relevance determination (ARD) for the length scale of each parameter. The hyperparameters of the kernel are obtained by maximizing the marginal log-likelihood of the hyperparameters given the past trials. To prevent overfitting, Gamma prior is introduced for kernel scale and noise variance and a hand-crafted prior is introduced for inverse squared lengthscales.

As an acquisition function, we use:

  • log expected improvement (logEI) for single-objective optimization,

  • log expected hypervolume improvement (logEHVI) for Multi-objective optimization, and

  • the summation of logEI and the logarithm of the feasible probability with the independent assumption of each constraint for (black-box inequality) constrained optimization.

For further information about these acquisition functions, please refer to the following papers:

The optimization of the acquisition function is performed via:

  1. Collect the best param from the past trials,

  2. Collect n_preliminary_samples points using Quasi-Monte Carlo (QMC) sampling,

  3. Choose the best point from the collected points,

  4. Choose n_local_search - 2 points from the collected points using the roulette selection,

  5. Perform a local search for each chosen point as an initial point, and

  6. Return the point with the best acquisition function value as the next parameter.

Note that the procedures for non single-objective optimization setups are slightly different from the single-objective version described above, but we omit the descriptions for the others for brevity.

The local search iteratively optimizes the acquisition function by repeating:

  1. Gradient ascent using l-BFGS-B for continuous parameters, and

  2. Line search or exhaustive search for each discrete parameter independently.

The local search is terminated if the routine stops updating the best parameter set or the maximum number of iterations is reached.

We use line search instead of rounding the results from the continuous optimization since EI typically yields a high value between one grid and its adjacent grid.

Note

This sampler requires scipy and torch. You can install these dependencies with pip install scipy torch.

Parameters:
  • seed (int | None) – Random seed to initialize internal random number generator. Defaults to None (a seed is picked randomly).

  • independent_sampler (BaseSampler | None) – Sampler used for initial sampling (for the first n_startup_trials trials) and for conditional parameters. Defaults to None (a random sampler with the same seed is used).

  • n_startup_trials (int) – Number of initial trials. Defaults to 10.

  • deterministic_objective (bool) – Whether the objective function is deterministic or not. If True, the sampler will fix the noise variance of the surrogate model to the minimum value (slightly above 0 to ensure numerical stability). Defaults to False. Currently, all the objectives will be assume to be deterministic if True.

  • constraints_func (Callable[[FrozenTrial], Sequence[float]] | None) –

    An optional function that computes the objective constraints. It must take a FrozenTrial and return the constraints. The return value must be a sequence of float s. A value strictly larger than 0 means that a constraints is violated. A value equal to or smaller than 0 is considered feasible. If constraints_func returns more than one value for a trial, that trial is considered feasible if and only if all values are equal to 0 or smaller.

    The constraints_func will be evaluated after each successful trial. The function won’t be called when trials fail or are pruned, but this behavior is subject to change in future releases. Currently, the constraints_func option is not supported for multi-objective optimization.

Note

Added in v3.6.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v3.6.0.

Methods

after_trial(study, trial, state, values)

Trial post-processing.

before_trial(study, trial)

Trial pre-processing.

infer_relative_search_space(study, trial)

Infer the search space that will be used by relative sampling in the target trial.

reseed_rng()

Reseed sampler's random number generator.

sample_independent(study, trial, param_name, ...)

Sample a parameter for a given distribution.

sample_relative(study, trial, search_space)

Sample parameters in a given search space.

after_trial(study, trial, state, values)[source]

Trial post-processing.

This method is called after the objective function returns and right before the trial is finished and its state is stored.

Note

Added in v2.4.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v2.4.0.

Parameters:
  • study (Study) – Target study object.

  • trial (FrozenTrial) – Target trial object. Take a copy before modifying this object.

  • state (TrialState) – Resulting trial state.

  • values (Sequence[float] | None) – Resulting trial values. Guaranteed to not be None if trial succeeded.

Return type:

None

before_trial(study, trial)[source]

Trial pre-processing.

This method is called before the objective function is called and right after the trial is instantiated. More precisely, this method is called during trial initialization, just before the infer_relative_search_space() call. In other words, it is responsible for pre-processing that should be done before inferring the search space.

Note

Added in v3.3.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v3.3.0.

Parameters:
  • study (Study) – Target study object.

  • trial (FrozenTrial) – Target trial object.

Return type:

None

infer_relative_search_space(study, trial)[source]

Infer the search space that will be used by relative sampling in the target trial.

This method is called right before sample_relative() method, and the search space returned by this method is passed to it. The parameters not contained in the search space will be sampled by using sample_independent() method.

Parameters:
  • study (Study) – Target study object.

  • trial (FrozenTrial) – Target trial object. Take a copy before modifying this object.

Returns:

A dictionary containing the parameter names and parameter’s distributions.

Return type:

dict[str, BaseDistribution]

See also

Please refer to intersection_search_space() as an implementation of infer_relative_search_space().

reseed_rng()[source]

Reseed sampler’s random number generator.

This method is called by the Study instance if trials are executed in parallel with the option n_jobs>1. In that case, the sampler instance will be replicated including the state of the random number generator, and they may suggest the same values. To prevent this issue, this method assigns a different seed to each random number generator.

Return type:

None

sample_independent(study, trial, param_name, param_distribution)[source]

Sample a parameter for a given distribution.

This method is called only for the parameters not contained in the search space returned by sample_relative() method. This method is suitable for sampling algorithms that do not use relationship between parameters such as random sampling and TPE.

Note

The failed trials are ignored by any build-in samplers when they sample new parameters. Thus, failed trials are regarded as deleted in the samplers’ perspective.

Parameters:
  • study (Study) – Target study object.

  • trial (FrozenTrial) – Target trial object. Take a copy before modifying this object.

  • param_name (str) – Name of the sampled parameter.

  • param_distribution (BaseDistribution) – Distribution object that specifies a prior and/or scale of the sampling algorithm.

Returns:

A parameter value.

Return type:

Any

sample_relative(study, trial, search_space)[source]

Sample parameters in a given search space.

This method is called once at the beginning of each trial, i.e., right before the evaluation of the objective function. This method is suitable for sampling algorithms that use relationship between parameters such as Gaussian Process and CMA-ES.

Note

The failed trials are ignored by any build-in samplers when they sample new parameters. Thus, failed trials are regarded as deleted in the samplers’ perspective.

Parameters:
Returns:

A dictionary containing the parameter names and the values.

Return type:

dict[str, Any]