Surrogate Models and Function Structures
Surrogate Models
The surrogate models in MÆSTRO are available via the functions in maestro.ModelConstruction
and they are stored in the file at path maestro/model.py.
Available surrogate model functions
At this time, the following functions are available in maestro.ModelConstruction:
def appr_pa_m_construct:
def appr_ra_m_n_construct:
def appr_ra_m_1_construct:
All these function calls use apprentice to construct polynomial or rational approximation models.
appr_pa_m_construct
This function constructs polynomial approximation model of order m.
The following model object should be used in configuration inputs
where the name of the data generated by the Monte Carlo simulator is MC
"model":{ "function_str":{ "MC":"appr_pa_m_construct", }, "parameters":{ "MC":{"m":2}, } }
appr_ra_m_n_construct
This function constructs rational approximation model of numerator order m and
denominator order n. The following model object should be used in configuration inputs
where the name of the data generated by the Monte Carlo simulator is MC
"model":{ "function_str":{ "MC":"appr_ra_m_n_construct", }, "parameters":{ "MC":{"m":2,"n":2}, } }
appr_ra_m_1_construct
This function constructs rational approximation model of numerator order m and
denominator order 1. The following model object should be used in configuration inputs
where the name of the data generated by the Monte Carlo simulator is MC
"model":{ "function_str":{ "MC":"appr_ra_m_1_construct", }, "parameters":{ "MC":{"m":2,"n":1}, } }
Creating your own surrogate model function
To create your own surrogate model function, you can use the template below with inline comments explaining different lines of the code:
def my_appx_construct(self,data_name):
"""
In maestro/model.py, create a function with two arguments
data_name is the name of the data generated by the Monte Carlo simulator
that will be passed by self.consturct_models (maestro.ModelConstruction.consturct_models).
The simulator data is contained in self.mc_data_df, which is a pandas data
frame that has the following structure:
MC ...
term1.P [[1., 2.],[4., 8.],[12.,9],...]
term1.V [19., 18., 17.,...] ...
term2.P [[1., 2.],[4., 8.],[12.,9],...]
term2.V [29., 28., 27.,...]
... ... ...
"""
app = {}
appscaled = {}
columnnames = list(self.mc_data_df.index)
import apprentice
Sclocal = apprentice.Scaler(self.mc_data_df[data_name]['{}'.format(columnnames[0])],
pnames=self.state.param_names)
self.state.set_tr_center_scaled(Sclocal.scale(self.state.tr_center).tolist())
self.state.set_scaled_min_max_parameter_bounds(Sclocal.box_scaled[:,0].tolist(),Sclocal.box_scaled[:,1].tolist())
# For each term e.g., term1, term2, ...
for cnum in range(0,len(columnnames),2):
X = self.mc_data_df[data_name]['{}'.format(columnnames[cnum])]
Xscaled = [Sclocal.scale(x) for x in X]
Y = self.mc_data_df[data_name]['{}'.format(columnnames[cnum+1])]
model_parameters = self.state.model_parameters[data_name]
"""
CONSTRUCT MODELS
This is where your surrogate model construction code should be called, i.e.,
Use X, Y and model_parameters to construct surrogate models for
unscaled data and store in unscaled_model_out <any>
Use Xscaled, Y and model_parameters to construct surrogate models
for scaled data and store in scaled_model_out <any>
"""
# Save the surrogate models
scaled_val_out_file = self.state.working_directory.get_log_path(
"{}_model_scaled_k{}.<ext>".format(data_name,self.state.k))
"""
STORE scaled_model_out into scaled_val_out_file
"""
self.state.update_f_structure_model_parameters('model_scaled',{data_name:val_out_file})
unscaled_val_out_file = self.state.working_directory.get_log_path(
"{}_model_unscaled_k{}.<ext>".format(data_name,self.state.k))
"""
STORE unscaled_model_out into unscaled_val_out_file
"""
self.state.update_f_structure_model_parameters('model',{data_name:val_out_file})
Note that you need to replace the CONSTRUCT MODELS and STORE sections
in the code above to complete the model construction function.
Install the code by typing the following commands:
cd maestro
pip install .
Then the following model object should be used in configuration inputs
where the name of the data generated by the Monte Carlo simulator is MC
"model":{ "function_str":{ "MC":"my_appx_construct", }, "parameters":{ "MC":{"key-value pairs required as model_parameter in this model function"}, } }
If you want to make your model function publicly available with MÆSTRO, consider submitting a pull request.
Function Structure
The f_structure functions in MÆSTRO are available via the functions in maestro.Fstructure
and they are stored in the file at path maestro/fstructure.py.
Available f_structure functions
At this time, the following functions are available in maestro.Fstructure:
def appr_tuning_objective:
def appr_tuning_objective_without_error_vals:
All these function calls use apprentice to construct f_structure function objects.
appr_tuning_objective
The objective function in this object calculates the least squares objective with error values generated by simulator. Specifically, the objective function in this object is:
where
\(N_t\): number of terms e.g., term1, term2, …
\(w_t\): weight for term t
\(M_t(p)\): surrogate model of mean value or the MC mean value for term t evaluated at parameter value p
\(D_t\): data (mean) value for term t
\(\widetilde{M_t}(p)\): surrogate model of error value or the MC error value for term t evaluated at parameter value p
\(\widetilde{D_t}\): data error for term t
The following f_structure object should be used in configuration inputs
"f_structure":{ "parameters":{ "data":"<Path of the data file, see below>", "weights":"<Path of the weight file, see below>", "optimization":{ "nstart":5, "nrestart":10, "saddle_point_check":false, "minimize":true, "use_mpi":true } }, "function_str":"appr_tuning_objective" }
Data File
The data file is a JSON file with keys that are the term names and values that is
an array of the [\(D_t,\widetilde{D_t}\)] corresponding to the term \(t\).
If the key data is not specified in the f_structure object, then
\(D_t=0\) and \(\widetilde{D_t}=1\) is assumed for each term \(t\).
An example data file is given below
{ "Term1": [ 0.0, 1.0 ], "Term2": [ 0.0, 1.0 ], "Term3": [ 0.0, 1.0 ] }
Weight File
The weight file is a tab delimited file where the first column are the
term names and the second column is \(w_t\) corresponding to the term \(t\).
If the key weights is not specified in the f_structure object, then
\(w_t=1\) is assumed for each term \(t\).
An example weight file is given below:
Term1 1.0
Term2 1.0
Term3 1.0
appr_tuning_objective_without_error_vals
The objective function in this object calculates the least squares objective without the error values generated by simulator. Specifically, the objective function in this object is:
where
\(N_t\): number of terms e.g., term1, term2, …
\(w_t\): weight for term t
\(M_t(p)\): surrogate model of mean value or the MC mean value for term t evaluated at parameter value p
\(D_t\): data (mean) value for term t
\(\widetilde{D_t}\): data error for term t
The following f_structure object should be used in configuration inputs
"f_structure":{ "parameters":{ "data":"<Path of the data file, see below>", "weights":"<Path of the weight file, see below>", "optimization":{ "nstart":5, "nrestart":10, "saddle_point_check":false, "minimize":true, "use_mpi":true } }, "function_str":"appr_tuning_objective_without_error_vals" }
Data File
The data file is a JSON file with keys that are the term names and values that is
an array of the [\(D_t,\widetilde{D_t}\)] corresponding to the term \(t\).
If the key data is not specified in the f_structure object, then
\(D_t=0\) and \(\widetilde{D_t}=1\) is assumed for each term \(t\).
An example data file is given below
{ "Term1": [ 0.0, 1.0 ], "Term2": [ 0.0, 1.0 ], "Term3": [ 0.0, 1.0 ] }
Weight File
The weight file is a tab delimited file where the first column are the
term names and the second column is \(w_t\) corresponding to the term \(t\).
If the key weights is not specified in the f_structure object, then
\(w_t=1\) is assumed for each term \(t\).
An example weight file is given below:
Term1 1.0
Term2 1.0
Term3 1.0
Creating your own f_structure function
To create your own f_structure function, you can use the template below with inline comments explaining different lines of the code:
def my_f_structure_function(self, parameter=None, use_scaled=False):
"""
In maestro/fstructure.py, create a function with three arguments
parameter is an optional parameter argument, in case the recurrence of the
function needs to be set for faster computation and the use_scaled argument
that specifies whether to use the scaled or unscaled surrogate models in the f_structure
function
"""
m_type = 'model_scaled' if use_scaled else 'model'
# get the f_structure parameters
f_structure_parameters = self.state.f_structure_parameters
# get the mdoels
models = [self.state.f_structure_parameters[m_type][self.state.data_names[i]]
for i in range(len(self.state.data_names))]
# CONSTRUCT FUNCTION STRUCTURE OBJECT
SP = f(models, f_structure_parameters)
return SP
Note that you need to replace the CONSTRUCT FUNCTION STRUCTURE OBJECT section
in the code above to complete the f_structure object construction function.
Also, the following methods should be callable on SP:
# returns the objective function value using surrogates evaluated at parameter p
SP.objective(p)
# returns the objective function using MC simulator values obtained at parameter p,
# the MC simulator values are passed as a pandas_dataframe with the following
# structure:
#
# MC ...
# term1.P [[1., 2.],[4., 8.],[12.,9],...]
# term1.V [19., 18., 17.,...] ...
# term2.P [[1., 2.],[4., 8.],[12.,9],...]
# term2.V [29., 28., 27.,...]
# ... ... ...
SP.objective_without_surrograte_values(pandas_dataframe)
# returns the gradient of the f_structure function at parameter p
SP.gradient(p)
# runs optimization and returns result where
# result['x'] is the optimal parameter (argmin) and
# result['fun'] is the minimum objective function value (min)
SP.minimize(**self.state.f_structure_parameters['optimization'])
Install the code by typing the following commands:
cd maestro
pip install .
Then the following f_structure object should be used in configuration inputs
"f_structure":{
"parameters":{
"key-value pairs required as f_structure_parameters in this f_structure function"
"optimization":{
"key-value pairs required by the minimize function"
}
},
"function_str":"my_f_structure_function"
}
If you want to make your f_structure function publicly available with MÆSTRO, consider submitting a pull request.