Optimization of model parameters and experimental designs with the Optimal Experimental Design Toolbox (v1.0) exempliﬁed by sedimentation in salt marshes

. Geosciences are a highly suitable ﬁeld of application for optimizing model parameters and experimental designs especially because many data are collected. In this paper, the weighted least squares estimator for optimizing model parameters is presented together with its 5 asymptotic properties. A popular approach to optimize experimental designs called local optimal experimental designs is described together with a lesser known approach which takes into account a potential nonlinearity of the model parameters. These two approaches have been combined with 10 two methods to solve their underlying discrete optimization problem. Allpresented methods were implemented in an open source MATLAB toolbox called the Optimal Experimental Design Toolbox whose structure and handling is described. 15 In numerical experiments, the model parameters and experimental design were optimized using this toolbox. Two existing models for sediment concentration in seawater and sediment accretion on salt marshes of different complexity served as application example. The advantages and disad-20 vantages of these approaches were compared based on these models. Thanks to optimized experimental designs, the parameters of these models could be determined very accurately with signiﬁcantly fewer measurements compared to unop-25 timized experimental designs. The chosen optimization approach played a minor role for the accuracy so that the approach with the least computational effort is recommended.


Introduction
Mathematical models often contain roughly known model 30 parameters which are optimized based on measurements.The resulting accuracy of the model parameters depends on the conditions, also called experimental setups or experimental designs, under which these measurements are carried out.These experimental designs can be optimized so that the re-35 sulting accuracy is maximized.Thus, effort and cost of measurements can be significantly reduced.
The optimization of experimental designs is therefore particularly interesting for geosciences, where much money is spent on data collection.However, few application exam-40 ples exist in this field.See Guest and Curtis (2009) for an overview.This article aims to promote this approach in geosciences and exemplarily apply it to an existing salt marsh accretion model (Schuerch et al. (2013)).
In optimizing experimental design, the main problem is to 45 quantify the information content of the measurements to be planned.In general, this can only be done approximatively.
There are several approaches available.In Section 2, four different approaches to optimize experimental designs together with the weighted least squares estimator for model param-50 eters are presented.Each of these four approaches makes a different trade-off between accuracy and computational effort.
One approach is based on the linearization of the model with respect to the parameters and is the most common used 55 approach called local optimal experimental design.The second more robust approach takes into account a potential nonlinearity of the model parameters.Both approaches are combined with two different approaches of solving the underlying discrete optimization problem.
As far as the authors know, there is no open source software available that can apply all these four approaches.The only software using this robust approach is a closed source software called VPLAN which was introduced in Körkel (2002).For the local optimal approach, several implementations are available but no open source software written in MATLAB.All four approaches, together with approaches to optimize model parameters, were implemented in a MAT-LAB toolbox called the Optimal Experimental Design Toolbox.Its structure and handling is described in Section 3.
We have chosen two models, simulating the suspended sediment concentration on salt marshes during tidal inundation and resulting accretion rates on these marshes (Krone (1987), Temmerman et al. (2003) and Schuerch et al. (2013)), as application examples.Both models are zero-dimensional point models and differ in their complexity and number of parameters.These models can be used as a basis to predict the survival capability of salt marshes under the influence of expected global sea level rise.
To use these models for local salt marshes, their parameters have to be adapted to the local environmental conditions.The required measurements are very time-consuming and costly.Using the presented approaches, these measurements could be carried out much more efficiently.These two models are described together with the attendant numerical experiments and the associated results in Section 4.

Optimization of model parameters and experimental designs
The first step to the optimization of model parameters is the choice of the estimator.This maps the measurement results onto estimated model parameters.These estimated parameters are often defined so that they minimize a so-called misfit function.The misfit function quantifies the distance between the measurement results and the model output.
The estimator should be derived from the statistical properties of the measurement errors, e.g. a maximum likelihood estimator.Often, the measurement errors are assumed to be normally distributed.This leads to the least squares estimators.They are the most widely used class of estimators since their introduction by Gauss and Legendre (Stigler (1981)).
Their simplest form is the ordinary least squares estimator.Its misfit function is the sum of the squares of the differences between each measurement result and the corresponding model output.A generalization is the weighted least squares estimator which has advantages in case of heteroscedastic measurement errors.This estimator and its asymptotic properties are presented in the following subsection.The generalized least squares estimator is a further generalization which takes into account a stochastic dependence of the measurement errors.

The weighted least squares estimator
In the following, the weighted least squares estimator is presented.For this purpose, some notations and assumptions are introduced.
The model function is denoted by Here, Ω x ⊆ R nx is the set of feasible experimental designs and Ω p ⊂ R np the set of feasible model parameters from which the unknown exact parameter vector p ∈ Ω p is to be determined.Often, these sets are defined by lower and upper 120 bounds.
The measurement result for every design x ∈ Ω x is considered as a realization of a random variable η x .Each random variable η x is assumed to be normally distributed with expectation f (x, p) and standard deviation σ x > 0.
Furthermore, these random variables are assumed to be pairwise stochastically independent.
A1b) η x and η x are stochastically independent for every x, x ∈ Ω x .

130
If we consider n ≥ n p measurement results y = (y 1 , . . ., y n ) T ∈ R n with corresponding experimental designs x 1 , . . ., x n ∈ Ω x , the weighted least squares estimation p n and the corresponding estimator P n is defined as where the misfit function ψ n is defined as With the following assumptions, the existence of a minimum is ensured.These asymptotic properties were first proved by Jennrich (1969) for the ordinary least squares estimator and also discussed in Malinvaud (1970) and Wu (1981).In White (1980), these properties were proved for the weighted least squares estimator and for the generalized least squares estimator in White and Domowitz (1984).A good summary for all three can be found in Amemiya (1983).
Consistency means that the estimated parameters converge in probability to the unknown exact parameters as the number of measurements goes to infinity.That is for the weighted least squares estimator P n with the unknown exact model parameters p.
For consistency, the following assumptions are sufficient in addition to the previous assumptions A1 to A3. (Seber and Wild, 2003, page 565) An estimator is asymptotically efficient if its variance converges to the Cramér-Rao bound as the number of measurements goes to infinity.The Cramér-Rao bound (Cramér (1946) and Rao (1945)) is a lower bound for the variance of any unbiased estimator.
For asymptotic efficiency, the following assumptions are sufficient in addition to the previous assumptions A1 to A4. (Seber and Wild, 2003, page 571) In this case, the Cramér-Rao bound of the weighted least squares estimator P n is the inverse of the Fisher information matrix M n (p).Under these assumptions, the asymptotic behavior of the weighted least squares estimator can be summarized by its convergence in distribution as follows See, e.g., (Seber and Wild, 2003, chapter 12) and (Walter and Pronzato, 1997, chapter 3).

Optimal experimental designs
The accuracy of the weighted least square estimator P n can 200 be described by its covariance matrix.Due to the asymptotic distribution (2), this can be approximated by the inverse of the information matrix M n (p n ), provided the matrix 205 Therefore, the unknown model parameters can be determined more accurately the smaller the (approximated) covariance matrix of the estimator is.
Criteria φ : R np×np → R + ∪ {∞}, such as the trace or determinant, are used in order to compare these matrices.(See, 210 e.g., El-Monsef et al. (2009) for an overview of various criteria.)If the approximation (3) is used and M n (p n ) is singular, the value of φ is set to infinity.
In the context of optimizing experimental designs, we assume n ≥ 0 measurements have been carried out and designs 215 for additional measurements should be selected from m designs x 1 , . . ., x m ∈ Ω x .The choice for each design x i is expressed by a weight w i ∈ {0, 1} where 1 indicates the selection and 0 the contrary.
Hence, the resulting information matrix, depending on the 220 choice w ∈ {0, 1} m and the parameter vector p n ∈ Ω p , is defined as If the covariance matrix is approximated by the inverse of the information matrix, optimal (additional) designs, with (4) These optimal designs are called local optimal designs because these designs are only optimal regarding the previously model parameter estimation p n and not the unknown exact 230 model parameters p. Potential constraints on the choice of the designs can be realized by constraints on the weight w.For example, the number or the cost of the measurements can be limited by linear constraints on w.These constraints have to be consid-235 ered in the above optimization problem (4).
The formulation (4) is useful if additional experimental designs should be chosen from a finite number of experimental designs.Otherwise, the optimization problem can be reformulated so that the additional optimal design variables have 240 to be optimized directly.

Calculation of optimal experimental designs
A straight-forward way to solve the optimization problem (4) is to test all possible values of w.This direct approach is only practical for small m.
For bigger m, The optimization problem (4) is solved approximately.For this purpose, it is solved in the continuous rather than the discrete setting, i.e., the constraint w ∈ {0, 1} m is relaxed to w ∈ [0, 1] m .Accordingly, the problem arg min (5) is solved.
A possible algorithm to solve this continuous optimization problem is the SQP algorithm which is, e.g., described in (Nocedal and Wright, 1999, chapter 18).
After the continuous problem ( 5) is solved, its solution is projected onto integers with heuristics.An easy way is to round the continuous solution.Another is to sum up all continuous weights and then to choose as many designs with the highest continuous weights.Potential constraints on w still have to be considered by solving the continuous problem and the following projection onto an integer solution.The second heuristic ,e.g., preserves constraints on the number of designs to choose.
Our numerical experiments with the application examples in Section 4 have shown that the solutions of the continuous problem (5) are already close to integer values.This behavior was also observed, for example, in Körkel (2002) and Körkel et al. (2004).

Robust optimal experimental designs
The information matrix M n depends on the estimated parameters p n if the parameters are nonlinear.This may lead to suboptimal designs if ∇ p f ( • , p n ) differs strongly from ∇ p f ( • , p).
For this reason, we now consider a method which takes into account a possible nonlinearity of the parameters.This robust method was presented in Körkel (2002) and Körkel et al. (2004).
The main idea of the method is not to optimize the quality of the covariance matrix for a single parameter vector p n as in (4), but to optimize the worst case quality within a whole domain which contains the unknown exact parameter vector p with high probability.
For this purpose, a confidence region which contains p with probability α ∈ (0, 1) is approximated by Here, γ(α) is the α-quantile of the χ 2 -distribution and v A := √ v T Av denotes the energy norm of the vector v ∈ R np with respect to the positive definite matrix A ∈ R np×np .The approximation of the confidence region arises from linearization of the model function f in point p n and the as- If the worst case quality in the entire region G n (α) shall be optimized, the optimization problem (4) becomes This min-max optimization problem can be solved only 295 with considerable more computational effort compared to the optimization problem (4).In order to reduce this effort, the function φ(M n (w, • ) −1 ) is linearized in point p n in the following way. 300 The resulting inner maximization problem can be solved analytically.It is as can be seen, e.g., in Körkel (2002).With this approach the 310 optimization problem ( 7) is replaced by This optimization problem again can be solved approximatively by solving the corresponding continuous problem and projecting this solution onto an integer solution as de-315 scribed in the previous subsection.
It should be noted that in this approach (8), the first and second derivatives of the model is used.In contrast, only the first derivative is used for local optimal designs (4).

320
A common way to describe the benefit of an experimental design is its efficiency.The efficiency of an experimental design w ∈ {0, 1} m regarding a criterion φ and with n previous measurements is defined as follows.
It should be noted that the searched parameter vector p is used here.If this is not known, thus the efficiency can not be calculated.
The efficiency is always between 0 and 1 and is larger the better the experimental design is.

3 The Optimal Experimental Design Toolbox
We implemented the methods presented in the previous section for optimization of model parameters and experimental designs as a MATLAB toolbox named the Optimal Experimental Design Toolbox.numerical algorithms, especially for optimization.Moreover, MATLAB supports object oriented programming and therefore permits a simple structuring, modification and extension of the implementation.Another advantage of MATLAB is that it can easily interact with C and Fortran.
The toolbox is available at a Git repository (Reimer (2015)) under the GNU General Public License (Foundation (2007)).It includes extensive commented source code, a detailed help integrated in MATLAB and a user manual.

Provision of the model function
For the methods described in Section 2, the model function and its first and second derivative with respect to the model parameters are required.
Actually, the model function is required for the parameter optimization and, depending on the optimization method, also its first derivative.Its first derivative is also required for the experimental design optimization.If the robust method is used also its second derivative is required.
The model interface prescribes how to provide these functions.They need not be written in MATLAB itself, since MATLAB can call functions in C, C++ or Fortran.
The toolbox has several possibilities to provide the derivatives automatically.The model_fd class, e.g., provides the derivatives by approximation with finite differences.If the model function is given as an explicit symbolic function, the model_explicit class can provide the derivatives by symbolic differentiation with the Symbolic Math Toolbox.Listing 1 shows, for example, how a model_explicit object is created.
i n p u t : t h e r i g h t hand s i d e o f t h e d i f f e r e n t i a l e q u a t i o n % 2 .i n p u t : t h e model p a r a m e t e r v a r i a b l e ( s ) % 3 .i n p u t : t h e model f u n c t i o n v a r i a b l e % 4 .i n p u t : t h e i n i t i a l v a l u e o f t h e model f u n c t i o n % 5 .i n p u t : t h e d e p e n d e n t v a r i a b l e i n t h e model f u n c t i o n % 6 .i n p u t : t h e i n t e r v a l o f i n t e g r a t i o n % r e t u r n : a model o b j e c t which i m p l e m e n t s t h e model i n t e r f a c e The class takes advantage of the fact that the integration and differentiation of the differential equation can be interchanged if the model function is sufficiently often continuously differentiable.Required derivatives of the differential equation and initial value are calculated again by symbolic differentiation with the Symbolic Math Toolbox.The resulting initial value problems are solved with MATLABs ode23s function which can also solve stiff problems.Because the arising initial value problems for the derivatives are mutu-380 ally independent, the solutions of the initial value problems can be calculated in parallel using the Parallel Computing Toolbox.

Setup of the solver
Methods for the optimization of model parameters and ex-385 perimental designs are provided by the solver class.First, a solver object has to be created and the necessary informations have to be passed.
On the one hand, this is the model which has to be set by the set_model method (see Listing 3).

] ) % i n p u t : t h e i n i t i a l e s t i m a t i o n o f t h e model p a r a m e t e r s
Potential accomplished measurements can be set via the set_accomplished_measurements method.These measure-395 ments consist of the corresponding experimental designs together with their variances of the measurement errors.Also the measurement results themselves have to be passed for a parameter estimation (see Listing 5).Finally, if an optimization of experimental designs shall be 400 performed, the selectable measurements have to be set by the set_selectable_measurements method (see Listing 6).These measurements consist of the experimental designs and the variances of the measurement errors again.The get_optimal_measurements method can solve the optimization problem directly by trying all possible combinations or approximatively.
For the approximative solving, the continuous problem is solved with the SQP algorithm (see (Nocedal and Wright, 1999, Chapter 18)) provided by the fmincon function of the Optimization Toolbox.Its solution is projected onto an integer solution by the second heuristic described in 2.4.
The first derivative of the objective function is provided in analytical form.This saves much of the computing time compared to derivatives calculated by finite differences.The Hessian matrix is approximated by the BFGS-update (Broyden (1970), Fletcher (1970), Goldfarb (1970) and Shanno (1970)).
Matlab's SQP algorithm can recover from infinity.If an infinite function value is reached during the optimization, the algorithm attempts to take a smaller step.Thus, if the optimization is started with a regular design, singular designs do not make any trouble.The get_optimal_parameters method uses the Trust-Region-Reflective (Coleman and Li (1994) and Coleman and Li (1996)) or the Levenberg-Marquard algorithm (Levenberg (1944), Marquardt (1963) and Moré (1977)) provided by the lsqnonlin function of the Optimization Toolbox to solve the least squares problem resulting from the parameter estimation.The first derivative of the objective function is also provided analytically.Furthermore, the expected quality of the resulting parameter estimation for any selection of experimental designs can be calculated using the get_quality method of the solver object.Thus, for example, the increase in quality by adding or removing experimental designs can be determined.

Execution time and memory consumption
The total time required for the optimization of the model parameters or an experimental design depends crucially on the time required for evaluating the model function and its first and second derivative with respect to the model parameters.
When optimizing model parameters, the model function and its first derivative has to be evaluated several times with different model parameter vectors at the accomplished measuring points.When optimizing experimental designs, the model function and its first and second derivative has to be 455 evaluated for one model parameter vector but at the accomplished and selectable measuring points.
Generally, the execution time increases with the number of parameters, the number of selectable measurements and the number of accomplished measurements.

460
The implementation of this toolbox favors a low execution time of a low memory consumption.For this reason, (intermediate) results within a method call and between successive method calls are saved and reused.An example are multiple occurring matrix multiplications within a method call.An-465 other example is a re-optimization of designs with other constraints, such as another maximum number of allowed measurements.Here, the derivatives of the model function calculated in the previous optimization is reused.
Due to the described caching strategy, the total memory 470 consumption depends linearly on the number of (accomplished and selectable) measurements and quadratically on the number of parameters.Nevertheless, it should be possible to solve problems with hundreds of parameters and thousands of measurements on a standard computer.475

Changeable options
Many settings for the optimization of experimental designs or model parameters are changeable.These can be altered by the set_option method of the solver object (see Listing 9).The desired options can be set using property-value pairs, as 480 already known from MATLAB.Estimation method: The estimation method for the quality of experimental designs can be selected by the estimation_method option.The standard point estimation method and the robust region estimation method, 485 both presented in Section 2, are supported.The region estimation method is the default setting.
Confidence level: The level of confidence for the confidence region at the region estimation method, represented by α in Section 2.5, can be set by the alpha option.The default value is 0.95.
Prior parameter estimation: It can be chosen whether a parameter optimization should be performed before optimizing experimental designs.This can be set by the parameter_estimation option and the values yes or no.

495
To save computational time no previous parameter optimization is performed by default.
Quality criterion: The quality criterion, which is applied to the covariance matrix and represented in Section 2.1 as φ, can also be chosen with the criterion option.The criterion interface prescribes the syntax of the criterion function and its necessary derivatives.The trace of the covariance is the default criterion and implemented by the criterion_A class.
Parameter scaling: It can be chosen whether model parameter should be scaled before optimizing experimental designs or the model parameters themselves.Scaling means a uniform impact of all model parameters and is enabled by default.The options are edo_scale_parameters and po_scale_parameters with the values yes and no.
Optimization algorithm for experimental design: The exact and the approximative approach for the optimization of experimental design problem can be chosen with the ed_algorithm option and the values direct and local_sqp.For time reasons by default the experimental design problem is solved by the approximative approach.Furthermore, the number of function evaluations and iterations by the SQP algorithm can be constrained by the options ed_max_fun_evals and ed_max_iter.
Optimization algorithm for parameter estimation: The optimization algorithm for the parameter estimation problem can be chosen with the po_algorithm option.
The Trust-Region-Reflective algorithm is the default algorithm.Furthermore, the number of function evaluations and iterations can be limited through the options po_max_fun_evals and po_max_iter.

Help and documentation
The Optimal Experimental Design Toolbox also provides an extensive integrated help.Besides system requirements and version informations, a user's guide with a step by step instruction how to optimize experimental designs and model parameters is included.Demos show how to work with the toolbox in practice.In addition, a detailed description for every class and method is available.The layout of the help of the Optimal Experimental Design Toolbox is based on the design of the help also used by MATLAB and other toolboxes.Thus the user does not have to get used to a new layout.

Application examples
In this section, numerical experiments together with their results regarding the optimization of model parameters and experimental designs are presented for two models of different complexity.Both models describe the sediment con-550 centration in seawater during tidal inundation of coastal salt marshes.
Coastal salt marshes have an important ecological function with their diverse flora and as a nursery for migratory birds.Furthermore they have the ability of dissipating current and 555 wave energy and therefore reducing erosional forces at dikes and coastal areas.
With these models, the vertical accretion of coastal salt marshes can be predicted.When considering expected global sea level rise (IPCC ( 2013)), the future ability of coastal salt 560 marshes to adapt to rising sea levels and thus to survive can be estimated.Depending on this, measures to protect these salt marshes can be taken.
Calibration of the model parameters requires measurements of suspended sediment concentration during tidal in-565 undation, which are time-consuming and laborious.For this reason, it is advantageous to know under which conditions and how many measurements should be carried out.

The models
Both models are zero-dimensional point models, which de-570 scribe the sediment concentration in seawater during tidal inundation of coastal salt marshes.The first model (C 2 -model) has two model parameters, was described in Temmerman et al. (2003) and adapted for a salt marsh in the German Wadden Sea (South-Eastern North Sea), located near Hoer-575 num in the southern part of the island of Sylt (Germany), by Schuerch et al. (2013).The second model (C 3 -model) has three model parameters, is an extension of the first model and subject of current research.

The C 2 -model 580
The first model is called the C 2 -model.Here, the sediment concentration in kg m 3 is modeled by the function C : [t S , t E ) → R + .Furthermore, t S is the start time of the inundation of the salt marsh and t E the end time.The concentration C is given implicit as solution of the initial value Here, C 0 ≥ 0 is the initial sediment concentration of the flooding seawater and w S ≥ 0 the settling velocity of the sus-pended sediment in m s .Moreover, the function describes the time-dependent water surface elevation and E the elevation of the marsh both in meters and relative to a fixed datum.Here, a, b and x 0 are constants describing the change in the water level, h M HW the mean high water level and h HW the high water level of a certain tidal inundation in meters.The start and end time t S and t E of the inundation are the points where the height h equals the elevation of the marsh E.
The sediment concentration C thus decreases continuously within a tidal cycle depending on the settling velocity w S which is described by the term 10).During the flood phase, the reduced sediment concentration is partially compensated by new inflowing sea water.This is described by the term in the first case of ( 10).The values used in the water surface elevation function h, for the local salt marsh, are shown in Table 1.These have been estimated by non-linear regression analysis using local historic tide gauge data from 1999-2009 (at Hoernum Hafen, Germany).The continuous high-resolution (6 minutes) time series has, therefore, been split into the individual tidal cycles beforehand (Schuerch et al. (2013)).The high water level h HW of the current tidal inundation is measured or taken from predictions.
The initial sediment concentration C 0 and the settling velocity w S are only roughly known and therefore model parameters.Reference values derived from literature values and typical ranges can be found in Table 2. See Bartholdy and Aagaard (2001) for C 0 and Temmerman et al. (2003) for C 0 and w S .
Where k ≥ 0, r ≥ 0 and s ≥ 0 are unknown model parameters.Reference values derived from literature values and typical ranges (where available) can be found in Table 3. See van Leussen (1999) and Pejrup and Mikkelsen (2010) for the settling index s and Temmerman et al. (2004) for k.

635
In this model, a linear relationship between the initial sediment concentration and the high water level is assumed, where during heavy flooding a higher sediment concentration is assumed (Temmerman et al. (2003) and Schuerch et al. (2013)).Additionally, a relationship between the initial 640 sediment concentration and the settling velocity is assumed (Krone (1987)).This is an empirical approximation of the so-called flocculation process (Burt (1986)).

Numerical experiments
We performed several numerical experiments to compare the 645 benefit of optimized with unoptimized measurement conditions.Also, the benefit of different approaches to optimization measurement conditions was compared.Using these results, an appropriate approach for the optimization of conditions for real measurements was selected.

650
The approaches introduced in Section 2 and implemented by the Optimal Experimental Design Toolbox described in Section 3 were used for the numerical experiments.For that, we used the model_ivp class which allows to calculate the solution of an initial value problem and its first and sec-655 ond derivatives with respect to the model parameters.The C 2 -model was implemented by the model_C2 class and the C 3 -model by the model_C3 class which is a subclass of the model_C2 class.
For our numerical experiments, we used the model output 660 with the reference parameters in Tables 2 and 3 plus an additive normally distributed measurement error with zero expectation as artificial measurement results.As standard deviation of the measurement error, we once chose 10 −2 and once 10 −1 .

665
In our numerical experiments, we alternately selected a fixed number of experimental designs and estimated the model parameters with corresponding measurement results.We carried out each experiment ten times and averaged the results to minimize the influence of randomness.
For the parameter estimation, the start values and bounds in Tables 2 and 3 were used.The bounds were chosen so that the typical range of values is covered, but also more extreme values are possible.The starting values were chosen slightly outside the typical ranges to represent a poor initial guess.
The experimental designs for these models consist of the time point of the measurement and the high water level of the tidal inundation.A set of thirty selectable experimental designs was specified.They were obtained by combining three different high water levels of the tidal inundation (1.5m, 2.0m and 2,5m) with ten time points equidistantly spread over the inundation period.
For choosing the experimental designs, we compared the standard and the robust approach presented in Section 3 with the trace as quality criterion together with uniformly distributed experimental designs.In the robust approach, a confidence level of 95% was used.The optimization problems for the experimental designs were once solved exactly and once approximatively.(See Section 2.4.)To evaluate all these methods, we compared the resulting parameter estimations with the reference model parameters.
We further investigated whether the number of measurements after which new experimental designs are optimized had an impact on the accuracy of the parameter estimation.For this purpose, different numerical experiments were performed where the parameters and experimental designs have been optimized after each one, three resp.five measurements.Altogether fifty measurements were simulated at each experiment with the C 2 -model.For the C 3 -model, hundred and fifty measurements were simulated at each experiment since the model is more complex and therefore a sufficiently accurate estimation of its parameters might be more difficult.

Accuracy of the parameter estimations
In this subsection, we compare the accuracy of the parameter estimations resulting from the previously described numerical experiments.Some results are illustrated in Figures 10 and 11  The accuracy of the parameter estimations for the C 2 -model only improved marginally after four to twelve measurements 710 independently of the choice of the experimental designs.The accuracy improved faster the more frequently the experimental designs and parameters were optimized.However, the best achieved accuracy was independent of the frequency.
With uniformly distributed experimental designs the best 715 achieved accuracy was slightly worse than with optimized experimental designs.Additional four to six more measurements were needed compared to optimized experimental designs in order to achieve their accuracy.
Although the parameters occur nonlinearly in this model, it made close to no difference whether the standard or the robust approach for the optimization of the experimental designs was used.
The approximate solving of the discrete optimization problem has resulted in a slightly worse accuracy at the first 725 iterations compared to the exact solving.Thereafter, the difference was very small.The solutions of the relaxed continuous optimization problems were almost always nearly integer.
The different standard deviations of the measurement er-730 rors only influenced the best achieved accuracy which was of course worse at a higher standard deviation.This can be explained by the fact that different constant standard deviations only mean a different scaling of the objective of the experimental design optimization problem.Thus, different constant 735 standard deviations do not affect its solution.After ten to twenty-five measurements, the accuracy of the parameter estimations for the C 3 -model with optimized experimental designs only improved slightly.Again, the accuracy improved faster, the fewer measurements were performed per iteration and the best achieved accuracy was independent of the number of measurements per iteration.
With uniformly distributed experimental designs, the best accuracy was achieved after twenty-four to sixty measurements.Furthermore, the best achieved accuracy was worse by about a factor of ten compared to the best accuracy achieved by (standard) optimized experimental designs.
The standard approach for optimizing experimental designs resulted in a slightly better accuracy compared to the robust approach.
For both approaches, the difference between the accuracy achieved with the exact solutions of the discrete optimization problem and the accuracy achieved with the approximate solutions was small but recognizable and almost constant over the iterations.Also in these experiments, the solutions of the relaxed continuous optimization problems were almost all nearly integer.
Again, the different standard deviations of the measurement errors only influenced the best achieved accuracy.

Conclusions regarding the approach for optimizing experimental designs
Optimized experimental designs provided a much more accurate parameter estimation than uniformly distributed experimental designs independent of the chosen optimization approach.Furthermore, only about half as many measurements were needed to archive the same accuracy with optimized experimental designs as with uniformly distributed experimental designs.In the more complex model, the difference was even bigger.
The robust approach achieved no higher accuracy compared to the standard approach.In the complex model, the robust approach was even slightly less accurate.This may indicate that the gain in accuracy by taking into account the nonlinearity is offset by the additional approximations in the 775 robust approach.Since a considerably higher computational effort is associated with the robust approach, the standard approach should be preferred, at least for these models.
The exact solutions of the discrete optimization problems yielded only slightly better accuracy gains compared to its 780 approximate solutions.The fact that the approximate solutions were almost all nearly integer also argues for the approximate solving.This circumstance was also observed in Körkel (2002) and Körkel et al. (2004).For these reasons and because the exact solving requires much more computa-785 tional effort, the approximate solving should be preferred, at least for these models.

Efficiency for the experimental designs
We also calculated the efficiencies of the used experimental designs.Some results are illustrated in Figures 12 and 13    The results emphasized the already seen importance of the optimization of the experimental designs.In particular, the advantage in the case of few measurements carried out so far was highlighted.Again, the slight advantage of the standard approach over the robust approach was visible.With 795 increasing number of accomplished measurements, the selection strategy of new measurements became less important as the amount and thus the influence of the new measurements compared to those of the accomplished measurements decreased.

Distribution of optimal measuring points
In this subsection, we compare the distribution of the measuring points optimized in the previously described numerical experiments.Graphical representation of the distribution of the measuring points from some numerical experiments are shown in Figure 14 and 15.The optimized measuring points were almost exclusively located at the start and end of the inundation periods.At the start of the inundation period, both approaches in the exact variant favored lower high water levels unlike both approaches in the approximate variant which favored higher high water levels.At the end of the inundation period, the standard approach in both variants favored lower high water levels unlike the robust approach in both variants which favored higher high water levels.For the C 3 -model the optimized measuring points accumulated at the end of the inundation periods.All approaches favored lower high water levels.With an increasing number of measurements per iteration the robust approach in both variants also prefered measurements in the middle of the inundation periods with the highest high water level.The experiments with the C 3 -model showed that here measurements at end of the inundation periods should be preferred.In this model, the concentration at the start is no parameter but is affected by a parameter that also influences the settling velocity.For this reason, measurements are not 845 suggested at the start.
For both models the high water level seemed to play a minor role for the choice of measuring points.
As a rule of thumb one can say that measurements should be carried out at the end of an inundation period and also 850 some at the start if the C 2 -model is used.

Conclusions
In this paper we presented two different approaches for optimizing experimental design for parameter estimations.One method was based on the linearization of the model with re-855 spect to its parameters, the other takes into account a possible nonlinearity of the model parameters.Both methods were implemented in our presented Optimal Experimental Design Toolbox for MATLAB.
By employing the presented approach for two existing salt 860 marsh models, we showed that model parameters can be determined much more accurately if the corresponding measurement conditions were optimized.Especially for timeconsuming or costly measurements, it is therefore useful to optimize the measurement conditions with the Optimal Ex-865 perimental Design Toolbox.This gain in accuracy is not limited to the application examples.In general, using the implemented methods, the accuracy of the parameters of any model can be maximized while minimizing the measurement cost, provided that the 870 related assumptions are fulfilled.However, the required execution time for the optimization increases with the number of model parameters and (accomplished and selectable) measurements.Parallelization techniques in the optimization as well as in the model evaluation itself can counteract this effect.
In addition to the parallelization, the optimization in the toolbox could also be extended to techniques of globalization, so that the chance of getting into a local minimum is reduced.
The results concerning the application examples have not significantly differed in spite of various approaches for optimizing experimental design.For this reason, the approach with the least computational effort is recommended.However, this recommendation can not be applied readily to other (more complex) models.Here, the performance of the approaches should be compared again if possible.
Furthermore, it has been found that measurements at the beginning and end of the inundation period are particularly important for the application examples.The high water level of the inundation seemed to play a minor role.These results, however, can not be applied easily to other models.Usually, a separate optimization of experimental design makes sense here.

Code availability
The Optimal Experimental Design Toolbox is available under the GNU General Public License (Foundation (2007)) at a Git repository (Reimer (2015)).Besides the toolbox, including commented source code and a user manual, also an implementation of the application examples is available.

225
respect to a criterion φ, are expressed by a solution of arg min w∈{0,1} m φ(M n (w, p n ) −1 ).

Figure 1 .
Figure 1.Create a model with a symbolic model function m o d e l _ o b j e c t = m o d e l _ e x p l i c i t ( ' p* t ˆ2 ' , ' p ' , ' t ' ) % 1 .i n p u t : t h e model f u n c t i o n a s s y m b o l i c f o r m u l a % 2 .i n p u t : t h e p a r a m e t e r v a r i a b l e ( s ) % 3 .i n p u t : t h e e x p e r i m e n t a l d e s i g n v a r i a b l e ( s ) % r e t u r n : a model o b j e c t which i m p l e m e n t s t h e model i n t e r f a c e

Figure 2 .
Figure 2. Create a model with a model function given as solution of an initial value problem m o d e l _ o b j e c t = m o d e l_ i v p ( '−y+( t +1)*b ' , ' [ a , b ] ' , ' y ' , ' a ' , ' t ' , [ 1 , 1 0 ] ) % 1 .in p u t : t h e r i g h t hand s i d e o f t h e d i f f e r e n t i a l e q u a t i o n % 2 .i n p u t : t h e model p a r a m e t e r v a r i a b l e ( s ) % 3 .i n p u t : t h e model f u n c t i o n v a r i a b l e % 4 .i n p u t : t h e i n i t i a l v a l u e o f t h e model f u n c t i o n % 5 .i n p u t : t h e d e p e n d e n t v a r i a b l e i n t h e model f u n c t i o n % 6 .i n p u t : t h e i n t e r v a l o f i n t e g r a t i o n % r e t u r n : a model o b j e c t which i m p l e m e n t s t h e model i n t e r f a c e 390

Figure 3 .
Figure 3. Set the model s o l v e r _ o b j e c t .s e t _ m o d e l ( m o d e l _ o b j e c t ) % i n p u t : an o b j e c t t h a t i m p l e m e n t s t h e model i n t e r f a c e

Figure 4 .
Figure 4. Set the initial parameter estimation s o l v e r _ o b j e c t .s e t _ i n i t i a l _ p a r a m e t e r _ e s t i m a t i o n ( [ 1 , 2 ] ) % i n p u t : t h e i n i t i a l e s t i m a t i o n o f t h e model p a r a m e t e r s

Figure 5 .
Figure 5. Set accomplished measurements s o l v e r _ o b j e c t .s e t _ a c c o m p l i s h e d _ m e a s u r e m e n t s ( ( 1 : 5 ) ' , 0 .0 1 * o n e s ( 5 , 1 ) , −← e x p ( ( 1 : 5 ) ' ) ) % 1 .i n p u t : t h e e x p e r i m e n t a l d e s i g n s o f a c c o m p l i s h e d measurements % 2 .i n p u t : t h e v a r i a n c e s o f t h e a s s o c i a t e d measurement e r r o r s % 3 .i n p u t : t h e a s s o c i a t e d measurement r e s u l t s

Figure 6 .
Figure 6.Set selectable measurements s o l v e r _ o b j e c t .s e t _ s e l e c t a b l e _ m e a s u r e m e n t s ( ( 6 : 1 0 ) ' , 0 .0 1 * o n e s ( 5 , 1 ) ) % 1 .i n p u t : t h e s e l e c t a b l e e x p e r i m e n t a l d e s i g n s % 2 .i n p u t : t h e v a r i a n c e s o f t h e a s s o c i a t e d measurement e r r o r s

Figure 7 .
Figure 7. Optimize experimental designs o p t i m a l _ m e a s u r e m e n t s = s o l v e r _ o b j e c t .g e t _ o p t i m a l _ m e a s u r e m e n t s ( 3 ) % i n p u t : t h e maximum number o f measurements a l l o w e d % r e t u r n : t h e o p t i m a l s u b s e t o f t h e s e l e c t a b l e measurements w i t h a ← number o f measurements l e s s o r e q u a l t o t h e r e s t r i c t i o n

Figure 8 .
Figure 8. Optimize model parameters o p t i m a l _ p a r a m e t e r s = s o l v e r _ o b j e c t .g e t _ o p t i m a l _ p a r a m e t e r s ( [ 0 , 0 ] , [ 9 , 9 ] ) % 1 .i n p u t : t h e l o w e r bound o f t h e model p a r a m e t e r s % 2 .i n p u t : t h e upper bound o f t h e model p a r a m e t e r s % r e t u r n : a p a r a m e t e r e s t i m a t i o n r e s u l t i n g from t h e a c c o m p l i s h e d ← measurements which t a k e s i n t o a c c o u n t t h e p a s s e d c o n s t r a i n t s

Figure 9 .
Figure 9. Change an option s o l v e r _ o b j e c t .s e t _ o p t i o n ( ' o p t i o n n a m e ' , o p t i o n _ v a l u e ) % 1 .i n p u t : t h e name o f t h e o p t i o n which s h o u l d be changed % 2 .i n p u t : t h e new v a l u e o f t h e o p t i o n -0.2 4 × 10 −6 -4 × 10 −4 start value 5 2 × 10 −7 optimization bound 10 −4 -10 4 10 −8 -1 4.1.2The C 3 -model The second model is an extension of the C 2 -model and is 625 called the C 3 -model.Here the model parameters C 0 and w S are substituted by optimization bound 10 −4 -10 4 10 −8 -1 10 −1 -5 .
Figure 10.Averaged error in the parameter estimation from ten optimization runs with the C2-model and three measurement per iteration with standard deviation 10 −2 of the measurement error.

Figure 11 .
Figure 11.Averaged error in the parameter estimation from ten optimization runs with the C3-model and three measurement per iteration with standard deviation 10 −2 of the measurement error. . 790

Figure 12 .
Figure 12.Averaged efficiency for the experimental designs from ten optimization runs with the C2-model and three measurement per iteration with standard deviation 10 −2 of the measurement error.

Figure 13 .
Figure 13.Averaged efficiency for the experimental designs from ten optimization runs with the C3-model and three measurement per iteration with standard deviation 10 −2 of the measurement error.
Figure 14.Averaged frequency of measurements from ten optimization runs with the C2-model and three measurement per iteration with standard deviation 10 −2 of the measurement error.
Figure 15.Averaged frequency of measurements from ten optimization runs with the C3-model and three measurement per iteration with standard deviation 10 −2 of the measurement error.
showed that measurements at the start and end of the inundation periods should be preferred for the C 2 -model.Measurements at the start of the inundations can be justified by the fact that one parameter of the model is the concen-830 tration at the start of the inundation.The fact that the settling velocity as second model parameter most affects the concentration at the end of the inundations justifies measurements here.This can be confirmed by an examination of the ordinary differential equation of the model derived with respect 835 to the settling velocity.The derivative of the model with respect to the settling velocity is zero at the start of the inundation and is getting smaller the further the inundation progresses.Its absolute greatest value it thus reached at the end of the inundation. 840

Table 1 .
Values used for the water surface elevation function h

Table 2 .
Values for the C2-model

Table 3 .
Values for the C3-model