Mixture design : A review of recent applications in the food industry

Öz Design of experiments (DOE) is a systematic approach to applying statistical methods to the experimental process. The main purpose of this study is to provide useful insights into mixture design as a special type of DOE and to present a review of current mixture design applications in the food industry. The theoretical principles of mixture design and its application in the food industry, based on an extensive review of the literature, are described. Mixture design types, such as simplex-lattice, simplex-centroid, D-optimal and crossed mixture, are compared in terms of their characteristics and advantages. Multi-response optimization and the application of some heuristics and softwares are discussed. This review focuses on an overview of the more specialized and novel food applications in the recent literature. Deney tasarımı, deneysel süreçlerde istatistiksel teknikleri kullanan sistematik bir yaklaşımdır. Bu çalışmanın temel amacı, deney tasarımının özel bir hali olan karışım tasarımlarının gıda sektöründeki güncel uygulamalarına dönük bir bakış açısı sunmaktır. Karışım tasarımlarına ilişkin teorik bilgiler ve gıda sektöründeki uygulamalar geniş bir literatür temel alınarak açıklanmaya çalışılmıştır. Simpleks Kafes, Merkezlenmiş Simpleks Kafes, D-optimal ve Çapraz tasarımlar gibi bir çok karışım tasarım tipi özellikleri ve avantajları yönüyle karşılaştırılmıştır. Çok yanıtlı eniyileme ve eniyilemede kullanılan bazı algoritmalar ve yazılımlar tartışılmıştır. Özetle, yapılan çalışma gıda endüstrisinde karışım tasarımlarına ilişkin daha özel ve güncel uygulamalara odaklanmıştır.


Introduction
Statistical techniques, from the simplest to the most complex, have been widely used in businesses. One of the leading approaches among statistical techniques is experimental design.
Experimental design is a systematic approach to apply statistical methods to experimental processes to improve input-output factors and process parameters. That is, experimental design is usually used as a methodology for selecting the levels of independent factors which provides the least variation on the required quality. Experimental design is also a powerful tool for fitting experimental data to an empirical function to provide information about a system. An experiment can be designed with a certain number of factors at a predetermined number of levels based on the observed product or process. Experimental design may be undertaken for a variety of purposes, such as optimizing the accuracy with which the model parameters are estimated (e.g., D-optimal designs) or optimizing of the predictive accuracy of the model (e.g., V-optimal designs).
The experimental design process consists of several distinct steps. In the DOE approach, the experimenter must determine the purpose of the experiment. For example, is it an exploratory experiment to be conducted to determine which factors or factor levels affect the outcome? Is the experiment intended to determine how to optimize the response(s) of a process? Different experimental designs have different purposes and are developed with different levels of prior knowledge about the product or process being studied. The experimenter must also identify both the independent factors and the response factors in which he or she is interested. The permissible ranges of the independent factors must be determined, and this determination defines the experimental region. The response surface methodology (RSM) also requires that the experimenter determine the nature of the model that will be fit to the experimental data. If a polynomial model is used, its degree must be specified, and if a more fundamental model is to be used, it must be more specified. Once these tasks have been accomplished, the experimenter can undertake the process of experimental design to determine the levels of the independent factors at which the experiment should be conducted.
An integrated mathematical and statistical technique for experimental design, model building and determining the effects of independent factors is the response surface methodology [1]. RSM reduces the number of experimental trials required in multi-factor experiments. Additionally, depending on the preferences of the experimenter about the increments of the input parameters, the relevant responses can be optimized by considering criteria such as most desired value (the target value), maximization or minimization. RSM is an effective method for analyzing and determining effects in multifactor experiments. The use of RSM in the determination of a polynomial equation was described by Yin, Chen and Gu [2] Before applying RSM, an experimental design is developed as the initial step that defines which experiments should be performed.
In many cases, the determination of proportions is important to obtaining the desired output. Mixture design, a special type of RSM, is a very effective method of determining the proportions of variables (ingredients) of a blend. The output varies depending on the proportions but the total remains constant as 1. Although no multipurpose technique is known to be applicable to all situations, mixture designs have been successfully applied to scientific research and development and have been implemented successfully in real-world problems.
Cafaggi, Leardi, and Parodi [3]. published a tutorial to show the application of mixture design to a pharmaceutical formulation. The statistical studies on mixture experiments published between 1955 and 2004 were summarized in Piepel [4]'s paper. Bezerra et al. [5] and Leardi [6] explained the uses of RSM in chemistry and discussed its advantages. In addition, Leardi [6] discussed three real-world examples. The paper emphasizes the importance of mental attitude in experimental design.
Mixture design is important in industries such as paint, glass, ceramic frits and polymers and is extremely important in the food industry. To the best of our knowledge, a review of mixture design applications in the food industry does not exist in the literature. The main purposes of this study are to present a review of recent mixture design applications in the food industry and to highlight the applicability and efficiency of mixture design in the food industry. This study also seeks to provide guidance to researchers and useful insights into the capabilities of mixture design as a special type of DOE. Three experimental design types simplex-lattice, simplex-centroid, and D-optimal are compared in terms of their characteristics and efficiency. Single optimization and multi-response optimization achieved by applying desirability functions and other techniques are also discussed. Section 2 of this study presents a discussion of the various mixture design models that can be used for food industry applications. It is clear that to use software is inevitable in many analyses. The software available for finding, fitting and analyzing the models are discussed in the third section. Various optimization tools and techniques are also described in this section. In section 4, methods for the verification of models are described. The last section includes some aspects related to possible future applications and conclusions.

Types of mixture design
Mixture design is defined as a special type of RSM in which the factors are the components of a mixture and the response varies as the proportions vary, i.e., the response is affected by the variation of the proportions [7], [8]. The sum of the proportions is sum up to one. In such cases, a standard design approach is not appropriate and cannot be applied. A q-component mixture is shown in Eq. 1.
where xi represents the proportion of the i th component in the mixture. The q-components form a regular (q-1)-dimensional simplex.
The choice of the appropriate mixture design requires to take account some points; such as the number of factors and interactions to be studied, the complexity of each design, the statistical validity and effectiveness of each design, and the ease of implementation and cost and time constraints associated with each design. In the following subsections, the most frequently used mixture design types are described.

Simplex-Lattice design
One of the most widely used mixture design types is the simplex-lattice design, which is defined as follows: A "{ , }" simplex-lattice design for q components consists of points defined by the following coordinate settings: the proportions assumed by each component take m-1 equally spaced values from 0 to 1, and all possible combinations (mixtures) of the proportions from Eq. 2 are used. In general, the number of points in a "{ , }" simplex-lattice design is: Detailed explanations of simplex-lattice design can be found in Cornell [7]. In Figure 1, various "{ , }" simplex-lattice designs with q components are shown [9].

Simplex-Centroid design
An alternative to the simplex-lattice design is the simplexcentroid design [10]. A simplex-centroid mixture design is applicable when all of the components have the same range (between 0 and 1) and no constraints on the design space exist. A center-point run with equal amounts of all the ingredients is always included.

Other types of mixture design
Mixture experiments typically involve additional complications. In many mixture designs, the restrictions on the component proportions take the form of lower ( ) and upper ( ) limit constraints.
The general form of the constrained mixture problem is shown in Eq. 3: I-, D-and G-optimality are widely used for constrained mixture problems. I-optimality focuses on minimizing the average scaled prediction variance over the design region. G-optimality focuses on the variance of the overall prediction equation. D-optimal design focuses on estimating the best possible model coefficients, especially for constrained mixture regions. This can be accomplished by minimizing the determinant of ( ′ ) −1 , where is the matrix of the appropriate component proportions and possibly cross-products between the proportions, depending on the model, and ′ is the transpose of [11].

Mixture design applications
In this part, a review of current mixture design applications in the food industry are introduced based on basic mixture design aplication steps: choosing design type, optimization and verification.

Design type
Many applications of the simplex-lattice design in the optimization of industrial procedures can be found in the literature. Appendix A lists a limited number of recent applications of simplex-lattice in the food industry, as examples of the use of this type of design. These recent applications are classified according to food type, factors handled in the mixture models and the objective of the study. Additionally, some examples of simplex-centroid designs are summarized in Appendix B.
Kappele [12] and Verbeken, Thas, and Dewettinck [13] used I-optimal mixture design to demonstrate that I-optimal designs are superior to conventional designs for industrial experiments. I-optimal designs are superior to conventional designs because they provide narrower confidence limits on predictions, on average, producing higher-quality predictions of product performance. By minimizing the maximum prediction error within the experimental region, using graphs to identify the design with the highest reliability, I-optimal design allowed the best interpretation of the experimental data.
Bezerra et al. [14] optimized the proportions of the components of the liquid phases of slurries by constrained mixture design. Additionally, some variables such as sample mass, sonication time and acid concentrations were considered.
Zhou et al. [15] used D-optimal mixture design to optimize the formulation of the predominant strains isolated from Tibetan kefir grains. Tibetan kefir grain is a massive, milky white, plastic-like, and grain-like object on which a variety of microorganisms can grow in milk and break up to for new grains with the same characteristics as the old ones. To obtain the optimal formulation of pure cultures in Tibetan kefir, the influences of different mixtures of five strains the cultures on the flavor components of fermented milk were studied using mixture design. Previous research conducted in this area using conventional approaches rather than mixture design involved a number of experiments that failed to identify the best combination of the cultures. D-optimal mixture design was also used to optimize the composition of a culture medium for selenium-enriched yeast production [2]. The aim of the optimization was to find the most significant factors affecting the biomass yield and total selenium yield. The results showed that mixture design is an effective and reliable technique for determining the optimal ratio of components in the fermentative medium.
Sometimes it is necessary to combine process variables with mixture design with another mixture design or other process variables. For example, process variables are factors in an experiment that affect the blending properties of the mixture components. A so-called mixture-process design can be constructed to optimize both the components of a mixture and the process variables [16]. One approach involves setting up a mixture design for each combination of the process variables. A second approach involves creating a factorial arrangement of the process variables for each combination of the mixture components. Examples of applications of these types of mixture design in the food industry are described below.
Dingstad, Egelandsdal, and Naes [17] described a crossed mixture design in which two mixture designs were combined (mixtures of mixtures), in a manner similar to that used in a mixture-process situation. A case study in sausage production was described. Both the quantity and source (different types of meat) of the biochemical components were considered. The quantity reflects minor components such as protein and fat that are the same for the different sources. The original focus of this study was on the interpretation issues and modeling methodology for crossed mixtures. The intra-mixture and intermixture behavior of two mixtures was interpreted using contour plots.
A mixture-process experiment design was used in a study by Ketelaere, Goos, and Brijs [18] because there were six mixture components that summed to one and one additional uncontrollable process variable. The relationship between the flour characteristics and bread quality was investigated. This study investigated which of 30 flour samples or which combinations of them should be chosen to create an optimal mixture. The experimental region was highly constrained, and the classical mixture design process was not appropriate. Optimal design approach involved using two different algorithms, one based on a coordinate exchange algorithm and the other based on a point exchange algorithm. The results showed that the first algorithm outperformed the second in finding the best approximation to a D-optimal design.
There are situations in the food sector in which various modeling methods can be integrated with different types of mixture designs. A good example in the food sector is in Didier et al. [19]'s paper artificial neural networks (ANN) coupled with crossed mixture design was used to test different blends.

Optimization and software
Optimization is the process of discovering where the best values lie. The use of the common-sense and trial-and-error methods to obtain the best mixture can be time-consuming and expensive if the results are not achieved after more than two repetitions [20].
On the other hand, finding optimal solutions using RSM is relatively simple for the researcher for a single response. This approach involves the use of regression models that contain cross-product and/or high-order terms. Occasionally in real-life applications, it is necessary to optimize several responses simultaneously. Focusing on a single objective may result in poor performance in many areas. However, if tradeoffs among several responses are enabled, a design that is close to the optimum can be achieved.
Inspecting contour plots of the response surface is a visual and straightforward way of interpreting mixture models [7]. Contour plots were used to interpret the model and draw conclusions about the intra-and inter-mixture behavior of the two mixtures investigated in the sausage production experiment described earlier [17]. It is also possible to use contour plots for simple optimization. The regions between the upper and lower limits for acceptable responses can be shaded and used as maps to identify acceptable solutions. It is also possible to combine these maps with other properties such as price, density and/or color. In many optimization cases, several responses must be considered simultaneously. Multi criteria methods [21] come into play in solving problem involving optimization of several responses when the various responses are considered at the same time and when it is important to find optimal compromises in terms of the total numbers of responses that are taken into account [5].
One of the most popular and most frequently used approaches to simultaneous optimization is the desirability function approach [22]. Cornell [7] also provides detailed information on the application of desirability functions for mixture experiments. Individual goals are combined into a single objective measure to be maximized using a geometric mean function. It is possible to obtain an overall desirability from the individual desirabilities. Optimization is performed through the generation of a global desirability function, which is assigned a value ranging from 0 to 1.
The use of simultaneous optimization in the paint [23] and ceramics industries [21] has been discussed recently. In the optimization step, a weight coefficient that reflects the degree of importance of each characteristic in the product is allocated to each response. Simultaneous optimization is used to find the optimum proportions of components.
Some applications of desirability functions in the optimization of multi-response procedures are described in the food literature. The optimization of multi-responses is still a concern. Dingstad, Egelandsdal, and Naes [17] used contour plots to interpret their model; these plots made it possible to draw conclusions about the intra-and inter-mixture behavior of the two sausage ingredient mixtures investigated. Bezerra et al. [14] fitted a quadratic model and its contour graph to the overall desirability. Di Monaco et al. [24] used contour plots for each of several sensory attributes and lipid types.
Most researchers use specialized software for mixture design, especially for creating designs that include constraints. Even in a single study, various software packages may be used for different purposes. In terms of design and analysis, while most statistical packages can be used in classical DOE methodologies, mixture design requires specialized software. Appendix C lists the types of software mentioned in the food literature as having been used for mixture design.
Azevedo et al. [25] used cubic models to find the optimum proportions of ingredients. The results were analyzed using the Statistica software, version 7.0, to visualize the fitted model via contour plots. Bautista-Gallego et al. [26], [27] used both Statistica and Design Expert for data processing. The nonlinear module of Statistica was used to fit microbial growth, decay and third-order acid-formation kinetic models. Additionally, various diagnostic tests such as those for the presence of outliers, Cook's distance, and leverage, as well as graphs of the residual vs. predicted values, were used to check the adequacy of the model. As in prior studies, ternary graphs were used to represent any mixture of three components in a triangular coordinate system. García-García, and Totosaus [28] and Liu et al. [29] used the PROC ANOVA procedure to detect significant differences via the Duncan multi-range test and the PROC GLM procedure to analyze regression equations using the Statistical Analysis System (SAS), version 8.0.
Di Monaco et al. [24] used superimposed contour plots for each sensory attribute and lipid type to simultaneously optimize responses using Design Expert, by means of desirability function. Significant differences between the observed and predicted values were determined using paired t-tests with the help of SPSS version 13.0.
Zorba and Kurt [30] determined the optimum proportions of beef, chicken and turkey meats in a mixture using superimposed contour plots produced with the JMP Software. In a study by Karaman,Yilmaz,and Kayacier [31], the JMP statistical software was used to develop prediction equations for and estimated ridges of the maximum and minimum responses. The correlations among the parameters were determined using MINITAB Release 13. Fustier et al. [32] used the ADX menu in the SAS/QC module to develop an experimental design. The effects of the isolated flour fractions in the blends were compared using contour plots. Yin et al. [2] used Design Expert to develop a six-level, three-factor design to evaluate the combined effects of germinated brown rice juice, beerwort and soybean sprout juice. The interrelations and interactions were identified using contour maps for the effects of independent variables on biomass yield and total selenium yield.
Dooley, Threlfall, and Meullenet [33] analyzed samples with JMP for ANOVA, with the treatment and the panelist or consumer as the main effects, using the mean separation from Fisher's least significant difference test. The advantage of using this software was that the ternary plots enabled three mixture factors to be viewed simultaneously; for more than three factors, the software allowed the user to choose which factors to plot. Mixture design analysis and the use of the desirability function were also performed with JMP. In further research, other hedonic responses can be considered to obtain more accurate results.
Zhou et al. [15] used the Design Expert software to predict the expected responses. The experiments began with a steepestclimb design from a random mix and continued until the target response values were reached. After establishing the regression model and the analysis of the cross-product terms, the researchers arrived at an effective mixed fermented starter involving several species. The results of this study show that using the mixture design methodology is an effective method of optimizing test ingredient combinations using fewer resources than those required by classical approaches.
Mali et al. [34] evaluated the effects of cassava starch, sugarcane bagasse fibers, and polyvinyl alcohol on selected properties of extruded foams. The experiments were designed and analyzed using Statistica version 6.0. The regression coefficients of the equations were calculated and an analysis of variance was conducted. All of the models were found to be statistically significant and to have satisfactory coefficients of determination (R 2 ). Marafon et al. [35] used Statistica version 8.0 to analyze a model of the kinetics of acidification of milk.
Nikzade, Tehrani, and Saadatmand-Tarzjan [36] used overlay plots to determine the optimum values of responses for a set of combinations of xanthan gum, guar gum, and mono-and diglycerides. The optimum solution was shown by a clear flag on the overlay plot. In a study by Liu et al. [29], quadratic polynomial equations were fitted to data, and contour plots were generated. Contour plots for each response were superimposed to obtain the optimum region by considering several responses.
Despite the common usage of statistical packages, most of the studies on experimental design in the food industry have paid no attention to the regression assumptions that are so important and can be easily analyzed using statistical software. The usual assumptions in regression are also made in experimental design of mixtures. In general, because the points in mixture designs can differ substantially in their leverage values, the analysis of studentized residuals is recommended rather than the ordinary least-squares residuals in mixture experiments [9]. This type of analysis can be performed using statistical software such as Design Expert. Additionally, the assumed independence of the residuals should be checked by the Durbin-Watson statistical test, and the autocorrelation function of residuals should be checked using Minitab, MATLAB, JMP, Excel, etc. Lastly, the constant variance assumption should be checked by examining the plot of the residuals versus the fitted values for each model. If these assumptions are not valid, the accuracy of the results is questionable.

Model verification
The verification of the model requires a confirmation experiment. The verification is completed in a few steps. First, one point is selected from the sample space of the data, and the response is predicted using the model. Second, the prediction interval is calculated, and a confirmation experiment is conducted at the selected point. Third, if a measured new observation falls into the prediction interval, it is concluded that the model predicts the response value well [37].
Yin et al. [2] performed experiments to verify the model that they had developed. Confirmatory trial results were found to be reasonably close to the predicted results, yielding a good fit between the observed and predicted values. Di Monaco et al. [24] investigated the optimal solution for their problem by setting the best sensory performance as the goal for each response. The solutions with the highest desirability values were selected for verification. The results of paired t-tests indicated that there were no significant differences between the predicted and observed values.
As the above discussion indicates, a limited number of studies have included verification experiments to increase the applicability of optimum results to real-life problems. It is suggested that the importance of confirmation should be considered in future research.

Conclusions and potential future applications
This review discusses applications of the available mixture design approaches in the food industry. From our analysis of the studies conducted in the food industry, the following conclusions can be drawn: 1. The application of the mixture design methodology to optimization problems in the food industry is common because of the methodology's ability to provide effective information from a small number of experiments and evaluate the interactions among variables, 2. Mixture design, which is a special type of experimental design, is becoming more popular with the increasing popularity of specialized statistical software. The superiority of this methodology, compared to traditional techniques, hinges on the choice of an appropriate design, the ability to fit an adequate mathematical function, and the ability to evaluate the quality of the fitted model. Furthermore, the choice of factors and levels is as important as the choice of design, 3. An optimal design methodology requires specification of a model form that contains sufficient terms to allow adequate consideration of all of the response variables. According to, the complete Scheffe quadratic (CSQ) model [38], which contains q linear and ( − 1)/2 quadratic blending terms, is widely used. Some authors in the food area have used CSQ or Scheffe linear models with cross-product terms, but Piepel, Szychowski and Loeppky [38] proposed the use of partial quadratic mixture (PQM) models consisting of linear terms augmented with appropriate subsets of squared and cross-product terms. Researchers can compare these approaches to obtain the most useful possible results from their studies, 4. Compared to other DOE designs approaches, heuristics and algorithms have seen little application in the food industry. Lejeune [39] proposed an algorithmic process involving the use of a one-exchange algorithm and generalized simulated annealing for the construction of complex designs. An examination of results reported in the literature indicates that simultaneous integration of these algorithms is very effective. According to Didier et al. [19], the ANN methodology permits the modeling of complex relationships, especially nonlinear ones, without complicated equations. Neural networks combined with experimental design, which is an alternative to classical modeling, can be used to model relationships among variables. Mannarswamy, Munson-McGee, and Andersen [40] used a pick-and-exchange algorithm to determine the D-optimal design. In future research, more attention should be paid to heuristics.
Most of the studies conducted in the food industry have paid no attention to the usual assumptions associated with regression, but these assumptions should be considered by other researchers using the enhanced properties of statistical software. The usage of a desirability function in multi-response optimization has limited applications in the food field. Limited studies have verified the optimal combinations of mixtures. The usage of linear/nonlinear programming for multi-response optimization is a challenge in this area. On the other hand, optimum solutions should be confirmed by additional experiments.

References
Some applications of simplex-lattice design in the food industry. Investigating the effect of interaction between pine, flower and highland honeys on the rheological properties of salep drink and determining the optimum levels of the honeys to obtain the most acceptable product with respect to the sensory properties studied. 2011, [31]. 2010, [27].

Chips
Lipids glucose syrup albumin Developing, producing and evaluating new chestnut-based chips to identify optimal formulations in terms of the best sensory performance.
2010, [24]. Sausages locust bean gum, potato starch -carrageenan Evaluating the interaction effects of potato starch, locust bean gum and -carrageenan on cooking yield, expressible moisture, texture and color in low-fat sodium-reduced sausages formulated with potassium and calcium chloride.

Beef chicken turkey
Optimizing some emulsion characteristics of beef, chicken and turkey meat. 2006, [30].

Appendix B
Some applications of simplex-centroid design in food industry.

Food/Product Factors Objective of the Study Reference
Milk Peanut milk Soy milk Cow milk Investigation of chemical composition and physic-chemical properties of soy-peanut-cow milk blends, and the usability of the blends in dairy products.

Wine
Cabernet sauvignon, merlot, zinfandel Optimizing blended Vitis vinifera wines, namely, Cabernet Sauvignon, Merlot and Zinfandel varietals, for consumer acceptability and validating consumer acceptance of the optimized blends compared to the original wines.
Mayonnaise xanthan gum guar gum mono-& diglycerides Determining the optimized mixture proportions of low-cholesterol, low-fat mayonnaise containing soy milk as an egg yolk substitute with different combinations of xanthan gum, guar gum and mono-and diglyceride emulsifiers to achieve the desired stability, textural and rheological properties and sensory characteristics.
Probiotic Yoghurts skimmed milk powder whey protein concentrate sodium caseinate Optimizing the rheological properties of probiotic yoghurts supplemented with skimmed milk powder, whey protein concentrate and sodium caseinat. 2011, [35].
Beef Patties α-tocopherol tea catheins cornosine Evaluating the interactions of three antioxidants, namely, α-tocopherol, tea catheins and cornosine, and their effect on color, lipid stability, metmyoglobin percentage and metmyoglobin-reducing activity in raw beef patties.

Biscuits
Gluten starch water-solubles Investigating the effect of varying the ratios of gluten, water-solubles and starch fractions isolated from three different flour grades.