Bayesian optimization of traveling wave-like wall deformation for friction drag reduction in turbulent channel ﬂow

We attempt to optimize the control parameters of traveling wave-like wall deformation for turbulent friction drag reduction using the Bayesian optimization. The Bayesian optimization is an optimization method based on stochastic processes, and it is good at ﬁnding the parameter values to minimize (or maximize) an expensive cost function or a blackbox function. The parameter value to be tested in the next iteration step is chosen based on the acquisition function that accounts for both the exploration term searching in high uncertainty regions and the exploitation term searching in the regions of high possibility over the current best observations. First, we investigate the ef-fectiveness of the Bayesian optimization using a two-parameter test function with known optimum value. As a result, the Bayesian optimization is shown to successfully work. Next, we apply the Bayesian optimization to the control parameters of traveling wave-like wall deformation for friction drag reduction in a turbulent channel ﬂow at the friction Reynolds number of Re (cid:28) = 180. While the wavenumber ( k + x ) is ﬁxed, the velocity amplitude ( v + w ) and the phasespeed ( c + ) are chosen as the variable to optimize. As a result of the Bayesian optimization, although the bulk-mean velocity in the optimized case varies periodically, we achieved the maximum drag reduction rate of 60 : 5% when ( v + w ; c + ) = (10 : 0 ; 42), which is higher than that in the previous study (Nabae et al., 2020), i.e., 36 : 1%. In the optimized case, by repeating laminarization of ﬂow ﬁeld and re-transition to turbulent ﬂow due to the inﬂection instability, the bulk-mean velocity increases and decreases periodically.


Introduction
Reduction of turbulent friction drag is of great importance for solving various environmental problems, e.g, the lack of energy resources. Therefore, different flow control methods have been proposed to date. Flow control methods are classified into passive and active control methods. In the passive control such as the riblets (Walsh, 1983) and polymer additives (White et al., 2008), only the initial setup is required so that it is relatively easier to implement in practice. However, due to the maintenance cost that usually exceeds the drag reduction effect, these passive control methods are still under extensive investigation toward their practical use (e.g., Sasamori et al., 2014;Sasamori et al., 2017). On the other hand, active control methods, which requires the energy input from the external system, are also considered promising candidates because much larger drag reduction effects such as relaminarization of turbulent flows have been reported.
Active control methods can be roughly classified into feedback and predetermined control methods. Feedback control methods have been extensively studied by direct numerical simulation (DNS) in turbulent channel flows at relatively low Reynolds numbers (e.g., Hammond et al., 1998;Lee et al., 1998;Iwamoto et al., 2002) since the pioneering work by Choi et al. (1994) on the opposition control, and about 20% drag reduction effect has been obtained. The effect of feedback control has also been confirmed in an wind-tunnel experiment (Yoshino et al., 2008), in which 6% (with its uncertainty of ±3%) drag reduction effect was attained. Moreover, it has been predicted from a theoretical analysis that the drag reduction effect of such feedback control methods is not much deteriorated even at practically high Reynolds numbers (Iwamoto et al., 2005). However, the big hurdle for such feedback control methods to be used in practice is that the sensors and actuators should have the spatial and temporal scales of the actual quasi-streamwise vortices, e.g., ∼ 0.1 mm and ∼ 1 kHz for bullet trains (Kasagi et al., 2009), and they should be installed over the entire surface as the drag reduction effect is nearly proportional to the controlled area (Fukagata & Kasagi, 2003).
Predetermined control methods which do not use sensors, e.g., traveling wave created by blowing/suction (Min et al., 2006;Mamori et al., 2014) and wall deformation (Nakanishi et al., 2012;Nabae et al., 2020), are expected as the alternative to overcome the aforementioned problems of feedback control. Min et al. (2006) reported in their DNS of turbulent channel flow that sub-laminar drag (i.e., the drag below that of the corresponding laminar flow) could be attained by upstream traveling wave-like blowing and suction, which produce negative Reynolds shear stress in the region near the wall (Mamori et al., 2010). The sub-laminar drag is due to a pumping effect by the traveling wave-like blowing and suction (Hoepffner & Fukagata, 2009), which produces the mean pressure gradient to keep the flow rate constant. On the other hand, Mamori et al. (2014) achieved the relaminarization in a turbulent channel flow by using downstream traveling wave-like blowing and suction, which can be attributed to the less receptively due to the downstream wave of blowing and suction (Lee et al., 2008;Moarref & Jovanović, 2010). Thus, streamwise traveling waves of blowing and suction reduce the turbulent friction drag either by upstream wave or downstream wave by different mechanisms. However, fabricating a practical device for streamwise traveling waves of blowing and suction is still difficult, as Min et al. (2006) also mentioned in their paper.
As an alternative, Nakanishi et al. (2012) considered a streamwise traveling wave of wall deformation ( Fig. 1) and investigated its effect by DNS in turbulent channel flow under a constant flow rate (CFR) condition at the bulk Reynolds number of Re b = 5600, which corresponds to the friction Reynolds number of the uncontrolled flow of Re τ ≃ 180. They reported relaminarization of flow under some parameter sets and achieved the maximum drag reduction rate of 69%. Nabae et al. (2020) revisited the same problem under more scalable condition, which is the constant pressure gradient (CPG) condition. They could not confirm the relaminarization due to the difference between CFR and CPG conditions but achieved the drag reduction of about 40% at the maximum. The reason for this difference in the drag reduction effect between CFR and CPG conditions is discussed in detail in Nabae et al. (2020). In these studies, however, different control parameters are examined based on the parametric study but they are not always optimized because of the high computational cost of DNS. Hence, it is preferable if the control parameters can be optimized using an efficient optimization method so that the drag reduction effect of traveling wave-like wall deformation can be maximized.
Hundreds of optimization methods proposed so far are roughly classified into the global optimization methods and the local optimization methods. Although the possibility of finding the optimum is high, the global optimization methods, e.g., the grid search and the genetic algorithms, usually take a long time for optimization. On the other hand, in the local optimization, e.g. the gradient descent method, the computational time required for optimization is shorter than that of the global optimization methods, but the risk to fall into the local solution is higher. In order to overcome these disadvantages, we attempt to use the Bayesian optimization (Brochu et al., 2010) to optimize the control parameter of traveling wave-like wall deformation. The Bayesian optimization is a method based on the stochastic processes. Using the acquisition function that accounts for both the exploration term searching in high uncertainty regions and the exploitation term searching in the wav e prop agat ion  regions of high possibility over the current best observation, the parameter value to be tested in the next iteration step is determined. Therefore, it can find the optimum in an expensive cost function or a blackbox function, and it is also widely used for parameter tuning in machine learning (Snoek et al., 2012). The remainder of this paper is organized as follows. The Bayesian optimization method is explained in Sec. 2. We investigate the effectiveness of the Bayesian optimization method via optimization of a test function in Sec. 3. Optimization of the control parameter of traveling wave-like wall deformation is presented in Sec. 4. Finally, conclusions are drawn in Sec. 5.

Overview
The Bayesian optimization (Brochu et al., 2010) is an optimization method based on stochastic processes, and it can find the minimum (or maximum) of an expensive cost function or a blackbox function. Figure 2(a) shows a schematic chart of the Bayesian optimization. As shown in Fig. 2(a), using n observations of an objective function f (x), the mean µ and the variance σ 2 are derived, which are associated with the posterior distribution and the confidence bound of the posterior distribution, respectively. The existence probability at each point follows the Gaussian distribution N, whose mean and variance are µ and σ 2 , respectively. Figure 2(b) shows the flowchart of the Bayesian optimization, which consists of three different steps: (1) calculating the posterior distribution, (2) determining the next observation, and (3) updating the observations.

Posterior distribution
First, using n observations (i.e., evaluation of f (x) at n discrete points on x), we derive the posterior distribution. The relationship between the value predicted at the next iteration step f n+1 at x n+1 and the known observations where k(·, ·) denotes the kernel covariance function, and K and k are defined as From Eqs.

Kernel covariance function
In order to derive the posterior distribution, we use the kernel covariance function, which is the function representing how two observation points influence each other. The choice of the kernel covariance function is important because we cannot obtain the accurate posterior distribution if we do not choose an appropriate function. The widely used kernel covariance functions are the squared exponential kernel k SE (Rasmussen and Williams, 2006) and the Matern kernel k Matern (Matern, 1960;Stein, 1999): where Γ(·) and K d (·) are the Gamma function and the modified Bessel function of order d, respectively, and θ is hyperparameters, which are the parameters left for the user to tune. Note that when d → ∞, k Matern is the same as k SE (Ramussen and Williams, 2006). In the present study, the Matern52 function, i.e., Eq. (7) with d = 5/2, is used. The hyperparameters are optimized by maximizing the marginal likelihood.

Acquisition function for the next observation
Next, based on the mean µ and the variance σ 2 computed by Eqs. (4) and (5), the acquisition function a for selection of the next observation point is calculated. In the minimization (maximization) problem, the next observation point is the point that the value of the acquisition function is minimum (maximum). There are three famous acquisition functions: PI (Probability of Improvement) (Kushner, 1964), EI (Expected Improvement) (Mockus et al., 1978), and GP-LCB (Gaussian Process-Lower Confidence Bound) (Srinivas et al., 2010).
PI is designed to maximize the probability P of f (x) to be below the best current value y best : where Φ(·) is the cumulative distribution function (CDF). PI accounts for the exploitation term only: because the search is done locally, the possibility to fall into the local solution is high (Jones, 2001). EI calculates the expected value of the difference between the predicted value and the current best value: where ϕ(·) is the probability distribution function and Z = (y best − µ(x)) /σ(x). GP-LCB is a function calculating the lower confidence bound at each point: where κ performance is usually said to be better than PI.

Verification of Bayesian optimization
In the present study, the Bayesian optimization is performed using the Python libraries GPy (2012) and GPyOpt (2016). As the acquisition function, GP-LCB of Eq. (10) is applied. Fig. 3 (a) The true function of Eq. (11); (b) mean, µ; (c) standard deviation, σ; (d) acquisition function. Square, the true optimum point; circle, the initial points; cross, the observation points.
Before the control parameters of the traveling wave-like wall deformation are optimized, we verify the Bayesian optimization by using two-parameter test function with known minimum value, i.e., the five-well potential function (Pavlyukevich, 2007): where −20 ≦ x, y ≦ 20. The optimum value in this test function is −1.462 at (x, y) = (4.9, − 9.9). The maximum number of the iteration is set to 15. Figure 3 shows the true function of Eq. (11) and the posterior distribution (mean, standard deviation, and acquisition function). The optimum value obtained is −1.458 at (x, y) = (5.0, − 9.7). As shown in Fig. 3(a) and (b), the predicted mean around the true optimum point is reasonable agreement with the true function. In addition, the standard deviation and the value of the acquisition function at around the true optimum point, i.e., (x, y) = (4.9, − 9.9), is small. Therefore, although strictly speaking, the true optimum value cannot be obtained, the Bayesian optimization is an effective method for the optimization of the two-parameter function.
4. Optimization of traveling wave-like wall deformation 4.1. Direct numerical simulation DNS of a fully developed turbulent channel flow controlled by traveling wave-like wall deformation is performed under a CPG condition at the friction Reynolds number of Re τ = 180. The governing equations are the incompressible continuity and Navier-Stokes equations. In this study, in order to compute deforming wall accurately, the coordinate transformation proposed by Kang & Choi (2000), is applied, where η = (η u − η d )/2 with η u and η d being the wall-normal displacements of the upper and lower walls, respectively. The continuity and the Navier-Stokes equations in the boundary-fitted coordinate of Eq. (12) is expressed as (Kang & Choi, 2000) ∂u where S and S i represent the source terms caused by the coordinate transformation. All variables are made dimensionless by the fluid density, ρ * , the friction velocity, u * τ , and the channel half-width, δ * , where superscript * represents dimensional quantities. The friction Reynolds number is defined as Re τ = u * τ δ * /ν * , where ν * is the kinematic viscosity. The periodic boundary condition is applied in the streamwise (ξ 1 ) and spanwise (ξ 3 ) directions. The velocity boundary conditions on Table 1 The computational conditions of DNS. the wall are imposed as where v w , k x and c are the velocity amplitude, the wavenumber and the phasespeed, respectively. Table 1 shows the computational condition, where the superscript + indicates wall units. The DNS code in the present study is same as that used in Nabae et al. (2020). For further details on the DNS part, readers are referred to Nabae et al. (2020).

Drag reduction rate
The optimization is performed to maximize the drag reduction rate R D , defined as where C f and C f 0 indicate the drag coefficient in the controlled and uncontrolled cases, respectively. These are calculated as Since the bulk Reynolds number, , under a CPG condition is changed by the control, we calculate C f 0 at the same bulk Reynolds number as the controlled case by using the approximation expression (Nabae et al., 2020), i.e., U + b,NC = 5.1904 log Re b − 3.6741.

Three-component decomposition
In the present study, the periodic component due to the wavy walls is included in the turbulence statistics. In order to distinguish the periodic component, we apply three-component decomposition (Hussain and Reynolds, 1970), i.e.,

Optimization condition
In the present optimization, we optimize two-parameters, i.e., the velocity amplitude v w and the phasespeed c in Eq. (15). The range of these parameters is set to be 5 ≦ v + w ≦ 10 and 20 ≦ c + ≦ 90. The other parameter, i.e., the wavenumber k x , is fixed at k + x = 0.022. The reasons the wavenumber is fixed in the present optimization is that the wavenumber can only take discrete values in the periodic computational domain (which makes it difficult to apply optimization) and that the value used in the present study was found in our previous study (Nabae et al., 2020) to be the best among the possible wavelengths in the present computational domain. As the initial values for optimization, we use eight different values examined in Nabae et al. (2020) as shown in Table 2.
In the CPG condition, the drag reduction rate, R D , increases as the bulk-mean velocity, U b , increases. In addition, the GPyOpt deals with the minimization problem. Therefore, in order to maximize the bulk-mean velocity, U b , the objective function is defined as f . Similarly to Sec. 3, we use GP-LCB as the acquisition function. It should be worth mentioning that we also tested EI as the acquisition function in our preliminary study optimizing one parameter of the traveling wave, i.e., the phasespeed (Nabae, 2017), but EI did not work well for this problem -the search points concentrated near the initial points. The choice of acquisition function should be substantially dependent on the problem, and what we can state here is that GP-LCB worked well for the present problem.  Figure 4 shows the mean µ, the standard deviation σ, the acquisition function a, and the update history of the maximum value of R D after 12 updates, and Fig. 5 shows the time history of the bulk-mean velocity, U + b , and the drag reduction rate, R D , in all cases. The maximum drag reduction rate R D is 60.5% at (v + w , c + ) = (10.0, 42) and is kept unchanged in the subsequent 6 updates. In addition, similarly to Sec. 3, the standard deviation, σ, at around (v + w , c + ) = (10.0, 42) is small. This means that the uncertainty at around (v + w , c + ) = (10.0, 42) is small. Note that the obtained R D map ( Fig. 4(a)), which does not exhibit strong multimodality, implies that the optimum parameter set should be insensitive to the initial values. Also, one may speculate the possibility that the real optimum may exist outside the admissible region, especially at higher v + w . However, according to our previous study (Nabae et al., 2020), the drag reduction rates at v + w = 12 and 15 are substantially smaller than that at v + w = 10. Therefore, the existence of the optimum outside the present admissible region is considered unlikely. Thus, the optimal parameter sets in the present study is judged to be (v + w , c + ) = (10.0, 42). Hereafter, the case with (v + w , c + ) = (10.0, 42) is referred to as the optimized case. Figure 6 shows the time history of the bulk-mean velocity in the optimized case. As shown in Fig. 6(a), the bulk-mean velocity varies periodically. It exhibits an unstable behavior before t + ≈ 40000, while relatively stable after t + ≈ 40000. To investigate this behavior in more detail, in what follows, we compare the turbulence statistics in instantaneous field at three different times indicated in Fig. 6(b), i.e., t + ≈ 41725, 42539, and 42755. Figure 7 shows the instantaneous flow field in the optimized case at three different time instants. At t + ≈ 41725 (i.e., drag-reducing phase), the quasi-streamwise vortices (QSVs) are almost vanished, and transverse vortices appear due to the motion of wavy wall, which suggests that the flow field is being laminarized. However, at the time just before the bulk-mean velocity decreases rapidly, i.e., t + ≈ 42539, turbulent puff-like structures, which are similar to those observed in a turbulent channel flow at very low Reynolds number (see, e.g., , Tsukahara et al., 2005) are observed. After the bulkmean velocity begins to decrease, the QSVs are drastically enhanced. At t + ≈ 42755, the QSVs are generated in the entire region of the channel. This means that the flow field is re-transitioned from laminar flow to turbulent flow. This drastic enhancement of the QSVs is due to the instability caused in the region near the wall at t + ≈ 42539. Figure 8 shows the phase-averaged streamwise velocity in the region near the wall at t + ≈ 42539 and 42755. As shown in Fig. 8, strong reverse flow is induced at around ϕ x /π ≈ 1.5, and after the bulk-mean velocity begins to decrease, the reverse flow becomes weaker as time goes. The inflection point due to this flow reversal triggers the instability to significantly enhance QSVs. Note that this instability due to the inflection point was observed also in Nakanishi et al. (2012) and Nabae et al. (2020) in the cases with relaltively large wall deformation amplitude. Figure 9 shows the turbulent root-mean-square (RMS) of the velocity fluctuations and the turbulent Reynolds shear stress (RSS) at three different time instants. According to Fig. 9, the turbulent component of the RMS velocity fluctuations and the turbulent RSS are significantly suppressed at t + ≈ 41725. In particular, the turbulent RSS is almost perfectly vanished. On the other hand, after the inflection point appears, i.e., t + ≈ 42539 and t + ≈ 42755, the turbulent component of RMS velocity fluctuations and the turbulent RSS are drastically enhanced. These trends are similar to those of QSVs as shown in Fig. 7.

Results and discussion
In summary, in the optimized case, the following cycle are repeated: • laminarization of the flow field by the control • the inflection instability due to the flow reversal in the near-wall region • re-transition to turbulent flow.

Conclusions
Using the Bayesian optimization based on stochastic processes, we optimized the control parameters of traveling wave-like wall deformation for friction drag reduction in a turbulent channel flow at the friction Reynolds number of Re τ = 180. In prior to the study of turbulent flow, we first investigated the usefulness of the Bayesian optimization by using two-parameter test function with known optimum value, and it was confirmed that the present Bayesian optimization is useful for finding the approximate minimum for a two-parameter function.
For the turbulent drag reduction by traveling wave-like wall deformation, we attempted to optimize the velocity amplitude, v w , and the phasespeed, c, in the range of 5 ≦ v + w ≦ 10 and 20 ≦ c + ≦ 90, while the wavenumber, k x , is fixed, i.e., k + x = 0.022. In the optimized case, i.e., (v + w , c + ) = (10.0, 42), when the bulk-mean velocity increases, quasistreamwise vortices (QSVs) are almost vanished due to laminarization of the flow field. After the inflection instability due to the strong reversal flow in the near-wall region, the flow field is re-transitioned to turbulent flow, so that the bulk-mean velocity decreases. By repeating laminarization of flow field and re-transition to turbulent flow, the bulk-mean velocity increases and decreases periodically. In the optimized case, the maximum drag reduction rate attained is R D = 60.5%, with (v + w , c + ) = (10.0, 42) which is significantly larger than that in the previous work (Nabae et al., 2020) which reported R D = 36.1% with (v + w , c + ) = (10.0, 30). Although the assessment of the performance of new turbulence control methods should usually rely on a rough parameter study using computationally costly DNS, as usually no turbulence models have been validated a priori for such new control methods, the result of the present study suggests that an additional fine tuning of parameters using an advanced optimization method is extremely important to clarify the potential performance of the control methods. Apart from the optimization itself, the present results also suggest that it is worth investigating the drag reduction effect in the case where the time variation of the drag reduction rate is large.