Applicability of Probabilistic Nucleation Modelling for the Analysis of Microfluidics Data †

Microfluidics tools have been developing rapidly over the past decade, as they offer unparalleled ability for controlling nucleation and tracking crystallisation events inside very large numbers of individual nanolitre-size droplets. They have demonstrated a significant potential for screening protein crystallization conditions and for the direct determination of inorganic products solubility curves. The accepted basis for analysing microfluidics data is the probabilistic nucleation model originally proposed by Pound and La Mer (1952). Given the significance of this model for the purpose of analysing microfluidics data, the paper conducts a review of its hypotheses, usage and applicability. A step-by-step derivation of the model equations confirms that the time variation of the proportion of empty droplets which microfluidics experiments can provide with high accuracy is indeed the recommended method for estimation of nucleation kinetic parameters from microfluidics experiments. The paper shows that, depending on its implementation, the model predicts different rates of appearance of crystals inside individual droplets. The paper focuses on two distinct implementation modes, referred to as constant supersaturation and single nucleation event modes. By confronting model prediction with microfluidics measurements for eflucimibe in octanol, the paper finds that both modes yield different model predictions, shedding light on the potential and limits of the probabilistic nucleation model for the analysis of microfluidics data.


Introduction
Microfluidics devices have become key research tools in recent years for the screening of the crystallization conditions of proteins (Selimović et al., 2009), the determination of crystallization kinetics (Laval et al., 2009;Teychené and Biscans, 2012) and precipitation kinetics (Vitry et al., 2015).The principle of this method is to divide the volume of the studied solution into a large number of small, independent droplets in an inert oil.These droplets can be produced in specific microfluidics geometries and their volume and chemical composition can be fixed in a controlled way.In addition, microfluidics devices allow a rapid mixing of the different compounds, help prevent hydrodynamic dispersion and cross-contamination, and the droplets can be stored in microchannels.
For nucleation studies, this experimental technique generates a great deal of data that permit reliable access to the kinetics of nucleation.Typically, the frequency of nucleation events is obtained by counting over time the number of drops without crystals (Laval et al., 2009).When the number of droplets is larger than the number of impurities initially present in the solution, one can observe homogeneous nucleation in some droplets.Moreover, when all the droplets have the same volume V and for a small enough V, it is in principle possible to relate the probability that a droplet contains a crystal to the nucleation rate (Kashchiev et al., 1994).Measurements of the temporal evolution of the fraction of droplets that contain crystals thus permit an estimation of the nucleation kinetics.
The processing of the data generated by such techniques remains limited, however.Indeed, because of the presence of a surfactant, an interface as well as some dust or impurities, the majority of published experimental studies show that the appearance of crystal nucleation in these systems is due to two concomitant mechanisms, namely heterogeneous and homogeneous nucleation.In general, nucleation data that originate from such experiments are treated either by neglecting the first instance of nucleation (Dombrowski et al., 2010), by treating the nucleation process in a comprehensive manner without differentiating between possible nucleation mechanisms (Goh et al., 2010) or by averaging the influence of impurities on the crystallization process (Laval et al., 2009;Teychené and Biscans, 2011).In addition to the models developed by Goh et al. (2010) in the case of protein crystallization by evaporation, we propose a probabilistic model of the nucleation process that allows us to have access to the different nucleation kinetics that are observed experimentally, to deduce the temporal evolution of the number of crystals within the droplets, and also to trace the influence of "active sites" on nucleation.

Nucleation model derivation
The nucleation process inside individual droplets can be analysed using a probabilistic nucleation model.Such an approach appears in a number of published studies in the context of microfluidics systems, albeit through different forms and uses.It is revisited here in some detail with the objective of establishing a sound basis for its applicability and usage in the context of microfluidics nucleation analysis.
The model first considers two distinct types of droplets, namely droplets with active nucleation sites and droplets without any nucleation site.Only homogeneous nucleation events can occur inside droplets devoid of nucleation sites, whereas both homogeneous and heterogeneous nucleation events can possibly take place inside droplets that bear active nucleation sites.
The nucleation model that results, whereby the first account of the probabilistic nucleation model is credited to Pound and La Mer (1952), uses only 3 macroscopic parameters, all of which have solid theoretical bases.
The first model parameter m is the mean number of active nucleation sites inside the system's droplets.Considering that active nucleation sites are identical in all respects and that they nucleate independently of each other inside a droplet, the probabilistic model considers that active nucleation sites have the same nucleation kinetic rate k (time -1 ), the model's second parameter.The third model parameter is the homogeneous nucleation kinetic rate k 0 (time -1 ), which applies to all droplets devoid of nucleation sites, and also to droplets with active nucleation sites if they undergo homogeneous nucleation.
To avoid any ambiguity, the term "homogeneous nucleation" that is used throughout this paper was chosen to keep consistent with the original definition by Pound and La Mer (1952).It refers to nucleation that occurs in the volume of the droplet, as opposed to "heterogeneous nucleation" which is understood to take place on the surface of nucleation sites.The nucleation mechanism associated with the term k 0 is therefore not to be mistaken for the homogeneous nucleation rate defined in Classic Nucleation Theory (CNT).This point, which has already been pointed out by several authors (Laval et al., 2009;Akella et al., 2014) in relation to probabilistic nucleation modelling, will be further discussed in section 3 of the paper.
Since we have no reason to expect interdependence of the droplets inside our microfluidics system -the system is designed so that two droplets have no interaction with each other -then a Poisson process is a suitable candidate for describing the statistics of nucleation events that occur within individual droplets.This simple statement justifies application of the Poisson statistics, as initially proposed by Pound and La Mer (1952) to model the nucleation process in microfluidics systems.
Considering a system with N T droplets in total, the number N p of droplets with p active nucleation sites is predicted by the Poisson distribution with mean m such that: where dpois() denotes the probability density function of , the expected fraction of droplets without active nucleation sites is therefore e -m and the number of droplets with active sites is 1 -e -m .Sear (2014) indicates that m should lie in the range 0.1 to 1.This means that the maximum number of active nucleation sites per droplet that can be expected, with a confidence level of 99 %, is in the range 1 to 4. These values correspond to the 99 % quantiles of the Poisson distribution with mean m = 0.1 and m = 1, respectively.Given that k 0 t is the mean number of nucleation events to occur inside a droplet without active nucleation sites during time period t, the probability that r nucleation events take place inside such a droplet during time period t is: Such a droplet may produce any number r of nuclei.Thus, the number of droplets without nucleation sites that contain r crystals at time t is: (3) The nucleation event that takes place inside a droplet that carries p active nucleation sites is a Bernoulli random variable with probability for homogeneous nucleation and The probability of homogeneous nucleation is expected to be significantly lower than that of heterogeneous nucleation, typically one or two orders of magnitude lower.This explains why Pound and La Mer (1952) neglected the possibility that a droplet with active nucleation sites may nucleate homogeneously.Akella et al. (2014) corrected this assumption and justly accounted for this possibility in their work.Finally, the probability that r nucleation events take place inside such a droplet during time period t is: ) where 0, The second Poisson density distribution dpois(r, pkt) could be upper truncated at p, if one assumes that one active nucleation site can produce one nucleus only.The authors chose not to truncate this distribution, assuming that a nucleation site can produce any number of nuclei.The number of droplets with p nucleation sites that contain r crystals at time t is therefore: where 0, Finally, the total number of droplets with r crystals at time t is given by: ) where 0, and the number fraction f r (t) of droplets that bear r crystals at time t is: ) where 0, This equation is the main equation for predicting the state of the microfluidics system at any time t.For all practical uses, there is no need to use the analytical expression of the probability density function (PDF) of the Poisson distribution since it is easily computed by numer-ical software programs.
However, in order to relate this expression to the analytical expressions that can be found in the literature, some of which are referred to in the following section, equation (8) gives the analytical equivalent for the number fraction f r (t) of droplets that bear r crystals at time t.

The case of empty droplets
The time variation of the fraction of empty droplets is well accepted as a means of estimating nucleation parameters m, k 0 and k from microfluidics experimental data.This section reviews this important practical issue.
From equation ( 7), the number fraction f 0 (t) of empty droplets (r = 0) at time t is: This expression appears different from those published by Pound and La Mer (1952), and later by Akella et al. (2014).The differences are clarified in the following text.
First of all, let us suppose that the droplets are either without or with active nucleation sites, as was assumed by Pound and La Mer (1952).Equation ( 10 Besides the case where k >> k 0 , which hails back to Pound and La Mer's assumption, there is no simplification to equations ( 9) and ( 10), recalling that they apply to the situation where droplets with nucleation sites can nucleate homogeneously or heterogeneously.Akella and Fraden (2014), however, published the following equation under these assumptions: It can be shown that this expression is the analytical expression for f 0 (t) that corresponds to the following nucleation model: ) where 0, This equation assigns a nucleation rate (k 0 + pk) to the probability of nucleation of droplets with active sites, when in actual fact, the probabilities of nucleation are different depending on whether these droplets nucleate homogeneously or heterogeneously.This oversight is of no consequence when predicting the time variation of f r (t) in the case k >> k 0 .However, it becomes increasingly significant when the difference between homogeneous and heterogeneous nucleation rates decreases, leading to incorrect estimates of nucleation parameters from the measured variation f 0 (t).
While providing the reader with a step-by-step derivation of the probabilistic nucleation model, this section has highlighted that one must be careful and use the model that precisely matches the assumptions.
This simplifies to the expression derived by Pound and La Mer (1952): whose equivalent in terms of Poisson distribution is: This is particularly critical when using the model to estimate nucleation model parameters from which phenomenological analysis of the nucleation process can be inferred.

Analysis of probabilistic nucleation model usage
In addition to differences that may arise from model assumptions on the final form of the probabilistic nucleation model, differences will also occur depending on how it is implemented.
Assuming that droplets with active sites can nucleate homogeneously or heterogeneously, and that nucleation events can yield any number of nuclei, equations ( 7) and ( 8) give the time variation of droplets with r crystals given nucleation parameters m, k 0 and k.
For the sake of clarity, the behaviour and possible uses of this model are discussed below through numerical examples.The mean number of active nucleation sites per droplet is set to m = -ln(0.5),which corresponds to an equal proportion of droplets with and droplets without active nucleation sites.The values of the nucleation rates for homogeneous and heterogeneous nucleation, noted k 0 and k, respectively, will be varied in the examples that follow.
For the sake of readability, only predictions of f 0 (t) through f 4 (t) are plotted, that is the fractions of droplets that bear up to 4 crystals.This is deemed sufficient to reveal the pattern of the models.Fig. 1 shows the nucleation model behaviour for a decreasing ratio between homogeneous and nucleation kinetic constants k 0 /k.
What these curves all have in common is that the proportion of droplets with a given number of crystals increases, goes through a maximum (or two), and then drops back to zero.When the ratio between homogeneous and heterogeneous kinetic constants k 0 /k is low, the curves exhibit 2 modes.Theses modes relate to the occurrence of homogeneous and heterogeneous nucleation events, and the more different their nucleation rates the more separate and distinct the 2 modes.These curves evolve gradually towards single mode curves as the ratio k 0 /k increases; this is the behaviour that is most often portrayed in the literature (Dombrowski et al., 2010;Goh et al., 2010).The limiting case for this ratio corresponds to k = 0, which is that of purely homogeneous nucleation.Fig. 2 shows the result of the model for k = 0.
The underlying assumption behind all the model predictions presented above is that droplets which have undergone a nucleation event at time t can sustain other nucleation events at a later time, with the same nucleation kinetics.This is equivalent to saying that the state of supersaturation inside individual droplets remains constant as time advances (i.e. the crystal growth rate is zero).We shall refer to this use of the nucleation model as constant supersaturation mode.
Another way of using the nucleation model consists of allowing droplets to undergo only one nucleation event, with the outcome in terms of number of crystals determined by the Poisson statistics of the model.This mode may offer a better representation of reality in microfluidics experiments where the constant supersaturation mode does not apply.We shall refer to this mode as the single nucleation event mode.In this case, model predictions can no longer be made analytically as they require using a Monte Carlo (MC) simulation scheme that predicts the nucleation of every single droplet in the system.Fig. 3 explains implementation of the nucleation model by Monte Carlo.Fig. 4 shows the output of the probabilistic nucleation model, used in single nucleation event mode, for Fig. 1, Case a (mixed homogeneous/heterogeneous nucleation), and Fig. 2 (homogeneous nucleation only), recalling that they were obtained using the model in constant supersaturation mode.
Predictions using the single nucleation event mode or the constant supersaturation mode are totally different, as one would expect.Where the model with constant supersaturation predicts any number of crystals inside the droplets, the single nucleation event mode rarely predicts more than 1 crystal per droplet with the nucleation parameters used.
It is important to recognise that f 0 (t) is the same regardless of the model used, i.e. whether it is used in single nucleation event mode or in constant supersaturation mode.This can be demonstrated analytically, and comes from the fact that the nucleation rate of droplets is independent of the path followed by the nucleation events that follow.This is confirmed in the previous figures, which show that the prediction for f 0 (t) using analytical equation ( 9) is indistinguishable from that calculated by the MC simulation scheme.This result fully endorses using the variation of empty droplets to estimate the nucleation parameters from microfluidics experiments.

Application of the probabilistic nucleation model to the analysis of microfluidics nucleation data
As discussed in section 2, the probabilistic nucleation model can be used to estimate nucleation parameters m, k 0 and k, directly from the measured variation of f 0 (t).Once the nucleation parameters have been estimated, the model can then be applied to predict the variation of f r (t) for any value of the number r of crystals.This section reviews application of the model to both endeavours, and discusses the behaviour and validity of the probabilistic nucleation model through the prism of actual microfluidics experimental data sets.

Experimental data
The applicability of the probabilistic nucleation model, which includes estimation of nucleation parameters and prediction of nucleation events inside individual droplets, was tested using 4 data sets from the authors (see Teychené and Biscans, 2011), and two additional data sets from Akella et al., Figs. 25 and 29 (Akella et al., 2014).The latter are used in section 5. Table 1 summarises the experimental conditions used to generate the microfluidics data.The experimental protocol is summarised below, however, the reader is invited to refer to Teychené et al. (2011) for details.The data were generated using a glass microfluidics platform in the context of a study about the nucleation of eflucimibe crystals in octanol.Undersaturated droplets of eflucimibe dissolved in octanol were generated with water as the carrier fluid.These droplets were generated using a flow-focusing geometry that generates highly monodispersed droplets.In these experiments, the variation of volume between droplets is less than 0.1 nL.Glass syringes, equipped with patterned filters (Millipores, pore diameter < 0.2 μm), are used to inject the fluids into the devices.The temperature of the syringe and the tubing that contain the organic solution are controlled to avoid crystallization by means of a small flexible heater.Flow rates are controlled by syringe pumps (PhD 2000, Harvard apparatus).The temperature of the microfluidic platform is precisely controlled using Peltier elements.Platinum temperature sensors, which are embedded in the microfluidics chip, provide precise temperature monitoring.Owing to this temperature control, the droplets are generated at a temperature 5 °C above the saturation temperature (denoted as T* in Table 1) for all the experiments.As soon as the desired amount of droplets is obtained on the microfluidics chip, typically 600 to 800, the flow is stopped and the system is rapidly cooled (i.e. in less than 1 minute) down to the desired temperature in order to induce supersaturation (denoted as T nucleation in Table 1).
Digital images of the whole microfluidics chip were acquired using an inverted light microscope (Zeiss AXIO observer) equipped with a CCD colour camera (PCO Sensicam QE) at a frame rate ranging from 5 to 30 minutes.The optical resolution of the system is 1 μm/pixel with the 10´ objective, and the particle detection threshold is around 5 μm.
As unfrozen droplets can be identified without ambiguity, their number was determined automatically by image analysis.The number of crystals in frozen droplets, however, was counted by visual inspection.With each experiment, the time at which the first visible crystallization event occurred was recorded.It varied between 25 and 70 minutes.This value was ultimately used to rescale the experimental time.

Estimation of nucleation kinetic parameters
Under the assumption of normality of experimental errors, the model parameters m, k 0 and k are estimated by non-linear least squares minimization.Given the large amount of data points obtained at the end of a microfluidics experiment, the more so for empty droplets, the experimental data can confidently be bootstrapped in order to provide an estimate of the confidence intervals of nucle-ation model parameters, as well as the model confidence and prediction intervals.Fig. 5 illustrates one bootstrap iteration of the parameter estimation scheme applied to data set D. Typically, 1000 bootstrap iterations are carried out for a data set that comprises over one hundred data points.
Each bootstrap iteration gives a set of model parameters (m, k 0 , k), and the combined bootstrap estimations yield the model confidence and prediction intervals.Fig. 6 gives the estimated 95 % prediction intervals for data sets A through D, using 1000 bootstrap estimates in each case.
The bootstrap estimations yield the distribution of individual model parameters from which their confidence intervals can also be obtained.Table 2 gives the results that were obtained from the 4 selected data sets.
For all data sets, the relative standard deviation (RSD) of the estimated parameter k 0 , defined by Pound and La Mer (1954) as the rate of volume nucleation (i.e.k 0 = J v V, cf.Eqn.15), is small.This means that the experimental sampling frequency was high enough to capture the essential features of the homogeneous nucleation kinetic component of the nucleation model.
Parameters m and k relate to the heterogeneous nucleation kinetic component of the nucleation model.Table 2 shows that parameter k is not estimated with the same The dotted lines represent the prediction of f 0 (t) from model equations ( 9) or ( 10).
Case a: Simulation run output for the mixed homogeneous/heterogeneous nucleation case.m = -ln(0.5),k 0 = 10 -6 s -1 , k = 10 -4 s -1 , N T = 10000 droplets, ∆t = 100 s.At t = 10 6 seconds, the model predicts 18.29 % empty droplets, 81.67 % droplets with 1 crystal, 0.15 % droplets with 2 crystals, 0 % with 3 crystals.Case b: Simulation run output for the pure homogeneous case.m = -ln(0.5),k 0 = 10 -6 s -1 , k = 0 s -1 , N T = 10000 droplets, ∆t = 100 s.At t = 10 6 seconds, the model predicts 36.1 % empty droplets, 63.9 % droplets with 1 crystal, 0 % droplets with 2 crystals.precision as parameter k 0 .Heterogeneous nucleation is a process that proceeds very fast at the beginning of the experiment because the energy barrier to form a critical nucleus is lower with surface nucleation than it is with volume nucleation.Moreover, the lower the number of impurities per drop, the rarer the occurrence of nucleation events.Sampling frequency at the start of an experiment is therefore critical for capturing the salient features of heterogeneous nucleation.The frame rate used to conduct the microfluidics experiments was not always sufficient and sometimes yielded only a handful of experimental points inside the heterogeneous nucleation period.This observation, which can be readily observed in Fig. 6, explains that the lower the mean number of impurities per droplet, the broader the confidence interval associated with estimation of parameter k.The worst case occurred with data sets A and B. This result is exacerbated by sampling statistics.Indeed, the fact that data set B contains N act.= 0.8 % of droplets with active nucleation sites means that in a microfluidics experiment that uses 600 to 800 droplets, the number of droplets with active nucleation sites is on average no more than 7.Such a number is clearly too low to expect precise estimates of heterogeneous nucleation rate parameters k and m.When the amount of impurities is high enough, as with data sets C and D, the sampling frame rate is sufficient and nucleation rate parameter k and m are estimated with good precision.Notwithstanding the difficulty with estimation of k at low levels of impurities, Fig. 7 shows that the heterogeneous nucleation parameters m and k are negatively correlated.The interpretation of the correlation between the estimated values of k and m, where the latter increases with supersaturation as one would expect, is unclear.It is emphasised that the variation of k between experiments does not contradict the model assumption that k is the same for all active nucleation sites; it just indicates that the nature of active nucleation sites differs between experiments.
Although the purpose of this paper was not to venture into nucleation theory, it seemed justified to give some interpretation of the values of the estimated nucleation model parameters that were obtained.
In the framework of CNT, the homogeneous nucleation rate is related to the parameter of the nucleation model by the following equation (Kashchiev, 2000): According to equation ( 16), k 0 is the product of the volume of the solution V where nucleation occurs and the rate of nucleation J (number of particles per unit time and unit volume).So, CNT for homogeneous nucleation implies that nucleation kinetics parameters are related to the volume of the droplets, and therefore the size, or size distribution of the droplets.In the context of this study,   A 0 is the nucleation rate prefactor which can be written as (Kashchiev, 2000): Where is the rate at which a monomer is added to the critical cluster, ρ is the number density of a monomer in the solution, D the diffusion coefficient, r* the critical radius of the nuclei,  is the probability of effective collision between a cluster and a monomer, and ρ N is the number density of nucleation sites.As with homogeneous nucleation, every monomer can be a nucleation site ρ = ρ N .
Similarly, the heterogeneous surface nucleation rate is written as (Kashchiev, 2000): As the main assumption of CNT is that the properties (shape, surface tension, etc.) of the pre-critical nuclei are the same as those of the post-critical nuclei and of the crystal, the pre-exponential factor has the same expression.However, in that case, the number density of nucleation site is directly related to the number density of impurities and is much smaller than the number density of monomers.
The evolution of the nucleation rate (k 0 ) as a function of temperature and supersaturation is given in Fig. 8.As predicted by CNT, it is found that ln(k 0 ) varies linearly with the term T -3 ln -2 S. The crystal solution interfacial tension is calculated from the slope of the line, giving a value of 3 mJ•m -2 that is consistent with the data already published (Teychené and Biscans, 2008).From the intercept, the pre-exponential factor of the nucleation rate was found to be in the range of 10 7 , which is very low for a purely homogeneous nucleation rate.Estimation of this factor from CNT in the case of a purely homogeneous nucleation yields values in the range 10 10 to 10 13 s -1 .Because the number of nucleation sites is smaller than the number of monomers in solution, the discrepancy can be attributed either to heterogeneous nucleation (with static heterogeneities present in the solution from the start of the experiment, but with a lower activity than the impurities responsible for the fast nucleation rate), or to the inability of CNT to capture the complexity of crystal nucleation from solution (i.e.nucleation is catalysed by dynamic heterogeneities, which are formed during the nucleation process, for instance).The limits of CNT have already been pointed out by a number of authors, including Vekilov (2010) for protein nucleation, Jawor-Baczynska et al. (2015) for organic molecules and Gebauer et al. (2014) for salts.
Fig. 8 also displays the variation of the heterogeneous surface nucleation rate k as a function of T -3 ln -2 S. Because m and k are strongly correlated, and because of the large uncertainty associated with data set B in particular, it is difficult to interpret the measured trend of the heterogeneous nucleation rate k.Nevertheless, the upward trend of ln(k) vs T -3 ln -2 S (which is directly related to the energy barrier of the nucleation process) is lower than the one obtained in the case of volume nucleation.As the nucleation process is fast, it can be inferred that the energy barrier for heterogeneous nucleation is very low.
For the same reasons, it is difficult to interpret the evolution of the mean number of impurities m.Based on Pound and La Mer's original paper (1954), this number depends both on temperature and concentration.For data sets A through D, it appears that m is proportional to the concentration of solute, which implies that the impurities come with the solute molecules.They are probably associated with molecules with a molecular structure that resemble the solute molecules and cannot be purified.
This section has confirmed that provided the sampling frame rate is commensurate with the heterogeneous nucleation rate and that the number of observed droplets is high enough to eliminate sampling statistics issues, the measured variation of unfrozen droplets can yield good estimates of nucleation parameters.Discussion about the parameter estimates led to the assertion that nucleation microfluidics experiments should be performed at the same initial concentration but with a different final temperature in order to decorrelate the heterogeneous nucleation parameters m and k, and to study conclusively the influence of supersaturation and temperature on nucleation kinetics.

Prediction of the appearance of crystals
Model predictions of f r (t) were obtained using the probabilistic model employed in single event nucleation mode and using the MC simulation scheme presented earlier for the selected experimental data sets.The output of the MC scheme is dependent on the time step ∆t used as well as the total number N T of droplets.The MC simulation requires using as small a time step ∆t as possible, which impacts computation time.The choice of time step ∆t used in the simulations is such that it is no greater than 1/k, the heterogeneous nucleation giving the fastest kinetics.
Fig. 9 shows the prediction of the fraction of empty droplets output by the MC simulation against the experimental microfluidics data.The prediction is good, as one would expect since the nucleation model parameters were estimated from the measured fraction of empty droplets.However, the point here is to recognise that the predictions are obtained by the MC simulation scheme, not from the analytical expressions for f 0 (t), which confirms the correctness of the Monte-Carlo simulation code developed by the authors to simulate the probabilistic nucleation model.
Fig. 9 also shows the predicted prediction intervals for data set D for two different total numbers N T of droplets.The prediction intervals obtained for the simulation performed using 700 droplets (which correspond to the number of the droplets analysed during the experiment) captures nicely the spread of the experimental data.As expected, the model predicts that had a lower number of droplets been observed, the spread of the experimental data would have been higher.This expected trend shows that the model can also be used to guide the experimenta-tion in terms of setting the number of droplets to be observed.This is an important issue as the spread of the data controls the precision of the estimates of the nucleation model parameters.
The point was made in section 2.3 that the prediction of the number of crystals inside a droplet depends on the mode chosen to implement the nucleation model.The following text analyses the capacity of the nucleation model to predict the appearance of crystals.It is worthy of note that, to the best of the authors' knowledge, application of the probabilistic nucleation model to predict the appearance of crystals in microfluidics experiments has not yet been reported.
Experimentally, it was observed that for experiments performed at relatively low supersaturation, most of the droplets contained 1 crystal and very few droplets contained two crystals.This is the case for data set A, which produced 33 % of droplets with 1 crystal, 2 % with 2 crystals and 65 % without crystals.With experiment B, droplets that had nucleated contained only 1 crystal.Simulation results of the probabilistic nucleation model used in single nucleation event mode are given in Fig. 10 for data sets A and B. It is emphasised here that the predictions use the parameters (k 0 , k, m) that were estimated in section 3 from the time variation of empty droplets.The agreement between simulations and measurements is good, confirming that the single nucleation event mode gives a good account of the nucleation events that took place in experiments A and B. One explanation is that as supersaturation is low, when one crystal appears, it starts growing and consumes all the supersaturation.The depletion of solute induces a decrease in nucleation probability such that no new nucleation event can occur inside the droplet.Such a situation will prevail when crystal growth kinetics are very fast, or when the ratio between the crystal nucleation rate and the crystal growth rate is small.It can be noticed in Fig. 10 that the measured fraction of droplets which bear one crystal is mainly inside the upper portion of the model's prediction interval, and that it is the opposite for the droplets with 2 crystals.One may recall that crystal number counting was done by visual inspection, and it is believed that the visual counting biased the data by overestimating the number of droplets with 1 crystal at the expense of those with 2 crystals.Visual counting of the number of crystals inside a droplet is difficult, and this nucleation-model-based analysis stressed the importance of this issue for analysis of microfluidics experiments.
When the crystal nucleation rate to crystal growth rate ratio is high (with the limiting case of no crystal growth), not all the supersaturation is consumed by a single crystal, and successive nucleation events may occur over time.Let us consider data set D for which the initial supersaturation was twice that of data sets A and B. Fig. 11 shows 9 Prediction of the fraction of empty droplets as a function of time using MC simulation for two representative experiments and two total numbers of droplets.
the predictions of the probabilistic nucleation model used in both single nucleation event mode and constant supersaturation mode.The variation of the number fraction of droplets that contain one crystal bears no resemblance with data sets A and B. Indeed, it shows a rapid increase at the beginning of the experiment to a maximum value, then a slow decrease.The measured number fraction of droplets with 2 crystals or more shows a continuous increase over time.The upper plot in Fig. 11 shows the nucleation model prediction in single nucleation event mode.It is clear that this implementation mode of the nucleation model does not capture at all the nucleation events of data set D. The lower plot in Fig. 11 gives the prediction of the nucleation model in constant supersaturation mode.Notwithstanding the counting bias between droplets with one or more crys-tals, it is clearly apparent that the constant supersaturation mode gives a much closer match to the measured data.With data set D, given its higher state of supersaturation and its higher k 0 /k ratio, which is more than 1 order of magnitude higher than that of data sets A and B as seen in Table 3, it is not surprising to find that the hypotheses of the constant supersaturation mode give a better account of the measured microfluidics data.
Even though the single nucleation event mode can be a   close match to microfluidics data, as found with data sets A and B, both the single nucleation event and the constant supersaturation modes are limited in their ability to predict nucleation in that they do not account for the crystal growth that follows nucleation.To further improve our ability to model microfluidics systems and gain even more value out of the valuable data produced by such systems, it is necessary to couple the probabilistic nucleation model with a crystal growth model that accounts for the consumption of supersaturation inside individual droplets.The Monte-Carlo simulation scheme, in which droplets are individualised, provides just the right environment for such a development.This requires both accurate measurements of crystal growth inside droplets, and a valid thermodynamic model of the liquid phase.

Application to microfluidics data from literature
For the sake of rounding off the paper, it was deemed relevant to apply the probabilistic nucleation model to high-quality data published by researchers other than the present authors.The data selected was that of Figs. 25 and 29 from Akella et al. (2014).The estimated nucleation parameters, using the non-linear parameter estimation scheme presented earlier, are given in Table 4.The values are close to those published by Akella et al., however, they differ due to the difference between equation (10), proposed by the authors and equation ( 14) used by Akella et al.
Figs. 12 and 13 show the corresponding 95 % prediction intervals for the number fraction f 0 of empty droplets.The prediction intervals are narrower compared to those of data sets A through D. This is due to Akella et al. using a high number of droplets, typically 1200 to 4000 in each of their experiments, so that their raw measurements of f 0 are extremely smooth.
Figs. 14 and 15 show the predictions made using both single nucleation event mode (N T = 1000 drops) and constant supersaturation mode for the production of crystals.
The results show that the single nucleation event mode predicts that after 15 hours, most droplets contain 1 crystal, with no more than 0.24 % and 0.62 % of droplets containing 2 crystals for the data from Figs. 25 and 29, respectively.The constant supersaturation mode predicts a significant proportion of droplets that bear 2 or more crystals.
It is anyone's guess as to the actual number of crystals observed by Akella and co-workers (2014), as they did not report this information.The predictions are so distinct that it should be rather simple for them to decide whether their system behaves more like one mode or the other.By examining the images and movie frames provided in their paper, and considering that their experiment is carried out at a low level of supersaturation, it is probable that their system follows the single nucleation event mode and produced no more than 1 crystal per droplet.

Conclusions
The probabilistic nucleation model originally proposed by Pound and La Mer (1952) was revisited with the objective of clarifying its underlying hypotheses and associated behaviour.The nucleation model considers two types of droplets, i.e. droplets without nucleation sites which can only undergo homogeneous nucleation events, and drop-lets with active nucleation sites which can undergo homogeneous and heterogeneous nucleation events.Analytical expressions were derived for predicting the variation of the fraction of droplets that contain a given number of crystals, both in terms of Poisson distribution functions and exponential functions.
The paper emphasises that the same model can be implemented in different ways which yield different outcomes.The model was applied in constant supersaturation and single nucleation event modes.The first mode, which is often portrayed in the literature, corresponds to the droplets with no crystal growth, whereas the second one corresponds to droplets having a supersaturation that is just sufficient for one nucleation event and associated crystal growth.
Regardless of the implementation mode, the model prediction for the time variation of the fraction of empty or unfrozen droplets is the same, confirming that nucleation kinetic parameters m, k 0 and k should indeed be estimated using the experimental measurements of the fraction of empty droplets.This is a particularly favourable situation given that this is the fraction that is most accurately measured in practice.Indeed, identifying droplets that contain 1 or more crystals is obviously more prone to measurement errors than identifying empty droplets.The estimation of nucleation model parameters showed the importance of frame rate and number of droplets as they impact the precision of the parameter estimates.
In the context of microfluidics studies, it was confirmed that both modes of implementation of the nucleation model are applicable.The probabilistic nucleation model was used, possibly for the first time, to predict the rate of appearance of crystals for microfluidics systems.Experimental results with low supersaturation levels were well predicted by the single nucleation event mode.The single nucleation event mode was shown to yield droplets with no more than one or two crystals, a desirable situation from an industrial perspective.The constant supersaturation mode accounted better for the nucleation events observed for data sets with excess supersaturation.Comparison between predicted and measured number fractions of droplets with one or more crystals indicated the possibility of bias in the number counting of crystals inside individual droplets, emphasising the need to develop reliable crystal counting methods for harvesting the full benefit of microfluidics systems.
The analysis confirmed the value and applicability of the probabilistic nucleation model for analysis of microfluidics data, particularly for estimation of nucleation parameters.The model showed limitations in its applicability for predicting the appearance of crystals inside droplets, although it gave reasonable predictions at low supersaturation levels.As the probabilistic nucleation model alone cannot account for the consumption of the supersaturation inside droplets, the nucleation model cannot predict the number of crystals inside droplets.As the real situation is somewhere between the single nucleation event and the constant supersaturation modes, it is necessary for prediction of the rate of appearance of crystals inside individual droplets to merge the probabilistic nucleation model with a crystal growth model.

Fig. 3
Fig. 3 Principle of the MC simulation scheme implementing equation (7), for one set of nucleation parameters (m, k 0 , k).

Fig. 5
Fig. 5 Example of a bootstrap estimation of the nucleation model parameters.The blue symbols are the experimental data points, the red symbols the bootstrap sample, and the solid line is the model regression for the bootstrap sample.

Fig. 6
Fig. 6 Nucleation model 95 % prediction interval estimated for data sets A through D.

Fig. 7
Fig. 7 Illustration of the strong correlation between heterogeneous nucleation rate parameters m and k.The correlation plotted is: log 10 k = -3.82-0.48log 10 m.

Fig. 8
Fig. 8 Evolution of the nucleation rates k and k 0 as a function of supersaturation S and temperature T.

Fig. 10
Fig. 10 Predicted and measured temporal evolution of the fraction droplets containing one or two crystals in the droplets for data set A (upper) and data set B (lower).

Fig. 11
Fig. 11 Predicted and measured temporal evolution of the number fractions of droplets containing one or more crystals for data set D. Upper figure: single nucleation event mode.Lower figure: constant supersaturation mode.

m
Nomenclature dpois(k,λ) Probability density function of the Poisson distribution with mean λ D Diffusion coefficient (m 2 •s -1 ) f r (t) Number fraction of droplets containing r crystals at time t k Nucleation rate constant of an active nucleation site (s -1 )k B Boltzmann constant k B = 1.38064852 × 10 -23 m 2 kg•s -2 •K -1 k 0 Homogeneous nucleation rate constant (s -1 ) j Droplet indexused in Monte-Carlo simulation (s) Mean number of active nucleation sites per droplet N act.Number fraction of droplets that contain active nucleation sites at the start of a microfluidics experiment N p Number of droplets with p active nucleation sites N T Total number of droplets counted in a microfluidics experiment p Number of active nucleation sites in a droplet r Number of nucleation events inside a droplet r* Radius of the critical nuclei (m) S x Supersaturation defined as S x = x°/xenergy barrier of nucleation (J) ρ Number density of monomers per unit of volume (m -3 ) ρ N Number density of nucleation sites (m -3 ) σ Crystal solution interfacial tension (mJ•m -2 )

Table 1
Experimental conditions used to produce the microfluidics experimental data.The average droplet volume is 11.5 ± 0.1 nL.

Table 2
Estimated nucleation model parameters (m, k 0 , k) for data sets A through D, and number fraction N act.=1e -m of droplets with active nucleation sites.The table indicates the estimated mean and relative standard deviation (RSD).dropletsare monodisperse and their volume is invariant during any given microfluidics experiment.

Table 3
Values of the k 0 /k ratio for data sets A through D.

Table 4
Estimated nucleation model parameters (m, k 0 , k) for Akella et al's data sets(Akella et al., 2014, Figs. 25  and 29), and number fraction of droplets with active nucleation sites.