A novel tracking and analysis system for time-lapse cell imaging of Saccharomyces cerevisiae

Recent studies have revealed that tracking single cells using time-lapse ﬂuorescence microscopy is an optimal tool for spatiotemporal evaluation of proteins of interest. Using this approach with Saccharomyces cerevisiae as a model organ-ism, we previously found that heterochromatin regions involved in epigenetic regulation differ between individual cells. Determining the regularity of this phe-nomenon requires measurement of spatiotemporal epigenetic-dependent changes in protein levels across more than one generation. In past studies, we conducted these analyses manually to obtain a dendrogram, but this required more than 15 h, even for a single set of microscopic cell images. Thus, in this study, we developed a software-based analysis system to analyze time-lapse cellular images of S. cerevisiae , which allowed automatic generation of a dendrogram from a given set of time-lapse cell images. This approach is divided into two phases: a cell extraction and tracking phase, and an analysis phase. The cell extraction and tracking phase generates a set of necessary information for each cell, such as geometrical properties and the daughter–mother relationships, using image processing-based analysis techniques. Then, based on this information, the analysis phase allows generation of the ﬁnal dendrogram by analyzing the ﬂuorescent characteristics of each cell. The system is equipped with manual error correction to correct for the inevitable errors that occur in these analyses. The time required to obtain the ﬁnal dendrograms was drastically reduced from 15 h in manual analysis to 0.8 h using this novel system.


INTRODUCTION
Changes in the expression status of epigenetically regulated genes, which are independent of DNA sequence, are important mechanisms by which individual cells generate diversity. Intercellular relationships are altered in pathological states, including cancer, suggesting that epigenetic regulation of cellular heterogeneity is a putative therapeutic target (Buzzetti et al., 2016;Meng et al., 2016;Tsuchida and Friedman, 2017). Although epigenetic regulation of gene expression varies between individual cells, most of the current work in the epigenetics field has been based on analysis of cell populations, in which the expression of epigenetically regulated genes is likely to vary between cells. However, because the characteristics of individual cells cannot be clarified by analysis of heterogeneous cell populations, the molecular mechanisms of epigenetic regulation remain incompletely understood. Therefore, we previously developed a unique method using Saccharomyces cerevisiae as a model organism, to track epigenetic changes in gene expression in single cells, which revealed among multiple cell populations the presence of a minority group with highly distinct phenotypic behavior (Mano et al., 2013). Single-cell analysis enables evaluation of the phenotypes of minority cell populations that would otherwise be undetectable, and is likely to facilitate the identification of mechanisms that have thus far been overlooked by conventional analyses of total cell populations.
Single-cell analysis techniques have been developed in a variety of other organisms. In many cases, a small number of cells in a distinct but small cell population are differentially regulated, and these minority cell populations play important roles in the total cell population (Osborne et al., 2011;Kitada et al., 2012;Winter et al., 2015). Thus, it is necessary to analyze single cells to elucidate the important functions of minority populations embedded in the total cell population, which are not detectable by conventional whole-population analyses. We previously developed the single-cell tracking system by introducing the EGFP gene encoding a fluorescent protein into a specific region of heterochromatin that is important in epigenetic regulation of gene expression. Spreading of the labeled heterochromatin region, such as by epigenetic regulation, weakened expression of the EGFP gene. The fluorescence was quantified contin-uously using a fluorescence microscope, enabling analyses of changes in epigenetic status at the single-cell level (Mano et al., 2013).
Time-lapse experiments using fluorescence microscopy require a long follow-up period, and the large amount of imaging data requires a considerable amount of time and effort to analyze manually. Because manual analysis consists of repeating simple tasks, such as the association of cells for each frame, daughter-mother association, and association of fluorescent images with bright-field images, the time required for analysis can be greatly reduced by automation. Indeed, a number of methods have been developed for automated analysis of cell tracking (Meijering et al., 2012;Shen et al., 2017). Li et al. (2008) enabled cell tracking using IMM (interacting multiple models), which employs a Kalman filter-based Chr.Ⅸ Chr.Ⅲ tracking filter with various motion mechanisms. Winter et al. (2011Winter et al. ( , 2015 also used lineage information to automatically improve tracking accuracy. Recently, software capable of analyzing multiple cell types has been developed using machine learning (Van Valen et al., 2016;Tsai et al., 2019). A substantial body of work has evaluated cell tracking not only in animal cells, but also in yeast and Escherichia coli, which form denser colonies that interfere with segmentation and tracking (Kvarnstrom et al., 2008;Wang et al., 2010;Hashimoto et al., 2016;Versari et al., 2017). However, many of these studies focused on improving segmentation algorithms, which resulted in shortcomings such as the inability to construct a tree diagram and to track the expression of fluorescent proteins.
To improve our manual single-cell tracking method, which requires a huge amount of processing time, we developed a software-based tracking system for the analysis of time-lapse cellular images of S. cerevisiae. The results show that the newly developed software has potential for use in yeast epigenetics research.

MATERIALS AND METHODS
Yeast culture and time-lapse experiments Yeast was pre-cultured from OD = 0.05 to the logarithmic growth phase in liquid YPD and subsequently diluted to OD = 0.0005. The diluted yeast cells were seeded into a dedicated lane of a microfluidic plate (ONIX), which limited the growth of each colony to a monolayer and trapped the cells in the viewing area in the center of the plate by air pressure flushing. After trapping the yeast at a maximum of 8 points, bright-field images were captured every 4 min, and fluorescence images were captured every 40 min using an Axio Observer Z1 (Carl Zeiss) microscope equipped with a 40 × Plan-Neofluar objective lens (NA = 1.3).
Files required for analysis The bright-field images were named in increments of + 1 from "001" in the order they were taken (e.g., "001", "002", "003"). The increment in the fluorescence image file name is dependent on the bright-field image capture interval (e.g., "001", "011", "021"). In addition, the images had a resolution of 694 pixels (width) by 520 pixels (height) and were in BMP file format.

RESULTS AND DISCUSSION
Previous analysis and cell region extraction and tracking The cell line (FUY1679: MATa ADE2 lys2 his3 leu2 trp1 ura3 trp1::HTB1-2×mCherry::TRP1 HMR-left-HTB1-ECFP HMR-left-HTB1-ECFP HML-right-HTB1-EYFP) used in the present study has ECFP inserted on the left side of the HMR region of chromosome III, and EYFP inserted on the right side of the HML region (Fig.  1A). The ECFP and EYFP proteins, when fused to the histone protein Htb1, remain in the nucleus (Mano et al., 2013). In the HMR and HML regions, gene expression was negatively regulated by Sir proteins, and boundaries were formed by neighboring region of E-silencers or I-silencers. Previous studies identified that gene The daughter cell, its mother cell, and the fluorescence value of each frame were entered into the file. BG means background and BF means bright field. expression activation and deactivation are distinctly regulated in new generations by changes in epigenetic silencing. In the present study, we used this model as a means to visualize changes in protein expression on the tree diagram. mCherry was inserted to correct for nuclear position. The bright-field images corresponding to the fluorescence images taken every 40 min were respectively extracted, and the behavior of each individual cell within the population was tracked (Fig. 1B). Cells were numbered in the order of their birth, beginning at "1" (Fig.  1C). Particular cells were assigned the same number between frames, but their positions shifted slightly. Evaluating the migration and gene expression patterns of cells between frames is necessary for our analysis, and the migration destination of the cells after 40 min had to be found by comparing the bright-field images taken every 4 min. The amount of work required increases rapidly as the number of frames and cells to be tracked increases, making manual analysis extremely difficult. We spent more than 7 h on this step. Cells were numbered frame by frame, and daughtermother associations were then determined for each cell. When the cells were close together, distinguishing mother cells from daughter cells was difficult. In fact, some mother and daughter cells could not be discriminated in the bright-field images at 40-min intervals, and so we evaluated them using bright-field images taken at . Left panel, edge intensity of an image regarded as topological terrain; middle panel, simulation of a water-filling process for each catchment; right panel, the position where waters in two different catchments meet is detected as a watershed (shown in red), which gives the boundary between the catchments. (C) Association of an identical cell between two successive frames. C 0 and C 1 -C k represent the target cell in the current frame and its candidates in the next frame, respectively. F k denotes the evaluation value for the candidate cell C k . The cell C k whose F k is the minimum is fundamentally associated with the target cell C 0 , in the next frame. 4-min intervals, although this approach was more timeconsuming, compounding the difficulty of these analyses. After daughter-mother relationships were determined, the fluorescence of the mother and daughter cells present in each frame was measured. When all the cells were measured, fluorescence values were recorded in Microsoft Excel (Table 1). Also, when a cell division occurred, the corresponding parent cell and daughter cell information was recorded in Excel. The tree diagram was completed by numbering the Excel file for every frame and inputting the data to expression analysis software.
Since this analysis requires nearly a whole day to complete, we developed a software-based computer-aided analysis system to reduce the time required for analysis. In this system, cell region extraction was performed on each bright-field image frame captured every 4 min using the peripheral boundaries of the cells. However, because cell peripheral boundaries were often weakly defined because of depth-of-field and/or lighting-related issues with the microscope employed, edge enhancement and interpolation (Hirano et al., 2003) were applied to enhance weak boundaries in each bright-field frame, and the region regarded as a cell was then extracted ( Fig. 2A). We used the watershed image segmentation algorithm for cell segmentation (Fig. 2B) (Najman and Schmitt, 1994). In this method, the edge intensity of an image is regarded as a topographic terrain, and a waterfilling process is simulated in each catchment of the terrain to detect its watersheds. In the cell-extraction problem, each catchment and watershed detected by this algorithm corresponds to a cell region and its boundary. The novel software system took 2 min to identify a total of 1,708 cells in each frame, and 15 min to identify and correct three split errors. Errors in this step were then corrected manually using the error-correcting capability of the software (see below).
Cell extraction from each frame often resulted in identical cells being associated in two successive frames because cells in each frame were segmented/extracted independently, neglecting frame-to-frame consistency. Such an association was detected using a cell tracking technique. Let C 0 and C k denote the target cell in the current frame and its k-th candidate cell in the next frame, respectively. The tracking process utilized an evaluation value, F k , to estimate the degree of correspondence between the cell C 0 and its candidate C k . Let p x , l x and a x be the center of gravity, circumference length and area of the cell C x , respectively (x = 0, k). Using this geometrical information, F k was defined as where W p , W l and W a are the weighting coefficients determined empirically as 20, 1 and 2, respectively. Actually, F k was evaluated for each candidate cell C k located in a certain vicinity in the next frame centered at p 0 in the current frame (Fig. 2C). Although the simplest strategy to identify the corresponding cell for C 0 was to find C k whose F k was the lowest among all candidates, this strategy unfortunately results in conflicts in the identification of target cells, leading to an increase in correspondence errors. An annealing technique was thus introduced to overcome this difficulty, where the corresponding cell in the next frame was determined for each target cell whose minimum F k was less than that of a prescribed threshold, and the threshold was then gradually increased by a  Fig. 3. Daughter-mother association. (A) Raw image including a daughter-mother pair. Note that the boundaries between the daughter and mother cells in the white boxes are very weak. (B) Boundary region R k (shaded region) between the daughter cell D 0 and its mother candidate M k defined using the 3 × 3 mask shown in (C). (C) Center pixel X in the 3 × 3 mask included in or excluded from the boundary region R k . Lower left panel, center pixel X included in R k , whose 3 × 3 vicinity includes pixels belonging to D 0 , M k or boundaries; lower right panel, center pixel X excluded from R k , whose 3 × 3 vicinity includes at least a pixel other than the one in D 0 , M k and boundaries.
prescribed step size to identify the remainder of the cells.
Daughter-mother associations Some cells undergoing division in one frame resulted in daughter cells appearing in successive frames. These cells could not be assigned to a mother cell using our tracking system, leaving some cells unidentified. In these cases, the daughters of mother cells were identified using morphological analysis.
As illustrated in Fig. 3A, the boundary between a daugh-ter and a mother cell is weaker than that between parent cells. Based on this property, we assigned daughter cells to mother cells by evaluating the strength of the boundary between each daughter and its mother candidates located in the vicinity of the daughter cell. To stably evaluate boundary strength, daughter-mother association was performed after each target daughter had grown so that its area exceeded a prescribed threshold. To evaluate the boundary strength between the target daughter cell D 0 and its mother candidate M k , the boundary region Cell #1 and its progeny Cell #2 and its progeny 51st frame 1st frame 85th frame is not identified in the next frame; lower right panel, manual correction that associates the target cell with the one in the red circle in the next frame. (C) Error propagation between frames. Red cells represent the progeny of cell #2. A daughter-mother association error in the 51st frame propagates to successive frames, which produces erroneous identifications of the progeny shown in the ellipse in the 85th frame. Manual correction of the error once in the 51st frame prevents all successive errors, leading to correct identifications in the 85th frame.
R k between D 0 and M k was defined (Fig. 3B) using the 3 × 3 mask (Fig. 3C) and the edge-enhanced version of the frame that included D 0 and M k . The edge-enhanced frame (middle panel of Fig. 2A) was first binarized with a suitable threshold to produce a binarized boundary image. The boundary region R k between D 0 and M k was then defined as a set of pixels each of whose 3 × 3 vicinities included pixels belonging to D 0 , M k , or the boundary of cells, as illustrated in the left panel of Fig. 3C: the set excluded a pixel whose 3 × 3 vicinity included pixels belonging to the other candidate cells or the background, as shown in the right panel in Fig. 3C. The boundary strength between D 0 and M k , denoted as B k , was then evaluated as the median pixel value in the corresponding boundary region R k on the acquired bright-field raw frame. B k was actually evaluated for each candidate M k located in a prescribed vicinity of the target daughter D 0 , similar to what is shown in Fig. 2C. Finally, the parent of each daughter was determined using B k for each candidate. As in the tracking process, the simplest way to determine the mother of a daughter D 0 was to find the candidate M k whose B k was the minimum of all the candidates. However, because such a simple strategy increased identification errors, the Fig. 5. Dendrogram generation. Upper panel, the dendrogram obtained after full manual analysis; lower panel, the dendrogram obtained using the software with manual error correction. Circles on the vertical lines indicate identical cells in different frames, and branching indicates cell division. The ECFP signal is shown in green, and the EYFP signal in yellow. Cells that were no longer tracked or dead are shown as gray circles. The daughter-mother relationships of cells that divided from those in the first frame are set manually and shown as blue circles to distinguish them from those identified using the analysis method in Fig. 3. ground fluorescence intensity was C bg , and the maximum luminance value was C max (user-specified). Using this information, F i is defined as Figure 5 shows an example of the dendrograms obtained after full manual analysis (upper) and by the software system with manual error correction (lower). Some branches differed at the top and bottom of the dendrograms, although we were able to output almost the same tree diagram. The difference in branching was due to cell numbering, but otherwise there was no essential difference between the dendrograms.
Experimental results Software-based computer-aided analysis was performed using the same data as used for manual analysis (101 bright-field images, 11 fluorescence images, 400 min tracking time) (Fig. 6). In a cell extraction, the system took 2 min to identify a total of 1,708 cells in each frame, and 15 min to identify and correct the three split errors. One reason for the division error was that high-intensity nuclei were regarded as cell boundaries. In cell tracking, an identical cell appearing in two consecutive frames was identified 1,650 times and daughter-mother relationships were identified 56 times, and annealing technique was utilized again. The parent cell was determined for each daughter cell whose minimum B k was less than a prescribed threshold, and the threshold was gradually increased by a prescribed step size to identify the parents of the remainder of the daughter cells.
Manual correction for possible errors After applying cell region extraction, cell tracking, and the daughter-mother cell association step to each frame of a set of time-lapse images, all the required properties of each cell were identified, including its area, perimeter, center of gravity, correspondence to a cell in the previous frame, and mother cell. These properties were stored for use in fluorescence analysis. However, several kinds of errors may occur during cell region extraction, the tracking step, and/or the daughter-mother association step, which can result in the assignment of erroneous properties to each cell. These errors were corrected manually. For example, errors in the cell extraction process include cell over-and under-extraction and erroneous cell fragmentation or concatenation (Fig. 4A). These errors were manually corrected using a mouse-driven painting tool in our system. Erroneous association between cells was another type of error (Fig. 4B), which was corrected by directly editing the corresponding property (the corresponding cell in the previous frame or mother cell). Figure 4C shows the propagation of an error between frames, where progeny of cell #2 are shown in red. The error in the daughter-mother association in the 51st frame affects successive daughter-mother associations, which produces an erroneous identification of the progeny unless a manual correction is performed in the 51st frame. Making one manual correction once in the 51st frame can prevent error propagation in successive frames.
Fluorescence analysis and generation of dendrogram Because each cell was tracked over numerous frames and each and every daughter-mother association was identified, fluorescence analysis could be performed automatically by evaluating the fluorescence intensities for each cell using the pixel intensity in each cell region on each fluorescence image.
The result was recorded in an Excel file, similar to manual input, and the tree diagram was completed (Fig.  5). Differently colored circles are present on each branch because these cells were numbered manually in the order of birth, but in the automatic analysis system they were not numbered until the cell size exceeded a certain value. The expression level of cell region i is indicated by the evaluation value F i . When ECFP was used, the fluorescence luminance was Y i , the background fluorescence intensity was Y bg , and the maximum luminance value was Y max (arbitrary). Likewise, when mCherry was used, the fluorescence luminance was C i , the back- these analyses took 30 sec. It took 30 min to identify and correct one pair of mistakes of an identical cell appearing in two consecutive frames and four pairs of mistakes in daughter-mother cell associations. Daughter-mother mistakes occurred frequently when the outline of the daughter cells was blurred in the bright field. Finally, it took 3 min to output the phylogenetic tree. The total analysis time was about 50 min.

CONCLUSION
This study reports new software that greatly reduces the analysis time for time-lapse cell images of Saccharomyces cerevisiae. The software enables screening of silencingrelated factors in S. cerevisiae and will contribute to the identification of genes affected by the heterochromatic region changes that occur between mother cells and daughter cells across generations.