General Paper

Analysis Method for Digital PCR Data on Two Dimensional Orthogonal Coordinate (*x*, *y*) by Converting to Two Dimensional Polar Coordinate (*r*, *θ*) Using EXCEL Macro

2019 Volume 5

Details

Abstract

An absolute quantitative analysis method has been recently developed as a third
generation polymerase chain reaction method “PCR” for fractionated DNA. The method is
designed to determine the number of DNA molecules in target DNA samples by counting the
number of PCR products obtained from fractionated DNA. We applied EXCEL Macro to perform
the conversion of two dimensional orthogonal coordinate **( x,
y)** fluorescent signal plot data obtained by digital PCR device
to two dimensional polar coordinate

1 INTRODUCTION

It has been demonstrated that data from digital PCR analysis [1,2,3,4], in which wild type and mutant type are
labeled with a probe/Hexachloro Fluorescein (HEX) and a probe/Fluorescein Amidite (FAM),
respectively, generate almost 15,000–20,000 fluorescent signals, the results of which are
recorded in the two dimensional orthogonal coordinate **( x,
y)** to determine the extent of target gene amplification [5]. It is possible to detect the presence of gene mutation
and the rate of mutation with high sensitivity in an extremely small amount of target sample
by plotting the intensity of mutant type (Mt) with HEX fluorescent signal in the

Figure 1.

Fluorescent signal plots of dPCR by two dimensional orthogonal coordinates
(*x*, *y*). Sample is ALK wild type/resistant mutation
type (L1196M) DNA. A; Mutation Type region (Mt region), B; No Template Control region
(NTC region), C; Wild Type region (Wt region). HEX (Hexachloro Fluoresein); a probe for
Wild type DNA, FAM (Fluorescein Amidite); a probe for Mutation type DNA. HEX fluorescent
signals are plotted in ** x** axis and FAM fluorescent signals
are plotted in

2 METHOD AND RESULTS

2.1 Digital PCR data in two dimensional orthogonal coordinate
(It has been generally demonstrated that the extent of the target gene amplification could
be monitored by plotting the intensity of fluorescence signals of Wt/HEX in the
** x** axis and those in Mt/FAM in the

Figure 2.

Histogram of two dimensional orthogonal coordinate (HEX axis). Sample is ALK wild
type/resistant mutation type (L1196M) DNA. In histogram of
** x** (HEX) axis, A (Mt region) is overlapped with B (NTC
region), therefore HEX fluorescent signal plot numbers of A (Mt region) can’t be
counted accurately. Two separate histograms of

Figure 3.

Histogram of two dimensional orthogonal coordinate (FAM axis). Sample is ALK wild
type/resistant mutation type (L1196M) DNA. In histogram of
** y** (FAM) axis, C (Wt region) is overlapped with B (NTC
region) like the case of Figure 2, therefore
FAM fluorescent signal plot numbers of C (Wt region) can’t be counted accurately. Two
separate histograms of

The origin **( x = 0, y = 0)** in two
dimensional orthogonal coordinate

Figure 4.

Conversion of dPCR fluorescent signal plots from two dimensional orthogonal
coordinate (*x*, *y*) to two dimensional polar
coordinate (*r*, *θ*). Origin
(** x** = 0,

** r** =SQRT (X^2+Y^2)

** θ** =DEGREES (ATAN (Y/X))

Figure 5.

Histogram of two dimensional polar coordinate (*θ,* 0°– 360°). Sample
is ALK wild type/resistant mutation type (L1196M) DNA. Frequency of the three regions
A (Mt region), B (NTC region) and C (Wt region) by histogram of two dimensional polar
coordinate **( θ,** 0°– 360°

After the coordinate converting, the number of fluorescent signal plots in the A (Mt region), B (NTC region) and C (Wt region) are calculated automatically using the normal distribution probability density function “NORMDIST” in EXCEL at each of 90%, 95% and 99% confidence interval, the gene mutation is determined by existence of fluorescent signal plots in the Mt region, and the gene mutation rate is calculated by Mt fluorescent signal plots/(Mt fluorescent signal plots + Wt fluorescent signal plots). In Figure 6, the EXCEL Macro program execution result of ALK wild type/resistant mutation type (L1196M) DNA (10,000copies) is shown, and in Figure 7, the EXCEL Macro program execution result of ALK wild type/resistant mutation type (L1196M) DNA (Mt/Wt;10/1,000 copies) is shown respectively.

Figure 6.

Result of EXCEL macro program execution using by converted two dimensional polar
coordinate (*r*, *θ*). Sample is ALK wild type/resistant
mutation type (L1196M) DNA (10,000copies). Fluorescent signal plot numbers of A (Mt
region) and C (Wt region) are calculated accurately at the same time, those plot
numbers are shown in the table right upper side. In this case, gene mutation is
detected by the existence of fluorescent signal plots in A (Mt region), and the gene
mutation rates are 51.7% (2315/4475), 52.1% (2475/4747) and 51.8% (2568/4954) at each
confidence interval 90%, 95% and 99%, respectively. “center” in table is expressing
each center coordinate **( x, y)** of A, B and C region,
respectively. Inside, middle and outside elliptical graphs are expressing the areas at
each confidence interval 90%, 95% and 99%, respectively.

Figure 7.

Result of EXCEL macro program execution using by converted two dimensional polar
coordinate (*r*, *θ*). Sample is ALK wild type/resistant
mutation type (L1196M) DNA (Mt/Wt:10:1,000copies). Fluorescent signal plot numbers of
A (Mt region) and C (Wt region) are calculated accurately at the same time, those plot
numbers are shown in the table right upper side. In this case, though the number of
the fluorescent signal plots is very few, gene mutation is observed by existence
fluorescent signal plots in A (Mt region), and the gene mutation rates are 1.1%
(6/528), 1.3% (7/555) and 1.2% (7/565) at each confidence interval 90%, 95% and 99%,
respectively. “center” in table is expressing each center coordinate **( x,
y)** of A, B and C region, respectively. Inside, middle and outside
elliptical graphs are expressing the areas at each confidence interval 90%, 95% and
99%, respectively, as same as Figure 6.

It is possible to judge promptly and easily whether target gene mutations are present by visualizing genetic analysis fluorescent signal plots data through the following processing procedure of EXCEL VBA MACROS analysis except that the Steps 1) and 9) were input and evaluated manually.

[Operating Procedure]1) Data uptake two dimensional orthogonal coordinate **( x,
y)**

2) Determine centroids for corresponding A (Mt region), B (NTC region) and C (Wt
region) by ** k**-means clustering method

3) Determine the gravity point of triangle formed by the centroids for A (Mt region), B (NTC region) and C (Wt region)

4) Parallel move of the origin of two dimensional orthogonal coordinate
**( x, y)** to the gravity point

5) Conversion to two dimensional polar coordinate **( r,
θ)***

* ** r**=SQRT (X^2+Y^2),

6) Histogram preparation based on **( θ)** degrees

7) Calculate the number of fluorescent signal plots distributed in A (Mt region), B (NTC region) and C (Wt region) within 90%, 95% and 99% confidence interval by NORMDIST function

8) Re-conversion** to two dimensional orthogonal coordinate **( x,
y)** and elliptical graph of 90%, 95% and 99% confidence
interval display

** ** x**=

9) Determine gene mutation and calculate gene mutation rates within 90%, 95% and 99% confidence interval, respectively

3 DISCUSSION

In quantitative analysis from histograms in HEX (** x** axis) and
FAM (

REFERENCES

- [1] X. Shi, C. Tang, W. Wang, D. Zhou, Z. Lu, Electrophoresis, 31, 528 (2010). , doi:10.1002/elps.20090036220119960
- [2] E. A. Ottesen, J. W. Hong, S. R. Quake, J. R. Leadbetter, Science, 314, 1464 (2006). , doi:10.1126/science.113137017138901
- [3] B. Vogelstein, K. W. Kinzler, Proc. Natl. Acad. Sci. USA, 96, 9236 (1999). , doi:10.1073/pnas.96.16.923610430926
- [4] S. Dube, J. Qin, R. Ramakrishnan, PLoS One, 3, e2876 (2008). , doi:10.1371/journal.pone.000287618682853
- [5] B. J. Hindson, K. D. Ness, D. A. Masquelier, P. Belgrader, N. J. Heredia, A. J. Makarewicz, I. J. Bright, M. Y. Lucero, A. L. Hiddessen, T. C. Legler, T. K. Kitano, M. R. Hodel, J. F. Petersen, P. W. Wyatt, E. R. Steenblock, P. H. Shah, L. J. Bousse, C. B. Troup, J. C. Mellen, D. K. Wittmann, N. G. Erndt, T. H. Cauley, R. T. Koehler, A. P. So, S. Dube, K. A. Rose, L. Montesclaros, S. Wang, D. P. Stumbo, S. P. Hodges, S. Romine, F. P. Milanovich, H. E. White, J. F. Regan, G. A. Karlin-Neumann, C. M. Hindson, S. Saxonov, B. W. Colston, Anal. Chem., 83, 8604 (2011). , doi:10.1021/ac202028g22035192
- [6]Kyoritsu Sugakukosiki, Kyoritsu Shuppan, pp. 111 (1975)
- [7] S. Neilson, “k-Means Cluster Analysis in Microsoft Excel” (2011). http://www.neilson.co.za/k-means-cluster-analysis-in-microsoft-excel/
- [8] H. Steinhaus, (French). Bull. Acad. Polon. Sci., 4, 801 (1957).
- [9] E. W. Forgy, Biometrics, 21, 768 (1965).
- [10] J. B. MacQueen, “Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability”, Berkeley, University of California Press, 1, 281–297 (1967).

© 2019 Society of Computer Chemistry, Japan