Frequency and Correlation of Nearest Neighboring Nucleotides in Human Genome
-
Graphical Abstract
-
Abstract
Zipf's approach in linguistics is utilized to analyze the statistical features of frequency andcorrelation of 16 nearest neighboring nucleotides (AA, AC, AG, … , TT) in 12 human chro-mosomes (Y, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, and 12). It is found that these statisticalfeatures of nearest neighboring nucleotides in human genome: (i) the frequency distributionis a linear function, and (ii) the correlation distribution is an inverse function. The coeffi-cients of the linear function and inverse function depend on the GC content. It proposes thecorrelation distribution of nearest neighboring nucleotides for the first time and extends thedescriptor about nearest neighboring nucleotides.
-
-