Detecting periodic patterns
in unevenly spaced gene expression time series
using Lomb-Scargle periodograms


Earl F. Glynn1, Jie Chen1,2*, Arcady R. Mushegian1,3


1Stowers Institute for Medical Research, 1000 East 50th Street, Kansas City, MO  64110 USA.

2Department of Mathematics and Statistics, University of Missouri-Kansas City, 5100 Rockhill Road,
Kansas City, MO  64110  USA.
   *Corresponding author.

3Department of Microbiology, University of Kansas Medical Center, Kansas City, KS  66160  USA


Home    Paper    Supplementary Information    R Code    Feedback

ABSTRACT

Motivation
Periodic patterns in time series resulting from biological experiments are of great interest. The commonly used Fast Fourier Transform algorithm is applicable only when data are evenly spaced and when no values are missing, which is not always the case in high-throughput measurements. The choice of statistic to evaluate the significance of the periodic patterns for unevenly spaced gene expression time series has not been well substantiated.

Methods
The Lomb-Scargle periodogram approach is used to search time series of gene expression to quantify the periodic behavior of every gene represented on the DNA array. The Lomb-Scargle periodogram analysis provides a direct method to treat missing values and unevenly spaced time points. We propose the combination of a Lomb-Scargle test statistic for periodicity and a multiple hypothesis testing procedure with controlled false discovery rate to detect significant periodic gene patterns.

Results
We analyzed Bozdech's Plasmodium falciparum gene expression dataset.  In the Quality Control Dataset of 5080 expression patterns, we found 4112 periodic probes. In addition, we identified 243 probes with periodic expression in the Complete Dataset, which could not be examined in the original study by the FFT analysis due to an excessive number of missing values. While most periodic genes had a period of about 48 hr, some had a period close to 24 hr. Our approach should be applicable for detection and quantification of periodic patterns in any unevenly spaced gene expression time series
data.


Figures
Fig. 1 Simulated cosine signal taken on unevenly spaced time points mixed with Gaussian noise (48 hr period)
Fig. 2 Simulated cosine signal taken on unevenly spaced time points (mixed with Gaussian noise) with single dominant period (24 hr)
Fig .3 Simulated cosine taken on unevenly spaced time points signal (mixed with Gaussian noise) with multiple periods (8, 24, and 48 hours)
Fig. 4 Simulated Gaussian noise taken at unevenly spaced time points with no periodicity
Fig. 5 Periodic gene sets identified by the two methods (Bozdech vs Lomb-Scargle)
Fig. 6 Periodic gene expression pattern of i3518_1 in Bozdch's Plasmodium falciparum gene expression dataset
Fig. 7 Non-periodic gene expression pattern of j167_5 in the Plasmodium falciparum gene expression dataset

Stowers Institute for Medical Research
Bioinformatics Center

Updated
22 Nov 2005