Characterization of Chronic Fatigue Syndrome
Using Affective Disorder and Immune System Pathways

Earl F. Glynn1, Hua Li1, Chris Seidel1, Frank Emmert-Streib1, Jie Chen2, Arcady R. Mushegian1,3

1 Stowers Institute for Medical Research
2 University of Missouri -- Kansas City
3 University of Kansas Medical Center

Online Supplement


PowerPoint Slides (9.3 MB)
PDF of Slides (451 KB)
Original Abstract (outdated)

presented at
Critical Assessment of Microarray Data Analysis "CAMDA '06"
Duke University
9 June 2006


Wichita Chronic Fatigue Syndrome Study
Data Analysis and Results

CAMDA 2006 Conference Datasets

Clinical Survey Data Analysis

  • Chronic Fatigue Syndrome (CFS) disease state clusters from Reeves, et al, (2005) extracted from survey data (Classification.csv).  Cluster classifications, "Worst", "Middle", "Least," were used for comparisons in Gene Expression and SNP Analysis below.

Notes about clinical data

Blood Data Analysis

Notes about clinical data

Gene Expression Data Analysis

  1. Candidate Gene Lists
    [Lists selected since they span both CFS Gene Expression and CFS SNP data, and identify genes likely related to psychological and neuroendocrine causes/exclusions of CFS.]
  2. Match array probe IDs to Gene IDs using Bioconductor biomaRt:
    21,950 matches for 19,700 probes.
    GeneInfo-2006-04-28.zip and R script.
  3. Load raw expression data, examine and prepare for analysis.  R Scripts: 0b-GeneExpression-Selected.R, 1-LoadData.R, 2-Zeros.R, 3-ControlProbes.R, 4-Replicates.R, 5-PatientScaled.R
  4. Analyze Hattori Gene Set:  Final Results  ExpressionStats4Final.xls
    [R-Scripts:  1-MatchBiomartWithAffectiveDisorderGenes.R, 2-ExtractExpressionData.R, 3-AnalyzeExpressionData.R.  Kruskal-Wallis results for all:  ExpressionStats1Raw.csv]
  5. Analyze PNI Gene Set:  Final Results:  ExpressionStats4Final.xls
    [R-Scripts: 1-MatchBiomartWithPNIGenes.R, 2-ExtractExpressionData.R, 3-AnalyzeExpressionData.R. Kruskal-Wallis results for all: ExpressionStats1Raw.csv]
  6. Aggregate Results:  Summary of Differentially Expressed Genes, FinalInfo.xls with 8 genes in Worst vs Least (Only), 5 genes in Worst vs. Middle, and 50 genes in Worst-or- Middle vs Least.  Table shows Probe IDs, Gene Names, Nucleotide and Protein GI numbers, KEGG Pathways, and descriptive information.

KEGG Pathways in aggregate results compare favorably to those in :

  • Hong Fang, et al, Gene expression profile exploration of a large dataset on chronic fatigue syndrome, Pharmacogenomics (2006), 7(3), 429-440
  • Toni Whistler, et al, Gene expression correlates of unexplained fatigue, Pharmacogenomics (2006), 7(3), 395-405

Notes about Gene Expression Data

SNP (Single Nucleotide Polymorphism) Data Analysis

  1. SNPNumeric.csv [R scripts to reformat original data: 
    SNP-Reformat1.R
    and SNP-Reformat2.R]
  2. Single Marker Analysis and statistical details
  3. Hardy-Weinberg Equilibrium Analysis Results Summary and Details, plus R Script
  4. Bagged Logic Regression Analysis

Genes in "Worst-Least" logic regression disjuncts compare favorably to "Importance of the genes based on their SNPs," Table 4, Benjamin N Goertzel, et al, Combinations of single nucleotide polymorphisms in neuroendocrine effector and receptor gene predict chronic fatigue syndrome, Pharmacogenomics (2006), 7(3), 475-483

Notes about SNP data

 

Updated
12 June 2006