The data

There are three kinds of information, i.e., the genotypic, map and phenotypic information, not unlike those from the Wellcome Trust Case Control Consortium (http://www.wtccc.org.uk/info/data_formats.shtml)

We illustrate with data from chromosoem 7.

The map information

The data includes SNP names, their chromosome number, and alleles at these SNPs.

5012435 rs6583338       7       141322  A       G
1612556 rs4451252       7       145340  C       T
5086570 rs7806592       7       149081  C       T
1679309 rs4281072       7       149266  C       T
6735129 rs9642259       7       155737  C       G
8516302 rs11771879      7       158399  C       T
5034463 rs6964622       7       158577  A       G
5039071 rs6969424       7       159284  A       C
8518356 rs11970804      7       160848  G       T
6750991 rs10226966      7       166807  C       T
8508684 rs11764261      7       180194  A       G
6852728 rs10435479      7       187608  C       T
5081622 rs7801619       7       189089  A       G
5086229 rs7806249       7       189202  C       T
1583901 rs4430042       7       189293  A       T
1641608 rs4298450       7       189660  A       G

The genotypic data

These are in the form of the so-called long, skinny format in which data are organised by SNP name, individual id and genotypes. Shown below is in ASCII text format and has been compressed into .gz format.

rs3094315       WTCCC139239     AA      0.1426
rs6672353       WTCCC139239     GG      0.07488
rs4040617       WTCCC139239     AA      0.02626
rs2980300       WTCCC139239     CC      0.09944
rs2905036       WTCCC139239     TT      0.2563
rs4245756       WTCCC139239     CC      0.0116
rs4075116       WTCCC139239     CT      0.08164
rs9442385       WTCCC139239     NN      0.7471
rs10907175      WTCCC139239     AA      0.01882
rs2887286       WTCCC139239     CT      0.08472
rs6603781       WTCCC139239     GG      0.06592
rs11260562      WTCCC139239     AG      0.02318
rs6685064       WTCCC139239     CC      0.05305
rs307378        WTCCC139239     NN      0.5644

The phenotypic data

The phenotypic data are relatively simply in this case, just like data from all other analyses .

  obs:         7,686
 vars:            22                          11 Jan 2007 15:02
 size:       630,252 (94.0% of memory free)
-------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------
wtccc_id        str11  %11s                   SAMPLE_ID
case            str1   %9s                    CASE
cohort          str1   %9s                    COHORT
sample          str1   %9s                    SAMPLE
sex            byte  %9.0g                 1=male,2=female
age             float  %9.0g
weight          float  %9.0g                  participant's weight in kg
height          float  %9.0g                  participant's height in cm
waist           float  %9.0g                  Participant's waist in cm
hip             float  %9.0g                  participant's hip in cm
bmi             float  %9.0g                  participant's body-mass index
                                                in kg/m^2 at 1HC
fev1            int    %9.0g                  higher of the 2 fev1
                                                measurements in l (1HC)
diastol         float  %9.0g                  average of the 2 diastolic BP
                                                measures in mmHg (1HC)
systol          float  %9.0g                  average of the 2 systolic BP
                                                readings in mmHg (1HC)
cholestl        float  %9.0g                  total cholesterol in mmol/l
                                                (1HC)
ldl             float  %9.0g                  LDL cholesterol in mmol/l (1HC)
hdl             float  %9.0g                  HDL cholesterol in mmol/l (1HC)
triglyc         float  %9.0g                  triglycerides in mmol/l (1HC)
hba1c           float  %9.0g                  Hb A1C as a % (1HC)
menostat        byte   %22.0g                 Menopausal status - 1=pre-,
                                                2=peri-(within 1 year of last
                                                period), 3=post-(2 to
menoage         float  %9.0g
fibrinog        float  %22.0g                 FIBRINOGEN
-------------------------------------------------------------------------------

Some rules of rotation as defined in rotate180 and used by exclude.sas

A1      H12
A2      H11
A3      H10
A4      H9
A5      H8
A6      H7
A7      H6
A8      H5
A9      H4
A10     H3
A11     H2
A12     H1
B1      G12
B2      G11
B3      G10
B4      G9

Individuals to be exclude

     +------------------------------+
     |          id           reason |
     |------------------------------|
  1. | WTCCC139256   het,cr         |
  2. | WTCCC139274   related,cr     |
  3. | WTCCC139281   het            |
  4. | WTCCC139285   related,cr     |
  5. | WTCCC139286   related,cr     |
     |------------------------------|
  6. | WTCCC139291   het,cr         |
  7. | WTCCC139567   cr             |
  8. | WTCCC139602   cr             |
  9. | WTCCC139605   cr             |
 10. | WTCCC139608   cr             |
     |------------------------------|
 11. | WTCCC139822   cr             |
 12. | WTCCC139886   cr             |
 13. | WTCCC139889   cr             |
 14. | WTCCC139891   cr             |
 15. | WTCCC139896   cr             |

...

C58.maf.gz

SNP     A1      A2      MAF
rs3094315       3       1       0.162272
rs6672353       0       3       0
rs4040617       3       1       0.133288
rs2980300       4       2       0.160204
rs2905036       2       4       0.000338066
rs4245756       4       2       0.000675676
rs4075116       2       4       0.282138
rs9442385       4       3       0.0697279
rs10907175      2       1       0.0851351
rs2887286       2       4       0.171419
rs6603781       1       3       0.11413
rs11260562      1       3       0.0734072
rs6685064       4       2       0.0728814
rs307378        4       3       0.0278533
rs1695824       1       2       0.477027
rs3766180       2       4       0.283108
rs6603791       3       1       0.321959
rs7540231       3       1       0.176411

...