There are three kinds of information, i.e., the genotypic, map and phenotypic information, not unlike those from the Wellcome Trust Case Control Consortium (http://www.wtccc.org.uk/info/data_formats.shtml)
We illustrate with data from chromosoem 7.
The data includes SNP names, their chromosome number, and alleles at these SNPs.
5012435 rs6583338
7 141322
A G
1612556
rs4451252
7 145340
C T
5086570
rs7806592
7 149081
C T
1679309
rs4281072
7 149266
C T
6735129
rs9642259
7 155737
C G
8516302
rs11771879 7
158399 C T
5034463
rs6964622
7 158577
A G
5039071
rs6969424
7 159284
A C
8518356
rs11970804 7
160848 G T
6750991
rs10226966 7
166807 C T
8508684
rs11764261 7
180194 A G
6852728
rs10435479 7
187608 C T
5081622
rs7801619
7 189089
A G
5086229
rs7806249
7 189202
C T
1583901
rs4430042
7 189293
A T
1641608
rs4298450
7 189660
A G
These are in the form of the so-called long, skinny format in which data are organised by SNP name, individual id and genotypes. Shown below is in ASCII text format and has been compressed into .gz format.
rs3094315
WTCCC139239 AA
0.1426
rs6672353
WTCCC139239 GG
0.07488
rs4040617
WTCCC139239 AA
0.02626
rs2980300
WTCCC139239 CC
0.09944
rs2905036
WTCCC139239 TT
0.2563
rs4245756
WTCCC139239 CC
0.0116
rs4075116
WTCCC139239 CT
0.08164
rs9442385
WTCCC139239 NN
0.7471
rs10907175
WTCCC139239 AA
0.01882
rs2887286
WTCCC139239 CT
0.08472
rs6603781
WTCCC139239 GG
0.06592
rs11260562
WTCCC139239 AG
0.02318
rs6685064
WTCCC139239 CC
0.05305
rs307378
WTCCC139239 NN
0.5644
The phenotypic data are relatively simply in this case, just like data from all other analyses .
obs:
7,686
vars:
22
11 Jan 2007 15:02
size: 630,252
(94.0% of memory
free)
-------------------------------------------------------------------------------
storage display value
variable name
type format
label variable
label
-------------------------------------------------------------------------------
wtccc_id
str11
%11s
SAMPLE_ID
case
str1
%9s
CASE
cohort
str1
%9s
COHORT
sample
str1
%9s
SAMPLE
sex byte %9.0g 1=male,2=female
age
float
%9.0g
weight
float
%9.0g
participant's weight in
kg
height float
%9.0g
participant's height in
cm
waist
float
%9.0g
Participant's waist in
cm
hip
float
%9.0g
participant's hip in
cm
bmi
float
%9.0g
participant's body-mass
index
in kg/m^2 at
1HC
fev1
int
%9.0g
higher of the 2
fev1
measurements in l
(1HC)
diastol float
%9.0g
average of the 2 diastolic
BP
measures in mmHg
(1HC)
systol
float
%9.0g
average of the 2 systolic
BP
readings in mmHg (1HC)
cholestl
float
%9.0g
total cholesterol in
mmol/l
(1HC)
ldl
float
%9.0g
LDL cholesterol in mmol/l
(1HC)
hdl
float
%9.0g
HDL cholesterol in mmol/l
(1HC)
triglyc float
%9.0g
triglycerides in mmol/l
(1HC)
hba1c
float
%9.0g
Hb A1C as a % (1HC)
menostat
byte
%22.0g
Menopausal status -
1=pre-,
2=peri-(within 1 year of
last
period), 3=post-(2 to
menoage
float %9.0g
fibrinog
float
%22.0g
FIBRINOGEN
-------------------------------------------------------------------------------
A1 H12
A2 H11
A3 H10
A4 H9
A5 H8
A6 H7
A7 H6
A8 H5
A9 H4
A10 H3
A11 H2
A12 H1
B1 G12
B2 G11
B3 G10
B4 G9
+------------------------------+
|
id reason
|
|------------------------------|
1. |
WTCCC139256 het,cr
|
2. | WTCCC139274 related,cr
|
3. | WTCCC139281
het
|
4. | WTCCC139285 related,cr
|
5. | WTCCC139286 related,cr
|
|------------------------------|
6. |
WTCCC139291 het,cr
|
7. | WTCCC139567
cr
|
8. | WTCCC139602
cr
|
9. | WTCCC139605
cr
|
10. | WTCCC139608
cr
|
|------------------------------|
11. |
WTCCC139822
cr
|
12. | WTCCC139886
cr
|
13. | WTCCC139889
cr
|
14. | WTCCC139891
cr
|
15. | WTCCC139896
cr
|
...
SNP
A1 A2
MAF
rs3094315
3 1
0.162272
rs6672353
0 3
0
rs4040617
3 1
0.133288
rs2980300
4 2
0.160204
rs2905036
2 4
0.000338066
rs4245756
4 2
0.000675676
rs4075116
2 4
0.282138
rs9442385
4 3
0.0697279
rs10907175
2 1
0.0851351
rs2887286
2 4
0.171419
rs6603781
1 3
0.11413
rs11260562
1 3
0.0734072
rs6685064
4 2
0.0728814
rs307378
4 3
0.0278533
rs1695824
1 2
0.477027
rs3766180
2 4
0.283108
rs6603791
3 1
0.321959
rs7540231
3 1
0.176411
...