/genomics/ › EUR vs EAS SCZ landscape

The chr 6 MHC band that is — and isn't — there

Side-by-side genome-wide layout of GWS hits from the PGC3 wave 3 European-cohort and Lam et al. 2019 East-Asian-cohort schizophrenia GWAS. The headline finding from the companion paper becomes visible at a glance once you toggle between cohorts: the chr 6 MHC region is dense with EUR-cohort GWS hits and conspicuously empty in the EAS-cohort hits at current sample sizes.

21,723
EUR GWS SNPs
4,529
EUR in MHC band
1,730
EAS GWS SNPs
0
EAS in MHC band
EAS GWS hit EAS & typed EUR GWS hit EUR & typed chr 6 MHC (25–34 Mb)

What you're seeing

Each row is a chromosome (1–22), drawn proportional to its length on GRCh37. Each dot is a SNP that reached genome-wide significance (p < 5×10⁻⁸) in either the PGC3 wave 3 EUR-cohort schizophrenia GWAS (Trubetskoy et al. 2022, n=71,554 cases / 97,863 controls) or the Lam et al. 2019 EAS-cohort schizophrenia GWAS (n=22,778 cases / 35,362 controls). Use the cohort toggle in the legend to flip between the two. Larger filled dots mark SNPs that are also typed on the 23andMe v5 chip (the chip-PRS subset). The dashed red band on chr 6 highlights the extended MHC region (25–34 Mb).

Toggle to EUR only: the chr 6 MHC region is the densest band in the genome — 4,529 of 21,723 GWS SNPs (~21%) live in that single 9-Mb window. Toggle to EAS only: the same band is empty (0 of 1,730 GWS SNPs). Toggle to Both: the contrast is immediate. The other dominant EAS clusters — chr 12q24 (123–124 Mb), chr 10q24 BORCS7/AS3MT/NT5C2 (104–105 Mb), chr 2/3 multi-locus — are present but not MHC-scale. The MHC absence in EAS is not because the MHC has no role in schizophrenia globally; it is because the specific tag SNPs reaching significance in EUR have minor allele frequencies under 1% in East Asians (Lam et al. note rs13194504 EAS MAF <1% vs 9% EUR; the C4-BS schizophrenia allele is uncommon in Han Chinese), and the EAS sample size is currently a fraction of the EUR cohort.

Why this matters for chip-PRS

If you compute a single-subject polygenic risk score from a 23andMe export against an EUR-cohort GWAS, ~66% of the typed GWS SNPs land in the MHC, and they are in long-range LD across the entire 9-Mb extended-MHC block. The "polygenic" score is essentially a homozygosity readout of one tag haplotype. Against an ancestry-matched East Asian GWAS, that LD-inflation source is absent at current sample sizes — so the chip-PRS signal is more distributed and (in a narrow technical sense) more interpretable, even though the broader ancestry-cohort-mismatch problem remains. The companion paper (PDF, 12 pages) develops this in detail.

Data & reproducibility

EAS source: Lam 2019 PGC EAS schizophrenia summary statistics on Figshare (file: daner_natgen_pgc_eas.gz). EUR source: PGC3 wave 3 EUR-cohort summary statistics on Figshare (Trubetskoy et al. 2022, PGC3_SCZ_wave3.european.autosome.public.v3.vcf.tsv.gz). Both filtered to p < 5e-8 with chromosome and position kept verbatim from the source files. The 23andMe-typed marker is computed by intersecting GWS rsIDs against the v5 chip's rsID set. Chromosome length scale is based on GRCh37/hg19 reference assembly.

The pre-processed data.json (EAS) here and eur_gws.json (EUR, reused from the chip-PRS explorer) are plain JSON with one entry per GWS SNP: rsid, chr, pos, effect size (or for EAS, beta for EUR), p, and typed. Anyone can rerun against a different GWAS by swapping the source file and re-running the same preprocessing logic.