Lightnews — Scholar-powered news

Stephen Burgess @stevesphd.bsky.social · 1d

Thanks to Janne for leading this, and the team at @FinnGen_FI led by @johanneskettune for allowing us to perform bespoke analyses in their cohort!

1

Stephen Burgess @stevesphd.bsky.social · 1d

Negative studies are difficult to interpret (and publish) - there are legitimate reasons why the result may not replicate in a different study population. However, we did not see encouraging evidence from our attempted replication analysis.

1 1

Stephen Burgess @stevesphd.bsky.social · 1d

While we cannot rule out low power, we did not find any associations between PCSK9 variants and breast cancer survival in datasets other than the original Cell paper. In contrast, variants in the HMGCR gene were associated with breast cancer survival.

1

Stephen Burgess @stevesphd.bsky.social · 1d

For the BCAC data, we weren't able to replicate the original analysis exactly - we couldn't restrict to older women, or those with Stage 2/3 cancer, or consider a recessive model. For FinnGen, we were able to replicate the original analysis exactly - but the sample size was much lower.

1

Stephen Burgess @stevesphd.bsky.social · 1d

We did not find replication of their finding in any analysis using published consortium (BCAC) data from Morra et al on 91,686 breast cancer cases with 7531 breast cancer-specific deaths, or in FinnGen (4648 breast cancer patients).

1

Stephen Burgess @stevesphd.bsky.social · 1d

There may be good reasons for this, but from a purely statistical point of view, reporting associations for a single SNP under a non-additive model and restricted participant eligibility raises concerns of possible selective reporting.

1 1

Stephen Burgess @stevesphd.bsky.social · 1d

We were curious why they did not examine genetic associations in large publicly-available datasets on breast cancer survival. Additionally, they considered associations using a recessive allele model, and limited to women over 50 with Stage 2 or 3 breast cancer.

1 1

Stephen Burgess @stevesphd.bsky.social · 1d

A recent Cell paper (pubmed.ncbi.nlm.nih.gov/39657676/) reported links between PCSK9 and breast cancer metastasis using a variety of approaches, including genetic associations - however, associations were estimated in small samples (n=1456).

A commonly inherited human PCSK9 germline variant drives breast cancer metastasis via LRP1 receptor - PubMed

Identifying patients at risk for metastatic relapse is a critical medical need. We identified a common missense germline variant in proprotein convertase subtilisin/kexin type 9 (PCSK9) (rs562556, V47...

pubmed.ncbi.nlm.nih.gov

1

Stephen Burgess @stevesphd.bsky.social · 1d

New pre-print: "PCSK9 and breast cancer survival: a Mendelian Randomization study" www.medrxiv.org/content/10.1... led by Janne Pott. Brief thread:

PCSK9 and breast cancer survival: a Mendelian Randomization study

Background: Proprotein convertase subtilisin/kexin type 9 (PCSK9) is well known for its causal effects on the lipid metabolism. A recent study identified an association between rs562556 within PCSK9 a...

www.medrxiv.org

1 2 13

Stephen Burgess @stevesphd.bsky.social · 2d

Full recording of the PSI webinar by myself and @jack_bowdenjack on instrumental variable methods is available online: psiweb.org/vod/item/efs...

PSI

The community dedicated to leading and promoting the use of statistics within the healthcare industry for the benefit of patients.

psiweb.org

1 3

Stephen Burgess @stevesphd.bsky.social · 15d

A pre-print for Chris Wallace's colocPropTest method for proportional colocalization is available online: www.biorxiv.org/content/10.1... @chr1sw.bsky.social

Variable information across SNPs in GWAS data can cause false rejections of colocalisation which can be resolved by proportional colocalisation tests

Fine-mapping is now a standard post-GWAS analysis, but it has been shown to be potentially inaccurate for large meta-analysis GWAS. We show how this can be caused by variable amounts of statistical in...

www.biorxiv.org

4

Stephen Burgess @stevesphd.bsky.social · 15d

In conclusion, we need to be cautious when using population-based biobanks for investigating rare diseases. Case definitions should be developed that do not only rely on hospital episode statistics and ICD code records.

1

Stephen Burgess @stevesphd.bsky.social · 15d

4) PAH prelevance in the All of US dataset based on electronic health records is far higher than expected, and many identified "cases" do not have corresponding medication prescription consistent with PAH.

1 1

Stephen Burgess @stevesphd.bsky.social · 15d

3) MR investigations for the effect of similar (but aetiologically distinct) conditions on PAH risk demonstrate effects in the population-based biobanks, but not in the clinically-validated dataset. This suggests that the population-based biobanks suffer from case contamination.

1

Stephen Burgess @stevesphd.bsky.social · 15d

2) GWAS hits from population-based biobanks do not validate in the clinically-validated dataset, even accounting for lower power. These hits also do not have biological support.

1

Stephen Burgess @stevesphd.bsky.social · 15d

In this work, we show: 1) GWAS hits from a clinically-validated dataset for PAH with biological support do not validate in population-based biobanks, despite the larger sample size and more "cases".

1

Stephen Burgess @stevesphd.bsky.social · 15d

For common diseases, the answer may be yes. But for rare diseases such as pulmonary arterial hypertension (PAH, ~50 cases per million), even a small fraction of misclassified cases (0.01%) can contaminate results.

1

Stephen Burgess @stevesphd.bsky.social · 15d

Population-based biobanks allow epidemiological analyses to be performed in large sample sizes, including genome-wide association studies (GWAS). These have taken over from smaller disease-specific cohorts, often constructed using clinically-validated outcome data. Is bigger better?

1

Stephen Burgess @stevesphd.bsky.social · 15d

New manuscript led by @BarWoolf "A cautionary note on the naive use of general-population biobanks to study pulmonary arterial hypertension" is now published at Euro Respiratory Journal @ERSpublications: pubmed.ncbi.nlm.nih.gov/40967765/. Summary follows:

A cautionary note on the naive use of general-population biobanks to study pulmonary arterial hypertension, with a focus on Mendelian randomization - PubMed

A cautionary note on the naive use of general-population biobanks to study pulmonary arterial hypertension, with a focus on Mendelian randomization

pubmed.ncbi.nlm.nih.gov

1 6

Stephen Burgess @stevesphd.bsky.social · Jul 22

Thanks to @amymariemason and @BarWoolf
for working on this together, and to @ChatGPTapp
for helping to get the ball rolling with the writing, even if we overruled you in many places!

1 1

Stephen Burgess @stevesphd.bsky.social · Jul 22

...but the initial text needed a lot of work - it struggled to synthesize the ideas, and the structure was not great. Maybe a better prompt? Some of the ideas we seeded in the prompt ended up less important in the eventual submission.

1 2

Stephen Burgess @stevesphd.bsky.social · Jul 22

But it did cut down the overall writing time - I would estimate by around 50%. This is a topic that has been in my head for several years, and I don't think I would have got round to writing it otherwise. It was much better at writing the abstract and cover letter...

1

Stephen Burgess @stevesphd.bsky.social · Jul 22

To be honest, I was a bit disappointed with the draft - in particular, the simulation study was incorrect and quite limited in scope (we hoped it would do well with this). We ended up re-writing large chunks of text, although some vestiges remain in the final submission.

1

Stephen Burgess @stevesphd.bsky.social · Jul 22

A subtext to this work is that it is the first manuscript I've written where the first draft was generated by ChatGPT - we used the Deep Research function. The AI prompt is in the appendix, and we will share the full machine-written draft (pre-edits) with the community.

1

Stephen Burgess @stevesphd.bsky.social · Jul 22

...as for context stratification, the subgroups differ based on other factors by definition - as they come from different centres. In conclusion, the idea may work in some cases, but even when it does, it is somewhat limited in scope and interpretation.

1