There are few areas of medicine where conviction has outpaced evidence quite as dramatically as adolescent gender care. For over a decade, thousands of young people were referred to the Gender Identity Development Service (GIDS), with some receiving puberty blockers and others progressing to cross-sex hormones. Despite this, long-term outcome tracking has been lacking.

Share Trust the Evidence

That failure now sits at the centre of the debate. The Cass Review highlighted the absence of robust longitudinal data and described the attempt to link the roughly 9,000 former GIDS patients into adulthood, an effort that initially faltered for want of cooperation.

Now, with statutory authority enabling data linkage as we outlined in Part 3, the question is no longer whether we can examine the cohort; it is whether it can be done properly. Few paediatric services have access to a dataset of this size and clinical specificity.

Properly structured, it could answer several vital questions. But it must begin with something deceptively simple: case ascertainment.

Donate to TTE - we need you

Who is included in this cohort? Who was referred for assessment, completed it, or disengaged before starting treatment? Who declined blockers or chose private care? Who transferred to adult services? Without a clear denominator, analyses may be distorted, and selective inclusion could undermine confidence from the start. Case ascertainment is not glamorous, but it is a foundational part of epidemiology: You cannot analyse what you cannot define.

Once the cohort is properly assembled, the next task is classification. Gender dysphoria in adolescence is not a single phenotype. If the linkage study can stratify patients by age at referral, sex assigned at birth, neurodevelopmental profile, mental health history, and pubertal stage, we may begin to see distinct subgroups with different trajectories.

From this, the natural history then follows. What happens over time? How many referred young people who did not receive medical treatment later accessed adult gender services? How many disengaged entirely? How many stabilised without hormones? What proportion of those who received puberty blockers progressed to cross-sex hormones?

The size of the sample is important: Nine thousand individuals is considered large by pediatric standards. This number is more than enough to identify significant differences in common outcomes, such as progression to hormone therapy or hospital admissions. However, it may not have the statistical power needed to draw definitive conclusions about rare events, such as suicide mortality. A limitation that should be recognised.

So far, we have described a case series, quite literally a series of cases with information recorded in clinical notes. A case series is descriptive: we have so many people aged X to Y whose characteristics and treatment are this and that. It is large and powerful, and, provided the data are properly recorded, we can do even more: build a comparative cohort.

The comparative cohort.

To transform this into something resembling a comparative cohort, researchers must go further. Children were not randomised to puberty blockers, as clinicians selected patients based on judgment, severity and evolving criteria. That creates confounding by indication: the very factors that led to treatment may also influence outcome.

In epidemiology, if you are going to compare A with B, you must do your utmost to ensure A and B are broadly comparable. We use the term broadly as randomisation is the only process by which you can assure comparability, but randomisation is not possible here.

To reduce bias (confounding by indication), baseline variables are essential: age, Tanner stage, duration of dysphoria, comorbidities, social transition status, and family context. If these variables exist only in narrative clinic notes - as they will likely do - then the study cannot rely solely on electronic linkage. It will require a structured, systematic review and extraction from the original medical records.

Going back to the notes is labour-intensive and introduces interpretive risks. But without it, attempts to compare treated and untreated groups risk being statistically sophisticated yet epidemiologically shallow. You cannot match like with like if you do not know who was alike in the first place.

Propensity score matching and related methods can help, but only if the baseline information is sufficiently detailed and consistently recorded. If the severity of dysphoria, clinician judgement of persistence or family dynamics were not systematically documented, then residual confounding will remain.

Then there is the issue of time. Puberty unfolds over months and years; practice evolved significantly since 2009, and the demographic shift occurred during that period. Calendar-time bias and developmental confounding complicate interpretation. A young person treated in 2012 is not directly comparable to one treated a decade later. Analyses must account for this temporal drift or risk conflating change in practice with change in outcome. None of this is an argument against the linkage study; it is an argument for methodological discipline.

If the GIDS cohort is grounded in robust case ascertainment and, where necessary, careful note review, then it might answer some of the vital questions. Does puberty suppression alter the likelihood of later hormone initiation? Do long-term mental health trajectories differ between treated and untreated groups once baseline severity is accounted for? Are certain subgroups more likely to experience regret? Does age at intervention matter?

For too long, gender care has lacked the registry infrastructure common in other high-stakes paediatric specialities. An omission that has proved to be a structural failure in NHS Care. We now have an opportunity to correct it. Done properly, the GIDS cohort could transform uncertainty into evidence. Done superficially, it will deepen mistrust. Now that the data exists and is accessible, the responsibility is to use it rigorously, comprehensively, and without fear of where it leads.

This post was written by two old geezers who believe in making the most of what we have - if it’s possible.