Association between patient features on use of thrombolysis#

Aims#

Examine thrombolysis use (in all arrivals, and the subset of those who arrive within 4 hours of known onset) according to:

  • Disability (Rankin) before stroke

  • Stroke severity (NIHSS)

  • Gender

  • Ethnicity

  • Age group

  • Onset known

  • Arrival by ambulance

Examine thrombolysis use in the absence and presence of co-morbidities.

Note: The association of particular features with use of thrombolysis does not imply that these relationships are causal.

Examine reasons given for not giving thrombolysis.

Import libraries and data#

Data has been resticted to stroke teams with at least 300 admissions, with at least 10 patients receiving thrombolysis, over three years.

# import libraries
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# import data
raw_data = pd.read_csv(
    './../data/2019-11-04-HQIP303-Exeter_MA.csv', low_memory=False)
headings = list(raw_data)
print (headings)
['StrokeTeam', 'PatientUID', 'Pathway', 'S1AgeOnArrival', 'MoreEqual80y', 'S1Gender', 'S1Ethnicity', 'S1OnsetInHospital', 'S1OnsetToArrival_min', 'S1OnsetDateType', 'S1OnsetTimeType', 'S1ArriveByAmbulance', 'S1AdmissionHour', 'S1AdmissionDay', 'S1AdmissionQuarter', 'S1AdmissionYear', 'CongestiveHeartFailure', 'Hypertension', 'AtrialFibrillation', 'Diabetes', 'StrokeTIA', 'AFAntiplatelet', 'AFAnticoagulent', 'AFAnticoagulentVitK', 'AFAnticoagulentDOAC', 'AFAnticoagulentHeparin', 'S2INR', 'S2INRHigh', 'S2INRNK', 'S2NewAFDiagnosis', 'S2RankinBeforeStroke', 'Loc', 'LocQuestions', 'LocCommands', 'BestGaze', 'Visual', 'FacialPalsy', 'MotorArmLeft', 'MotorArmRight', 'MotorLegLeft', 'MotorLegRight', 'LimbAtaxia', 'Sensory', 'BestLanguage', 'Dysarthria', 'ExtinctionInattention', 'S2NihssArrival', 'S2BrainImagingTime_min', 'S2StrokeType', 'S2Thrombolysis', 'Haemorrhagic', 'TimeWindow', 'Comorbidity', 'Medication', 'Refusal', 'Age', 'Improving', 'TooMildSevere', 'TimeUnknownWakeUp', 'OtherMedical', 'S2ThrombolysisTime_min', 'S2TIAInLastMonth']

Restrict original data to hospitals with at least 300 admissions + 10 thrombolysis patients#

keep = []

groups = raw_data.groupby('StrokeTeam') # creates a new object of groups of data

for index, group_df in groups: # each group has an index and a dataframe of data
    
    # Skip if total admissions less than 300 or total thrombolysis < 10
    admissions = group_df.shape[0]
    thrombolysis_received = group_df['S2Thrombolysis'] == 'Yes' 
    if (admissions < 300) or (thrombolysis_received.sum() < 10):
        continue
    
    else: 
        keep.append(group_df)

# Concatenate output
data = pd.DataFrame()
data = pd.concat(keep)

# Convert thromboilysis to boolean
data['thrombolysis_given'] = data['S2Thrombolysis'] == 'Yes'

print (data.shape)
(239505, 63)

Get out-of-hopsital stroke onset arrivals within 4 hours of known stroke onset#

# Get out of hospital arrivals
mask = data['S1OnsetInHospital'] == 'No'
data = data[mask]

# Get arrivals within 4 hours of known stroke onset
mask = data['S1OnsetToArrival_min'] <= 240
data_4hr = data[mask]

Breakdown of thrombolysis use by feature#

Rankin before stroke#

# All admissions
rankin_before_stroke_all = \
    data.groupby('S2RankinBeforeStroke').mean()['thrombolysis_given']
rankin_before_stroke_all.rename('All', inplace=True)

# Admissions within 4 hours of arrival
rankin_before_stroke_4hr = \
    data_4hr.groupby('S2RankinBeforeStroke').mean()['thrombolysis_given']
rankin_before_stroke_4hr.rename('4hr', inplace=True)

# Combine
rankin_before_stroke = \
    pd.concat([rankin_before_stroke_all, rankin_before_stroke_4hr], axis=1)
rankin_before_stroke = rankin_before_stroke.round(3)
rankin_before_stroke *= 100
rankin_before_stroke
All 4hr
S2RankinBeforeStroke
0 14.2 34.9
1 10.8 28.6
2 9.0 24.0
3 8.7 21.5
4 6.5 15.0
5 4.3 9.8
# Set up figure
fig = plt.figure(figsize=(5,5))
ax = fig.add_subplot()
x = rankin_before_stroke.index
y1 = rankin_before_stroke['All']
y2 = rankin_before_stroke['4hr'] 
ax.plot(x, y1, label='All arrivals')
ax.plot(x, y2, label='4 Hour arrivals')
ax.set_xlabel('Disability (Rankin) before stroke')
ax.set_ylabel('Thrombolysis use (%)')
ax.legend()
plt.tight_layout()
plt.savefig('output/thrombolysis_use_by_rankin_before_stroke.jpg', dpi=300)
plt.show()
../_images/04_thrombolysis_general_12_0.png

Stroke severity#

# All admissions
NIHSS_all = data.groupby('S2NihssArrival').mean()['thrombolysis_given']
NIHSS_all.rename('All', inplace=True)

# Admissions within 4 hours of arrival
NIHSS_4hr = data_4hr.groupby('S2NihssArrival').mean()['thrombolysis_given']
NIHSS_4hr.rename('4hr', inplace=True)

# Combine
NIHSS = pd.concat([NIHSS_all, NIHSS_4hr], axis=1) * 100
NIHSS = NIHSS.round(1)
NIHSS
All 4hr
S2NihssArrival
0.0 0.6 2.1
1.0 1.1 3.8
2.0 2.5 8.3
3.0 4.9 15.0
4.0 9.7 28.7
5.0 14.0 36.8
6.0 16.4 39.9
7.0 18.8 42.4
8.0 21.0 44.7
9.0 21.6 44.2
10.0 21.9 43.4
11.0 24.0 46.1
12.0 23.9 43.8
13.0 24.7 43.9
14.0 25.9 45.9
15.0 25.9 45.0
16.0 26.4 44.4
17.0 26.2 44.3
18.0 26.6 45.1
19.0 26.7 46.2
20.0 23.7 42.3
21.0 24.3 42.6
22.0 24.1 41.4
23.0 25.0 44.2
24.0 24.7 42.7
25.0 22.2 39.3
26.0 17.5 31.0
27.0 17.6 31.8
28.0 12.5 24.1
29.0 12.6 24.0
30.0 9.8 19.4
31.0 8.9 18.1
32.0 8.3 16.5
33.0 6.0 12.7
34.0 4.9 10.0
35.0 4.2 7.7
36.0 2.4 5.0
37.0 3.6 6.8
38.0 5.7 9.9
39.0 6.5 11.4
40.0 0.0 0.0
41.0 0.0 0.0
42.0 1.6 2.7
# Set up figure
fig = plt.figure(figsize=(5,5))
ax = fig.add_subplot()
x = NIHSS.index
y1 = NIHSS['All']
y2 = NIHSS['4hr'] 
ax.plot(x, y1, label='All arrivals')
ax.plot(x, y2, label='4 Hour arrivals')
ax.set_xlabel('Stroke severity (NIHSS) on arrival')
ax.set_ylabel('Thrombolysis use (%)')
ax.legend()

plt.tight_layout()
plt.savefig('output/thrombolysis_use_by_arrival_nihss.jpg', dpi=300)
plt.show()
../_images/04_thrombolysis_general_16_0.png

Show thrombolysis use by NIHSS cutoff of 0-10 or 11+

# All arrivals

mask = (data['S2NihssArrival'] >= 0) & (data['S2NihssArrival'] <= 10)
thrombolisis_use = data[mask]['thrombolysis_given'].mean()
print (
    f'Thrombolysis use in NIHSS 0-10, all arrivals: {thrombolisis_use:0.3f}')

mask = data['S2NihssArrival'] >= 11
thrombolisis_use = data[mask]['thrombolysis_given'].mean()
print (
    f'Thrombolysis use in NIHSS 11+, all arrivals: {thrombolisis_use:0.3f}')
Thrombolysis use in NIHSS 0-10, all arrivals: 0.086
Thrombolysis use in NIHSS 11+, all arrivals: 0.231
# 4 Hour arrivals
mask = (data_4hr['S2NihssArrival'] >= 0) & (data_4hr['S2NihssArrival'] <= 10)
thrombolisis_use = data_4hr[mask]['thrombolysis_given'].mean()
print (
    f'Thrombolysis use in NIHSS 0-10, 4 hour arrivals: {thrombolisis_use:0.3f}')

mask = data_4hr['S2NihssArrival'] >= 11
thrombolisis_use = data_4hr[mask]['thrombolysis_given'].mean()
print (
    f'Thrombolysis use in NIHSS 11+, 4 hour arrivals: {thrombolisis_use:0.3f}')
Thrombolysis use in NIHSS 0-10, 4 hour arrivals: 0.245
Thrombolysis use in NIHSS 11+, 4 hour arrivals: 0.412

Gender#

# All admissions
gender_all = data.groupby('S1Gender').mean()['thrombolysis_given']
gender_all.rename('All', inplace=True)

# Admissions within 4 hours of arrival
gender_4hr = data_4hr.groupby('S1Gender').mean()['thrombolysis_given']
gender_4hr.rename('4hr', inplace=True)

# Combine
gender = pd.concat([gender_all, gender_4hr], axis=1) * 100
gender = gender.round(1)
gender
All 4hr
S1Gender
Female 11.3 28.2
Male 12.3 30.8

Ethnicity#

# All admissions
ethnicity_all = data.groupby('S1Ethnicity').mean()['thrombolysis_given']
ethnicity_all.rename('All', inplace=True)

# Admissions within 4 hours of arrival
ethnicity_4hr = data_4hr.groupby('S1Ethnicity').mean()['thrombolysis_given']
ethnicity_4hr.rename('4hr', inplace=True)

# Combine
ethnicity = pd.concat([ethnicity_all, ethnicity_4hr], axis=1) * 100
ethnicity = ethnicity.round(1)
ethnicity
All 4hr
S1Ethnicity
Asian 12.1 30.0
Black 10.2 29.2
Mixed 11.7 32.3
Other 14.4 34.3
White 11.6 29.1

Age group#

# All admissions
age_group_all = data.groupby('S1AgeOnArrival').mean()['thrombolysis_given']
age_group_all.rename('All', inplace=True)

# Admissions within 4 hours of arrival
age_group_4hr = data_4hr.groupby('S1AgeOnArrival').mean()['thrombolysis_given']
age_group_4hr.rename('4hr', inplace=True)

# Combine
age_group = pd.concat([age_group_all, age_group_4hr], axis=1) * 100
age_group = age_group.round(3)

# Re-order
order = ['[15,20)', '[20,25)', '[25,30)', '[30,35)', '[35,40)',
       '[40,45)', '[45,50)', '[50,55)', '[55,60)', '[60,65)', '[65,70)',
       '[70,75)', '[75,80)', '[80,85)', '[85,90)', '[90,95)', '[95,100)',
        '[100,120]']
age_group = age_group.loc[order]
age_group.set_index(np.arange(15, 101, 5), inplace=True)
age_group
All 4hr
15 17.470 42.029
20 13.402 33.621
25 15.371 37.229
30 14.624 36.164
35 15.697 36.794
40 15.274 37.671
45 14.863 36.277
50 14.189 36.692
55 13.091 34.348
60 12.642 33.238
65 12.843 32.862
70 12.999 32.381
75 12.314 30.396
80 11.298 28.106
85 9.958 24.600
90 8.510 20.752
95 8.630 20.101
100 7.929 17.883
# Set up figure
fig = plt.figure(figsize=(5,5))
ax = fig.add_subplot()
x = age_group.index
y1 = age_group['All']
y2 = age_group['4hr']
ax.plot(x, y1, label='All arrivals')
ax.plot(x, y2, label='4 Hour arrivals')
ax.set_xlabel('Age band (lower bound)')
ax.set_ylabel('Thrombolysis use (%)')
ax.legend()
plt.tight_layout()
plt.savefig('output/thrombolysis_use_by_age.jpg', dpi=300)
plt.show()
../_images/04_thrombolysis_general_29_0.png

Onset known#

Get fractions for each onset type

counts = data.groupby('S1OnsetTimeType').count()['StrokeTeam']
fractions = counts/counts.sum() * 100
fractions = fractions.round(1)
fractions
S1OnsetTimeType
Best estimate    33.7
Not known        33.1
Precise          33.1
Name: StrokeTeam, dtype: float64
# All admissions
onset_known_all = data.groupby('S1OnsetTimeType').mean()['thrombolysis_given']
onset_known_all.rename('All', inplace=True)

# Admissions within 4 hours of arrival
onset_known_4hr = data_4hr.groupby('S1OnsetTimeType').mean()['thrombolysis_given']
onset_known_4hr.rename('4hr', inplace=True)

# Combine
onset_known = pd.concat([onset_known_all, onset_known_4hr], axis=1) * 100
onset_known = onset_known.round(1)

# Re-order
order = ['Not known', 'Best estimate', 'Precise']
onset_known = onset_known.loc[order]
onset_known
All 4hr
S1OnsetTimeType
Not known 0.4 NaN
Best estimate 6.3 14.0
Precise 28.8 39.0

Arrival by ambulance#

arrive_by_ambo = np.mean(data['S1ArriveByAmbulance'] == 'Yes') * 100
print (f'Percent arriving by ambulance: {arrive_by_ambo:0.1f}')
Percent arriving by ambulance: 81.4
# All admissions
ambulance_all = data.groupby('S1ArriveByAmbulance').mean()['thrombolysis_given']
ambulance_all.rename('All', inplace=True)

# Admissions within 4 hours of arrival
ambulance_4hr = data_4hr.groupby('S1ArriveByAmbulance').mean()['thrombolysis_given']
ambulance_4hr.rename('4hr', inplace=True)

# Combine
ambulance = pd.concat([ambulance_all, ambulance_4hr], axis=1) * 100
ambulance = ambulance.round(1)
ambulance
All 4hr
S1ArriveByAmbulance
No 3.8 21.4
Yes 13.7 30.3

Thrombolysis use by comorbidities#

Only examine effect of comorbidity on thrombolysis use of those arriving within 4 hours of known onset.

Analyse by comorbidity ‘Yes’ or ‘No’. Ignore ‘No but’.

comords = ['CongestiveHeartFailure', 'Hypertension', 'AtrialFibrillation', 
           'Diabetes', 'StrokeTIA', 'AFAntiplatelet', 'AFAnticoagulent', 
           'AFAnticoagulentVitK', 'AFAnticoagulentDOAC', 
           'AFAnticoagulentHeparin']

# Loop through comorbidities and get results
results = []
for comord in comords: 
    result = data_4hr.groupby(comord).mean()['thrombolysis_given']
    results.append({'No': result['No'] * 100, 
                    'Yes': result['Yes'] * 100})
    
comord_df = pd.DataFrame(results, index=comords)
comord_df['Ratio'] = comord_df['Yes'] / comord_df['No']
comord_df = comord_df.round(3)
comord_df
No Yes Ratio
CongestiveHeartFailure 29.709 26.081 0.878
Hypertension 30.331 28.826 0.950
AtrialFibrillation 32.034 19.931 0.622
Diabetes 30.066 27.241 0.906
StrokeTIA 32.151 22.506 0.700
AFAntiplatelet 18.455 28.757 1.558
AFAnticoagulent 34.253 8.976 0.262
AFAnticoagulentVitK 31.009 11.942 0.385
AFAnticoagulentDOAC 32.254 5.677 0.176
AFAnticoagulentHeparin 30.068 7.576 0.252
# Set up figure
fig = plt.figure(figsize=(8,5))
ax = fig.add_subplot()
x = comord_df.index
y = comord_df['Ratio']

ax.barh(x, y)
ax.set_xlabel(
    'Ratio of thrombolysis use\n(co-morbidity present / comorbidity absent)')
ax.set_ylabel('Comorbidity')
ax.grid(axis='x')

plt.tight_layout()
plt.savefig('output/thrombolysis_use_by_comorbidity.jpg', dpi=300)
plt.show()
../_images/04_thrombolysis_general_41_0.png

Reasons for not giving thrombolysis#

Reasons given for not using thrombolysis (more than one may be ticked for each patient).

Only examine reasons for those arriving within 4 hours of known onset.

reasons = ['Haemorrhagic', 'TimeWindow', 'Comorbidity', 'Medication', 'Refusal',
           'Age', 'Improving', 'TooMildSevere', 'TimeUnknownWakeUp', 
           'OtherMedical']

# Restrict data to non-thrombolysed patients
mask = data_4hr['thrombolysis_given'] == False
no_thromolysis_4hr = data_4hr[mask]
# Convert Yes/No in reasons to boolean
reasons_boolean = no_thromolysis_4hr[reasons] == 'Yes'
# Get average of 'Yes' values as percentage
reasons = reasons_boolean.mean() * 100
reasons = reasons.round(1)
# Sort by value
reasons.sort_values(ascending=True, inplace=True)

reasons
Refusal               0.9
Age                   1.2
TimeWindow            7.1
Medication            7.7
Comorbidity           9.9
OtherMedical         15.2
TimeUnknownWakeUp    17.5
Improving            17.8
Haemorrhagic         21.7
TooMildSevere        22.0
dtype: float64
# Set up figure
fig = plt.figure(figsize=(8,5))
ax = fig.add_subplot()
x = reasons.index
y = reasons.values

ax.barh(x, y)
ax.set_xlabel('Presence of reason for no thrombolysis (%)')
ax.set_ylabel('Reason for no thrombolysis')
ax.grid(axis='x')

plt.tight_layout()
plt.savefig('output/thrombolysis_non_use_reasons.jpg', dpi=300)
plt.show()
../_images/04_thrombolysis_general_44_0.png

Observations#

  • Disability before stroke: The higher the level of disability before stroke (as described by the modified Rankin Scale) the lower the use of thrombolysis. As a percentage of all out-of-hospital onset arrivals, thrombolysis use drops from 14.2% in those with no disability before stroke to 4.3% of those with a modified Rankin Scale of 5. As a percentage of out-of-hospital onset arrivals arriving within 4 hours of known stroke onset, thrombolysis use drops from 34.9% in those with no disability before stroke to 9.8% of those with a modified Rankin Scale of 5.

  • Stroke severity: The use of thrombolysis varies very significantly with stroke severity (NIHSS) on arrival (Figure 2). Thrombolysis use is very low at extreme NIHSS, withy a plateau of use between NIHSS of approximately 6 to 25.

  • Gender: There is a small association between gender on use of thrombolysis. 12.3% of all male arrivals receive thrombolysis, compared with 11.3% for females. Of those arriving within 4 hours of known stroke onset 30.8% of all male arrivals receive thrombolysis, compared with 28.2% for females.

  • Ethnicity: There is some association between ethnicity on thrombolysis use in all arrivals, with black people having the lowest use of thrombolysis (10.2% vs 11.6% - 12.1% for white, asian and mixed race people), but little variation in use of thrombolysis in those arriving within 4 hours of known stroke onset, suggesting the cause of lower use of thrombolysis is likely to be the lower proportion of black people arriving at hospital within 4 hours of known stroke onset.

  • Age: Thrombolysis use declines with age. For ages 25-55, thrombolysis use is about 15% of all arrivals and 35% of those arriving within 4 hours of known stroke onset. Above this age there is a decline in use, for example in the age band 85-89, use is about 10% of all arrivals and 25% of those arriving within 4 hours of known stroke onset

  • Knowledge of time of onset: Knowledge of onset is split almost evenly (33% to 34% each) for ‘not known’, ‘best estimate’, and ‘precise’. Type of knowledge of stroke onset shoes a significant association with use of thrombolysis. Of those arriving within 4 hours of known onset, 39.0% receive thrombolysis if the time is recorded as being known precisely, compared with 14.0% receiving thrombolysis if the time is a best estimate.

  • Arrival by ambulance: 8.4% of arrivals are by ambulance. There is a significant effect arrival model on use of thrombolysis. For all arrivals, use of thrombolysis is 13.7% for arrivals by ambulance, and 3.8% for other arrivals.

  • Presence of co-morbidities: The presence or absence of co-co-morbidities can be a strong indicator of the use of thrombolysis. For example those on anticoagulant therapies receive thrombolysis less than those who do not, but those on anti-platelet therapies receive it more.

  • Stated reasons for not giving thrombolysis:: Haemorrhagic stroke or a stroke that is too mild/sever are the most common causes (each present in 22% of those not treated). An improving condition is given as the reason for no treatment in 17% of non-treated patients. It is noteworthy that patient refusal is given as a reason for non-treatment in only 0.9% of untreated patients