FiO2 analysis

Data cleaning

Merge languages

First task is to combine all the languages into a single file.

no labels

Here is the combined language file, without labels.

labels

And the combined language file with labels.

Split into facility and individual level answers

The next task is to separate the combined language files into two parts, the questions about the facility, and the questions about individual practice.

facilities_labels.csv and individuals_labels.csv created in processed_data folder

Remove duplicate facilities

Facility name cleaning

There are a lot of facilities with similar or duplicate names.

Let’s start by listing out all the facility names to see what the patterns are.

there are 170 unique facility names

We will replace similar names as follows:

# map fat_name to hospital_mapping
facilities['fat_name'] = facilities['fat_name'].map(hospital_mapping)
there are 118 unique facility names
we have reduced the number of unique facility names from 170 to 118

Duplicate entries by same individual

Exact duplicates

We can easily just get rid of these, if their facility answer is identical and the same name and email.

before dropping duplicates, the number of entries is 204
the number of duplicate entries by same individual is 16
after dropping duplicates, the number of entries is 188

Remaining duplicates

No contact information

For the hospitals that still have duplicates, we want to be able to contact someone to verify.

Therefore, we will get rid of duplicate hospitals that have NEITHER a contact email NOR a contact phone number.

after dropping duplicates, the number of entries is 168

Drop by number of missing values

  • Now we will calculate the number of missing values in each row.
  • Then for each hospital, we will drop the rows where the number of missing values is greater than the smallest number of missing values for any hospital.
after filtering, the number of facilities is 155

Drop by completed column

Next we will drop those duplicate facilities where the completed column is not 2 (complete)

the number of records in the dataframe is 155
after dropping incomplete facilities, the number of facilities is 130

Drop by timestamp

Next, we will find people who responded more than once and had completed surveys, and keep the latest timestamp.

Keeping only facilities with the latest timestamp: 124

Drop duplicate records that were not verified

Fred has called the rest of the hospitals, for verification. These are the facilities that were not called, did not get updated information, and therefore can be dropped.

there are 0 duplicate facilities

Facility Analysis

Describe hospital distribution

count
Nigeria         22
Uganda          19
Tanzania         9
South Africa     6
Namibia          5
Liberia          5
Rwanda           5
Ethiopia         5
Madagascar       3
Burkina Faso     3
Kenya            3
Sudan            2
Zambia           2
Zimbabwe         2
Mozambique       2
Réunion          2
Congo            1
Mayotte          1
Burundi          1
Ghana            1
South Sudan      1
Eswatini         1
Cameroon         1
Name: count, dtype: int64

Geocoding

Mapping

Unable to display output for mime type(s): application/vnd.plotly.v1+json

Categorical variable analysis

What are teh characteristics of the responding facilities?

HTML file created: table.html
Section Count (proportion)
Facility Level
Tertiary 42 (0.41)
Referral 29 (0.28)
General 24 (0.24)
Primary care 7 (0.07)
Hospital funding model
Public 71 (0.7)
Private 13 (0.13)
Public-private partnership 10 (0.1)
Mission (private not-for-profit) 8 (0.08)
Setting or location of hospital
Urban 63 (0.62)
Peri-urban 21 (0.21)
Rural 18 (0.18)
Type of hospital
Academic (Affiliated with a training institution) 71 (0.7)
Non-academic 31 (0.3)
Anesthesia provider cateogry
Specialist physician anesthesia provider 75 (0.74)
Non-physician anesthesia provider (with formal training) 24 (0.24)
Non-specialist physician anesthesia provider like a surgeon or medical officer 2 (0.02)
Providers with no formal training (trained on the job) 1 (0.01)

Numeric columns

count mean std min 25% 50% 75% max
num_ana_mach 102.0 5.176471 4.106049 0.0 2.0 4.0 8.0 20.0
fio2_analys 102.0 3.137255 3.653397 0.0 1.0 2.0 4.0 20.0
num_oxymet 102.0 7.215686 10.484904 0.0 3.0 5.0 8.0 100.0
num_21 102.0 3.235294 3.661890 0.0 1.0 2.0 5.0 20.0
num_fio2_var 102.0 3.392157 3.912804 0.0 1.0 2.0 4.0 20.0
num_100 102.0 4.705882 4.255662 0.0 1.0 3.5 8.0 20.0
num_peep 102.0 4.107843 3.891872 0.0 1.0 3.0 5.0 20.0

Histograms:

Select all that apply

This is the count and percentage of ‘checked’ answers for each source of oxygen.

Count Percentage ("Checked") Source
13 43 0.421569 No access to medical air
18 42 0.411765 All our machines can provide FiO2 ranges betwe...
17 42 0.411765 FiO2 analyzer is unavailable or dysfunctional
1 36 0.352941 Piping from the manifold
4 34 0.333333 Cylinders (externally purchased)
0 32 0.313725 Piping from plant
15 24 0.235294 Supply of medical gases
8 20 0.196078 Piping from the manifold
3 18 0.176471 Cylinders (filled at hospital plant)
7 16 0.156863 Piping from the plant
16 14 0.137255 Missing components like hoses
5 13 0.127451 Anesthesia machine in-built concentrator
12 10 0.098039 Anesthesia machine entrains air
2 10 0.098039 Stand-alone concentrator
10 9 0.088235 Cylinders (externally purchased)
14 6 0.058824 Purity of medical gases
9 6 0.058824 Stand-alone air compressor
11 3 0.029412 Cylinders (filled at the hospital plant)
6 1 0.009804 No access to oxygen

Bar plots of counts, and stacked bars including ‘checked’ and ‘unchecked’

Individual analysis

Cleaning

there are 170 unique facility names
there are 118 unique facility names
we have reduced the number of unique facility names from 148 to 118
the number of records in the dataframe is 204
after dropping incomplete facilities, the number of facilities is 152
Keeping only facilities with the latest timestamp: 145
before dropping duplicates, the number of entries is 145
the number of duplicate entries by same individual is 0
after dropping duplicates, the number of entries is 102

GA with ETT

  GA with ETT  Count  Percentage ("Checked")
0         Yes    133                0.917241
1          No     12                0.082759

Abdominal surgeries per week

count    129.000000
mean      16.598450
std       16.847301
min        0.200000
25%        5.000000
50%       12.000000
75%       20.000000
max      100.000000
Name: abd_sur_wkly, dtype: float64

carrier gases

                                     None  Count  Percentage ("Checked")
0                       Pure oxygen alone     71                0.550388
1                  Oxygen and medical air     48                0.372093
2  Oxygen, Nitrous oxide, and Medical air      6                0.046512
3                Oxygen and nitrous oxide      4                0.031008
<Figure size 1000x600 with 0 Axes>

Other frequency questions

Anesthesia provider work experience

count    129.000000
mean       5.752171
std        5.387062
min        0.200000
25%        2.000000
50%        4.000000
75%        7.000000
max       29.000000
Name: anes_prov_exper, dtype: float64

Hindrances

Count Percentage ("Checked") Source
0 64 0.441379 Lack of FiO2 analyzer
1 30 0.206897 Limited training (knowledge and skill)
2 58 0.400000 Other limitations of the anesthesia machine
3 55 0.379310 Access to medical gases (availability and cost)
4 6 0.041379 Hospital and or national practice policies and...
5 43 0.296552 I am able to use variable ranges of FiO2 at my...

Sex

sex
2.0    89
1.0    40
Name: count, dtype: int64

Age

count    129.000000
mean      37.937984
std        7.764857
min       20.000000
25%       34.000000
50%       37.000000
75%       42.000000
max       69.000000
Name: age, dtype: float64