\n",
- "
Important: this step could have a huge influence on reporting rates!\n",
- " Activity can be evaluated over
1 year or
across all years, based on grouping:
group_by(OU_ID, YEAR):
\n",
- "
\n",
- " - With
YEAR → “active that year” \n",
- " - Without
YEAR → “ever active over the entire extracted period” \n",
- "
\n",
- "
"
- ],
- "id": "a598e4b7"
+ "tags": []
+ },
+ "source": [
+ "## 3. Reporting rates computations"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "7d62cdb6",
+ "metadata": {},
+ "source": [
+ "#### 3.0. Define start and end period based on routine data "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3bc2e76a-b5c7-4c71-90f2-c66926ca560a",
+ "metadata": {
+ "papermill": {
+ "duration": 0.044172,
+ "end_time": "2026-01-16T10:23:59.862224",
+ "exception": false,
+ "start_time": "2026-01-16T10:23:59.818052",
+ "status": "completed"
},
- {
- "cell_type": "code",
- "metadata": {
- "papermill": {
- "duration": 0.173961,
- "end_time": "2026-01-16T10:24:05.948136",
- "exception": false,
- "start_time": "2026-01-16T10:24:05.774175",
- "status": "completed"
- },
- "tags": [],
- "vscode": {
- "languageId": "r"
- }
- },
- "source": [
- "# Flag facilities with at least one report in the year\n",
- "facility_master_routine_01 <- facility_master_routine %>%\n",
- " group_by(OU_ID, YEAR) %>%\n",
- " mutate(ACTIVE_THIS_YEAR = max(ACTIVE_THIS_PERIOD, na.rm = TRUE)) %>% # use max() to flag if ACTIVE_THIS_PERIOD is 1 at least once\n",
- " ungroup()"
- ],
- "execution_count": null,
- "outputs": [],
- "id": "002e7fbf-1f68-4419-be2d-f16d8c72936d"
+ "tags": [],
+ "vscode": {
+ "languageId": "r"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "PERIOD_START <- dhis2_routine$PERIOD %>% min()\n",
+ "PERIOD_END <- dhis2_routine$PERIOD %>% max()\n",
+ "\n",
+ "period_vector <- format(seq(ym(PERIOD_START), ym(PERIOD_END), by = \"month\"), \"%Y%m\")\n",
+ "cat(glue(\"Start period: {PERIOD_START} \\nEnd period: {PERIOD_END} \\nPeriods count: {length(period_vector)}\"))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "526bc3af-01c1-4ddc-b3b9-077354e57559",
+ "metadata": {
+ "papermill": {
+ "duration": 0.000109,
+ "end_time": "2026-01-16T10:23:59.862555",
+ "exception": false,
+ "start_time": "2026-01-16T10:23:59.862446",
+ "status": "completed"
},
- {
- "cell_type": "markdown",
- "metadata": {
- "papermill": {
- "duration": 0.000098,
- "end_time": "2026-01-16T10:24:05.948452",
- "exception": false,
- "start_time": "2026-01-16T10:24:05.948354",
- "status": "completed"
- },
- "tags": []
- },
- "source": [
- "#### 3.5. Compute Weighting factor based on \"volume of activity\""
- ],
- "id": "160c08ec-cc9a-4e1a-99ec-f703db83a71d"
+ "tags": []
+ },
+ "source": [
+ "#### 3.1. Build master table (all PERIOD x OU)\n",
+ "The master table contains all combinations of period x organisation unit "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "9308197a-0852-4d34-8888-cf5564f35a9d",
+ "metadata": {
+ "papermill": {
+ "duration": 0.289128,
+ "end_time": "2026-01-16T10:24:00.151791",
+ "exception": false,
+ "start_time": "2026-01-16T10:23:59.862663",
+ "status": "completed"
},
- {
- "cell_type": "code",
- "metadata": {
- "papermill": {
- "duration": 0.520673,
- "end_time": "2026-01-16T10:24:06.469233",
- "exception": false,
- "start_time": "2026-01-16T10:24:05.948560",
- "status": "completed"
- },
- "tags": [],
- "vscode": {
- "languageId": "r"
- }
- },
- "source": [
- "log_msg(glue(\"Computing volume of activity using indicator: {paste(VOLUME_ACTIVITY_INDICATORS, collapse=', ')}\"))\n",
- "\n",
- "# Compute MEAN_REPORTED_CASES_BY_HF as total cases over months with activity\n",
- "mean_monthly_cases <- dhis2_routine %>% \n",
- " mutate(total_cases_by_hf_month = rowSums(across(all_of(VOLUME_ACTIVITY_INDICATORS)), na.rm = TRUE)) %>%\n",
- " group_by(ADM2_ID, OU_ID) %>% \n",
- " summarise(\n",
- " total_cases_by_hf_year = sum(total_cases_by_hf_month, na.rm = TRUE),\n",
- " number_of_reporting_months = length(which(total_cases_by_hf_month > 0)),\n",
- " .groups = \"drop\"\n",
- " ) %>% \n",
- " mutate(MEAN_REPORTED_CASES_BY_HF = total_cases_by_hf_year / number_of_reporting_months) %>%\n",
- " select(ADM2_ID, OU_ID, MEAN_REPORTED_CASES_BY_HF)\n",
- "\n",
- "mean_monthly_cases_adm2 <- mean_monthly_cases %>% \n",
- " select(ADM2_ID, MEAN_REPORTED_CASES_BY_HF) %>% \n",
- " group_by(ADM2_ID) %>% \n",
- " summarise(SUMMED_MEAN_REPORTED_CASES_BY_ADM2 = sum(MEAN_REPORTED_CASES_BY_HF, na.rm=TRUE), \n",
- " NR_OF_HF = n())\n",
- "\n",
- "# Compute weights\n",
- "hf_weights <- mean_monthly_cases %>% \n",
- " left_join(mean_monthly_cases_adm2, by = \"ADM2_ID\") %>%\n",
- " mutate(WEIGHT = MEAN_REPORTED_CASES_BY_HF / SUMMED_MEAN_REPORTED_CASES_BY_ADM2 * NR_OF_HF)\n",
- "\n",
- "# Join with rest of data\n",
- "facility_master_routine_02 <- facility_master_routine_01 %>%\n",
- " left_join(hf_weights %>% select(OU_ID, WEIGHT), by = c(\"OU_ID\"))"
- ],
- "execution_count": null,
- "outputs": [],
- "id": "4420e559-4134-4fc3-8950-9972ebede00e"
+ "tags": [],
+ "vscode": {
+ "languageId": "r"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "log_msg(glue(\"Building master table with periods from {PERIOD_START} to {PERIOD_END}. Periods count: {length(period_vector)}\"))\n",
+ "\n",
+ "facility_master <- dhis2_pyramid_formatted %>%\n",
+ " rename(\n",
+ " OU_ID = glue::glue(\"LEVEL_{config_json$SNT_CONFIG$ANALYTICS_ORG_UNITS_LEVEL}_ID\"),\n",
+ " OU_NAME = glue::glue(\"LEVEL_{config_json$SNT_CONFIG$ANALYTICS_ORG_UNITS_LEVEL}_NAME\"),\n",
+ " ADM2_ID = str_replace(ADMIN_2, \"NAME\", \"ID\"),\n",
+ " ADM2_NAME = all_of(ADMIN_2),\n",
+ " ADM1_ID = str_replace(ADMIN_1, \"NAME\", \"ID\"),\n",
+ " ADM1_NAME = all_of(ADMIN_1)\n",
+ " ) %>%\n",
+ " select(ADM1_ID, ADM1_NAME, ADM2_ID, ADM2_NAME, OU_ID, OU_NAME, OPENING_DATE, CLOSED_DATE) %>%\n",
+ " distinct() %>%\n",
+ " tidyr::crossing(PERIOD = period_vector) %>%\n",
+ " mutate(PERIOD=as.numeric(PERIOD))\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d5af25ad-f17c-4cdc-ac96-908af49fe558",
+ "metadata": {
+ "papermill": {
+ "duration": 0.000114,
+ "end_time": "2026-01-16T10:24:00.152094",
+ "exception": false,
+ "start_time": "2026-01-16T10:24:00.151980",
+ "status": "completed"
},
- {
- "cell_type": "markdown",
- "metadata": {
- "papermill": {
- "duration": 0.000108,
- "end_time": "2026-01-16T10:24:06.469622",
- "exception": false,
- "start_time": "2026-01-16T10:24:06.469514",
- "status": "completed"
- },
- "tags": []
- },
- "source": [
- "#### 3.6. Compute Weighted variables"
- ],
- "id": "2fed8529-70e9-4e2e-a498-fe3dd7499bb3"
+ "tags": []
+ },
+ "source": [
+ "#### 3.2. Identify \"Active\" facilities\n",
+ "\n",
+ "Facilities **reporting** zero or positive values on any of the selected indicators (**\"Activity indicators\"**) are considered to be **active**. Note that this method only counts **non-null** (not `NA`s) to prevent counting empty submissions as valid reporting.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7b279d27",
+ "metadata": {
+ "vscode": {
+ "languageId": "r"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "log_msg(glue(\"Assessing facility reporting activity based on the following indicators: {paste(ACTIVITY_INDICATORS, collapse=', ')}\"))\n",
+ "\n",
+ "facility_master_routine <- left_join(\n",
+ " facility_master,\n",
+ " # dhis2_routine %>% select(OU_ID, PERIOD, all_of(DHIS2_INDICATORS)), # GP 2026-02-04\n",
+ " dhis2_routine %>% select(OU_ID, PERIOD, any_of(DHIS2_INDICATORS)), \n",
+ " by = c(\"OU_ID\", \"PERIOD\")\n",
+ " ) %>%\n",
+ " mutate(\n",
+ " YEAR = as.numeric(substr(PERIOD, 1, 4)),\n",
+ " ACTIVE_THIS_PERIOD = ifelse(\n",
+ " rowSums(!is.na(across(all_of(ACTIVITY_INDICATORS))) & across(all_of(ACTIVITY_INDICATORS)) >= 0) > 0, 1, 0), \n",
+ " COUNT = 1 # Counting every facility\n",
+ " )"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "89c3e5c8-4a4e-497d-9d75-2aed2e8fe619",
+ "metadata": {
+ "papermill": {
+ "duration": 0.000107,
+ "end_time": "2026-01-16T10:24:01.626760",
+ "exception": false,
+ "start_time": "2026-01-16T10:24:01.626653",
+ "status": "completed"
},
- {
- "cell_type": "code",
- "metadata": {
- "papermill": {
- "duration": 0.483413,
- "end_time": "2026-01-16T10:24:06.953139",
- "exception": false,
- "start_time": "2026-01-16T10:24:06.469726",
- "status": "completed"
- },
- "tags": [],
- "vscode": {
- "languageId": "r"
- }
- },
- "source": [
- "log_msg(glue(\"Computing weighted variables for reporting rate calculation.\"))\n",
- "\n",
- "facility_master_routine_02$ACTIVE_THIS_PERIOD_W <- facility_master_routine_02$ACTIVE_THIS_PERIOD * facility_master_routine_02$WEIGHT\n",
- "facility_master_routine_02$COUNT_W <- facility_master_routine_02$COUNT * facility_master_routine_02$WEIGHT \n",
- "facility_master_routine_02$OPEN_W <- facility_master_routine_02$OPEN * facility_master_routine_02$WEIGHT\n",
- "facility_master_routine_02$ACTIVE_THIS_YEAR_W <- facility_master_routine_02$ACTIVE_THIS_YEAR * facility_master_routine_02$WEIGHT\n",
- "\n",
- "dim(facility_master_routine_02)\n",
- "head(facility_master_routine_02, 2)"
- ],
- "execution_count": null,
- "outputs": [],
- "id": "216f7658-c1da-44e4-9f4f-fdb44fd40259"
+ "tags": []
+ },
+ "source": [
+ "#### 3.3. Identify `OPEN` facilities (denominator)\n",
+ "The \"OPEN\" variable indicates whether a facility is considered structurally open for a given reporting period.\n",
+ "\n",
+ "A facility is flagged as open (OPEN = 1) for a period if both of the following conditions are met:\n",
+ "1. No explicit closure in the facility name. The facility name does not contain closure keywords such as “CLOTUR”, “FERMÉ”, “FERMEE”, or similar.\n",
+ "\n",
+ "2. The period falls within the facility’s opening and closing dates. The opening date is not after the reporting period, and the closing date is not before or equal to the reporting period.\n",
+ "\n",
+ "If either of these conditions is not met, the facility is considered not open (OPEN = 0) for that period."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "0b71f1d8-2048-4b62-865c-9acfe61b5b89",
+ "metadata": {
+ "papermill": {
+ "duration": 1.317923,
+ "end_time": "2026-01-16T10:24:02.944800",
+ "exception": false,
+ "start_time": "2026-01-16T10:24:01.626877",
+ "status": "completed"
},
- {
- "cell_type": "markdown",
- "metadata": {
- "papermill": {
- "duration": 0.000172,
- "end_time": "2026-01-16T10:24:06.953755",
- "exception": false,
- "start_time": "2026-01-16T10:24:06.953583",
- "status": "completed"
- },
- "tags": []
- },
- "source": [
- "#### 3.7. Aggregate data at ADM2 level"
- ],
- "id": "9c0367f7-91cd-4524-abe4-11adf2fcea02"
+ "tags": [],
+ "vscode": {
+ "languageId": "r"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "facility_master_routine <- facility_master_routine %>%\n",
+ " mutate(\n",
+ " period_date = as.Date(ym(PERIOD)),\n",
+ " \n",
+ " # Flag facilities explicitly marked as closed in their name\n",
+ " NAME_CLOSED = str_detect(\n",
+ " toupper(OU_NAME),\n",
+ " \"CLOTUR|FERM(E|EE)?\"\n",
+ " ),\n",
+ "\n",
+ " # Check whether the facility is open during the period using open/close dates\n",
+ " OPEN_BY_DATE = \n",
+ " !(is.na(OPENING_DATE) | as.Date(OPENING_DATE) > period_date |\n",
+ " (!is.na(CLOSED_DATE) & as.Date(CLOSED_DATE) <= period_date)\n",
+ " ),\n",
+ " \n",
+ " # Final definition of an open facility for the period:\n",
+ " # not explicitly closed, within opening/closing dates,\n",
+ " # and started reporting\n",
+ " OPEN = ifelse(\n",
+ " !NAME_CLOSED & OPEN_BY_DATE,\n",
+ " 1, 0\n",
+ " )\n",
+ " )"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "657fd6ca",
+ "metadata": {},
+ "source": [
+ "#### 3.4. Identify \"Active\" facilities for each YEAR (denominator)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a598e4b7",
+ "metadata": {},
+ "source": [
+ "