Skip to content

Add measures parameter to data_tabulate()#689

Open
elinw wants to merge 6 commits into
easystats:mainfrom
elinw:measures
Open

Add measures parameter to data_tabulate()#689
elinw wants to merge 6 commits into
easystats:mainfrom
elinw:measures

Conversation

@elinw

@elinw elinw commented Jun 12, 2026

Copy link
Copy Markdown

There were a few ways to approach this. The cumulative values depends on having the valid values, and it could make sense to just set any not included measures to NULL at the end, the way I did for valid.

I added a test for measures = "raw" ... I could add for other variations also.

Fixes #685

@elinw

elinw commented Jun 12, 2026

Copy link
Copy Markdown
Author

#685

Comment thread R/data_tabulate.R
Comment thread R/data_tabulate.R Outdated
}

out$`Raw %` <- 100 * out$N / sum(out$N)
if ("raw" %in% measures){

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use space around infix-operators :-)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed, not sure why it doesn't show as outdated.

Comment thread tests/testthat/test-data_tabulate.R
Comment thread R/data_tabulate.R Outdated
#' @param measures Optional character vector, indicating the types of
#' percents to be included. Only applies to frequencies, i.e. when `by` is
#' `NULL`. Can be `"raw"` (includes `NA` values), `"valid"` (excludes `NA` values)
#' or `"cumulative"` (excludes `NA` vlues).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo (vlues)

This comment was marked as duplicate.

@etiennebacher etiennebacher left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On top of the other comments to address above, please add a bullet point in NEWS.md and add an example for the new argument in the docs of data_tabulate().

(@strengejacke I hid all the copilot comments because they were just duplicates of your comments and adding noise to this PR)

@elinw

elinw commented Jun 12, 2026

Copy link
Copy Markdown
Author

I was wondering if it should be documented that measures = c() will return the frequencies only.
Also, that made me wonder if "N" should also be optional.

@elinw elinw requested a review from etiennebacher June 12, 2026 19:07
@codecov

codecov Bot commented Jun 14, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.10%. Comparing base (89daeba) to head (7c6542e).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #689   +/-   ##
=======================================
  Coverage   92.09%   92.10%           
=======================================
  Files          76       76           
  Lines        7727     7736    +9     
=======================================
+ Hits         7116     7125    +9     
  Misses        611      611           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@etiennebacher etiennebacher left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if it should be documented that measures = c() will return the frequencies only.

Yes I think so.

Also, that made me wonder if "N" should also be optional.

I don't know about that but if it's not included then I think this measures parameter should be renamed shares. @strengejacke WDYT?


The workflow for code styling fails. We use Air to automatically format code so you can install it and run it on the project to fix this failure.

Comment thread R/data_tabulate.R
Comment on lines +93 to +95
#' # exclude the cumulative percent column
#' data_tabulate(efc$c172code, remove_na = TRUE)
#'

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new example should show how to use measures instead of remove_na.

Comment thread R/data_tabulate.R
Comment on lines +44 to +45
#' `NULL`. Can be `"raw"` (includes `NA` values), `"valid"` (excludes `NA` values)
#' or `"cumulative"` (excludes `NA` values).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' `NULL`. Can be `"raw"` (includes `NA` values), `"valid"` (excludes `NA` values)
#' or `"cumulative"` (excludes `NA` values).
#' `NULL`. Can be several choices among `"raw"` (includes `NA` values),
#' `"valid"` (excludes `NA` values) and `"cumulative"` (excludes `NA` values).

)
})

test_that("data_tabulate data.frame with measures", {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests for measures = c(), measures = NULL and measures = "foo".

Comment thread NEWS.md
Comment on lines +23 to +24
* `data_tabulate()` gain a `measures` argument to allow selection of columns to
display ("raw", "valid", and "cumulative") (#689).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add your github username if you want: (#689, <username>)

@etiennebacher etiennebacher changed the title Add measures parameter to control columns displayed in frequency tables. Add measures parameter to data_tabulate() Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cumulative distribution should be optional

4 participants