Skip to content

Commit 8fd9f88

Browse files
committed
add readme
1 parent b14e2b1 commit 8fd9f88

2 files changed

Lines changed: 605 additions & 34 deletions

File tree

README.Rmd

Lines changed: 236 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -16,39 +16,262 @@ knitr::opts_chunk$set(
1616
# simMultiCov
1717

1818
<!-- badges: start -->
19+
[![R-CMD-check](https://github.com/BmBaczkowski/simMultiCov/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/BmBaczkowski/simMultiCov/actions/workflows/R-CMD-check.yaml)
20+
[![Codecov test coverage](https://codecov.io/gh/BmBaczkowski/simMultiCov/branch/main/graph/badge.svg)](https://app.codecov.io/gh/BmBaczkowski/simMultiCov?branch=main)
1921
<!-- badges: end -->
2022

21-
The goal of simMultiCov is to ...
23+
**simMultiCov** provides tools for simulating multilevel (clustered) covariates with flexible correlation structures. It supports continuous, binary, and ordinal covariates, allowing researchers to generate realistic clustered data for methodological studies, power analyses, and simulation-based research.
24+
25+
## Features
26+
27+
- **Multiple covariate types**: Simulate continuous, binary, and ordinal covariates
28+
- **Flexible correlation structures**: Specify separate within-cluster and between-cluster correlations
29+
- **Latent variable approach**: Binary and ordinal covariates are generated using latent normal distributions
30+
- **Unequal cluster sizes**: Support for varying numbers of observations per cluster
31+
- **Multiple datasets**: Generate multiple independent datasets in a single call
32+
- **Reproducible simulations**: Optional seed parameter for reproducibility
2233

2334
## Installation
2435

25-
You can install the development version of simMultiCov like so:
36+
You can install the development version of simMultiCov from GitHub with:
37+
38+
``` r
39+
# install.packages("devtools")
40+
devtools::install_github("BmBaczkowski/simMultiCov")
41+
```
42+
43+
Or using the `remotes` package:
2644

2745
``` r
28-
# FILL THIS IN! HOW CAN PEOPLE INSTALL YOUR DEV PACKAGE?
46+
# install.packages("remotes")
47+
remotes::install_github("BmBaczkowski/simMultiCov")
2948
```
3049

31-
## Example
50+
## Quick Start
3251

33-
This is a basic example which shows you how to solve a common problem:
52+
### Basic Example
3453

3554
```{r example}
3655
library(simMultiCov)
37-
## basic example code
56+
57+
# Define covariates
58+
covs <- make_covariates(
59+
covariates = list(
60+
make_continuous("age", mean = 50, total_var = 100, icc = 0.2),
61+
make_binary("treatment", prob = 0.5, icc = 0.1),
62+
make_ordinal("satisfaction", probs = c(0.2, 0.3, 0.5), icc = 0.15)
63+
),
64+
correlations = list(
65+
define_correlation("age", "treatment", corr_within = 0.1, corr_between = 0.05)
66+
)
67+
)
68+
69+
# View the specification
70+
print(covs)
71+
```
72+
73+
### Simulate Data
74+
75+
```{r simulate}
76+
# Simulate a single dataset with 10 clusters of 20 observations each
77+
df <- simulate(covs, n_clusters = 10, cluster_size = 20, seed = 123)
78+
79+
# View the first few rows
80+
head(df)
81+
82+
# Check the data structure
83+
str(df)
84+
```
85+
86+
### Unequal Cluster Sizes
87+
88+
```{r unequal}
89+
# Simulate with varying cluster sizes
90+
df_unequal <- simulate(
91+
covs,
92+
n_clusters = 5,
93+
cluster_size = c(10, 15, 20, 25, 30),
94+
seed = 456
95+
)
96+
97+
# View cluster sizes
98+
table(df_unequal$cluster)
99+
```
100+
101+
### Multiple Datasets
102+
103+
```{r multiple}
104+
# Generate 3 independent datasets
105+
datasets <- simulate(
106+
covs,
107+
n_clusters = 5,
108+
cluster_size = 10,
109+
n_datasets = 3,
110+
seed = 789
111+
)
112+
113+
# Check the number of datasets
114+
length(datasets)
115+
116+
# Each dataset is a data frame
117+
class(datasets[[1]])
118+
```
119+
120+
## Detailed Examples
121+
122+
### Continuous Covariates
123+
124+
```{r continuous}
125+
# Create a continuous covariate with ICC of 0.3
126+
ability <- make_continuous(
127+
name = "ability",
128+
mean = 100,
129+
total_var = 225,
130+
icc = 0.3
131+
)
132+
133+
# Build covariate specification
134+
covs <- make_covariates(covariates = list(ability))
135+
136+
# Simulate data
137+
df <- simulate(covs, n_clusters = 8, cluster_size = 15, seed = 111)
138+
139+
# Verify ICC
140+
library(dplyr)
141+
df %>%
142+
group_by(cluster) %>%
143+
summarise(cluster_mean = mean(ability)) %>%
144+
summarise(
145+
between_var = var(cluster_mean),
146+
total_var = var(df$ability),
147+
icc = between_var / total_var
148+
)
149+
```
150+
151+
### Binary Covariates
152+
153+
```{r binary}
154+
# Create a binary treatment variable
155+
treatment <- make_binary(
156+
name = "treatment",
157+
prob = 0.6,
158+
icc = 0.15,
159+
labels = c("Control", "Treatment")
160+
)
161+
162+
# Build specification
163+
covs <- make_covariates(covariates = list(treatment))
164+
165+
# Simulate data
166+
df <- simulate(covs, n_clusters = 6, cluster_size = 20, seed = 222)
167+
168+
# Check treatment distribution
169+
table(df$treatment)
170+
171+
# Check ICC
172+
df %>%
173+
group_by(cluster) %>%
174+
summarise(cluster_prop = mean(treatment == "Treatment")) %>%
175+
summarise(
176+
between_var = var(cluster_prop),
177+
total_prop = mean(df$treatment == "Treatment"),
178+
icc = between_var / (total_prop * (1 - total_prop))
179+
)
180+
```
181+
182+
### Ordinal Covariates
183+
184+
```{r ordinal}
185+
# Create an ordinal satisfaction variable
186+
satisfaction <- make_ordinal(
187+
name = "satisfaction",
188+
probs = c(0.15, 0.25, 0.35, 0.25),
189+
icc = 0.2,
190+
labels = c("VeryLow", "Low", "High", "VeryHigh")
191+
)
192+
193+
# Build specification
194+
covs <- make_covariates(covariates = list(satisfaction))
195+
196+
# Simulate data
197+
df <- simulate(covs, n_clusters = 10, cluster_size = 25, seed = 333)
198+
199+
# Check distribution
200+
table(df$satisfaction)
38201
```
39202

40-
What is special about using `README.Rmd` instead of just `README.md`? You can include R chunks like so:
203+
### Correlated Covariates
41204

42-
```{r cars}
43-
summary(cars)
205+
```{r correlated}
206+
# Create multiple correlated covariates
207+
covs <- make_covariates(
208+
covariates = list(
209+
make_continuous("income", mean = 50000, total_var = 100000000, icc = 0.25),
210+
make_continuous("education", mean = 16, total_var = 4, icc = 0.2),
211+
make_binary("employed", prob = 0.7, icc = 0.1)
212+
),
213+
correlations = list(
214+
define_correlation("income", "education", corr_within = 0.5, corr_between = 0.6),
215+
define_correlation("income", "employed", corr_within = 0.3, corr_between = 0.2),
216+
define_correlation("education", "employed", corr_within = 0.4, corr_between = 0.3)
217+
)
218+
)
219+
220+
# View the specification
221+
summary(covs)
222+
223+
# Simulate data
224+
df <- simulate(covs, n_clusters = 15, cluster_size = 30, seed = 444)
225+
226+
# Check correlations
227+
cor(df[, c("income", "education")])
44228
```
45229

46-
You'll still need to render `README.Rmd` regularly, to keep `README.md` up-to-date. `devtools::build_readme()` is handy for this.
230+
## Use Cases
231+
232+
**simMultiCov** is particularly useful for:
233+
234+
- **Methodological research**: Testing the performance of multilevel models under various data-generating conditions
235+
- **Power analysis**: Determining required sample sizes for multilevel studies
236+
- **Teaching**: Demonstrating multilevel data structures and analysis techniques
237+
- **Simulation studies**: Generating realistic clustered data for Monte Carlo simulations
238+
- **Sensitivity analysis**: Assessing how violations of assumptions affect model performance
239+
240+
## Package Structure
241+
242+
The package provides the following main functions:
47243

48-
You can also embed plots, for example:
244+
- `make_continuous()`: Define continuous covariates
245+
- `make_binary()`: Define binary covariates
246+
- `make_ordinal()`: Define ordinal covariates
247+
- `define_correlation()`: Specify correlations between covariates
248+
- `make_covariates()`: Combine covariate specifications into a complete structure
249+
- `simulate()`: Generate simulated data from a covariate specification
49250

50-
```{r pressure, echo = FALSE}
251+
## Getting Help
51252

253+
- **Documentation**: Run `?function_name` for detailed help on any function
254+
- **Issues**: Report bugs or request features on [GitHub Issues](https://github.com/BmBaczkowski/simMultiCov/issues)
255+
- **Questions**: Ask questions on [GitHub Discussions](https://github.com/BmBaczkowski/simMultiCov/discussions)
256+
257+
## Citation
258+
259+
If you use simMultiCov in your research, please cite:
260+
261+
``` r
262+
citation("simMultiCov")
52263
```
53264

54-
In that case, don't forget to commit and push the resulting figure files, so they display on GitHub and CRAN.
265+
## License
266+
267+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
268+
269+
## Contributing
270+
271+
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
272+
273+
1. Fork the repository
274+
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
275+
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
276+
4. Push to the branch (`git push origin feature/AmazingFeature`)
277+
5. Open a Pull Request

0 commit comments

Comments
 (0)