mxsem/README.Rmd at main · jhorzek/mxsem · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
library(mxsem)
set.seed(123)
```

# mxsem

<!-- badges: start -->
[![Total Downloads](https://cranlogs.r-pkg.org/badges/grand-total/mxsem)](https://cranlogs.r-pkg.org/badges/grand-total/mxsem)
[![R-CMD-check](https://github.com/jhorzek/mxsem/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/jhorzek/mxsem/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->

**mxsem** provides a **lavaan**-like (Rosseel, 2012) syntax to implement structural equation models (SEM)
with **OpenMx** (Boker et al., 2011). The objective is to simplify fitting basic SEM with
**OpenMx**, while also unlocking some very useful advanced features. For instance,
**mxsem** allows for parameter transformations and definition
variables. However, **mxsem** is intentionally incomplete in order to focus
on simplicity. The main function (`mxsem()`) is similar to **lavaan**'s `sem()`-function
in that it tries to set up parts of the model automatically (e.g., adding variances
automatically or scaling the latent variables automatically).

> **Warning**: The syntax and settings of **mxsem** may differ from
**lavaan** in some cases. See `vignette("Syntax", package = "mxsem")` for more details
on the syntax and the default arguments.

## Alternatives

**mxsem** is not the first package providing a **lavaan**-like syntax for **OpenMx**.
You will find similar functions in the following packages:

- [**metaSEM**](https://github.com/mikewlcheung/metasem) (Cheung, 2015) provides a `lavaan2RAM`
function that can be combined with the `create.mxModel` function. This combination
offers more features than **mxsem**. For instance, constraints of the form `a < b`
are supported. In **mxsem** such constraints require algebras (e.g., `!diff; a := b - exp(diff)`).
- [**umx**](https://github.com/tbates/umx) (Bates et al., 2019)
provides the `umxRAM` and `umxLav2RAM` functions that can parse single **lavaan**-style
statements (e.g., `eta =~ y1 + y2 + y3`)
or an entire **lavaan** models to **OpenMx** models.
- [**tidySEM**](https://github.com/cjvanlissa/tidySEM) (van Lissa, 2023) provides the
`as_ram` function to translate **lavaan** syntax to **OpenMx** and also implements a unified syntax to
specify both, **lavaan** and **OpenMx** models. Additionally, it works well with the
**tidyverse**.
- [**ezMx**](https://github.com/OpenMx/ezMx) (Bates, et al. 2014) simplifies fitting SEM with **OpenMx**
and also provides a translation of **lavaan** models to **OpenMx** with the
`lavaan.to.OpenMx` function.

Because **mxsem** implements the syntax parser from scratch, it can extend the
**lavaan** syntax to account for specific **OpenMx** features. This enables
[implicit transformations](#transformations) with curly braces.

## Citation

Cite **OpenMx** (Boker et al., 2011) for the modeling and **lavaan** for the
syntax (Rosseel, 2012). To cite **mxsem**, check `citation("mxsem")`.

## Installation

**mxsem** is available from CRAN:

```{r, eval = FALSE}
install.packages("mxsem")
```

The newest version of the package can be installed from GitHub using the following commands in R:

```{r, eval = FALSE}
if(!require(devtools)) install.packages("devtools")
devtools::install_github("jhorzek/mxsem",
                         ref = "main")
```
Because **mxsem** uses Rcpp, you will need a compiler for C++ (e.g., by installing
Rtools on Windows, Xcode on Mac and build-essential on linux).

## Example

The following example is directly adapted from **lavaan**:

```{r, eval = TRUE, results = 'hide', message=FALSE, warning=FALSE, message=FALSE, warning=FALSE}
library(mxsem)
model <- '
  # latent variable definitions
     ind60 =~ x1 + x2 + x3
     dem60 =~ y1 + a1*y2 + b*y3 + c1*y4
     dem65 =~ y5 + a2*y6 + b*y7 + c2*y8

  # regressions
    dem60 ~ ind60
    dem65 ~ ind60 + dem60

  # residual correlations
    y1 ~~ y5
    y2 ~~ y4 + y6
    y3 ~~ y7
    y4 ~~ y8
    y6 ~~ y8
'

mxsem(model = model,
      data  = OpenMx::Bollen) |>
  mxTryHard() |>
  summary()
```
<details>
<summary>Show summary</summary>


```{r, include=FALSE}
fit_mx <- mxsem(model = model,
                data  = OpenMx::Bollen) |>
  mxTryHard()
```
```{r,echo=FALSE}
summary(fit_mx)
```

</details>

## Adding bounds

Lower and upper bounds can be added to any of the parameters in the model.
The following demonstrates bounds on a loading:
```{r, eval = TRUE, results = 'hide', message=FALSE, warning=FALSE}
library(mxsem)
model <- '
  # latent variable definitions
     ind60 =~ x1 + x2 + x3
     dem60 =~ y1 + a1*y2 + b*y3 + c1*y4
     dem65 =~ y5 + a2*y6 + b*y7 + c2*y8

  # lower bound on a1
     a1 > 0
  # upper bound on a2
     a2 < 10.123
'

mxsem(model = model,
      data  = OpenMx::Bollen,
      # use latent variances to scale the model
      scale_loadings = FALSE,
      scale_latent_variances = TRUE) |>
  mxTryHard() |>
  summary()
```

<details>
<summary>Show summary</summary>

```{r, include=FALSE}
fit_mx <- mxsem(model = model,
                data  = OpenMx::Bollen,
                # use latent variances to scale the model
                scale_loadings         = FALSE,
                scale_latent_variances = TRUE) |>
  mxTryHard()
```
```{r,echo=FALSE}
summary(fit_mx)
```

</details>

**mxsem** adds lower bounds to any of the variances by default. To remove these
lower bounds, set `lbound_variances = FALSE` when calling `mxsem()`.

## Definition Variables

Definition variables are, for instance, used in latent growth curve models when
the time intervals between observations are different for the subjects
in the data set. Here is an example, where the variables `t_1`-`t_5` indicate
the person-specific times of observation:

```{r, eval = TRUE}
library(mxsem)
set.seed(3489)
dataset <- simulate_latent_growth_curve(N = 100)
head(dataset)
```

In **OpenMx**, parameters can be set to the values found in the columns of
the data set with the `data.` prefix. This is used in the following to fix
the loadings of a latent slope variable on the observations to the times
recorded in `t_1`-`t_5`:
```{r, eval = TRUE, results = 'hide', message=FALSE, warning=FALSE}
library(mxsem)
model <- "
  # specify latent intercept
     I =~ 1*y1 + 1*y2 + 1*y3 + 1*y4 + 1*y5
  # specify latent slope
     S =~ data.t_1 * y1 + data.t_2 * y2 + data.t_3 * y3 + data.t_4 * y4 + data.t_5 * y5

  # specify means of latent intercept and slope
     I ~ int*1
     S ~ slp*1

  # set intercepts of manifest variables to zero
     y1 ~ 0*1; y2 ~ 0*1; y3 ~ 0*1; y4 ~ 0*1; y5 ~ 0*1;
  "

mxsem(model = model,
      data  = dataset) |>
  mxTryHard() |>
  summary()
```

<details>
<summary>Show summary</summary>

```{r, include=FALSE}
fit_mx <- mxsem(model = model,
                data  = dataset) |>
  mxTryHard()
```
```{r,echo=FALSE}
summary(fit_mx)
```

</details>

## Transformations

Sometimes, one may want to express one parameter as a function of other parameters.
In moderated non-linear factor analysis, for example, model parameters are often
expressed in terms of a covariate k. For instance, the effect $a$ of $\xi$ on $\eta$
could be expressed as $a = a_0 + a_1\times k$.

```{r, eval = TRUE}
library(mxsem)
set.seed(9820)
dataset <- simulate_moderated_nonlinear_factor_analysis(N = 100)
head(dataset)
```

**mxsem** currently supports two ways of specifying such transformations. First,
they can be specified explicitly. To this end, the parameters $a_0$ and $a_1$
must fist be initialized with `!a0` and `!a1`. Additionally, the transformation
must be defined with `a := a0 + a1*data.k`.
```{r, eval = TRUE, results = 'hide', message=FALSE, warning=FALSE}
model <- "
  # loadings
     xi =~ x1 + x2 + x3
     eta =~ y1 + y2 + y3
  # regression
     eta ~ a*xi

  # we need two new parameters: a0 and a1. These are created as follows:
     !a0
     !a1
  # Now, we redefine a to be a0 + k*a1, where k is found in the data
     a := a0 + data.k*a1
"

fit_mx <- mxsem(model = model,
                data  = dataset) |>
  mxTryHard()

summary(fit_mx)

# get just the value for parameter a:
mxEval(expression = a, model = fit_mx)
```

<details>
<summary>Show summary</summary>

```{r, include=FALSE}
fit_mx <- mxsem(model = model,
                data = dataset) |>
  mxTryHard()
```
```{r,echo=FALSE}
summary(fit_mx)
# get just the value for parameter a:
mxEval(expression = a,
       model      = fit_mx)
```

</details>

Alternatively, the transformations can be defined implicitly by placing the
algebra in curly braces and directly inserting it in the syntax in place
of the parameter label. This is inspired by the approach in **metaSEM** (Cheung, 2015).
```{r, eval = TRUE, results = 'hide', message=FALSE, warning=FALSE}
model <- "
  # loadings
     xi =~ x1 + x2 + x3
     eta =~ y1 + y2 + y3
  # regression
     eta ~ {a0 + a1*data.k} * xi
"

mxsem(model = model,
      data  = dataset) |>
  mxTryHard() |>
  summary()
```

<details>
<summary>Show summary</summary>

```{r, include=FALSE}
fit_mx <- mxsem(model = model,
                data  = dataset) |>
  mxTryHard()
```
```{r,echo=FALSE}
summary(fit_mx)
```

</details>

You can also provide custom names for the algebra results:
```{r, eval = TRUE, results = 'hide', message=FALSE, warning=FALSE}
model <- "
  # loadings
     xi  =~ x1 + x2 + x3
     eta =~ y1 + y2 + y3
  # regression
     eta ~ {a := a0 + a1*data.k} * xi
"

fit_mx <- mxsem(model = model,
                data  = dataset) |>
  mxTryHard()

summary(fit_mx)

# get just the value for parameter a:
mxEval(expression = a,
       model      = fit_mx)
```

<details>
<summary>Show summary</summary>

```{r, include=FALSE}
fit_mx <- mxsem(model = model,
                data  = dataset) |>
  mxTryHard()
```

```{r,echo=FALSE}
summary(fit_mx)
# get just the value for parameter a:
mxEval(expression = a,
       model      = fit_mx)
```

</details>

## Adapting the Model

`mxsem` returns an `mxModel` object that can be adapted further by users familiar
with **OpenMx**.

## Trouble shooting

Sometimes things may go wrong. One way to figure out where **mxsem** messed up
is to look at the parameter table generated internally. This parameter table
is not returned by default. See `vignette("create_parameter_table", package = "mxsem")`
for more details.

Another point of failure are the default labels used by **mxsem** to indicate directed
and undirected effects. These are based on unicode characters. If you see parameter
labels similar to `"eta\u2192y1"` in your output, this indicates that your editor cannot display
unicode characters. In this case, you can customize the labels as follows:

```{r, eval = TRUE, results = 'hide', message=FALSE, warning=FALSE, message=FALSE, warning=FALSE}
library(mxsem)
model <- '
  # latent variable definitions
     ind60 =~ x1 + x2 + x3
     dem60 =~ y1 + a1*y2 + b*y3 + c1*y4
     dem65 =~ y5 + a2*y6 + b*y7 + c2*y8
'

mxsem(model      = model,
      data       = OpenMx::Bollen,
      directed   = "_TO_",
      undirected = "_WITH_") |>
  mxTryHard() |>
  summary()
```
<details>
<summary>Show summary</summary>


```{r, include=FALSE}
fit_mx <- mxsem(model = model,
                data  = OpenMx::Bollen,
                directed   = "_TO_",
                undirected = "_WITH_") |>
  mxTryHard()
```
```{r,echo=FALSE}
summary(fit_mx)
```

</details>


## References

* Bates, T. C., Maes, H., & Neale, M. C. (2019). umx: Twin and Path-Based Structural Equation Modeling in R. Twin Research and Human Genetics, 22(1), 27–41. https://doi.org/10.1017/thg.2019.2
* Bates, T. C., Prindle, J. J. (2014). ezMx. https://github.com/OpenMx/ezMx
* Boker, S. M., Neale, M., Maes, H., Wilde, M., Spiegel, M., Brick, T., Spies, J., Estabrook, R., Kenny, S., Bates, T., Mehta, P., & Fox, J. (2011).
OpenMx: An Open Source Extended Structural Equation Modeling Framework. Psychometrika, 76(2), 306–317. https://doi.org/10.1007/s11336-010-9200-6
* Cheung, M. W.-L. (2015). metaSEM: An R package for meta-analysis using structural equation modeling. Frontiers in Psychology, 5. https://doi.org/10.3389/fpsyg.2014.01521
* Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02
* van Lissa, C. J. (2023). tidySEM: Tidy Structural Equation Modeling. R package version 0.2.4, https://cjvanlissa.github.io/tidySEM/.