poseidon-dev/TestSmallestLength.Rmd at master · Bai-Li-NOAA/poseidon-dev · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
---
title: "Testing atlantisom: do we ever get small fish in length comps?"
author: "Sarah Gaichas and Christine Stawitz"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output: html_document
bibliography: "packages.bib"
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
# automatically create a bib database for R packages
knitr::write_bib(c(
  .packages(), 'knitr', 'rmarkdown', 'tidyr', 'dplyr', 'ggplot2',
  'data.table', 'here', 'ggforce', 'ggthemes'
), 'packages.bib')
```

## Introduction

Here we further test the function `calc_age2length` to see if small fish ever appear with full census sampling. If they don't, perhaps the default CV in the function needs modification? This question is triggered by the [sampling for NOBA Greenland halibut](https://sgaichas.github.io/poseidon-dev/NOBAStdSurvLengthCompTest.html), in which fish below 50 cm didn't show up.

For all setup, etc, please see previous files Full methods are explained [here](https://sgaichas.github.io/poseidon-dev/TrueBioTest.html) and  [here](https://sgaichas.github.io/poseidon-dev/TrueLengthCompTest.html).

This page has visualizations for the NOBA model example, CERES Global Sustainability. For full explanation of methods, see the file linked at the beginning of each section.

```{r message=FALSE, warning=FALSE}
library(tidyr)
require(dplyr)
library(ggplot2)
library(data.table)
library(here)
library(ggforce)
library(ggthemes)
library(atlantisom)
```

```{r initialize}

initCCA <- FALSE
initNEUS <- FALSE
initNOBA <- TRUE

if(initCCA) source(here("config/CCConfig.R"))

if(initNEUS) source(here("config/NEUSConfig.R"))

if(initNOBA) source(here("config/NOBAConfig.R"))

```

```{r get_names, message=FALSE, warning=FALSE}
#Load functional groups
funct.groups <- load_fgs(dir=d.name,
                         file_fgs = functional.groups.file)
#Get just the names of active functional groups
funct.group.names <- funct.groups %>%
  filter(IsTurnedOn == 1) %>%
  select(Name) %>%
  .$Name

```

```{r load_Rdata, message=FALSE, warning=FALSE}

if(initCCA) {
  truth.file <- "outputCCV3run_truth.RData"
  load(file.path(d.name, truth.file))
  truth <- result
}

if(initNEUS) {
  truth.file <- "outputneusDynEffort_Test1_run_truth.RData"
  load(file.path(d.name, truth.file))
  truth <- result
}

if(initNOBA){
  truth.file <- "outputnordic_runresults_01run_truth.RData"
  load(file.path(d.name, truth.file))
  truth <- result
}

```

## Simulate a survey part 4: sample for length composition (testing revised function)

Full methods are explained [here](https://sgaichas.github.io/poseidon-dev/StdSurvLengthCompTest.html).

We will apply examples here to only one species, Greenland halibut in NOBA, which grows to a large size.

To create a census, the user specifies the timing of the survey, which species are captured, the spatial coverage of the survey, the species-specific survey efficiency ("q"), and the selectivity at age for each species. The following settings should achieve a survey that samples all Atlantis model output timesteps, all fish and shark species, and all model polygons, with perfect efficiency and full selectivity for all ages.


```{r census-spec, message=FALSE, warning=FALSE}

# should return a perfectly scaled survey
effic1 <- data.frame(species=funct.group.names,
                     efficiency=rep(1.0,length(funct.group.names)))

# should return all lengths fully sampled (Atlantis output is 10 age groups per spp)
selex1 <- data.frame(species=rep(funct.group.names, each=10),
                     agecl=rep(c(1:10),length(funct.group.names)),
                     selex=rep(1.0,length(funct.group.names)*10))

# should return all model areas
boxpars <- load_box(d.name, box.file)
boxall <- c(0:(boxpars$nbox - 1))

# generalized timesteps all models
runpar <- load_runprm(d.name, run.prm.file)
noutsteps <- runpar$tstop/runpar$outputstep
stepperyr <- if(runpar$outputstepunit=="days") 365/runpar$toutinc

timeall <- c(0:noutsteps)

# define set of species we expect surveys to sample (e.g. fish only? vertebrates?)
# for ecosystem indicator work test all species, e.g.
survspp <- funct.group.names

# for length and age groups lets just do fish and sharks
# NOBA model has InvertType, changed to GroupType in file, but check Atlantis default
if(initNOBA) funct.groups <- rename(funct.groups, GroupType = InvertType)

survspp <- funct.groups$Name[funct.groups$IsTurnedOn==1 &
                           funct.groups$GroupType %in% c("FISH", "SHARK")]

```

Here we use `create_survey` on the numbers output of `run_truth` to create the survey census of age composition (for just one species in this case). The `sample_fish` applies the median for aggregation and does not apply multinomial sampling if `sample=FALSE` in the function call.

Because we don't want to wait 24 hours for this, we will look at only the first 112 time steps.

```{r stdsurveyNbased-GHR, echo=TRUE}

ss.name <- funct.group.names[funct.group.names == "Green_halibut"]

# get survey nums with full (no) selectivity
ss_survey_testNall <- create_survey(dat = truth$nums,
                                 time = c(0:111),
                                 species = ss.name,
                                 boxes = boxall,
                                 effic = effic1,
                                 selex = selex1)

# this one is high but not equal to total for numerous groups
effNhigh <- data.frame(species=survspp, effN=rep(1e+8, length(survspp)))

# apply default sample fish as before to get numbers
ss_numsallhigh <- sample_fish(ss_survey_testNall, effNhigh)


# aggregate true resn per survey design
ss_aggresnall <- aggregateDensityData(dat = truth$resn,
                                 time = c(0:111),
                                 species = ss.name,
                                 boxes = boxall)

# aggregate true structn per survey design
ss_aggstructnall <- aggregateDensityData(dat = truth$structn,
                                 time = c(0:111),
                                 species = ss.name,
                                 boxes = boxall)

#dont sample these, just aggregate them using median
ss_structnall <- sample_fish(ss_aggstructnall, effNhigh, sample = FALSE)

ss_resnall <-  sample_fish(ss_aggresnall, effNhigh, sample = FALSE)

```

Length sample with user specified max length bin (200 cm):

```{r userset-maxlen, echo=TRUE}

ss_length_census <- calc_age2length(structn = ss_structnall,
                                 resn = ss_resnall,
                                 nums = ss_numsallhigh,
                                 biolprm = truth$biolprm, fgs = truth$fgs,
                                 maxbin = 200,
                                 CVlenage = 0.1, remove.zeroes=TRUE)


```

We should get the upper end of Greenland halibut with a 200cm max length bin. What I am concerned about here is whether we get very small fish at any time step:

```{r sslengthsamp2-test}

lfplot <- ggplot(ss_length_census$natlength, aes(upper.bins)) +
  geom_bar(aes(weight = atoutput)) +
  theme_tufte() +
  labs(subtitle = paste(scenario.name, ss_length_census$natlength$species))

lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 1, scales="free_y")
lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 2, scales="free_y")
lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 3, scales="free_y")
lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 4, scales="free_y")
lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 5, scales="free_y")
lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 6, scales="free_y")
lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 7, scales="free_y")

```

Are there fish <50 cm long at any output timestep? If not, does this make sense? There appear to be a few, such as timestep 4. Hard to see at this level.

Aggregate length comp over all of these time steps (first 22+ years):

```{r agglengthcomp}

lfplot

```

This suggests that the smallest Greenland turbot in the ecosystem are ~25 cm over all of the timesteps we checked. Does this make sense?

## Quick test with saved CCA census output

Here we check the CCA model (Atlantis Summit Common Scenario 1) saved length census output for small species at Isaac's suggestion:

```{r loadCCAlengthcomp, echo=TRUE}

source(here("config/CCConfig.R"))

length_censussurvsamp <- readRDS(file.path(d.name, paste0(scenario.name, "length_censussurvsamp.rds")))

```

Sardine:
```{r plotCCAsardine}
ss_length_censussurvsamp <- length_censussurvsamp$natlength %>%
  filter(species == "Pacific_sardine")

lfplot <- ggplot(ss_length_censussurvsamp, aes(upper.bins)) +
  geom_bar(aes(weight = atoutput)) +
  theme_tufte() +
  labs(subtitle = paste(scenario.name, ss_length_censussurvsamp$species))

lfplot
```

Anchovy:
```{r plotCCAanchovy}
ss_length_censussurvsamp <- length_censussurvsamp$natlength %>%
  filter(species == "Anchovy")

lfplot <- ggplot(ss_length_censussurvsamp, aes(upper.bins)) +
  geom_bar(aes(weight = atoutput)) +
  theme_tufte() +
  labs(subtitle = paste(scenario.name, ss_length_censussurvsamp$species))

lfplot
```

Herring:
```{r plotCCAherring}
ss_length_censussurvsamp <- length_censussurvsamp$natlength %>%
  filter(species == "Herring")

lfplot <- ggplot(ss_length_censussurvsamp, aes(upper.bins)) +
  geom_bar(aes(weight = atoutput)) +
  theme_tufte() +
  labs(subtitle = paste(scenario.name,ss_length_censussurvsamp$species))

lfplot
```

How do these three look?

Do we need to change something about the length comp estimation?