-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME.Rmd
More file actions
169 lines (117 loc) · 5.1 KB
/
README.Rmd
File metadata and controls
169 lines (117 loc) · 5.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# programets
<!-- badges: start -->
[](https://lifecycle.r-lib.org/articles/stages.html#stable)
<!-- badges: end -->
**programets** is an R package for collecting and analyzing academic impact metrics for NIH-funded research projects. It aggregates data from multiple public sources to help researchers, program managers, and evaluators understand how their projects engage with the broader research community over time.
## What Does It Do?
The package provides unified access to:
- **NIH RePORTER** - Project metadata, funding details, and associated publications
- **iCite** - Citation metrics including Relative Citation Ratio (RCR) for PubMed publications
- **GitHub** - Repository metrics (stars, forks, commits, contributors, issues, PRs) for projects tagged with NIH Core Project Numbers
- **Google Analytics** - Web traffic and engagement data for project websites
- **Europe PMC** - Literature search across millions of publications
## Installation
Install the development version from GitHub:
``` r
# install.packages("pak")
pak::pak("nih-cfde/programets")
```
**Requirements:**
- R >= 4.1.0
- For GitHub data: A Personal Access Token is recommended for higher rate limits
## Quick Start
### 1. Query NIH RePORTER for Project Information
Retrieve comprehensive project metadata including publications, funding details, and principal investigators:
```{r reporter_example}
library(programets)
# Get project info for one or more NIH Core Project Numbers
proj_info <- get_core_project_info(c("u24ca289073"))
proj_info |> colnames()
```
Available fields include: project title, abstract, funding amounts, dates, PIs, publications (PMIDs), and more.
### 2. Get GitHub Repository Metrics
If your project repositories are tagged with NIH Core Project Numbers as topics, you can collect engagement metrics:
```{r git_example}
# Fetch GitHub metrics for repos tagged with your project number
df <- get_github_by_topic(c("u24ca289073"))
df |> colnames()
```
Metrics include: stars, watchers, forks, open/closed issues, open/closed PRs, commit count, contributor count, and more.
**Tip:** Tag your GitHub repositories with your NIH Core Project Number (e.g., `u24ca289073`) to enable discovery.
### 3. Get Citation Metrics from iCite
Calculate impact metrics including the Relative Citation Ratio (RCR) for your publications:
```{r icite_example, eval=FALSE}
# Get citation metrics for PubMed IDs
pmids <- c(26001965, 25015380)
citation_data <- icite(pmids)
# View RCR and other metrics
dplyr::select(citation_data, pmid, title, year,
relative_citation_ratio, citation_count)
```
### 4. Search Europe PMC for Publications
Query millions of publications with flexible search syntax:
```{r epmc_example, eval=FALSE}
# Search for publications related to CRISPR
crispr_pubs <- epmc_search(query = "crispr", page_limit = 2)
# Search by author
author_pubs <- epmc_search(query = 'AUTH:"Smith J"', page_limit = 1)
```
### 5. Access Google Analytics Data
Retrieve web traffic data for project websites (requires authentication):
```{r ga_example, eval=FALSE}
# Authenticate with Google (first time only)
googleAnalyticsR::ga_auth()
# Get traffic data
traffic <- ga_dataframe(
property_id = "123456789",
start_date = "2024-01-01",
end_date = "2024-12-31",
metrics = c("activeUsers", "sessions"),
dimensions = c("date", "country")
)
```
## Authentication
### GitHub
For increased API rate limits, set up a Personal Access Token:
```{r github_token, eval=FALSE}
# Create a token with: usethis::create_github_token()
# Then use it in your calls:
df <- get_github_by_topic(c("u24ca289073"), token = "your_token_here")
```
### Google Analytics
First-time setup requires authentication:
```{r ga_auth, eval=FALSE}
# Opens browser for Google account authorization
googleAnalyticsR::ga_auth()
# For non-interactive use, see DEVELOPER.md for service account setup
```
## Use Cases
- **Impact Reporting**: Generate comprehensive reports combining publication citations, web traffic, and GitHub engagement
- **Trend Analysis**: Track how metrics evolve over time in response to publications, presentations, or events
- **Portfolio Management**: Compare metrics across multiple projects or funding opportunities
- **Compliance**: Document project outputs and community engagement for progress reports
## Documentation
- **Function Reference**: See `?get_core_project_info`, `?icite`, `?get_github_by_topic`, etc.
- **Vignettes**: Browse vignettes for detailed workflows
- **Developer Notes**: See `DEVELOPER.md` for advanced setup (service accounts, encryption)
## Getting Help
- File issues at: https://github.com/nih-cfde/programets/issues
- Review examples in the vignettes
- Check function documentation with `?function_name`
## Authors
- Sean Davis (seandavi@gmail.com)
- David Mayer (david.mayer@cuanschutz.edu)
## License
MIT