This repository contains web server access logs and error logs from the websites operated by the Cloud and Distributed Systems Laboratory (CDSL). The logs cover the period from April 2023 to February 2026.
All IP addresses in the logs have been anonymized. Each original IP address has been replaced with a unique identifier in the format IP###### (e.g., IP001234). The mapping is consistent within the dataset — the same original IP address always maps to the same anonymized identifier — so per-client request patterns remain analyzable without exposing real addresses.
Logs are organized by month, then by daily backup snapshot:
YYYYMM/
log-backup-YYYYMMDDHHMMSS/ # Access logs
access.<service>.log.1
...
log-error-backup-YYYYMMDDHHMMSS/ # Error logs (available from October 2024)
error.<service>.log.1
...
All HTTP access logs use the standard Nginx combined log format:
$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for"
Example:
IP143527 - - [31/Dec/2025:00:04:20 +0900] "HEAD / HTTP/1.1" 405 0 "https://doktor.tak-cslab.org/" "Mozilla/5.0+(compatible; UptimeRobot/2.0; http://www.uptimerobot.com/)"
The TCP stream proxy logs (access.nisl-1-stream.log.1) use a different format produced by the Nginx stream module:
$remote_addr [$time_local] $protocol $status $bytes_sent $bytes_received $session_time
Example:
IP277299 [21/Dec/2025:10:30:27 +0900] TCP 502 0 0 3.012
| Filename | URL | Available From | Description |
|---|---|---|---|
access.doktor.log.1 |
https://doktor.tak-cslab.org | Apr 2023 | Academic paper search web service built on a microservice architecture. Source code: cdsl-research/doktor-v2 |
access.wp.log.1 |
https://ja.tak-cslab.org/ | Apr 2023 | Public-facing WordPress website. Contains the lab blog and research introduction pages. |
access.rudder.log.1 |
https://rudder.tak-cslab.org/ | Apr 2023 | Internal lab introduction website. Access is controlled via Google Account SSO. |
access.nisl.tak-cslab.org.log.1 |
http://nisl.tak-cslab.org | Apr 2023 | HTTP reverse proxy that fronts a Kubernetes cluster exposed for joint research purposes. |
access.lily.log.1 |
— | May 2023 | Internal website operated by the lab. |
access.clematis.tak-cslab.org.log.1 |
https://clematis.tak-cslab.org | Nov 2023 | Internal website operated by the lab. |
access.nisl-1-stream.log.1 |
— | Mar 2024 | TCP-level stream proxy for direct Kubernetes API access (port 6443). Uses Nginx stream module format. |
access.forsythia.tak-cslab.org.log.1 |
https://forsythia.tak-cslab.org | Apr 2024 | Internal website operated by the lab. |
access.harvest.tak-cslab.org.log.1 |
https://harvest.tak-cslab.org | Nov 2024 | Experimental website used for research purposes. |
Error logs are available from October 2024 onwards. They are stored in separate log-error-backup-* directories within each monthly folder.
| Filename | Description |
|---|---|
error.nisl.tak-cslab.org.log.1 |
Nginx error log for the NISL HTTP reverse proxy |
error.nisl-1-stream.log.1 |
Nginx error log for the NISL TCP stream proxy |
error.clematis.tak-cslab.org.log.1 |
Nginx error log for the Clematis site |
error.forsythia.tak-cslab.org.log.1 |
Nginx error log for the Forsythia site |
error.harvest.tak-cslab.org.log.1 |
Nginx error log for the Harvest site |
error.lily.log.1 |
Nginx error log for the Lily site |
Total: 6,082 files / 5,736,834 log entries
| Service | Files | Entries |
|---|---|---|
| doktor | 623 | 529,155 |
| wp | 953 | 4,786,964 |
| rudder | 478 | 478 |
| nisl (HTTP) | 780 | 30,232 |
| lily | 546 | 38,287 |
| clematis | 543 | 230,985 |
| nisl-1-stream (TCP) | 331 | 5,857 |
| forsythia | 378 | 56,270 |
| harvest | 218 | 38,715 |
| Total | 4,850 | 5,716,943 |
| Service | Files | Entries |
|---|---|---|
| nisl (HTTP) | 206 | 1,273 |
| lily | 203 | 2,030 |
| clematis | 204 | 246 |
| nisl-1-stream (TCP) | 231 | 5,847 |
| forsythia | 203 | 406 |
| harvest | 185 | 10,089 |
| Total | 1,232 | 19,891 |
If you use this dataset in your research, please cite it as follows:
BibTeX:
@misc{cdsl-website-access-logs,
author = {{Cloud and Distributed Systems Laboratory}},
title = {Website Access Logs --- Cloud and Distributed Systems Laboratory},
year = {2026},
howpublished = {\url{https://github.com/cdsl-research/website-access-logs-public}},
note = {Accessed: \today}
}APA:
Cloud and Distributed Systems Laboratory. (2026). Website Access Logs — Cloud and Distributed Systems Laboratory [Dataset]. GitHub. https://github.com/cdsl-research/website-access-logs-public
| Service | Access Log Period | Error Log Period |
|---|---|---|
| doktor | Apr 2023 – Feb 2026 | — |
| wp (WordPress) | Apr 2023 – Feb 2026 | — |
| rudder | Apr 2023 – Feb 2026 | — |
| nisl (HTTP) | Apr 2023 – Feb 2026 | Oct 2024 – Feb 2026 |
| lily | May 2023 – Feb 2026 | Oct 2024 – Feb 2026 |
| clematis | Nov 2023 – Feb 2026 | Oct 2024 – Feb 2026 |
| nisl-1-stream (TCP) | Mar 2024 – Feb 2026 | Oct 2024 – Feb 2026 |
| forsythia | Apr 2024 – Feb 2026 | Oct 2024 – Feb 2026 |
| harvest | Nov 2024 – Feb 2026 | Oct 2024 – Feb 2026 |