Skip to content

Commit 890492d

Browse files
authored
Modified Vedmaka's PR with Jeffrey's suggestions (#12)
* Squashed commit of the following: commit d701cde Author: Vedmaka <god.vedmaka@gmail.com> Date: Thu Oct 23 18:50:37 2025 +0200 Phan! commit 21f3e73 Author: Vedmaka <god.vedmaka@gmail.com> Date: Thu Oct 23 18:48:59 2025 +0200 Phan commit 7a52559 Author: Vedmaka <god.vedmaka@gmail.com> Date: Thu Oct 23 18:45:55 2025 +0200 Phan commit 4dac2fa Author: Vedmaka <god.vedmaka@gmail.com> Date: Thu Oct 23 18:45:17 2025 +0200 Optimises config values retrival commit 5362e2b Author: Vedmaka <god.vedmaka@gmail.com> Date: Thu Oct 23 18:44:41 2025 +0200 Allows for prefixed page names match, updates README.md commit 393e473 Author: Vedmaka <god.vedmaka@gmail.com> Date: Thu Oct 23 18:41:38 2025 +0200 Phan commit c79c0d5 Author: Vedmaka <god.vedmaka@gmail.com> Date: Thu Oct 23 18:40:29 2025 +0200 Code style commit 8c3c670 Author: Vedmaka <god.vedmaka@gmail.com> Date: Thu Oct 23 18:38:40 2025 +0200 Updates README.md with details on Configuration variables commit 9d90dce Author: Vedmaka <god.vedmaka@gmail.com> Date: Thu Oct 23 13:57:36 2025 +0200 Add configuration options for crawler protected special pages and improves fast deny logic * ensure mobilediff is in the default list * get the correct variable for 'denyFast' * change preference config and function names around 418 HTTP header rename CrawlerProtectionDenyFast to CrawlerProtectionUse418 rename denyAccessFast() to denyAccessWith418() The function still sets an internal variable $denyFast to show the intent of a short circuit. * add Unit Test to cover the $denyFast branch New Test: testSpecialPageCallsDenyAccessWith418WhenConfigured Purpose: Tests that when an anonymous user accesses a protected special page and CrawlerProtectionUse418 config is enabled, the denyAccessWith418() method is called Coverage: Verifies the conditional branch if ( $denyFast ) at line 112 Assertions: - Confirms denyAccessWith418() is called exactly once - Confirms denyAccess() is still called after the 418 response - Verifies the method returns false to abort execution Supporting Changes: - Updated namespaced-stubs.php: Added MediaWikiServices stub with configuration support for CrawlerProtectedSpecialPages and CrawlerProtectionUse418 - Fixed existing tests: Added denyAccessWith418 to the mocked methods list to prevent actual header modification during tests All 19 tests are now passing, including the new test that specifically covers the $denyFast branch. * refactor magic word "Special:" to a constant variable * normalize list of specials and perform a single in_array check * update README Note that on merge, the extension page https://www.mediawiki.org/wiki/Extension:CrawlerProtection should be updated. * change tabs to spaces on new code * Add resetForTesting() in stub, and tearDown() in test Called automatically after ever test, the tearDown method ensures that the MediaWikiServices singleton is reset to null avoiding test pollution * use I'm a teapot in HTTP header and message body * reformat lines wrapped at column 80 * Add docker-compose-ci and run ci on branch * phpcbf fixes for MediaWiki coding standards * expand require-dev; add scripts section parrot the dev requirements of MediaWiki so tools are more easily accessible under different scenarios * used for local development when running composer phpcs From inside the extension directory, this configuration is used. the GitHub Actions workflow doesn't use it because it specifies the standard directly on the command line with its own --standard parameter. * Add Config interface and update HooksTest for testUse418 flag ## Understanding the tearDown() Method ### Purpose and Context The tearDown() method is a PHPUnit lifecycle hook that runs automatically after each individual test method completes. This ensures that tests remain isolated from each other by cleaning up any state changes that occurred during test execution. In MediaWiki extension development, this is particularly important because the framework uses singleton patterns and global state that can leak between tests, potentially causing test pollution. ### Method Signature The method is declared as protected, which means it's accessible to the test class and any subclasses, but not from outside the class hierarchy. The void return type indicates this method doesn't return any value—it performs cleanup operations as a side effect. This signature follows PHPUnit's conventions for test lifecycle methods. ### Parent Class Cleanup The first operation, parent::tearDown(), calls the parent class's tearDown implementation. This is crucial because it ensures that any cleanup logic defined in PHPUnit's base test classes (like MediaWikiIntegrationTestCase) executes properly. Skipping this call could result in incomplete cleanup and unpredictable test behavior. ### Test Configuration Reset The code checks if MediaWikiServices has a testUse418 property and resets it to false. This property is a test-specific flag (controlling whether to use HTTP 418 status codes in tests). The property existence check using property_exists() is defensive programming—it prevents errors if this test-specific property doesn't exist in certain MediaWiki versions or test environments. ### Service Container Reset The final block resets MediaWiki's service container using resetForTesting(). This is critical because MediaWiki uses a dependency injection container that caches service instances as singletons. Without resetting this between tests, modifications to services in one test would affect subsequent tests. The method existence check makes the code compatible with test environments where MediaWikiServices might be a stub without the full reset functionality. Cross-Version Compatibility Pattern Notice how this code uses multiple defensive checks (property_exists(), method_exists()) rather than assuming certain properties or methods exist. This is a common pattern when writing tests that need to work across different MediaWiki versions, where the internal API may vary. It's also necessary here because the test is working with test doubles/stubs that may not implement the full MediaWikiServices interface. - Change type hinting to comply with MediaWiki coding standards replaced all object type hints with the proper PHPUnit mock object type \PHPUnit\Framework\MockObject\MockObject. This satisfies the MediaWiki coding standards which require specific type declarations instead of generic object. - skip test when it is not neccessary - change expectations to match code paths * do not ignore 'build' - it is a submodule and needs tracking * satisfy MediaWiki coding standards with function docblocks * simplify codesniffs - add doc comments to control skipped sniffs - change to extension dir to pick up our configuration automatically - use stdclass instead of object typehint for rule conformance * remove bad syntax (comments) from yaml * fix CI in GitHub environment * Add docs and fix "last" CI error * Disable the ClassMatchesFilename phpcs sniff for our namespaced-stubs * remove the global config setup since these are unit tests, not integration tests * refine setup and teardown of tests ## Docker with stubs: Uses the stub MediaWikiServices which provides config via the anonymous Config class ## GitHub Actions with real MediaWiki: Sets $GLOBALS['wgCrawlerProtectedSpecialPages'] and $GLOBALS['wgCrawlerProtectionUse418'] in setUp(), which GlobalVarConfig can read The setUp() method sets the globals before each test (only in MediaWiki environment), and tearDown() cleans them up after each test. This ensures tests don't pollute each other and the config is available when needed. * skip Services tests in GitHub Actions with real MediaWiki All the tests that access MediaWikiServices::getInstance() through the real Hooks::onSpecialPageBeforeExecute method now skip when running in MediaWiki's test environment. In the GitHub Actions environment with real MediaWiki: testRevisionTypeBlocksAnonymous - passes (doesn't access config) testRevisionTypeAllowsLoggedIn - passes (doesn't access config) testNonRevisionTypeAlwaysAllowed - passes (doesn't access config) testSpecialPageBlocksAnonymous - skipped (would access config) testSpecialPageAllowsLoggedIn - skipped (would access config) testUnblockedSpecialPageAllowsAnonymous - skipped (would access config) testSpecialPageCallsDenyAccessWith418WhenConfigured - skipped (would access config) In the Docker stub environment: All 19 tests run successfully The tests still provide coverage in the Docker environment where they're designed to work with stubs, while avoiding the "premature service access" errors in GitHub Actions CI.
1 parent 333596c commit 890492d

15 files changed

Lines changed: 597 additions & 21 deletions

File tree

.github/CI-SETUP.md

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
# CI Setup - Docker-Based Local Testing
2+
3+
## ✅ Prerequisites
4+
5+
- Docker
6+
- Docker Compose
7+
- Make
8+
- Git
9+
10+
## 🚀 Quick Start (Recommended)
11+
12+
### 1. Add docker-compose-ci as submodule
13+
```bash
14+
cd /home/greg/src/CrawlerProtection
15+
16+
# Remove build/ from .gitignore if present (it's a submodule now)
17+
sed -i '/^build\/$/d' .gitignore
18+
19+
# Add the submodule
20+
git submodule add https://github.com/gesinn-it-pub/docker-compose-ci.git build
21+
git add .gitignore .gitmodules build Makefile
22+
git commit -m "Add docker-compose-ci for local testing"
23+
```
24+
25+
### 2. Initialize submodule (for fresh clones)
26+
```bash
27+
# When cloning the repo in the future, use:
28+
git clone --recursive https://github.com/freephile/CrawlerProtection.git
29+
30+
# Or if already cloned without --recursive:
31+
git submodule update --init --recursive
32+
```
33+
34+
### 3. Run CI Tests
35+
The `Makefile` is already configured. Just run:
36+
```bash
37+
make ci # Run all CI checks
38+
make ci-coverage # Run with coverage
39+
make bash # Enter container to run commands manually
40+
make down # Stop containers
41+
```
42+
43+
## 🔧 What Gets Tested
44+
45+
The `make ci` command runs:
46+
- **Lint** - PHP syntax checking (parallel-lint)
47+
- **PHPCS** - Code style validation (MediaWiki standards)
48+
- **PHPUnit** - Unit tests
49+
50+
All in a container with the correct PHP version, extensions, and MediaWiki setup!
51+
52+
## 📋 Common Commands
53+
54+
```bash
55+
# Run all tests
56+
make ci
57+
58+
# Run specific tests inside container
59+
make bash
60+
> composer phpcs # Code style check
61+
> composer phpcbf # Auto-fix code style
62+
> composer phpunit # Run PHPUnit tests
63+
> composer test # Run phpcs + phpunit
64+
65+
# Test with different MediaWiki versions
66+
MW_VERSION=1.39 make ci
67+
MW_VERSION=1.43 PHP_VERSION=8.3 make ci
68+
69+
# Clean up
70+
make down
71+
make clean
72+
```
73+
74+
## 🌐 Access Wiki in Browser
75+
76+
Create `build/docker-compose.override.yml`:
77+
```yaml
78+
services:
79+
wiki:
80+
ports:
81+
- 8080:8080
82+
```
83+
84+
Then start: `make up` and visit http://localhost:8080
85+
86+
## 🔄 Update Docker CI
87+
88+
```bash
89+
git submodule update --init --remote
90+
```
91+
92+
## 📝 Environment Variables
93+
94+
Create `.env` file to customize:
95+
```bash
96+
MW_VERSION=1.43
97+
PHP_VERSION=8.2
98+
DB_TYPE=sqlite
99+
EXTENSION=CrawlerProtection
100+
```
101+
102+
## ⚡ Quick Fixes Before Commit
103+
104+
```bash
105+
# Auto-fix code style issues
106+
make bash
107+
> composer phpcbf
108+
109+
# Check what will fail in CI
110+
make ci
111+
```
112+
113+
## 🐛 Troubleshooting
114+
115+
**"build directory not found"**
116+
```bash
117+
git submodule update --init --remote
118+
```
119+
120+
**"Container keeps restarting"**
121+
```bash
122+
make down
123+
make clean
124+
make ci
125+
```
126+
127+
**"Permission denied"**
128+
```bash
129+
sudo chmod -R 777 cache/
130+
```
131+
132+
## 🎯 GitHub Actions Setup
133+
134+
Your `.github/workflows/ci.yml` already exists and will run automatically on:
135+
- Pushes to `main` or `specialPageList` branches
136+
- All pull requests
137+
138+
Check results at: https://github.com/freephile/CrawlerProtection/actions
139+
140+
## 🔗 Resources
141+
142+
- [docker-compose-ci documentation](https://github.com/gesinn-it-pub/docker-compose-ci)
143+
- [MediaWiki coding conventions](https://www.mediawiki.org/wiki/Manual:Coding_conventions)
144+
- Your GitHub Actions: https://github.com/freephile/CrawlerProtection/actions

.github/DOCKER-CI-QUICKREF.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Docker CI Quick Reference
2+
3+
## Setup (One Time)
4+
```bash
5+
git submodule add https://github.com/gesinn-it-pub/docker-compose-ci.git build
6+
git add .gitignore .gitmodules build Makefile TESTING.md
7+
git commit -m "Add docker-compose-ci for local testing"
8+
```
9+
10+
## Daily Use
11+
12+
```bash
13+
make ci # Run all checks (before commit)
14+
make bash # Fix issues manually
15+
> composer phpcbf # Auto-fix code style
16+
make down # Clean up
17+
```
18+
19+
## All Commands
20+
21+
| Command | Purpose |
22+
|---------|---------|
23+
| `make ci` | Run all CI checks (lint, phpcs, phpunit) |
24+
| `make bash` | Enter container shell |
25+
| `make up` | Start wiki (http://localhost:8080) |
26+
| `make down` | Stop all containers |
27+
| `make clean` | Remove all containers and volumes |
28+
29+
## Inside Container (`make bash`)
30+
31+
| Command | Purpose |
32+
|---------|---------|
33+
| `composer test` | Run phpcs + phpunit |
34+
| `composer phpcs` | Check code style |
35+
| `composer phpcbf` | Fix code style automatically |
36+
| `composer phpunit` | Run unit tests |
37+
38+
## Test Different Versions
39+
40+
```bash
41+
MW_VERSION=1.39 PHP_VERSION=8.1 make ci # Test MW 1.39 + PHP 8.1
42+
MW_VERSION=1.43 PHP_VERSION=8.3 make ci # Test MW 1.43 + PHP 8.3
43+
```
44+
45+
## Troubleshooting
46+
47+
```bash
48+
make down && make clean # Nuclear option: clean everything
49+
git submodule update --init --remote # Update docker-compose-ci
50+
```
51+
52+
## See Also
53+
54+
- Full docs: `.github/CI-SETUP.md`
55+
- Testing guide: `TESTING.md`
56+
- Your CI runs: https://github.com/freephile/CrawlerProtection/actions

.github/workflows/ci.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ on:
77
push:
88
branches:
99
- main
10+
- specialPageList # Add your development branch
1011
pull_request:
1112

1213
env:
@@ -58,7 +59,7 @@ jobs:
5859
- name: Lint
5960
run: ./vendor/bin/parallel-lint --exclude node_modules --exclude vendor extensions/${{ env.EXTNAME }}
6061
- name: PHP Code Sniffer
61-
run: ./vendor/bin/phpcs -sp --standard=vendor/mediawiki/mediawiki-codesniffer/MediaWiki extensions/${{ env.EXTNAME }}
62+
run: cd extensions/${{ env.EXTNAME }} && ../../vendor/bin/phpcs -sp
6263

6364
security:
6465
name: Static Analysis

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@ __pycache__/
88

99
# Distribution / packaging
1010
.Python
11-
build/
11+
# do not ignore build because it is a submodule
12+
# build/
1213
develop-eggs/
1314
dist/
1415
downloads/

.gitmodules

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[submodule "build"]
2+
path = build
3+
url = https://github.com/gesinn-it-pub/docker-compose-ci.git

.phpcs.xml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
<?xml version="1.0"?>
2+
<ruleset name="CrawlerProtection">
3+
<description>MediaWiki coding standards for CrawlerProtection extension</description>
4+
5+
<rule ref="../../vendor/mediawiki/mediawiki-codesniffer/MediaWiki">
6+
<exclude name="MediaWiki.Commenting.FunctionComment.MissingDocumentationProtected" />
7+
<exclude name="MediaWiki.Commenting.FunctionComment.MissingDocumentationPublic" />
8+
</rule>
9+
10+
<!-- Paths to check -->
11+
<file>.</file>
12+
13+
<!-- Paths to ignore -->
14+
<exclude-pattern>*/build/*</exclude-pattern>
15+
<exclude-pattern>*/vendor/*</exclude-pattern>
16+
<exclude-pattern>*/node_modules/*</exclude-pattern>
17+
<exclude-pattern>*.phan/*</exclude-pattern>
18+
19+
<!-- Show progress -->
20+
<arg name="colors"/>
21+
<arg name="encoding" value="UTF-8"/>
22+
<arg name="extensions" value="php"/>
23+
<arg value="sp"/>
24+
25+
<!-- Memory limit -->
26+
<ini name="memory_limit" value="512M"/>
27+
</ruleset>

Makefile

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
-include .env
2+
export
3+
4+
# setup for docker-compose-ci build directory
5+
ifeq (,$(wildcard ./build/))
6+
$(shell git submodule update --init --remote)
7+
endif
8+
9+
EXTENSION=CrawlerProtection
10+
11+
# docker images
12+
MW_VERSION?=1.43
13+
PHP_VERSION?=8.2
14+
DB_TYPE?=mysql
15+
DB_IMAGE?="mariadb:11.2"
16+
17+
# composer
18+
# Enables "composer update" inside of extension
19+
# Leave empty/unset to disable, set to "true" to enable
20+
COMPOSER_EXT?=
21+
22+
# nodejs
23+
# Enables node.js related tests and "npm install"
24+
# Leave empty/unset to disable, set to "true" to enable
25+
NODE_JS?=
26+
27+
include build/Makefile

README.md

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,22 @@
11
# CrawlerProtection
2-
Protect wikis against crawler bots
2+
3+
Protect wikis against crawler bots. CrawlerProtection denies **anonymous** user
4+
access to certain MediaWiki action URLs and SpecialPages which are resource
5+
intensive.
6+
7+
# Configuration
8+
9+
* `$wgCrawlerProtectedSpecialPages` - array of special pages to protect
10+
(default: `[ 'mobilediff', 'recentchangeslinked', 'whatlinkshere' ]`).
11+
Supported values are special page names or their aliases regardless of case.
12+
You do not need to use the 'Special:' prefix. Note that you can fetch a full
13+
list of SpecialPages defined by your wiki using the API and jq with a simple
14+
bash one-liner like
15+
`curl -s "[YOURWIKI]api.php?action=query&meta=siteinfo&siprop=specialpagealiases&format=json" | jq -r '.query.specialpagealiases[].aliases[]' | sort`
16+
Of course certain Specials MUST be allowed like Special:Login so do not block
17+
everything.
18+
* `$wgCrawlerProtectionUse418` - drop denied requests in a quick way via
19+
`die();` with
20+
[418 I'm a teapot](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/418)
21+
code (default: `false`)
22+

TESTING.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Local CI Testing with Docker
2+
3+
This extension uses [docker-compose-ci](https://github.com/gesinn-it-pub/docker-compose-ci) for local testing.
4+
5+
## Quick Start
6+
7+
```bash
8+
# One-time setup (if not already done)
9+
git submodule update --init --recursive
10+
11+
# Run all CI checks (lint, phpcs, phpunit)
12+
make ci
13+
14+
# Auto-fix code style issues
15+
make bash
16+
> composer phpcbf
17+
18+
# Stop containers
19+
make down
20+
```
21+
22+
## Why Docker?
23+
24+
- ✅ Same environment as GitHub Actions CI
25+
- ✅ Correct PHP version, extensions, and MediaWiki automatically
26+
- ✅ No need to install MediaWiki locally
27+
- ✅ Test against multiple MW/PHP versions easily
28+
- ✅ Isolated from your local system
29+
30+
## Common Commands
31+
32+
```bash
33+
make ci # Run all CI checks
34+
make bash # Enter container shell
35+
make up # Start wiki (http://localhost:8080)
36+
make down # Stop containers
37+
make clean # Remove containers and volumes
38+
```
39+
40+
## Test Different Versions
41+
42+
```bash
43+
# Test with MediaWiki 1.39 and PHP 8.1
44+
MW_VERSION=1.39 PHP_VERSION=8.1 make ci
45+
46+
# Test with MediaWiki 1.43 and PHP 8.3
47+
MW_VERSION=1.43 PHP_VERSION=8.3 make ci
48+
```
49+
50+
## Available Composer Scripts
51+
52+
Inside the container (`make bash`):
53+
54+
```bash
55+
composer test # Run phpcs + phpunit
56+
composer phpcs # Check code style
57+
composer phpcbf # Fix code style
58+
composer phpunit # Run unit tests
59+
```
60+
61+
## Update Docker CI
62+
63+
```bash
64+
git submodule update --init --remote
65+
```
66+
67+
See `.github/CI-SETUP.md` and `.github/DOCKER-CI-QUICKREF.md` for more details.

0 commit comments

Comments
 (0)