Skip to content
144 changes: 127 additions & 17 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,31 +1,141 @@
# Whitelist approach - ignore everything except specified files
# This approach provides better security by denying all files by default
# and explicitly allowing only essential development files

# ========================================
# DENY ALL BY DEFAULT
# ========================================
/*
Comment thread
MarjovanLier marked this conversation as resolved.
Outdated

# ========================================
# ALLOW DIRECTORY TRAVERSAL (CRITICAL)
# ========================================
# Without this pattern, Git cannot traverse subdirectories
# to check for whitelisted files within them
!*/

# ========================================
# CORE APPLICATION FILES
# ========================================
!*.php
!composer.json
!LICENSE

# ========================================
# DOCUMENTATION
# ========================================
!README.md
!CONTRIBUTING.md
!CHANGELOG.md

# ========================================
# SOURCE CODE & TESTS
# ========================================
!src/
!src/**
!tests/
!tests/**

# ========================================
# CONFIGURATION FILES
# ========================================
!phpunit.xml
!phpcs.xml
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated
!phpstan.neon
!psalm.xml
!phpmd.xml
!pint.json
!rector.php
!infection.json5

Comment thread
MarjovanLier marked this conversation as resolved.
# ========================================
# CI/CD & GITHUB
# ========================================
!.github/
!.github/**
!.pre-commit-config.yaml
Comment thread
MarjovanLier marked this conversation as resolved.
!.codacy.yaml

# ========================================
# DOCKER & INFRASTRUCTURE
# ========================================
!Dockerfile
!docker-compose.yml

# ========================================
# DEVELOPMENT SCRIPTS
# ========================================
!*.sh

# ========================================
# NODE.JS CONFIGURATION (if present)
# ========================================
!package.json
!commitlint.config.js

# ========================================
# ADDITIONAL CONFIGURATIONS
# ========================================
!.coderabbit.yaml
!.dockerignore
!.pr_agent.toml
!sweep.yaml

# ========================================
# GIT CONFIGURATION
# ========================================
!.gitignore
!.gitattributes
!.gitmessage

# ========================================
# EXPLICITLY DENIED ITEMS
# (These remain ignored even with whitelist)
# ========================================
# Dependencies and lock files
vendor/
node_modules/
composer.lock
vendor
tests/temp
.idea
package-lock.json

# Cache and temporary files
.phpunit.cache
.phpunit.result.cache
.php-cs-fixer.cache
reports

.qodo
*.tmp

# Qodana
# Build artifacts and reports
reports/
.qodana/
qodana.yaml
qodana.sarif.json
.qodana/

# Temporary files
commit_messages.txt
*.tmp
# IDE and editor files
.idea/
.vscode/
*.swp
*.swo

# AI tooling directories (private)
.claude/
.claude-flow/
.github
Comment thread
MarjovanLier marked this conversation as resolved.
Outdated
.hive-mind/
.kilocode/
.roo/
.qodo/

# Private documentation
CLAUDE.local.md
AGENTS.md

# Docker
# Docker overrides
.docker/
docker-compose.override.yml

# Pre-commit
# Pre-commit cache
.pre-commit/

# Node modules
node_modules/
package-lock.json
.php-cs-fixer.cache
# System files
.DS_Store
Thumbs.db
117 changes: 88 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,23 @@

- [Introduction](#introduction)
- [Features](#features)
- [Performance Benchmarks](#performance-benchmarks)
- [Installation](#installation)
- [Usage](#usage)
- [Advanced Usage](#advanced-usage)
- [Testing](#testing)
- [Testing & Quality Assurance](#testing--quality-assurance)
- [System Requirements](#system-requirements)
- [Contributing](#contributing)
- [Support](#support)

## Introduction

Welcome to the `StringManipulation` library, a robust and efficient PHP toolkit designed to enhance string handling in
your PHP projects. With its user-friendly interface and performance-oriented design, this library is an essential
addition for developers looking to perform complex string manipulations with ease.
Welcome to the `StringManipulation` library, a high-performance PHP 8.3+ toolkit designed for complex and efficient
string handling. Following a recent suite of O(n) optimisations, the library is now **2-5x faster**, making it one of
the most powerful and reliable solutions for developers who require speed and precision in their PHP applications.

This library specialises in Unicode handling, data normalisation, encoding conversion, and validation with comprehensive
testing and quality assurance.

[![Packagist Version](https://img.shields.io/packagist/v/marjovanlier/stringmanipulation)](https://packagist.org/packages/marjovanlier/stringmanipulation)
[![Packagist Downloads](https://img.shields.io/packagist/dt/marjovanlier/stringmanipulation)](https://packagist.org/packages/marjovanlier/stringmanipulation)
Expand All @@ -25,20 +31,46 @@ addition for developers looking to perform complex string manipulations with eas
[![Phan Enabled](https://img.shields.io/badge/Phan-enabled-brightgreen.svg?style=flat)](https://github.com/phan/phan/)
[![Psalm Enabled](https://img.shields.io/badge/Psalm-enabled-brightgreen.svg?style=flat)](https://psalm.dev/)
[![codecov](https://codecov.io/github/MarjovanLier/StringManipulation/graph/badge.svg?token=lBTpWlSq37)](https://codecov.io/github/MarjovanLier/StringManipulation)
[![Qodana](https://github.com/MarjovanLier/StringManipulation/actions/workflows/qodana_code_quality.yml/badge.svg)](https://github.com/MarjovanLier/StringManipulation/actions/workflows/qodana_code_quality.yml)

## Features

- **Search Words**: Transform strings into a search-optimised format for database queries, removing unnecessary
characters and optimising for search engine algorithms.
- **Name Fix**: Standardise last names by capitalising the first letter of each part of the name and handling prefixes
correctly, ensuring consistency across your data.
- **UTF-8 to ANSI**: Convert UTF-8 encoded characters to their ANSI equivalents, facilitating compatibility with systems
that do not support UTF-8.
- **Remove Accents**: Strip accents and special characters from strings to normalise text, making it easier to search
and compare.
- **Date Validation**: Ensure date strings conform to specified formats and check for logical consistency, such as
correct days in a month.
- **`removeAccents()`**: Efficiently strips accents and diacritics to normalise text. Powered by O(n) optimisations
using hash table lookups, this high-performance feature makes text comparison and searching faster than ever (981,436+
ops/sec).
- **`searchWords()`**: Transforms strings into a search-optimised format ideal for database queries. This
high-performance function intelligently removes irrelevant characters and applies single-pass algorithms to improve
search accuracy (387,231+ ops/sec).
- **`nameFix()`**: Standardises names by capitalising letters and correctly handling complex prefixes. Its
performance-oriented design with consolidated regex operations ensures consistent data formatting at scale (246,197+
ops/sec).
- **`utf8Ansi()`**: Convert UTF-8 encoded characters to their ANSI equivalents with comprehensive Unicode mappings,
facilitating compatibility with legacy systems.
- **`isValidDate()`**: Comprehensive date validation utility that ensures date strings conform to specified formats and
validates logical consistency.
- **Comprehensive Unicode/UTF-8 Support**: Built from the ground up to handle a wide range of international characters
with optimised character mappings, ensuring your application is ready for a global audience.

## Performance Benchmarks

The library has undergone extensive performance tuning, resulting in **2-5x speed improvements** through O(n)
optimisation algorithms. Our benchmarks demonstrate the library's capability to handle high-volume data processing
efficiently:

| Method | Performance | Optimisation Technique |
Comment thread
MarjovanLier marked this conversation as resolved.
|-------------------|----------------------|---------------------------------|
| `removeAccents()` | **981,436+ ops/sec** | Hash table lookups with strtr() |
| `searchWords()` | **387,231+ ops/sec** | Single-pass combined mapping |
| `nameFix()` | **246,197+ ops/sec** | Consolidated regex operations |

*Benchmarks measured on standard development environments. Actual performance may vary based on hardware, string length,
Comment thread
MarjovanLier marked this conversation as resolved.
and complexity.*

**Key Optimisation Features:**

- O(n) complexity algorithms for all core methods
- Static caching for character mapping tables
- Single-pass string transformations
- Minimal memory allocation in critical paths

## Installation

Expand Down Expand Up @@ -77,7 +109,6 @@ $fixedName = StringManipulation::nameFix('mcdonald');
echo $fixedName; // Outputs: 'McDonald'
```


### Search Words

This feature optimises strings for database queries by removing unnecessary characters and optimising for search engine
Expand Down Expand Up @@ -135,7 +166,6 @@ $isValidDate = StringManipulation::isValidDate('2023-02-29', 'Y-m-d');
echo $isValidDate ? 'Valid' : 'Invalid'; // Outputs: 'Invalid'
```


## Advanced Usage

For more complex string manipulations, consider chaining functions to achieve unique transformations. For instance, you
Expand Down Expand Up @@ -164,31 +194,60 @@ steps:

Thank you for your interest in improving our library!

## Testing
## Testing & Quality Assurance

To ensure the reliability and functionality of your string manipulations, it's recommended to run the entire test suite
with the following command:
We are committed to delivering reliable, high-quality code. Our library is rigorously tested using a comprehensive suite
of tools to ensure stability and correctness.

```bash
./vendor/bin/phpunit
```
### Docker-Based Testing (Recommended)

To run specific tests or test suites, you can use PHPUnit flags to filter tests. For example, to run tests in a specific
file:
For a consistent and reliable testing environment, we recommend using Docker. Our Docker setup includes PHP 8.3 with all
required extensions:

```bash
./vendor/bin/phpunit --filter testFileName
# Run complete test suite
docker-compose run --rm test-all

# Run individual test suites
docker-compose run --rm test-phpunit # PHPUnit tests
docker-compose run --rm test-phpstan # Static analysis
docker-compose run --rm test-code-style # Code style
docker-compose run --rm test-infection # Mutation testing
```

Comment thread
MarjovanLier marked this conversation as resolved.
And to run tests matching a specific name pattern:
### Local Testing

If you have a local PHP 8.3+ environment configured:

```bash
./vendor/bin/phpunit --filter '/::testNamePattern$/'
# Complete test suite
composer tests

# Individual tests
./vendor/bin/phpunit --filter testClassName
./vendor/bin/phpunit --filter '/::testMethodName$/'
```

### Our Quality Suite Includes:

- **PHPUnit**: 166 comprehensive tests with 100% code coverage ensuring functional correctness
- **Mutation Testing**: 88% Mutation Score Indicator (MSI) with Infection, guaranteeing our tests are robust and
meaningful
- **Static Analysis**: Proactive bug detection using:
- PHPStan (level max, strict rules)
- Psalm (level 1, 99.95% type coverage)
- Phan (clean analysis results)
- PHPMD (mess detection)
- **Code Style**: Automated formatting with Laravel Pint (PSR compliance)
- **Performance Benchmarks**: Continuous performance monitoring with comprehensive benchmarking suite

## System Requirements

- PHP 8.3 or later.
- **PHP 8.3 or later** (strict typing enabled)
- **`mbstring` extension** for multi-byte string operations
- **`intl` extension** for internationalisation and advanced Unicode support
- **Enabled `declare(strict_types=1);`** for robust type safety
- **Composer** for package management

## Support

Expand Down
36 changes: 18 additions & 18 deletions composer.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "marjovanlier/stringmanipulation",
"description": "A PHP library for efficient string manipulation, focusing on data normalisation, encoding conversion and validation.",
"description": "High-performance PHP 8.3+ string manipulation library featuring O(n) algorithms with up to 5x speed improvements. Provides Unicode-aware operations including searchWords(), nameFix(), utf8Ansi(), removeAccents(), and isValidDate() with comprehensive testing infrastructure.",
"keywords": [
"string manipulation",
"performance",
Expand Down Expand Up @@ -46,32 +46,32 @@
},
"require-dev": {
"enlightn/security-checker": ">=2.0",
"infection/infection": ">=0.29.14",
"laravel/pint": ">=1.22.1",
"phan/phan": ">=5.4.5",
"infection/infection": ">=0.31.2",
"laravel/pint": ">=1.24.0",
"phan/phan": ">=5.5.1",
"php-parallel-lint/php-parallel-lint": ">=1.4.0",
"phpmd/phpmd": ">=2.15",
"phpstan/extension-installer": ">=1.4.3",
"phpstan/phpstan": ">=2.1.17",
"phpstan/phpstan-strict-rules": ">=2.0.4",
"phpstan/phpstan": ">=2.1.22",
"phpstan/phpstan-strict-rules": ">=2.0.6",
"phpunit/phpunit": ">=11.0.9|>=12.0.2",
Comment thread
MarjovanLier marked this conversation as resolved.
"psalm/plugin-phpunit": ">=0.19.3",
"rector/rector": ">=2.0.16",
"rector/rector": ">=2.1.4",
"roave/security-advisories": "dev-latest",
"vimeo/psalm": ">=6.7"
},
"scripts-descriptions": {
"test:code-style": "Check code for stylistic consistency.",
"test:composer-validate": "Ensure 'composer.json' is valid and consistent.",
"test:infection": "Conduct mutation testing for robustness.",
"test:lint": "Search for syntax errors and problematic patterns.",
"test:phan": "Perform static analysis with Phan to identify code issues.",
"test:phpmd": "Detect bugs and suboptimal code with PHP Mess Detector.",
"test:phpstan": "Use PHPStan for static analysis and bug detection.",
"test:phpunit": "Execute PHPUnit tests to verify code functionality.",
"test:psalm": "Run Psalm to find errors and improve code quality.",
"test:rector": "Apply automated code quality enhancements with Rector.",
"test:vulnerabilities-check": "Scan dependencies for known security vulnerabilities."
"test:code-style": "Check code for stylistic consistency using Laravel Pint",
"test:composer-validate": "Validate composer.json schema, dependencies, and configuration integrity with strict validation",
"test:infection": "Execute comprehensive mutation testing to verify test quality and code robustness against logic modifications",
"test:lint": "Perform syntax validation and identify deprecated PHP patterns across all source files",
"test:phan": "Execute Phan static analysis for type safety, dead code detection, and PHP compatibility validation",
"test:phpmd": "Analyse code complexity, design patterns, and identify potential bugs using PHP Mess Detector rules",
"test:phpstan": "Perform advanced static analysis with PHPStan for type checking, null safety, and logic validation",
"test:phpunit": "Run comprehensive PHPUnit test suite (166 tests) with strict type checking and edge case coverage",
"test:psalm": "Execute Psalm static analysis for advanced type inference, purity checking, and security validation",
"test:rector": "Analyse code for modernisation opportunities and PHP 8.3+ feature adoption using Rector rules",
"test:vulnerabilities-check": "Scan all dependencies for known CVE vulnerabilities and security advisories using Enlightn Security Checker"
},
"scripts": {
"post-update-cmd": [
Expand Down
Loading