cleanframe — simple pandas-based data cleanup

cleanframe detects common column types (email, phone, date, name) and automatically normalizes them with simple rule-based fixers and validators.

Quick start

Install dependencies

python -m pip install pandas

Run unit tests

pip install pytest
pytest -q

Use the library from a script

python - <<'PY'
import sys
sys.path.insert(0, 'datacleaner')
import pandas as pd
from cleanframe import fix

df = pd.DataFrame({
    'email': ['AASHAY@GMAIL.COM', 'vansh@EXAMPLE.com', 'bad'],
    'phone': ['800-123-4567', '(800) 333-4444', 'nope'],
    'date': ['2023-13-01', '2020-02-29', 'invalid'],
    'name': ['aashay', 'VANSH', '3rd street'],
})
cleaned = fix(df)
print(cleaned)
PY

Usage note

To clean a CSV file, call cleanframe.fix from a Python script and write the resulting DataFrame to disk.

Removed files and tools

This repo was cleaned to remove several helper/experimental scripts. Use pytest for tests and import cleanframe.fix in your Python code to perform cleaning.

Design notes

Detection: heuristics based on validators and pattern matching
Fixing: deterministic normalization, no external APIs or ML
Invalid values become NaN after cleaning (safe for later processing)

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
datacleaner		datacleaner
examples		examples
scripts		scripts
LICENSE		LICENSE
README.md		README.md
debug_core.py		debug_core.py
debug_date.py		debug_date.py
run_tests.py		run_tests.py
run_tests_cleanframe.py		run_tests_cleanframe.py
run_tests_now.py		run_tests_now.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cleanframe — simple pandas-based data cleanup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cleanframe — simple pandas-based data cleanup

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages