Readme
Privacy Filter
Detect and redact personally identifiable information (PII) in text using OpenAI’s privacy-filter — a 1.5B-parameter (50M active) bidirectional token classifier.
The model returns both the detected entities (with confidence scores and character
offsets) and a redacted version of the input where each span is replaced with its
category tag, e.g. [private_email].
Supported PII categories
- private_person — names, aliases
- private_email — email addresses
- private_phone — phone numbers
- private_address — street addresses, cities, ZIPs
- private_url — personal URLs and handles
- private_date — birthdays and sensitive dates
- account_number — bank, card, and account IDs
- secret — API keys, passwords, tokens
Inputs
- text — the text you want to scan for PII.
- score_threshold — minimum confidence (0.0–1.0) for a span to be flagged. Default is
0.5. 0.3— compliance / audit mode: catch everything for human review0.5— balanced default0.9— automated redaction pipeline: only act when very sure
Output
The model returns two things:
- entities — a list of detected PII spans, each with its category, confidence score, the matched text, and its start/end character positions in the input.
- redacted — the original text with every detected span replaced by its category in square brackets (e.g.
[private_email]).
Adjacent spans of the same category are merged so multi-word PII like full names appear as a single entity.
Use cases
- Logging & observability — strip PII from request logs before they hit your storage
- Customer support pipelines — redact tickets before training or analytics
- LLM input/output guardrails — sanitize prompts and completions in real time
- Dataset preparation — clean user-generated content prior to model training
- Compliance workflows — surface candidate PII for human reviewers (use a low threshold)
Tips
- For automated redaction (no human in the loop), use a higher threshold like
0.9to minimize false positives. - For audit / review pipelines, use a lower threshold like
0.3to maximize recall — a human can dismiss false positives later. - The model is multilingual-friendly thanks to its bidirectional tokenizer, but works best on English-style PII patterns.
Limitations
- Token-level classifier — does not perform entity linking or de-duplication across documents.
- Confidence scores are calibrated, but rare PII formats (e.g. unusual international phone numbers, non-Latin names) may need a lower threshold.
- Does not detect contextual PII (e.g. “the patient” in a medical note) — only surface-form spans.