lucataco/privacy-filter

OpenAI Privacy Filter is a bidirectional token classifier for detecting and masking personally identifiable information (PII) in text

Public
18 runs

Run time and cost

This model runs on Nvidia T4 GPU hardware. We don't yet have enough runs of this model to provide performance information.

Readme

Privacy Filter

Detect and redact personally identifiable information (PII) in text using OpenAI’s privacy-filter — a 1.5B-parameter (50M active) bidirectional token classifier.

The model returns both the detected entities (with confidence scores and character offsets) and a redacted version of the input where each span is replaced with its category tag, e.g. [private_email].

Supported PII categories

  • private_person — names, aliases
  • private_email — email addresses
  • private_phone — phone numbers
  • private_address — street addresses, cities, ZIPs
  • private_url — personal URLs and handles
  • private_date — birthdays and sensitive dates
  • account_number — bank, card, and account IDs
  • secret — API keys, passwords, tokens

Inputs

  • text — the text you want to scan for PII.
  • score_threshold — minimum confidence (0.0–1.0) for a span to be flagged. Default is 0.5.
  • 0.3 — compliance / audit mode: catch everything for human review
  • 0.5 — balanced default
  • 0.9 — automated redaction pipeline: only act when very sure

Output

The model returns two things:

  • entities — a list of detected PII spans, each with its category, confidence score, the matched text, and its start/end character positions in the input.
  • redacted — the original text with every detected span replaced by its category in square brackets (e.g. [private_email]).

Adjacent spans of the same category are merged so multi-word PII like full names appear as a single entity.

Use cases

  • Logging & observability — strip PII from request logs before they hit your storage
  • Customer support pipelines — redact tickets before training or analytics
  • LLM input/output guardrails — sanitize prompts and completions in real time
  • Dataset preparation — clean user-generated content prior to model training
  • Compliance workflows — surface candidate PII for human reviewers (use a low threshold)

Tips

  • For automated redaction (no human in the loop), use a higher threshold like 0.9 to minimize false positives.
  • For audit / review pipelines, use a lower threshold like 0.3 to maximize recall — a human can dismiss false positives later.
  • The model is multilingual-friendly thanks to its bidirectional tokenizer, but works best on English-style PII patterns.

Limitations

  • Token-level classifier — does not perform entity linking or de-duplication across documents.
  • Confidence scores are calibrated, but rare PII formats (e.g. unusual international phone numbers, non-Latin names) may need a lower threshold.
  • Does not detect contextual PII (e.g. “the patient” in a medical note) — only surface-form spans.
Model created
Model updated