Spaces:
Runtime error
Runtime error
Update app.py
Browse files
app.py
CHANGED
|
@@ -84,9 +84,17 @@ def anonymize(text, min_len=3):
|
|
| 84 |
|
| 85 |
title = "PII Masking"
|
| 86 |
description = """
|
| 87 |
-
In many applications, PII is easy to remove from databases.
|
| 88 |
-
|
| 89 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 90 |
"""
|
| 91 |
|
| 92 |
gr.Interface(
|
|
|
|
| 84 |
|
| 85 |
title = "PII Masking"
|
| 86 |
description = """
|
| 87 |
+
In many applications, personally identifiable information (PII) is easy to remove from databases since a column may contain specific PII.
|
| 88 |
+
Common techniques like hashing also allow the identity of these values to be preserved without exposing the contents of the value.
|
| 89 |
+
|
| 90 |
+
However, it can be less straightforward to remove from unstructured text data, where PII may or may not be present.
|
| 91 |
+
Further, text may contain multiple types of PII that present an increased risk of exposure when coupled together.
|
| 92 |
+
For example, a name and IP address together may be used to pinpoint a specific person's location.
|
| 93 |
+
Hashing the data outright is not an option since consumers of these data often prefer to work with the raw text data.
|
| 94 |
+
Thus, preserving privacy in raw text data remains a challenge.
|
| 95 |
+
|
| 96 |
+
This space applies both rule-based and ML-based approaches to remove names, phone numbers, emails, and IP addresses from raw text.
|
| 97 |
+
This app accepts raw text and returns the same text, but with PII replaced with special tokens that preserve some characteristics of the masked entities without revealing their contents.
|
| 98 |
"""
|
| 99 |
|
| 100 |
gr.Interface(
|