Log obfuscation
To protect the data passed to the bot, JAICP provides log obfuscation — converting sensitive data into an unreadable format.
Description
Log obfuscation allows removing sensitive data from client and bot phrases. The data is masked in:
To do this, you need to create rules by which such data will be masked. The rules can be applied only to the projects that you choose.
You need to customize regular expressions for matching and replacing only the relevant information. When a regular expression matches a text, the text is replaced with the string specified in the rule settings.
Rules
To create an obfuscation rule, click the icon and go to the Log obfuscation section.
- System rules
- Ready-made rules for Email and Bank card (VISA/Master). Disabled by default.
- You cannot edit the regular expressions set for system rules.
- You can choose the replacement format and projects to which the rule will be applied.
- User rules
- Create custom rules.
- You can create regular expressions yourself and customize the replacement format.
Regular expressions
Characters for composing regular expressions can be divided into several groups:
Boundary matches
Character | Description |
---|---|
^ | The beginning of a line |
$ | The end of a line |
\b | A word boundary |
\B | A non-word boundary |
Character classes
Character | Description |
---|---|
. | Any character except a newline character |
(a|b) | Capturing group: a or b |
[abc] | Matches a , b , or c |
[a-m] | Range, matches characters from a to m |
[^a] | Negation, matches everything except a |
Predefined character classes
Character | Description |
---|---|
\w | A word character |
\W | A non-word character |
\d | A digit |
\D | A non-digit |
\s | A whitespace character |
\S | A non-whitespace character |
Quantifiers
Quantifiers always follow a character or group of characters. Quantifiers may be:
- Greedy — match the longest group. If the first match attempt fails, the matcher backs off the input string by one character and tries again, repeating the process until a match is found or there are no more characters left to back off from.
- Lazy — match the shortest group.
- Possessive — match the longest group without backing off.
Greedy | Lazy | Possessive | Description |
---|---|---|---|
X? | X?? | X?+ | Zero times or once |
X* | X*? | X*+ | Zero or more times |
X+ | X+? | X++ | One or more times |
X{n} | X{n}? | X{n}+ | Exactly n times |
X{n,} | X{n,}? | X{n,}+ | At least n times |
X{n,m} | X{n,m}? | X{n,m}+ | At least n but no more than m times |
How to use
Consider the following example: you need to mask the mobile phone numbers of American customers.
To do this, go to the Log obfuscation tab and click Create rule. Fill in the Name and Replace format fields.
In Regular expression, write the following pattern:
\+1[- ]*\(?[- ]*(\d{3}[- ]*\)?([- ]*\d){7}|\d\d[- ]*\d\d[- ]*\)?([- ]*\d){6})
Let’s decompose this regular expression and take a look at each part:
\+1
will only match numbers starting with+1
, since only American phone numbers are considered in this example.
\
to be interpreted as normal characters.- Here and elsewhere, the expression
[- ]*
allows the client to put a hyphen or a space 0 or more times after the country code. \(?
— the use of the opening parenthesis is accepted.- The expression
\d{3}[- ]*\)?([- ]*\d){7}
first matches a 3-digit telephone area code, which can be separated by a space, a hyphen, and/or a closing parenthesis. Then the client can only enter 7 digits. - The expression from item 4 is followed by the character
|
, which means that the following pattern for the part after the country and telephone area codes is also allowed:\d\d[- ]*\d\d[- ]*\)?([- ]*\d){6}
.
Examples of phone numbers that will be obfuscated by the created rule:
16123456789
1(612)123-45-67
+16123456789
+1 612 345 67 89
+1-612-123-45-67
...