Skip to main content

Log obfuscation

To protect the data passed to the bot, JAICP provides log obfuscation — converting sensitive data into an unreadable format.

caution
Log obfuscation setting is available only to users with the SECURITY_ADMIN role.

Description

Log obfuscation allows removing sensitive data from client and bot phrases. The data is masked in:

To do this, you need to create rules by which such data will be masked. The rules can be applied only to the projects that you choose.

You need to customize regular expressions for matching and replacing only the relevant information. When a regular expression matches a text, the text is replaced with the string specified in the rule settings.

Rules

To create an obfuscation rule, click the  icon and go to the Log obfuscation section.

  • System rules
    • Ready-made rules for Email and Bank card (VISA/Master). Disabled by default.
    • You cannot edit the regular expressions set for system rules.
    • You can choose the replacement format and projects to which the rule will be applied.
  • User rules
    • Create custom rules.
    • You can create regular expressions yourself and customize the replacement format.
tip
To check the correctness of the created rule, click Test and enter the data you want to obfuscate.

Regular expressions

tip
regular expression is a pattern for matching and replacing strings.

Characters for composing regular expressions can be divided into several groups:

Boundary matches

CharacterDescription
^The beginning of a line
$The end of a line
\bA word boundary
\BA non-word boundary

Character classes

CharacterDescription
.Any character except a newline character
(a|b)Capturing group: a or b
[abc]Matches a, b, or c
[a-m]Range, matches characters from a to m
[^a]Negation, matches everything except a
tip
Boundary values are also included.

Predefined character classes

CharacterDescription
\wA word character
\WA non-word character
\dA digit
\DA non-digit
\sA whitespace character
\SA non-whitespace character

Quantifiers

Quantifiers always follow a character or group of characters. Quantifiers may be:

  • Greedy — match the longest group. If the first match attempt fails, the matcher backs off the input string by one character and tries again, repeating the process until a match is found or there are no more characters left to back off from.
  • Lazy — match the shortest group.
  • Possessive — match the longest group without backing off.
GreedyLazyPossessiveDescription
X?X??X?+Zero times or once
X*X*?X*+Zero or more times
X+X+?X++One or more times
X{n}X{n}?X{n}+Exactly n times
X{n,}X{n,}?X{n,}+At least n times
X{n,m}X{n,m}?X{n,m}+At least n but no more than m times

How to use

Consider the following example: you need to mask the mobile phone numbers of American customers.

To do this, go to the Log obfuscation tab and click Create rule. Fill in the Name and Replace format fields.

In Regular expression, write the following pattern:

\+1[- ]*\(?[- ]*(\d{3}[- ]*\)?([- ]*\d){7}|\d\d[- ]*\d\d[- ]*\)?([- ]*\d){6})

Let’s decompose this regular expression and take a look at each part:

  1. \+1 will only match numbers starting with +1, since only American phone numbers are considered in this example.
caution
Metacharacters must be escaped with \ to be interpreted as normal characters.
  1. Here and elsewhere, the expression [- ]* allows the client to put a hyphen or a space 0 or more times after the country code.
  2. \(? — the use of the opening parenthesis is accepted.
  3. The expression \d{3}[- ]*\)?([- ]*\d){7} first matches a 3-digit telephone area code, which can be separated by a space, a hyphen, and/or a closing parenthesis. Then the client can only enter 7 digits.
  4. The expression from item 4 is followed by the character |, which means that the following pattern for the part after the country and telephone area codes is also allowed: \d\d[- ]*\d\d[- ]*\)?([- ]*\d){6}.

Examples of phone numbers that will be obfuscated by the created rule:

16123456789
1(612)123-45-67
+16123456789
+1 612 345 67 89
+1-612-123-45-67
...