Skip to content

Anonymizing User Names in Accounting Data

If you want to prevent your individual user names from being legible in your accounting data in Elasticsearch, you can use the scripts described in the following to set up an anonymization pipeline.

Provide a mapping of user names to anonymized groups, e. g. departments, and the scripts will replace individual user names by the group name in any accounting data that are uploaded to Elasticsearch.


Transforming the Data

The UserToPrincipal upload script

  1. analyzes your document,

  2. transforms the data into the correct syntax,

  3. stores the data in the following file:

    seal-kibana/configuration/scripts/anonymization/UserToPrincipal_data.ndjson
    
  4. and uploads this file to Elasticsearch.

You can start the upload script

  • manually, whenever you need to anonymize uploaded data:

    • Linux:

      seal-kibana/configuration/scripts/anonymization/UserToPrincipal_data.sh
      
    • Windows:

      ``` seal-kibana/configuration/scripts/anonymization/UserToPrincipal_data.ps1 ````

  • via Cron job:

    The init-anonymization script anonymizes uploaded data on a regular basis, see Setting up a Monthly Upload.

Hint - usage

Execute the script with the -h or -help option to get the usage.


Anonymizing the User

The relation between uploaded userID and principalID is stored in an index in Elasticsearch.

Every time a document is uploaded to the accounting index, the Elastic Stack internal Data Enrichment process checks the userID in the document and replaces it by the corresponding principalID.


Initializing the Anonymization

Enable the anonymization of the accounting data by running the following script:

  • Linux:

    seal-kibana/configuration/scripts/anonymization/init-anonymization.sh
    
  • Windows:

    seal-kibana/configuration/scripts/anonymization/init-anonymization.ps1
    

This sets up all necessary components in Elasticsearch and initializes the pipeline necessary for anonymization.

Hint - usage

Execute the script with the -h or -help option to get the usage.


Uploading Data

Elasticsearch needs to map userID and principalID to actually anonymize the users.

You can upload a mapping table by using the following script:

  • Linux:

    seal-kibana/configuration/scripts/anonymization/UserToPrincipal.sh
    
  • Windows:

    seal-kibana/configuration/scripts/anonymization/UserToPrincipal.ps1
    

By default the script uploads data from the following file:

seal-kibana/configuration/scripts/anonymization/UserToPrincipal_data.ndjson

The file needs to be formatted as follows:

{ "userID": "<your first userID>", "principalID": "<anonymized name for the first userID>" }
{ "userID": "<your second userID>", "principalID": "<anonymized name for the second userID>" }
...

The file must end on a new line.

Caution - stop Filebeat

Stop Filebeat before you start the upload script!

Any accounting data that Filebeat inputs into Elasticsearch while the upload script is running will not be properly anonymized.


Pausing Filebeat

You can pause Filebeat with the following script:

  • Linux:
seal-filebeat/packaging/linux/rpm/resource/pause_filebeat.sh
  • Windows:
seal-filebeat/packaging/windows/msi/resource/pause_filebeat.ps1

If you have several Filebeat installations/instances uploading accounting data, you have to pause each one separately.

Hint - usage

Execute the script with the -h or -help option to get the usage.


Setting up a Monthly Upload

  1. Start the init-anonymization script with the option -m:

    • Linux:

      seal-kibana/configuration/scripts/anonymization/init-anonymization.sh -m
      
    • Windows:

      seal-kibana/configuration/scripts/anonymization/init-anonymization.ps1 -m
      

    This sets up a monthly upload on the first day of the month at 00:05 AM.

  2. Start the pause_filebeat script with the option -i:

    • Linux:

      seal-filebeat/packaging/linux/rpm/resource/pause_filebeat.sh -i
      
    • Windows:

      seal-filebeat/packaging/windows/msi/resource/pause_filebeat.ps1 -i
      

    Filebeat ist stopped on the first day of the month from 0:00 AM to 0:30 AM.

Hint - default time

This is the default time for the monthly upload script to start.

If you change the default starting time of the init-anonymization.xxx script, you have to change the pausing time of the pause_filebeat.xxx script as well.

If you have several Filebeat installations/instances uploading accounting data, you have to pause each one separately.


Back to top