If we need to anonymize some sensible or personal data strings from our PostgreSQL database, then the bellow function will do the work for you. This postgres anonymize function is coded mainly to anonymize strings by replacing middle of the string with asterisks (*) and optionally keep the email domain if required.
String anonymize() examples
SELECT anonymize('email.address@example.com',1,1); returns "e***********************m" SELECT anonymize('email.address@example.com',2,2); returns "em*********************om" SELECT anonymize('email.address@example.com',1,1,true); returns "e***********s@example.com"
Clearly enough I hope, basically you need to provide the string that need to be anonymized, how many chars to keep from the string start, how many chars to keep from the string end and an optionally true value to inform the function that your string is an email address and you want to keep the top-level domain from that.
Of course, you can use the function for a display only purpose or to update database tables, eg UPDATE table_clients set email = anonymize(email,1,1,true) where id = 1;
or UPDATE table_clients set details = replace('email.address@example.com',anonymize('email.address@example.com',1,1,true));
The FUNCTION anonymize()
CREATE OR REPLACE FUNCTION anonymize( i_str text, -- text to be anonymized str_start integer, -- no of chars to keep at start str_end integer, -- no of chars to keep at end str_keep_email_tld boolean default false -- keep email domain ) RETURNS text AS $BODY$ DECLARE str_len integer; str_anon text; str_array text[]; BEGIN IF str_keep_email_tld THEN str_array = string_to_array(i_str,'@'); i_str = str_array[1]; str_len = length(i_str); str_anon = concat_ws('',substr(i_str,1,str_start),repeat('*',str_len-str_end-str_start),substr(i_str,str_len-str_end+1, str_len),'@'||str_array[2]); ELSE str_len = length(i_str); str_anon = concat_ws('',substr(i_str,1,str_start),repeat('*',str_len-str_end-str_start),substr(i_str,str_len-str_end+1, str_len)); END IF; RETURN str_anon; END $BODY$ LANGUAGE plpgsql IMMUTABLE COST 1;
Using the above Postres custom anonymize function against UNIQUE INDEX columns
Let's suppose you have a user's database table where you keep usernames, maybe in a form of an email address (used as username), and you need to anonymize one or multiple accounts. I assume your username column has a lowercase/uppercase unique index to prevent duplicates. If you run the anonymize function against one or multiple users there is a chance that the anonymized version of two usernames to be the same. So we need another anonymize function to be used for that. Let's name this new function anonymize_unique(). But I will let you to build this function, with some hints: use the table pk column (ID) and generate a md5 hash from the username id column. Or you can go further and keep the email domain from the username email, resulting something like that: v1) a-unique-32char-md5-hash or v2) a-unique-32char-md5-hash@example.com. What do you say, are you in?
That's all folks