Can I use seed files for masking/generation functions?
Yes. In certain masking functions and anonymization functions can pull values from seed files. These can be in either .CSV or .TXT format. Multiple different column denominator symbols are accepted, but for .TXT files we recommend a newline, and for .CSV files we suggest either a comma or a semicolon to separate columns.
Masking
In the masking of Privacy there are two functions which directly use seed files. In order to access this, add the Generate… container function on the column of your choice, and then select either Value from seed file or Value from multi-column seed file.
These two functions are largely the same, with one small distinction. When using the value from multi-column seed file function you can supply a seed file that has multiple columns, and you need to specify which column in your target data is replaced with which column from the seed file.
Columns of data in your target database are grouped into rows when using the multi-column seed file function. This means that if a target row (with more than one affected column) is anonymized by the function, it will replace all values from a single “Row” out of the seed file.
Ex. if you have a seed file that contains both First Name & Last Name values, you can be certain that existing pairs in this file will be kept grouped if you replace the names in your target database.
Character set
When using a seed file, you need to specify which character set the seed file is encoded in within the function. By default, this is set to UTF-8, but can be changed to whatever encoding your seed file uses.
Generation
In generation, the functions operate in the same fashion as in the masking portion of Privacy, but in order to select them you must make a generation set beforehand to contain the functions in.