Amazon AWS offers a wide range of (serverless) cloud databases and cloud warehouses that can be used to host data. Many of these are directly based on proprietary SQL databases. In general, the usage of these databases in tandem with DATPROF tools is not an issue. However, some key differences exist which may hinder deployments. In this page we’ll go over some of the notable exceptions to our support and additional information that the user can use to determine which services to choose.
Relational database service (RDS)
The relational database service is fully supported, as long as the version of the database offered falls under our current support. This includes AWS specific variants such as Aurora (MySQL) and Aurora (PostgreSQL).
Databases hosted in EC2
We fully support databases directly hosted in EC2 instances, as this is functionally no different from hosting the database on an in-house server. For a list of possible database options please refer to the specific product compatibility pages.
Connecting to AWS Databases
To connect to one of the above mentioned databases, simply select the underlying database type (MySQL or PostgreSQL) in the connection editor of Subset or Privacy, or the environment editor in Runtime, and supply the necessary connection data.
One thing to be aware of is making sure that your VPC configuration allows connections to the database from the server that Runtime is hosted on, or the IP from which you’re editing and deploying Privacy / Subset.
A notable exception to what we support is Redshift, AWS' cloud data warehousing solution. Redshift uses a modified version of PostgreSQL 8.0.2, which has been modified with additional restrictions relating to the modification of restraints.
Because DATPROF support for PostgreSQL starts from 9.6 onward, we currently do not support the deployment of our tools on a Redshift environment. Furthermore, due to how the architecture on how Redshift databases are designed, the databases can struggle with high concurrent modifications on the dataset. Because our applications generate dynamic SQL code to modify an environment, it is likely that the deployment of our tools would cause significant performance issues.
Because Redshift loads data from other sources, a workable alternative would be to offload the data used in Redshift onto something like an Amazon RDS, and execute masking on that environment. Afterwards, the data can be loaded back into Redshift.
The databases listed below are currently not supported:
Quantum Ledger Database
Most of these examples are unsupported because they are NoSQL databases, which our tools currently do not support. For all of these the same as above exists as an alternative; unloading the data present onto a supported database, and masking/subsetting there. For certain entries, such as JSON or XML files, the data can also be unloaded and modified using DATPROF File Masking before reloading into a NoSQL database instance.
For specific compatibility questions, please contact us.