This screen is the start for any DATPROF Analyze project. It enables you to  select a connection for the database and to import or synchronize metadata. 

The first thing to do is to click the link "Edit Connection" to open the Connection Editor.


Connection Editor

In order to analyze (parts of) a database, a connection to the database is required. This connection is used for importing and synchronizing metadata and will be used to execute the analyze functions on that database.

Datprof Analyze supports three types of databases: Oracle, MS SQL Server and DB2 for Linux, Unix and Windows.

A database connection can be specified, selected and saved in the Connection Editor. The left side of the screen displays the saved connections. The right side of the screen shows the connection details of the connection that is used for the current project, this connection is known as Project Connection. A selected connection becomes the actual Project Connection by pressing the button “Use this connection”.

When defining a connection, the following parameters are required:


Oracle

  • Connection Type: ORACLE.

  • Host: Hostname or IP-address of the database server.

  • Port: TCP Port

  • SID / Service: Unique database identification.

  • Username: The user name to login.

  • Password: The password


Microsoft SQL Server

  • Connection Type: MSSQL.
  • Host: Hostname or IP-address of the database server.
  • Port: Port  of the SQL Server database.
    By default this is 1433. 
    When empty, the default port will be used.
  • Instance: If a named instance is used, fill in this field with the name of the instance. This property is optional, then the default instance on the host is used.
  • Database: Name of the database
  • Use Windows Authentication: Use your windows authentication to log on to the SQL Server database. Using the Windows authentication for both Source and Target connection can cause a “double hop” . 


    When Windows Authentication is unchecked:

  • Username: The user name to login.

  • Password: The password.

DB2 for Linux, Unix and Windows

  • Database type:  IBM DB2 for Linux, Unix and Windows.
  • Host: Hostname or IP-address of the database server.
  • Port: TCP port on which the database server is listening.

Database: Name of the target database.

Test, Save, New

With the button Test,  the selected connection can be tested. This option will connect to the database using the specified connection details. The result is shown to the user.

The Save button ensures that the specified Project Connection is stored in the list of saved connections.

All connections are stored in the file “Analyze_connections.lst” in  “<Windows User>\AppData\Local\DATPROF\”.

The button New will empty the input fields allowing the user to define a new connection.  You can also modify an existing entry and save that. A new entry will be added.

The Save password checkbox ensures that the login information is stored and does not need to be re-entered in next sessions.

(info)
 Whenever a connection is saved in your project, new connections are restricted to the Database type you saved.


Configuration, Profiles and Comment

On the Settings screen you find three other areas: The tab Configuration, the tab Profiles and the Comment field.


Configuration

In Configuration you can set the parameter Parallel Settings. It's default is 8. Lower will slow down the process.

Every function is one process in parallel.

Setting this to 1 realy slows down the process.

Profiles

Profiles are named expressions to identify your data.

After gathering statistics you can execute the Profiling functionality to identify your data in the database by checking the 20 most frequent values per column per table.

After running the profiling functionality you can get an overview of the columns having simular Profiles.

Profiling is usefull to identify the location of your (sensitive) data. It is primarely not meant to identify dataquality issues.


You can create your own profiles. The are saved as a xml file in  APPDATA\local\datprof\<AnalyzeFolder>analyze_profiles.xml 

A profile has a name. This will show up in reports so keep it short and descriptive.

A profile has a type: "Values", "Contains" , "Regex",

  • Values, have a comma seperated list of values, (No spaces around the commas)
  • Contains: A list a values possible in the column.
  • Regex: use a regular expression to identify data matching a certain pattern.  For an introduction about regular expression check: http://www.rexegg.com/

A Profile has a percentage set. If the number of occurences matching the profile matches the percentage it will be recognised as a match. 

The percentage is calculated as a the number of matching occurences compared to the total number of occurences in the top 20 most frequent found values.

Below GENDER is set to a percentage of 100. So if only 1 occurence contains a different value.

 A Profile can have a description.

Comment

Here you can enter your comments to keep track of the work you have done.


Importing Metadata

Using the button Import metadata (or using the menu option Project à Import and synchronize à Import metadata wizard), a wizard for importing meta data of the specified database is started. Importing the metadata through this wizard consists of three steps:


  1. Selecting the schemes and tables;
  2. Selecting the relations (foreign keys);
  3. Importing the specified metadata.

Step 1: Selecting the schemas and tables


During this first step, the user can select the tables to be imported. The screen consists of two parts.

The left part shows the metadata available in the database. In the dropdown box (top left) the scheme containing the tables can be selected. In the list below the user can select one or more tables which will be imported.

The right part shows the selected tables to imported.

With the single arrow, the selected tables are added to the list of tables to import. The double arrow will add all tables from the selected scheme in the list of tables to import.

The Next button moves the wizard to the next step.


Step 2: Selecting relations (foreign keys)

The next dialog allows the user to select the foreign key relations (FK's) to import. This dialog has the same format as the previous dialog. The list of available FK’s are derived from the selected tables. Again, the user can select one or more FK’s and add them to the list FK’s to import.

Normally you will select them all.


The Next button moves to the next step.

Step 3: Importing the specified metadata

The last dialog of the wizard allows the user to actually import the selected metadata (tables and FK's). By clicking the Startbutton, the application will retrieve the selected metadata from the database.

The progress of the import can be monitored with the Progress Bar.

By clicking the Logging button, the application shows the logging during import. This logging also provides input for resolving any errors that occurred during the import.



Synchronizing metadata

The Metadata Synchronization Wizard helps the user to process any changes to the datamodel of the target database, ensuring that anonymization can be executed at the changed datamodel. Synchronization can take place through a DME file or an ODBC database connection (Project Connection).

In the example below we use the ODBC Database connection.

 


The next screen shows a summary of the differences between the metadata from the target database and the metadata already imported into the application.

(info) The database information is leading.


Changes may include tables, columns and Foreign Keys. In time they can be added, changed or deleted.

(warning)The Synchronization wizard does not distinguish between self-defined Foreign Keys and removed Foreign Keys from the database. They are both marked for deletion in the wizard.

By clicking the Next button, a dialog will be opened for each type of change, where the user can decide whether he wants to process the change in his project.

 

These dialogs have the same format as the dialogs for importing metadata.

The last screen allows the user to process the selected changes in his project.

 The progress of the synchronization can be monitored with the Progress Bar.




This screen is the start for any DATPROF Analyze project. It enables you to  select a connection for the database and to import or synchronize metadata. 

The first thing to do is to click the link "Edit Connection" to open the Connection Editor.


Connection Editor

In order to analaze (parts of) a database, a connection to the database is required. This connection is used for the importing and synchronizing the metadata and will be used to execute the analyse functions on that database.

Datprof Analyze supports two types of database: Oracle and MS SQL Server

DB2 for Linux, Unix and Windows

.

A database connection can be specified, selected and saved in the Connection Editor. The left side of the screen displays the saved connections. The right side of the screen shows the connection details of the connection that is used for the current project, this connection is known as Project Connection. A selected connection becomes the actual Project Connection by pressing the button “Use this connection”.

When defining a connection, the following parameters are required:


Oracle

  • Connection Type: ORACLE.

  • Host: Hostname or IP-address of the database server.

  • Port: TCP Port

  • SID / Service: Unique database identification.

  • Username: The user name to login.

  • Password: The password


Microsoft SQL Server

  • Connection Type: MSSQL.
  • Host: Hostname or IP-address of the database server.
  • Port: Port  of the SQL Server database.
    By default this is 1433. 
    When empty, the default port will be used.
  • Instance: If a named instance is used, fill in this field with the name of the instance. This property is optional, then the default instance on the host is used.
  • Database: Name of the database
  • Use Windows Authentication: Use your windows authentication to log on to the SQL Server database. Using the Windows authentication for both Source and Target connection can cause a “double hop” . 


    When Windows Authentication is unchecked:

  • Username: The user name to login.

  • Password: The password.

DB2 for Linux, Unix and Windows

  • Database type:  IBM DB2 for Linux, Unix and Windows.
  • Host: Hostname or IP-address of the database server.
  • Port: TCP port on which the database server is listening.

Database: Name of the target database.

Test, Save, New

The Save password option ensures that the login information is stored and does not need to be re-entered in next sessions.With the button Test,  the selected connection can be tested. This option will connect to the database using the specified connection details. The result is shown to the user.

The Save button ensures that the specified Project Connection is stored in the list of saved connections.

All connections are stored in the file “Analyze_connections.lst” in  “<Windows User>\AppData\Local\DATPROF\”.

The button New will empty the input fields allowing the user to define a new connection.  You can also modify an existing entry and save that. A new entry will be added.

(info)
 Whenever a connection is saved in your project, new connections are restricted to the Database type you saved.


Configuration, Profiles and Comment

On the Settings screen you find three other areas: The tab Configuration, the tab Profiles and the Comment field.


Configuration

In Configuration you can set the parameter Parallel Settings. It's default is 8. Lower will slow donw the process.

Every function is one process in parallel.

Setting this to 1 realy slows down the process.

Profiles

Profiles are named expressions to identify your data.

After gathering statistics you can execute the Profiling functionality to identify your data in the database by checking the 20 most frequent values per column per table.

After running the profiling functionality you can get an overview of the columns having simular Profiles.

Profiling is usefull to identify the location of your (sensitive) data. It is primarely not meant to identify dataquality issues.


You can create your own profiles. The are saved as a xml file in  APPDATA\local\datprof\<AnalyzeFolder>analyze_profiles.xml 

A profile has a name. This will show up in reports so keep it short and descriptive.

A profile has a type: "Values", "Contains" , "Regex",

  • Values, have a comma seperated list of values, (No spaces around the commas)
  • Contains: A list a values possible in the column.
  • Regex: use a regular expression to identify data matching a certain pattern.  For an introduction about regular expression check: http://www.rexegg.com/

A Profile has a percentage set. If the number of occurences matching the profile matches the percentage it will be recognised as a match. 

The percentage is calculated as a the number of matching occurences compared to the total number of occurences in the top 20 most frequent found values.

 Below GENDER is set to a percentage of 100. So if only 1 occurence contains a different value.

A Profile can have a description.

Comment

Here you can enter your comments to keep track of the work you have done.


Importing Metadata

Using the button Import metadata (or using the menu option Project à Import and synchronize à Import metadata wizard), a wizard for importing meta data of the specified database is started. Importing the metadata through this wizard consists of three steps:


  1. Selecting the schemes and tables;
  2. Selecting the relations (foreign keys);
  3. Importing the specified metadata.

Step 1: Selecting the schemas and tables


During this first step, the user can select the tables to be imported. The screen consists of two parts.

The left part shows the metadata available in the database. In the dropdown box (top left) the scheme containing the tables can be selected. In the list below the user can select one or more tables which will be imported.

The right part shows the selected tables to imported.

With the single arrow, the selected tables are added to the list of tables to import. The double arrow will add all tables from the selected scheme in the list of tables to import.

The Next button moves the wizard to the next step.


Step 2: Selecting relations (foreign keys)

The next dialog allows the user to select the foreign key relations (FK's) to import. This dialog has the same format as the previous dialog. The list of available FK’s are derived from the selected tables. Again, the user can select one or more FK’s and add them to the list FK’s to import.

Normally you will select them all.


The Next button moves to the next step.

Step 3: Importing the specified metadata

The last dialog of the wizard allows the user to actually import the selected metadata (tables and FK's). By clicking the Startbutton, the application will retrieve the selected metadata from the database.

The progress of the import can be monitored with the Progress Bar.

By clicking the Logging button, the application shows the logging during import. This logging also provides input for resolving any errors that occurred during the import.



Synchronizing metadata

The Metadata Synchronization Wizard helps the user to process any changes to the datamodel of the target database, ensuring that anonymization can be executed at the changed datamodel. Synchronization can take place through a DME file or an ODBC database connection (Project Connection).

In the example below we use the ODBC Database connection.

 


The next screen shows a summary of the differences between the metadata from the target database and the metadata already imported into the application.

(info) The database information is leading.


Changes may include tables, columns and Foreign Keys. In time they can be added, changed or deleted.

(warning)The Synchronization wizard does not distinguish between self-defined Foreign Keys and removed Foreign Keys from the database. They are both marked for deletion in the wizard.

By clicking the Next button, a dialog will be opened for each type of change, where the user can decide whether he wants to process the change in his project.

 

These dialogs have the same format as the dialogs for importing metadata.

The last screen allows the user to process the selected changes in his project.

 The progress of the synchronization can be monitored with the Progress Bar.