In this part of the application the user can specify the analyze funtions on the columns of the imported metadata and Analyze the data in the database using these functions.

At the left side of the window you see the imported tables or  views. The right part shows the details and statistics of this table or view.

To analyze one or more tables or views you should press the "Run Statistics" button below on the right side of the window.

There are three icons giving you 3 options.

From left to right:

  • Run Statistics
    : The Analyse run will start.
  • Pause Statistics
    : The run will pause, a pop-up will appear. To continue press "OK".
  • Abort Statistics
    : The analyse run will abort.

To run statistics a connection to the database is required.

(info) After running Statistics it is recommended to save the results. (File→ Save or CTRL-S)


Tables and views at the left side

At the left side of the screen the imported tables and views are shown. 

From top to bottom you see:

  • A search box. This enables you to filter the tables or views in this screen. You can use % as a wildcard.
  • A dropdown with the available schemas. This only applies if more then one schema is imported.
  • The Tables and views.  They can be identified by the icon. 
     for tables, 
     for views.

You can select one or more rows.

Use the right mouse button to

  • Count rows. The amount will appear right from the table or view name.
  • Clear statistics: Clear statistics , including count, from the selected tables or views.
  • Clear profiles: Clear the matrching profiles for the selected tables or views.



Statistics at the right side

The right part of the screen contains 3 Tabs.  

Tab Statistics

This will give statistics date for the tables or views. Only selected tables or views are analysed.

Some of the columns provide characterstics of the field, they are availble without running the anlayse. They are based upon the import.

After an analyze run the other columns are filled.

Statistics are gathered on the data in the columns.

  • # Nulls: The amount of NULL values in this column.
  • # Distinct: The number of unique values in this column
  • Min: Show the lowest value in this column ("A" is before "a", "-1" is before "1")
  • Max: Show the highest value in this column.
  • # Shortest: Show the shortest value in this column. For numerics also the shortest length is given where the the value 11 has the length 2 and the value -2 also has the lentgth 2.
  • # Longest: Show the longest value in this column. For numerics also the shortest length is given where the the value 11 has the length 2 and the value -2 also has the lentgth 2.
  • Least frequent: Show the least occuring value in this column.
  • Most frequent: Show the most frequent occuring value in this column.

Filtering

If you select a table and simply press "Run Statsitics" the analyze is run on the content of the whole table.

To obtain statistics on a selction of your data a filter can be given.

In the center part of the window click the "Filter" tab and a textbox will open. In this textbox a SQL "where" clause can be given.

  • Running Statistics using a filter will analyze only the filtered data.
  • The resulting amount is shown in the Summary below after "Data filing".

  • A filter can be given for every table or view.
  • Filters are saved in the project
  • (info) Filters are NOT removed when clearing Statistic data.
  • (info) The Number of records in the left panel still shows the total number of unfiltered records in this table.

Tab Summary

The summary tab shows one or twe diagrams. The data filling diagram shows the filing of this column. The other diagram is only visible for Numeric columns. It shows the number of positive, negative and "0" records in this column.

Below the diagram part you find three blocks of information: Ordering, Length, Frequency

For all three blocks yoy see the most extreme results. The other tabs focusses on these results.

Below an example for an alphanumeric and a numeric colum , both with the filter on LAST_NAME (like 'A%').


Tab Ordering


It shows two panels. One for the Minimum values and one for the maximum values. It shows the value (from low to high and vice versa), the length and the number of occurences.

Doubleclick on the record to show the found record(s). You can adjust this filter and run the query if you like.


Tab Length

This shows two panels. One for the shortest and one for the longest values. It shows also the number of occurences.

Doubleclick on the record to show the found record(s). You can adjust this filter and run the query if you like.

Tab Frequency

(The example below is taken from the last_name column to show some variation (smile))

This shows two panels. One for the least frequent and one for the most frequent found values. The values and the number of occurences are shown.

Doubleclick on the record to show the found record(s). You can adjust this filter and run the query if you like.


Tab Profiling

This screen Shows per table the Profiles matching the data.

In the example below the column EMAIL and GENDER have matching Profiles. In this case it is easy beacuse the clumn names are descriptive.

Having columns named ZGKLZ125 and ZGKLA124 makes the use of profile rather helpful.



Tab Data

This screen gives a SQL window where you can provide a "where" clause to filter the data in the selected table. It will show the results in a grid. You can sort the grid by clicking the headers.

(info) Only 100 records are retrieved by default. Since all data is transferred to the client rasing this value may result in memory issues