Choose a start table

When developing a subset template, one of the first things we have to think about is the start table and a start filter. The start table is our entry point in a complex data model. This table and the start filter greatly determine how large our subset will be. Each subset template will have only one start table. From this start table it will filter other data that is direct or indirect related to the starting table. When asking testers of developers which data they need, they will probably say something like: "Customers that have these and these properties" or "All contracts that have these attributes.". In most cases this will be a good indication for a good start table and filter.

Now let's select a start table for our first subset template

Open the Start table drop down list
Choose the PRODUCTION.CUSTOMERS value (CUSTOMERS table in the PRODUCTION schema

Great, now lets move on the start filter option.

Start filter

The following part of the training is bit technical, but really important to understand. The start filter is a really flexible approach to make any selection from your database.

The start filter is a part of a SQL query that filter out data in the selected start table. This filter will be appended to a generated query like this:

CODE

select customer_id, logical_key,title,... from PRODUCTION.CUSTOMERS <-- The start filter will be appended here.

Let's say we want to select all customers where the last start begins with the letter 'B', we can enter the following start filter:

CODE

where last_name like 'B%'

When the template is executed against the database, DATPROF Subset will append the start filter to the generated query like this:

CODE

select customer_id, logical_key,title,... from PRODUCTION.CUSTOMERS where last_name like 'B%'

You will understand that the flexible SQL start filter will accept any sorts of filters or a combination of those. Here are some examples.

Start filter examples

List of specific customers

CODE

where customer_id in (93601,93602,93603,93604,93605,93606,93607,93608,93609)

Looking in a driving table

CODE

where customer_id in (select relation_id from schema.test_data_driving_table)

Joining other tables

CODE

join PRODUCTION.ORDERS@TEST_LINK ord
on customers.customer_id = ord.customer_id
where ord.plan_date > to_date('2005-03-31','YYYY-MM-DD')

In the previous example, you may have noticed that for the ORDERS table, we specify the schema and database link. This is really important because all queries are executed on the target database! Without the schema and database link, the query will look for ORDERS data in the target database instead of the source database! So make sure that when joining tables, the right schema prefix and database link are used!

For our first subset template we will use a simple start filter.

Enter the following start filter and click Count start filter
CODE
```
where last_name like 'B%'
```
After counting the amount of rows, you will get a popup with the amount of rows and percentage of the original table. This will give a bit of an indication how large your subset will be.

If you got stuck and watch a video how to select a start table and enter a start filter. Watch the following video clip. Make sure to make it full screen for a better viewing experience.