How-To: Display Null or Missing Values

How the Platform Works

The GoodData platform is designed to perform inner joins between datasets for metric/measure calculation. Here’s an example of what this looks like:

Say we have the following datasets and fields in the LDM:

LDM Example

Test Dimension Data Values:

Test Dimension Data Values

Test Score Data Values:

Test Score Data Values

If we wanted to view all SUM(Score) by Test Type, only Test Types with scores will appear:

SUM(Score) by Test Type

If you want to view all Test Types, regardless of whether they have an associated record in the Test Score table, there are two ways to display these values: apply a metric definition change for a single dimension you want to join against or generate the missing rows in the ETL.

Option 1: Metric change for a single dimension

If there is a single dimension you would like to compare the measure against (i.e., Test Dimension), you could modify the measure to simulate a LEFT OUTER JOIN from that dimension to the fact.

Previous Measure:

SELECT SUM(Score)

Modified Measure:

SELECT IFNULL(SUM(Score), 0) + 0
Modified Measure Example

The missing records will be populated with a non-null measure value.

Option 2: Generating missing rows in ETL

If there are multiple dimensions you would like to compare the measure against in the same report/insight, you have to populate the missing records in the ETL. The cartesian product of all of the dimensions you would like to slice and dice by will need to be loaded into the fact table. New data volume in the fact table with this missing row generation will increase based on the number of attributes you would like to slice and dice by, the cardinality of each of these attributes, and the original data volume of the fact table.

The updated Test Score fact table in our example will be:

Updated Test Score Fact Table

Using the original measure of SUM(Score), the report of this measure by Test Type would be the following:

Test Type SUM(Score)
System 180
Regression -

As another example, let’s add another dimension to the LDM:

Additional Dimension in LDM

Say the Test Score data values are as follows, you would like to report on every combination of Test Type and Test Taken Month/Year Date for Q1 of 2019 (Jan, Feb, Mar 2019).

Test Score Data Values

Because there are multiple dimensions you would like to compare the scores against (Test Type and Test Taken Date), you will need to populate the missing data in the backend. The updated Test Score data values will now be:

Updated Test Score Values

Now, the report of SUM(Score) by Test Type, Test Taken Date where Test Taken Date = Q1 2019 would be the following:

Test Type Test Taken Date SUM(Score)
System 1/1/2019 80
System 2/1/2019 100
System 3/1/2019 -
Regression 1/1/2019 -
Regression 2/1/2019 -
Regression 3/1/2019 -