Benford’s Law compares the frequency distribution of leading digits to its (empirically proven) natural counterpart. This can then be used to detect fraud and errors.

Dynamic Benford's Law measures in Power BI and Power Pivot

The green columns show how often each number should be the first digit in numbers that should follow the Benford-distribution. In black you’ll see the actual distribution of first digits within my table. Lastly, the red line shows the percental absolute deviations between actual and Benford values.

In this example, there is a relatively high occurrence of numbers starting with 4 and 5. So this could be a sign for fraudulent manipulations.

The Benford Distribution

First you need a table with the Benford-distribution. Just load it as a disconnected table to your data model and name it “BenfordTable”. The “Value”-field from this table will be taken as x-axis for the visualisations. As the Benford-distribution is logarithmic, it can quickly be created with the following DAX-code:

DAX
BenfordTable = 
ADDCOLUMNS (
    GENERATESERIES ( 1, 9 ),
    "BenfordDistribution", LOG10 ( 1 + ( 1 / [Value] ) )
)

Benford’s Law Measure

The measure calculates how often a number starts with one of the BenfordNumbers (1..9) compared to the total number of rows in the FactTable.

DAX
BenfordMeasure =
VAR BenfordNumber = MAX ( BenfordTable[Value] )
VAR CountBenfordNumber =
    CALCULATE (
        COUNTROWS ( FactTable ),
        LEFT ( FactTable[Value], 1 ) * 1 = BenfordNumber
    )
VAR CountTotal = COUNTROWS ( FactTable )
RETURN DIVIDE ( CountBenfordNumber, CountTotal )

If there are blank in the Value-column of the table to be analyzed, the measure has to be adjusted by filtering them out in the VAR CountTotal: ( CALCULATE(COUNTROWS(FactTable), FactTable[Value] <> BLANK()) )

Please note that you can create as many measures as you need in one model. So if you have multiple columns to investigate, just write a measure for each.

Benford’s Law Variance Measure

To calculate the difference between target and actual, I use a MAXX-aggregation. This returns the maximum difference there is for a number. I also use this in a card visual if I want to add a data driven alert. So I don’t have to check and eyeball the chart regularly, but can just use this in a card visual in a dashboard. Then I’ll set a threshold value for the alert and will not miss any alarming developments.

DAX
Deviation % = MAXX (
    ADDCOLUMNS (
        BenfordTable,
        "Diff", ABS ( [BenfordMeasure] - BenfordTable[BenfordDistribution] ) / BenfordTable[BenfordDistribution]
    ),
    [Diff]
)

Why DAX?

I’m using measures here instead of a calculated column (in the Benford-Table) because this allows me to filter and slice my table. This allows for making advanced and flexible analysis like comparing different values against each other or over time against the Benford distribution:

Dynamic Benford's Law measures in Power BI and Power Pivot

  • No labels