tl;dr – iBeta plays a critical role in baseline product assessment, but DHS RIVR has emerged as the gold standard buying guide for PAD / liveness technology in remote identity validation use cases.

The Department of Homeland Security Science & Technology Directorate’s recent Remote Identity Validation Rally (RIVR) test is a big deal. It is the first open, head-to-head test of the performance of leading face matching, face liveness / presentation attack detection (“PAD”), and document verification technology providers in IDV. 

Across these tests, face matching and document verification are fairly easy to contextualize. In short: 

  • Face matching: Very similar to NIST FRTE tests, but based on different data sources: RIVR assesses selfies and photos of driver’s licenses (the common basis for identity assertion in remote, digital IDV), as compared to NIST FRTE’s visa, border crossing, mugshot, and webcam data. 
  • Document verification: Simply put, there was no open or closed assessment of document verification technologies prior to RIVR. There is nothing to compare it against, because it is the first time these technologies have ever been assessed. 

But PAD is a bit more difficult to contextualize, as RIVR is characteristically different from iBeta testing, which has been broadly used for years as the “stamp of approval” for biometric presentation attack detection technologies. To summarize: iBeta determines that technologies meet a certain set of thresholds across a certain set of tests, whereas RIVR PAD assesses performance across both a set of thresholds as well as on a relative basis between providers. Here, we will dive deep into this difference. 

A metaphor: iBeta is to DHS RIVR PAD what UL (Underwriters Laboratories) is to Consumer Reports

When you buy a computer, smartphone, or other piece of electronics, it will always come with a certification from UL, TÜV, or other relevant test group, performed in accordance with a range of standards including those from ISO, FCC, and CE. These certifications are critical: They make sure that the products are going to function to a certain level, especially in terms of safety and reliability, and allow you to purchase with confidence. From the perspective of hardware manufacturers, these tests can be frustrating and costly, but also often lead to a better product in the end. 

While the UL mark is critical, it doesn’t give consumers an understanding of relative performance. The UL mark doesn’t show which computer will process graphics faster, which smartphone will take better pictures, or what stereo will deliver better sound quality. In other words, UL is necessary, but it doesn’t inform consumers on what product they should buy. That’s the purview of Consumer Reports (or Wirecutter, or your favorite technology website or magazine). 

This is an appropriate way to look at iBeta as it relates to biometric presentation attack detection. iBeta will help purchasing parties to know that technologies have met certain thresholds, but it is not a buying guide. We now have that for PAD, and it is DHS RIVR. 

What iBeta tests, and why it matters

So what exactly does iBeta do? It is an NVLAP-accredited test group that performs a standard set of tests in accordance with the ISO 30107-1 standard for presentation attack detection. Notably, ISO 30107 does not set definitions for different attack levels or thresholds for acceptability. iBeta does this themselves in what has become a de facto standard over the course of time and many vendor tests. 

You can go to iBeta’s website and see a list of vendors who have passed their thresholds for acceptance on what they have determined to be Level 1, Level 2, and Level 3 presentation attacks. This is useful: it sets a trusted baseline for tested products. It also has its limits: 

  1. The public reporting from iBeta only confirms that vendors have passed their thresholds, which are fairly tight for false positives (reported as APCER, with threshold set to 0%) and fairly loose for false negatives (reported as BPCER, with threshold often set to 15%). So it is up to vendors to report on their BPCER, which many do not.
  2. The test does not push vendors to extremes, as for instance Paravision Liveness demonstrated a 0% APCER and 0% BPCER on iBeta Level 2. Tests that push products hard (i.e., to non-zero results) have value to vendors and customers alike in terms of differentiation and opportunities for improvement. 

This is not to find fault in iBeta, but to contextualize its value. Again, the goal of UL is not to push products to extremes, but to give customers a baseline of confidence. iBeta is funded by vendors, just like UL. This doesn’t create a conflict of interest; someone has to pay for the testing.

It should be noted that iBeta also goes across use cases and technologies, testing embedded technologies (like camera modules) and SDKs alike, across fingerprint, face, and beyond. Once again, similar to UL, which tests a vast range of consumer hardware products, rather than offering a focused, relative assessment. 

Why DHS RIVR PAD is different 

While iBeta tests vendors on a case-by-case basis, DHS RIVR PAD is a large-scale evaluation across a range of leading vendors following an application and selection process. This government-funded effort is a true asset to the biometric identity industry, which is otherwise too small to (for instance) support a subscription- or ad-supported test like Consumer Reports or Wirecutter. 

DHS RIVR PAD reports using the same core metrics as iBeta (i.e., APCER, BPCER) as well as a few additional metrics (e.g., test speed, user sentiment) and sets its own thresholds for success. Most critically, it reports not only whether vendors met those thresholds, but also on their specific quantitative performance. 

Because of its construction, DHS RIVR reports on technology providers by aliases, which vendors can then choose to report (as Paravision has done). So, if not publicly reported, customers should ask their vendor if they submitted to DHS RIVR PAD, and if so, under which alias or aliases (if they made multiple submissions) they were reported in DHS’s evaluation report. 

In conclusion 

iBeta (not to mention other test groups like Ingenium and Bixelab) fulfills an important role in understanding baseline functional capabilities of biometric presentation attack detection. But it is not a relative buying guide, any more than UL is. DHS RIVR PAD is the first of its kind, and it is now the gold-standard buying guide for integrators of PAD technology in remote IDV.