During World War II, Abraham Wald, a US statistician tasked with improving aircraft safety, proposed a paradoxical solution to boost survivor rates. Unlike fellow researchers who had decided to shore up the security of areas of the aircraft that were often found damaged upon return, Wald proposed they reinforce areas that remained in perfect condition post-mission. He argued that the planes that returned were selected for; particular areas on these planes remained unscathed as any damage to these areas would cause the plane to crash. Damaged areas, however, could suffer impairment without a total loss of aircraft functionality. In what became known as survivorship bias, Wald invoked a common concept underlying all forms of data analysis and interpretation: data has a genealogy and is affected by externalities during the collection process. You can never truly divorce data from its context.

Healthcare claims are an exceptionally biased source of clinical data for this reason. Providers are incentivized to bill for services that will pass adjudication and can record information with this in mind. This oftentimes leads to various forms of fraud, waste, and abuse. If a member is ineligible for a particular procedure that the doctor deems necessary, the provider may change the diagnosis or procedure codes billed to create a covered claim. These data survive the clinical encounter despite being an inaccurate reflection of the visit and provided service. While checking for coherence between all available information in the patient’s health history, encompassing all their claims, can expose some inconsistencies, claims data is ultimately curated by the provider-patient relationship.

Bringing in a variety of other data sources for a more holistic view can help to address this problem and catch bad actors. Here at Alivia, we can use medical records to verify claims, even despite the immense scale of such records. The greater the contextual view, the more comprehensive the data analysis. Wald was lucky enough to have a known, limited set of areas to potentially reinforce on US aircraft. With fraud, however, we only know about and can address the schemes that have been previously documented. In data science, we term these at present undiscovered schemes unknown unknowns. Thankfully, new data-rich APIs are rolled out routinely nowadays, and the current IoT landscape is burgeoning with more and more devices with live data streams. Alivia’s real-time processing can use these new sources to probe into the great unknown, deciphering the truth within your medical claims.