The challenge of non-events
It is relatively easy to analyze healthcare data for activities that have occurred, but difficult to confirm when an activity has not. If a patient is diagnosed with a disease, we know about it because it is indicated in their medical record. However, a medical record without a diagnosis does not guarantee that the patient is free of the disease. Many of the activities that are important to researchers relate to things that didn’t occur, or are at least closely related to the “dark matter” in the data. A vast amount of interesting information is hidden in the murky area of missing data.
A recent example involved the analysis of a population of diabetes patients. We segmented the data set into periods to identify patients who either performed or did not perform an HbA1c test every six months. The algorithm identified a surprisingly large population of non-performers from the 2008 data; approximately three times larger than expected. The query attempted to pull all of the patients in the database who met the criteria for our definition of diabetes. This definition, for simplicity sake, meant that the patient was diagnosed at some point and was currently alive. This definition was flawed; just having data about a diagnosis did not guarantee that we also had data about the non-event of missing an HbA1c test within the last six months.
The first problem we identified was a substantial number of patients from the population of about 100,000 stopped receiving care in the health system after they were diagnosed within a 15-year period of data. This meant that the non-event of skipping an HbA1c required validation against a second non-event of not receiving any care at all. It was relatively easy to resolve this problem by eliminating patients from the query who had no facts in the data warehouse for the previous five years. Patients with a chronic disease should have had healthcare visits within this time frame, however the decision was arbitrary. Had we instead chosen to ignore patients without a visit during just one previous year, our results might have been suspicious.
One consideration was to analyze enrollment data from insurance payer files, however it wasn’t readily accessible from the EMR, and the data was incomplete given that it omitted patients without insurance coverage from major commercial payers. The insurance payer files would have had PCP data, but likely nothing about patients who stopped receiving care from one health system while having a PCP in another.
There was also a significant challenge to capture the patients that would come in and out of the system only when they were extremely sick. Those patients deemed themselves to be healthy enough to avoid having a doctor’s visit or a lab test for a couple of years, but then suddenly showed up for one and then disappeared.
Another non-event occurred among patients who frequently relocated. There was a field in the data warehouse to track each address a patient had in their history, however it came from EMPI registration data, and most patients did not update their registration, especially when departing the system.
Despite having a field for vital status in their records, we could barely verify with certainty whether a patient was alive or not. While a morbid thought--at what visit to the physician was a patient to report if they had passed? Death was only consistently verifiable when it occurred in a hospital or if information was received from social security records. Inevitably there were families with a financial motive to conceal mortality because social security checks arrived by mail for living elderly citizens with limited assets.
Regardless of the challenges, these were the patients that needed to be identified for important testing and monitoring of their chronic disease state. How were we to know who was a patient when they were not consistently in the health system as facts? We had to identify patterns among both the visits that occurred as well as the non-events. These were linked pieces of information operating in opposite worlds. A lack of visits could be dispelled by a visit. The patterns were not universal for interpreting quality. A patient with a certain age, gender, health, race, and economic background could be expected to have a very different level of activity in the system, thus a universal rule couldn’t apply to all patients and all measures of performance.
The nature of clinical intelligence work often deals with unknown non-events that are never registered as facts in data sets. This creates doubt about the quality of the analyses and recommendations.
This is just a few of many reasons for chart abstraction in core measures. When structured medical record fields are missing event data to comply with a measure, it doesn’t guarantee that the event never occurred nor does it eliminate the possibility of it being documented in a clinical note. The only way to eliminate doubt from a small set of cases is to analyze the full chart of each patient according to CMS guidelines.
I encounter instances of non-events on a daily basis. In many cases, it is a factor of whether or not we have up-to-date data. It can often take up to 30 days to receive information from a source system through a monthly load. Many source systems aren’t capable of providing instant data feeds for analysis or it is too cost prohibitive to do so. This requires us to run reports against information that is near real-time, meaning it is not up-to-the-minute but relevant for the state. An example would be to determine whether or not a patient received a lab test that was ordered two months ago. We are unable to provide 100% confidence for non-events without up-to-the-minute data. However, if we have a record of a test being performed, we are 100% certain that it occurred. In many instances, we need to account for latency in the reporting and provide tolerances for the lack of information. Grace periods are required to account for these uncertainties.
While it is challenging enough to access the data we have, it is critical to think carefully about the data we don’t, and the impact it has on our results. Maybe we need more data? I certainly find it odd that in a world where many people tweet ten times a day, we have tremendous uncertainty about individual health.
Dan Housman
Managing Director, Analytical Applications
A recent example involved the analysis of a population of diabetes patients. We segmented the data set into periods to identify patients who either performed or did not perform an HbA1c test every six months. The algorithm identified a surprisingly large population of non-performers from the 2008 data; approximately three times larger than expected. The query attempted to pull all of the patients in the database who met the criteria for our definition of diabetes. This definition, for simplicity sake, meant that the patient was diagnosed at some point and was currently alive. This definition was flawed; just having data about a diagnosis did not guarantee that we also had data about the non-event of missing an HbA1c test within the last six months.
The first problem we identified was a substantial number of patients from the population of about 100,000 stopped receiving care in the health system after they were diagnosed within a 15-year period of data. This meant that the non-event of skipping an HbA1c required validation against a second non-event of not receiving any care at all. It was relatively easy to resolve this problem by eliminating patients from the query who had no facts in the data warehouse for the previous five years. Patients with a chronic disease should have had healthcare visits within this time frame, however the decision was arbitrary. Had we instead chosen to ignore patients without a visit during just one previous year, our results might have been suspicious.
One consideration was to analyze enrollment data from insurance payer files, however it wasn’t readily accessible from the EMR, and the data was incomplete given that it omitted patients without insurance coverage from major commercial payers. The insurance payer files would have had PCP data, but likely nothing about patients who stopped receiving care from one health system while having a PCP in another.
There was also a significant challenge to capture the patients that would come in and out of the system only when they were extremely sick. Those patients deemed themselves to be healthy enough to avoid having a doctor’s visit or a lab test for a couple of years, but then suddenly showed up for one and then disappeared.
Another non-event occurred among patients who frequently relocated. There was a field in the data warehouse to track each address a patient had in their history, however it came from EMPI registration data, and most patients did not update their registration, especially when departing the system.
Despite having a field for vital status in their records, we could barely verify with certainty whether a patient was alive or not. While a morbid thought--at what visit to the physician was a patient to report if they had passed? Death was only consistently verifiable when it occurred in a hospital or if information was received from social security records. Inevitably there were families with a financial motive to conceal mortality because social security checks arrived by mail for living elderly citizens with limited assets.
Regardless of the challenges, these were the patients that needed to be identified for important testing and monitoring of their chronic disease state. How were we to know who was a patient when they were not consistently in the health system as facts? We had to identify patterns among both the visits that occurred as well as the non-events. These were linked pieces of information operating in opposite worlds. A lack of visits could be dispelled by a visit. The patterns were not universal for interpreting quality. A patient with a certain age, gender, health, race, and economic background could be expected to have a very different level of activity in the system, thus a universal rule couldn’t apply to all patients and all measures of performance.
The nature of clinical intelligence work often deals with unknown non-events that are never registered as facts in data sets. This creates doubt about the quality of the analyses and recommendations.
This is just a few of many reasons for chart abstraction in core measures. When structured medical record fields are missing event data to comply with a measure, it doesn’t guarantee that the event never occurred nor does it eliminate the possibility of it being documented in a clinical note. The only way to eliminate doubt from a small set of cases is to analyze the full chart of each patient according to CMS guidelines.
I encounter instances of non-events on a daily basis. In many cases, it is a factor of whether or not we have up-to-date data. It can often take up to 30 days to receive information from a source system through a monthly load. Many source systems aren’t capable of providing instant data feeds for analysis or it is too cost prohibitive to do so. This requires us to run reports against information that is near real-time, meaning it is not up-to-the-minute but relevant for the state. An example would be to determine whether or not a patient received a lab test that was ordered two months ago. We are unable to provide 100% confidence for non-events without up-to-the-minute data. However, if we have a record of a test being performed, we are 100% certain that it occurred. In many instances, we need to account for latency in the reporting and provide tolerances for the lack of information. Grace periods are required to account for these uncertainties.
While it is challenging enough to access the data we have, it is critical to think carefully about the data we don’t, and the impact it has on our results. Maybe we need more data? I certainly find it odd that in a world where many people tweet ten times a day, we have tremendous uncertainty about individual health.
Dan Housman
Managing Director, Analytical Applications
Labels: Clinical Intelligence





0 Comments:
Post a Comment