The Divurgent Team worked diligently at HIMSS’14 to crack the code on the reality of healthcare “big data” and the vendors in the data management space. It wasn’t easy – most readers will agree there is much confusion and lots of players in this space. The barrier to entry can be low: produce a data model, have a story about populating the database, have some BI tools to visualize it – and you’re in. Nonetheless, the ones we talked to were real indeed, and working hard to make a difference. The market agrees (confusion or not), and is buying at a rapid pace.
First, a few words about the term “big data”: Wikipedia defines “Big Data” as data sets so large that conventional tools can no longer be employed. Not true among provider organizations of course – the data sets are big, but not that big. The term is usually applied to the need to aggregate and make sense of large amounts of disparate data (clinical, costs, claims, genomics, etc.).
Reflecting on our review, it occurred to me that much has not changed since I built my first data warehouse with one of the Big Six Firms back in the 80’s. There is a certain set of requirements that are just as true today as they were back in the days of COBOL and RPGII, including:
- A great data model;
- Data Mapping (data element by data element) from transactional systems to your data repository;
- Populating the model (or more precisely, the physical schema);
- Normalize disparate data, using non-standard coding schemes, etc.;
- Educating users on the precise definitions of the data (e.g. achieve data literacy);
- Have a great visualization tool to provide value to what can be an enormous investment (this capability was not around in the 80’s of course)
By the way, the customer (or the professional services that hire) is responsible for 2 – 6. The vendors do have great data models, many have adapters (out-of-the-box ETL for major EHR and other popular vendors). Many have drag-and-drop capabilities to help you with 2 – 6. Many have professional services or partners that can help. And many have experience with vendors like Epic, Cerner, Meditech, Lawson, Kronos, etc. But alas, the heavy lifting is mostly done by the customer.
Divurgent is working on a taxonomy to take away as much ambiguity in the marketplace as possible. There are data warehouses, population management tools, and pretty much everything else across the spectrum. We can say this about many of the market leading “point solution” entries: they come from good pedigrees (Explorys – Cleveland Clinic, Health Catalyst – Intermountain Health, Health Care Dataworks – Ohio State University, etc.), many are relatively new (~ 2 years as a household name for the vendors mentioned above), and some come with sophisticated data governance models (HealthCatalyst and IBI for example.
Let us try to go through a few representative vendor products as a first installment on a series on Big Data (a term we just debunked, but since everybody “gets it,” it makes for a convenient shorthand.)
IBM and Oracle. Both contain beautiful, comprehensive data models – that if printed – would cover the floor of HIMSS. IBM relies on their partners to help populate them; Oracle looks to their professional services group. Both are optimized for their database and server products. Both have the business rules for data manipulation embedded in the ETL. Overall, not cheap solutions, but in the end, a provider should have a very robust EDW.
Health Catalyst. A data aggregator with an interesting architecture: data marts are first stood up which are a mirror (albeit with more meaningful naming conventions), for each transactional data set of interest (e.g. one for Clarity, one for Lawson, etc.). In a “late binding” process, SQL queries are used to populate dashboards, reports, etc., and data manipulation when data normalization is needed. If source data changes, the SQL code is modified, vs. going back to the ETL layer. Several “starter” dashboards are included. Finally, they have an interesting story of nurturing an ecosystem environment for those developers who can build applications to bolt-on the HealthCatalyst database. HealthCatalyst also provides an interesting data governance tool, Atlas, which provides a way of managing data definitions across the enterprise without forcing major data conversions at the initial ETL stage.
Explorys. A data aggregator with the primary intent of providing rich population health functionality, including PMPM metrics, predictive costing, and outreach capabilities. As with most these tools, high-level to physician/procedure/cost level of data is emphasized. Two of their founders originally founded Everstream – a sophisticated analytics solution for the media industry. Out visit to the McKesson booth occurred a time when no MedVentive representatives, however, we place them in roughly the same category as Explorys (by the way, these two vendors are the leading 3rd party vendors as population health management “bolt-ons” for the Epic platform.)
Truven. A roll-up of interesting point solutions bought by Thompson Reuters before the transition to Truven, including clinical surveillance, clinical decision support, analytics, and population management. They are working on a common repository to link these solutions together. Their subject matter experts are very knowledgeable (many of the clinicians), and their products are time tested. We would put Optum in roughly the same category, albeit with a broader product line, especially with their acquisition of Humedica.
Anyone who has been following this market will know that we have barely scratched the surface. The list is long: Intersystems, Information Builders, Healthcare DataWorks, Caridigm dimensional insight all have a unique story to tell. Stay tuned for our next installment.
For those considering a buy in the short term, it would suggest networking, and then networking some more. As noted, much of the cost and risk are associated with labor and time to value. The best source for this information organizations who have gone through (or more likely in this nascent market, going through) the process.