Editor’s note: Today’s guest post comes from AI for healthcare platform Lumiata. Here’s the story of how they use Google Cloud to power their platform—performing data prepping, model building, and deployment to tackle inherent challenges in healthcare organizations.
If ever there was a year for healthcare innovation—2020 was it. At Lumiata, we’ve been on a mission to deliver smarter, more cost-effective healthcare since 2013, but the COVID-19 pandemic added new urgency to our vision of making artificial intelligence (AI) easy and accessible. Using AI in healthcare went from a nice-to-have to a must-have for healthcare organizations. Just imagine how differently you could plan or assess risk if you could identify communities that are more likely to develop long-term effects or co-morbidities and end up in the hospital.
Our Lumiata AI Platform helps healthcare organizations use AI to improve quality of care, minimize risk, and reduce costs, without having to build those predictive analytics capabilities themselves. We’re making it possible for healthcare organizations to benefit from the latest technologies to derive valuable insights from their data at scale—and ultimately offer better care to their patients.
As an AI-first organization, we’ve been able to try new things to meet our customers’ changing needs, and our Google Cloud infrastructure lets us experiment and develop solutions quickly (check out our previous post on why we chose Google Cloud to help us deliver AI).
What our customers need: fast, accessible AI
AI provides an enormous opportunity for healthcare organizations, with an almost unlimited number of practical applications. AI isn’t just a solution you can switch on—it requires implementing the right, often purpose-built, solutions to extract the right insights from your data.
AI in healthcare is the next frontier, but the transformation can be slow. Without Lumiata’s help, many organizations struggle to operationalize AI, from data prepping to model building and deployment, even if they have identified the problems they would like to solve. Having advanced data science teams isn’t enough—you need to establish a fast, flexible, and resilient infrastructure to deploy projects. Healthcare AI projects are often plagued by a lack of understanding of the complexity of the high-dimensional data and what it takes to simplify it, as well as what’s required from engineering to productionize AI. In addition, it can be difficult to get the appropriate buy-in to make the changes needed to be successful.
In addition, healthcare organizations are building technology on waterfall methodologies, which lack the feedback loops and continual improvement to deliver the promised results. Without fast proof that AI is worth the investment, many projects fail before they’ve even started.
This is where Lumiata comes in. Our goal is to get customers up and running with the ability to perform fast queries and accurate, AI- and ML-driven predictions in a few weeks. The wealth of healthcare data is ripe for generating AI-powered insights and predictions, but it’s often trapped in legacy systems. Also, many organizations simply don’t have the resources to build everything themselves. We provide predictive products to healthcare businesses looking to leverage machine learning without the heavy lift by offering low- to no-code data modeling tools and solutions—all based on Google Cloud. That way, organizations are empowered to get started and run models when they don’t necessarily have the team do it themselves.
We selected Google Cloud because of its security infrastructure, intuitive AI tools, and multi-cloud application management technologies. BigQuery, Google’s serverless data warehouse, enables us to provide access to huge amounts of data. With Google Cloud Dataflow and Apache Beam, we built a data ingestion and extraction process to join and normalize disparate patient records and datasets. The entire system is built on a Google Kubernetes Engine, allowing us to scale quickly to meet infrastructure requirements, and Kubeflow helps us develop and deliver our machine learning pipelines.
Additionally, Google Cloud’s fully managed services mean we don’t have to think about building and operating our infrastructure. Instead, we invest our resources in doing more work for our customers and addressing their data needs.
Let’s take a look at how Google Cloud helps us deliver AI solutions to our customers through the steps of a typical ML building process.
1. Data prepping—from raw input to a 360-degree view
Healthcare organizations often suffer from information data silos that lack interoperability, so there’s no true understanding of the total amount of data and insights available. Most companies rarely have a comprehensive longitudinal person record (LPR) into every person’s health history.
When it comes to machine learning, cleaning and preparing data is where teams spend the majority of their time. It’s slow and incredibly time-consuming. In addition, working with on-premises environments doesn’t provide enough elasticity–you need to move quickly, and only the cloud has the capacity to support data prepping for AI.
We’ve created a data preparation process that takes raw, messy data and transforms it into fully prepped data for machine learning. Our data management pipeline ingests raw, disparate patient datasets and turns them into what we call Lumiata Person360 records. Using BigQuery and Dataflow, we ingest raw data dumps, link with existing or synthesized identifiers, validate, clean, and normalize it based on medications, procedures, diagnosis codes, and lab results. The data is then tied into a single person record and tagged with proprietary disease codes.
Our automated pipeline gives us incredible speed to intake data, and Google Cloud ensures we can scale to handle massive datasets seamlessly. For instance, we’ve been able to take 63 million person records (roughly 2.5 terabytes of data) and run them through our entire data management pipeline in less than four hours.
As healthcare organizations handle protected health information and must ensure Health Insurance Portability and Accountability (HIPAA) compliance, it’s imperative that we have the highest level of security and compliance at all times. To ensure this, we deploy single-tenant instances of the entire platform as its own Google Cloud Platform project with its own Kubernetes, networking, buckets, BigQuery tables, and services.
2. Removing the burden of training data models
One of the biggest challenges with model building is developing the infrastructure that enables transparent access to various data sources. Infrastructure setup takes time and resources, and often creates complexity when determining how to manage diverse data representations, architecture, and data quality. This is compounded by the fact that ML pipelines must continuously scale as the data for analysis increases. Ultimately, we don’t want our customers to have to worry about the underlying infrastructure.
We use Kubernetes and Kubeflow to build scalable ML pipelines and deep learning architectures that can support massive datasets. Our platform unlocks millions of input variables (machine learning features) from Person360 patient records and mixes them with our own internal 110 million-member data asset. We then use this data for training our complex data models to predict cost, risk, disease onset, and medical events.
Google’s AI Platform also makes it easier for us to experiment faster with large training datasets from our 120 million records. For instance, we have shifted from more traditional machine learning (like gradient boosted decision trees) toward larger deep learning models that can take multiple predictions across more than 140 medical conditions and analyze them across a specific time dimension.
The real value here is the speed at which we can service our customers from the time they drop their data into our platform to the first datasets. Our automated machine learning pipeline enables us to reduce the time it takes to deliver the first outputs—from months to weeks. For instance, we can now train our models with feature matrices containing 11 million people in less than two hours—and all without having to waste time setting up infrastructure for distributed training.
3. Deploying and serving models in production
Productizing complex ML models comes with its own host of challenges. After models are trained and ready for deployment, it can be difficult to maintain consistency as you scale model deployments to meet new use cases or requirements across the organization.
Our data science and machine learning engineering teams run offline experiments (outside of Kubeflow) using the Google AI Platform, allowing a single team member to run numerous experiments a day. Once we have a model that works, we version the training pipeline, pre-trained models, and inference pipelines before deploying it onto Kubeflow. The Lumiata AI Platform allows us to benefit from serverless and distributed training—our data scientists are training more models per week, and we have made quicker leaps forward using BERT-based deep learning models.
Building on top of Kubernetes and Kubeflow gives us a fast, scalable path to deploy and serve models to our customers. Kubeflow’s reusable components allow us to scale without having to build from scratch. There’s no need to worry about the nuances of training, tuning, and deploying models—customers can simply upload their data and get predictions out on the other end.
Running ML and AI in production
The real impact of simplifying AI implementation is that it opens up previously undiscovered paths for improvement.
For instance, we recently launched Pharmacy Intelligence, an AI-powered intervention tool that leverages pharmacy visit data to improve chronic disease management. We partnered up with FGC Health, a Canadian retail pharmacy chain, to help them identify diabetic patients at risk for cardiovascular complications who have gaps in care. The tool will then recommend a simple, actionable intervention, such as a visit to a specialist, drug titration, or adjustments to their existing drug regimen. This is a wonderful example of how using AI to address common gaps in care has the power to save lives.
As a company, we see Google Cloud as the core of our platform, enabling us to innovate more rapidly. As a result, we’re delivering solutions for new and interesting problems, such as claims payment integrity, predicting hospital admission and readmission, and identifying disease progression fingerprints for more personalized care. We’re helping healthcare companies become smarter, more powerful, and more effective—leveraging the information they already have in new ways to power the next generation of patient care.