Research News

03/20/2026

AI-powered solution developed in collaboration with IBM improves reproducibility in cell culturing

The predictive model, tested in gut epithelial cell samples, provides a path toward benchmarking—not eyeball estimates—in the field.

A person's hand, with a blue glove on, holding a tray with fluorescent pink cell cultures under the lens of a microscope.

In the field of primary cell culture systems, opportunity exists for scientists to develop them for in vitro experimentation (cells grown in the laboratory from in vivo cells, which are taken directly from human tissue). Although researchers have made several advances in this area, a challenge still remains: reproducibility.

A team of researchers at Cleveland Clinic, led by Thaddeus Stappenbeck, MD, PhD, collaborated with researchers at IBM, led by Jianying Hu, PhD—through the Discovery Accelerator, a ten-year partnership between Cleveland Clinic and IBM—to meet this challenge. With their teams’ combined expertise, they created a cutting-edge machine learning solution capable of robust comparisons between in vitro and in vivo systems.

The findings, published by PNAS, represent a novel approach that could apply to similar biomedical research in other fields.

Benchmarking and reproducibility advance discovery

Primary cell cultures, which are living cells taken directly from an organism and grown in lab dishes, are a common method for researchers to learn more about disease. These in vitro models also allow researchers to understand how cells function outside of the body. Though they are important for research, these in vitro cells are harder to grow and are less consistent, depending on growth conditions.

A key to advancing scientific research is the ability for labs and researchers anywhere in the world to consistently reproduce experiments using the same variables, methods and steps, and measure their results against others. These comparisons rely on a process called benchmarking. As applied to in vitro culture methods, Dr. Stappenbeck describes benchmarking as a structured, multiscale comparison between cells in the culture system with a common reference point. It is at this point that corresponding cells taken directly from the body can be similarly analyzed.

Growing gut epithelial cells reliably has only been possible since 2008; but there still is no rigorously validated method for doing so. Researchers doing comparison work between cells in vivo and in vitro typically estimated the degree of likeness between the two samples by “eyeballing it,” or showing that basic markers are retained in vitro. This approach can lead to disparate results that are hard to reconcile.

The two-panel illustration above is a mockup of a cell type that shows low fidelity (left panel, with purple arrows pointing to many different in vivo cell types) versus a cell type that aligns with high fidelity to an in vivo cell type (right panel, with a single green arrow).

AI, IBM and building a foundation model

Dr. Stappenbeck, chair of Cleveland Clinic Research’s Inflammation & Immunity department, and his team used cells in vivo from non-diseased, gut epithelial tissue samples. Using an adapted air–liquid interface culture system, which encourages the cells to organize and mature in ways that resemble real tissue, they then grew cells in vitro from those samples. From the way those in vitro cells appeared, the researchers hypothesized that they would be a good match to the in vivo cells as well—but they needed a way to test that hypothesis.

The next step was to develop a system that could process large amounts of high-dimensional data and provide a quantitative measure of the similarity between the two sets of cells.

Dr. Hu, IBM Fellow and Director of Healthcare and Life Sciences, and her global team created AI algorithms and tools for this research using one of IBM’s Biomedical Foundation Models, called BMFM-RNA, which was pre-trained on general, publicly available, single-cell RNA sequencing data and then fine-tuned on data specific for this project. The results not only revealed significant similarities between cells in vitro and in vivo, but for the first time, quantified these similarities for benchmarking purposes. This suggests that the approach could help scientists better understand how genes work together to maintain and repair the lining of the human intestine. The model’s success enhances the potential for research in other organs and body systems.

The future of predictive modeling

While a computational model doesn’t replace real clinical data, Dr. Stappenbeck believes that it represents an opportunity to increase the rigor of reproducibility in the field and make benchmarking a standard practice.

“It’s exciting to use high-dimensional data, like single-cell RNA sequencing, for these incredibly robust comparisons. These comparisons are done in the context of, quite literally, millions of other cells that have been analyzed from the human body,” he says. “We’re grateful for Jianying and the team at IBM for partnering with us to share their creation of this biomedical foundation model. This is going to become a crucial part of our repertoire for ultimately using in vitro human models to develop new patient treatments.”

Dr. Hu echoes the notion that the work described in the PNAS article is a concrete demonstration of how such a model can help address what she calls “some of the thorniest challenges in science.”

“By combining Cleveland Clinic’s expertise in advanced human cell models with IBM’s computational algorithmic development capabilities, we’ve created a rigorous framework that allows scientists to more reliably assess the fidelity of novel in vitro experimental systems in a scalable manner,” she explains.

The framework, she believes, “paves the way for significant acceleration of the cycle of scientific discovery.”