Skip to main content.

The Essential Researcher

On most research studies across KU Medical Center, there is a biostatistician laboring behind the scenes to translate research data into medical evidence.

People in lab coats with stethoscopes and people in regular clothing interact with giant-sized screens and books displaying medical symbols.

In May of 2021, a team of 13 researchers made an important discovery for pregnant women and their doctors. Led by Susan Carlson, Ph.D., a professor in the Department of Dietetics and Nutrition at the University of Kansas School of Health Professions and a University Distinguished Professor, the scientists showed that pregnant women who take a daily 1,000 milligram supplement of docosahexaenoic acid (DHA), a dose well above the standard recommended 200 milligrams, can decrease their risk of having the baby prematurely. This finding was significant for women and for public health: Complications from preterm births are the number-one cause of infant death and disability in young children.

The researchers published their study in a high-profile clinical journal produced by The Lancet. After it was released, news articles about the findings appeared on many related sites for consumers and practitioners. The articles typically quoted first-author Carlson, as well as other nutritionists/physicians, while the second author on the study, Byron Gajewski, Ph.D., in the Department of Biostatistics & Data Science at the University of Kansas School of Medicine, was rarely mentioned.

Susan Carlson, Ph.D., and Byron Gajewski, Ph.D.
Susan Carlson, Ph.D., and Byron Gajewski, Ph.D.

Along with Carlson and a neonatologist affiliated with the University of Cincinnati, Gajewski was one of the study’s three principal investigators. Carlson has been collaborating with Gajewski on DHA studies since 2004, but their work on this study was a true partnership.

“I am a nutritionist and someone who has studied DHA for a lot of years, so I knew the [research] questions that needed to be asked,” said Carlson. “And Byron knew the best way to answer them. He made the decisions related to how many participants were needed and the analytical plan.”

It’s not that Gajewski’s work was invisible to the journalists who wrote about the study. Biostatistics, the application of statistics to health-related fields, is the language in which medical evidence is communicated: numerically, with data.

For the DHA study, for example, a key piece of evidence reported was that among women who began the study with low blood levels of DHA, the rate of preterm birth before 34 weeks of gestation was 2% for women who took 1,000 milligrams of DHA versus 4% for those who took 200 milligrams.

But such results are usually communicated by investigators such as Carlson, rather than the biostatisticians who quantified and analyzed them. Mainstream media aim to convey the “so what” of research, rather than the mathematical methods critical to it.

Many people have never heard of biostatistics — not even now, during what is likely the biggest public health event of our lifetimes. Biostatistics has long been fundamental to public health. It was biostatisticians who translated the data from the COVID-19 vaccine clinical trials into evidence that doctors could use to take action.

But biostatisticians do so much more than analyze results. They are typically involved from the inception of the project, helping devise laboratory studies and clinical trials that are statistically rigorous, ethical and set up to reduce or eliminate bias and comply with protocols and regulations.

“Sometimes we hear, ‘You guys crunch the numbers,’” said Gajewski. “But it is more than that. It’s the design, it’s the decisions that go into what we’re going to do and what we’re going to analyze — and it’s also doing all that in a very efficient way.”

IN PLAIN SITE

Those decisions help scientists to answer such questions as: Is this new drug or vaccine effective? How does that effectiveness vary according to a person’s age and ethnicity? How long are people with a specific type of cancer likely to live? What are the risk factors for kidney disease? Are those factors different for women versus men?

Nonetheless, biostatistics is a field often hidden in plain sight. That’s true for the discipline in general, and it’s also true for the Department of Biostatistics & Data Science at KU Medical Center, where faculty not only collaborate extensively with researchers across the medical center’s three schools (Medicine, Nursing and Health Professions) as well as The University of Kansas Cancer Center and the University of Kansas Alzheimer’s Disease Research Center — but also train future biostatisticians.

Then there’s this irony: the department that few hear about also has, in terms of student enrollment, the largest Ph.D. program at KU Medical Center. The department’s online master’s and certificate programs, launched in 2015 in conjunction with the Edwards campus, have a current enrollment of more than 150 students, and 117 master’s degrees have been granted so far.

Meanwhile, according to the data provided by the Office of Enterprise Analytics, the department itself — faculty, staff and graduate assistants — grew 58% from the 2016-17 academic year to the 2021-22 academic year, while the overall growth of the medical center grew 12%.
That’s especially impressive when you consider that the department didn’t even exist until 2007, when department chair Matthew Mayo, Ph.D., founded it.

Mayo had wanted to create a biostatistics department since his arrival at KU Medical Center in 1998, when he was hired as an assistant professor in the Department of Preventive Medicine and soon also served as director of biostatistics at the Kansas Masonic Cancer Research Institute, the research arm of The University of Kansas Cancer Center. In 2002, Mayo was awarded a five-year $1.5 million grant from the National Cancer Institute (NCI) to establish the Biostatistics and Informatics Shared Resource, which consists of faculty and staff who support researchers with study design, data collection and management and statistical analyses.

Probably nobody understood the value of that shared resource more than Roy Jensen, M.D., who was recruited by KU in 2004 with a mission: to develop KU’s cancer center into an NCI-designated center. Institutions with NCI designation have met rigorous standards for research, prevention or treatment. Jensen knew he needed biostatistical support to achieve his goal.

“Shared resources are a very critical element to cancer centers because they fund expertise and technological capabilities that most investigators cannot support within their own labs. Very few principal investigators can fund a full-time biostatistician to oversee their data research activities,” said Jensen, vice chancellor and director of The University of Kansas Cancer Center, which was awarded NCI designation in 2012. “So, for a whole host of reasons, I felt like investing in biostatistics was a great way to elevate the level of research and to build our cancer research portfolio.”

“The growth of the department has coincided with the growth of research at KU Medical Center — but especially with the growth of KU Cancer Center,” Mayo said. “Without the cancer center, we would not be half the size we are today.”

In 2006, the Kansas Board of Regents gave Mayo approval to create an academic department of biostatistics, which would train biostatisticians through degree programs as well as offer support for researchers across KU Medical Center. Mayo continued to recruit statisticians and expanded an online project registration system so that researchers from all parts of the medical center could request the help of a biostatistician.

In 2020, the 23 faculty members of the biostatistics department, supported by numerous staff within the department, were involved in 501 new and ongoing projects with 204 researchers at KU Medical Center.

“I know of no other department of this size in the country that collaborates with as many investigators on as many projects,” Mayo said. “And we’re not a large biostatistics department. We’re in the bottom third in size, but historically we’ve been in the top third to top half in the number of projects.”

ITS OWN FIELD

The expansion of the department also mirrors the growth of the field itself. The U.S. Bureau of Labor Statistics projects the employment of statisticians to grow 33% from 2020 to 2030, much faster than the 8% average for all occupations. The demand for biostatisticians in particular is fueled by ever-larger data sets generated by electronic health records and genomics research.

In 2019, the department added “& Data Science” to its name to reflect the work the department does using scientific methods and algorithms to discover patterns in and extract meaning from data, especially massive data sets.

“There’s more and more data out there, and it’s not as simple as punching a button on a software tool to get the right answer,” said Mayo. “You need to be sure you’re analyzing it as it should be analyzed, that you’ve designed things properly and that you’ve collected the right data to answer your question. There’s a lot of work that goes into that.”

This is not to say that the need for biostatistics is a recent development. The American Statistical Association, founded in 1839, is the second-oldest professional society in the United States (the Massachusetts Medical Society was founded in 1781).

When the Communicable Disease Center (now the Centers for Disease Control and Prevention, or CDC) was created in 1946 to prevent malaria from spreading across the country, it borrowed some of the statistical methods that Florence Nightingale, who became the first female member of the Royal Statistical Society as well as a famous nurse, had developed in the 1850s for her study of death rates and sanitation in a British military hospital.

In 1956, a CDC report investigating staphylococcus infections in newborns was the first to use two basic statistical measures: a chi-square test, which assesses how likely the observed data conform to a particular model or underlying relationship, and a p value, which is the probability that a more extreme or unusual result than the one observed would occur under a particular hypothesis or research theory. In the early 1960s, a t-test, which determines if there is a significant difference between the means (averages) of two groups, was used in a CDC investigation of mononucleosis in Kentucky.

Today, medical researchers learn such basic statistical methods, and others, routinely in their training programs. But new, more sophisticated statistical techniques emerge all the time.

Biostatistics is its own field. Keeping up with it cannot be done as a mere side gig. Janet Pierce, Ph.D., APRN, CCRN, FAAN, a professor in the University of Kansas School of Nursing and a University Distinguished Professor, took several statistics courses when she was in graduate school in the late 1980s. She remembers during the early 1990s, when it was not uncommon power analysis, a calculation that determines the minimum sample size needed, such as the number of participants needed in a clinical trial. Today, knowing how to “power the study” is expected — as is having a biostatistician on applications for research grants.

“When I review grant applications, and I see that they’re going to do something fairly complicated, and they don’t have a statistician, I might give them a poor score as reviewer, or I might suggest collaborating with a biostatistician,” she said. “They’re a very valuable part of a research team.”

A WORLD OF DIFFERENCE

The overarching question biostatisticians are often trying to answer is, “Is there a difference?” The difference might be in the rate of alcohol addiction between urban versus rural populations, or in breast cancer patients’ health when they take a new drug, or in the cognitive function of people with dementia when they take a vitamin supplement. More specifically, biostatisticians are trying to find out if there is a difference that is not due to some factor outside of their hypothesis.

An article in the Spring/Summer 2021 issue of Kansas Medicine + Science (“New Kidney, Better Brain Health”) featured a study led by Aditi Gupta, M.D., M.S., an associate professor in the Department of Internal Medicine. Then a fairly new physician-scientist, Gupta had received a career development grant from the National Institutes of Health in 2017 to study the relationship between cognition, brain health and kidney disease.

Jonathan Mahnken, Ph.D., and Aditi Gupta, M.D., M.S.
Jonathan Mahnken, Ph.D., and Aditi Gupta, M.D., M.S.

One of Gupta’s mentors on the grant was Jonathan Mahnken, Ph.D., professor in the Department of Biostatistics & Data Science. Mahnken had already completed projects in nephrology and with the KU Alzheimer’s Disease Research Center, so he had some familiarity with both clinical areas of the study. He worked with graduate students Robert Montgomery, now a research assistant professor in the department, and Palash Sharma were charged with coming up with the right statistical approach to use for Gupta’s research.

The difference the researchers were trying to detect was in the brain health and function of people with chronic kidney disease who often struggle with concentration and memory after a kidney transplant. It was a straightforward question, but answering it was complicated.

For one thing, the data in the study were inherently “noisy,” pointed out Mahnken. “Noise” in statistics refers to any source of variation in the data measurements other than the “signal,” which is the variation scientists are trying to detect.

“With a randomized controlled trial, where you flip a coin and assign participants randomly to groups, if you see a difference, then the difference is very likely attributable to whatever feature it was that made the groups different,” said Mahnken. “But when you’re dealing with data like what Dr. Gupta had, observational data, you don’t have the benefit of dividing the pre-transplant versus post-transplant group in some random fashion because that depends on things like, does a transplant kidney become available [for that participant]? And are they healthy enough for a transplant? There are just all kinds of factors at play.”

Moreover, the measurements for the study participants (including brain imaging and cerebral blood flow) were taken at baseline and then at three months and 12 months after the transplant. Those repeated measures could also be a source of noise, as variation within one participant’s series of measurements could happen not because of the transplant, but because of other factors such as a change in their sleep patterns or exercise habits.

Meanwhile, “some people might have been first assessed six months before their transplant, and others a year before they could get a transplant,” Gupta said. “How do you adjust for this time difference in your analysis?”

Mahnken, Montgomery and Sharma employed a statistical approach known as a linear mixed model, which allowed them to separate out the noise between participants and within each participant’s own measures and to account for the timing variations.

“Without that method, it would have been hard to detect the difference,” Mahnken said. “Because we were able to separate it out, we were able to make a better inference.”

The inference turned into the first study demonstrating that kidney transplants can reverse some of the brain abnormalities that accompany advanced kidney disease.

SO RANDOM

Pierce has worked with several biostatisticians during her tenure at KU. For the last four years, she’s been working with Francisco Diaz, Ph.D., professor of biostatistics, on research related to a heart condition that accounts for half of all heart failure patients. People with this form of heart failure suffer from fatigue and lack of energy.

Francisco Diaz, Ph.D.
Francisco Diaz, Ph.D.

In their latest project, Pierce is investigating whether supplements of Ubiquinol, a type of coenzyme 10 (CoQ10) that protects cells from damage, or D-ribose, a type of sugar that the body makes from food, would improve patients’ energy levels and shortness of breath. Pierce wants to compare the effectiveness of each kind of supplement to the other and to taking no supplement at all. She also wants to know what would happen if a patient were given both supplements simultaneously. Would the benefit be better than when each supplement was given alone? If so, by how much?

Diaz has built a linear regression model to test for that synergistic interaction. Linear regression models the relationship between variables by applying a linear equation to observed data. For this study, the model includes indicators for Ubiquinol and D-ribose treatment as independent variables, as well as the product of these two indicators (the interaction), to investigate changes in quality of life and other treatment outcomes from the beginning of the study through 12 weeks.

“Linear regression is very old, actually,” said Diaz. “But it’s a methodology that has been evolving over the years, along with computer methods to implement it. Someone who worked with linear regression 60 years ago would not recognize the methodology today.”

Diaz’s primary interest, in fact, is in random effects linear models. Unlike fixed effects models, which have components that represent only the entire population as a whole, random effects models incorporate special components that represent persons as individuals in addition to measured variables that remain constant, such as gender or ethnicity, or variables that can change, such as a person’s weight or quality of life. This allows researchers to analyze data measured over time in longitudinal studies. Moreover, the special components make these models useful for personalized medicine, Diaz said.

“These types of studies are more complex and difficult to analyze, so we use random effects, which is much more sophisticated,” said Diaz.

SMALLER, STRONGER, FASTER

Biostatistics not only can help answer research critical questions in medicine and public health, but it also can make that research more efficient and ethical.

For the DHA study with Carlson, Gajewski designed and implemented a Bayesian adaptive design.   Bayesian methods allow researchers to analyze data as the research progresses and then update the probability of their hypothesis accordingly. Adaptive design means that using prespecified rules, they can adjust the randomization in order to move participants into what is proving to be the clinically advantageous part of the trial. In the DHA trial, that meant getting more study participants into the higher-dose group.

Janet Pierce, Ph.D., APRN, CCRN, FAAN
Janet Pierce, Ph.D., APRN, CCRN, FAAN

Being able to make these kinds of adjustments to the study as it’s happening can also save money and time. The protocol for the DHA study included the option of stopping the trial early if they were already getting a clear signal.

Gajewski has also developed and applied Bayesian adaptive designs to research in breast cancer, smoking cessation and severe traumatic brain injury.

“I aim to design trials that will be smaller, stronger, faster and place more trial participants eventually on the better performing treatments, and I’m passionate about it,” Gajewski said.

That kind of commitment is often cited by researchers at KU Medical Center about the biostatisticians they work with. Carlson remembers when she and Gajewski finished another study, a precursor to the most recent DHA trial.

“I think Byron wrote three or four papers off that study, and he also used it with his Ph.D. students,” said Carlson. “He felt an ownership of the trial as the biostatistician.”
Mayo recalled a time when biostatisticians weren’t so valued.

“The applied nature of our field wasn’t truly respected early on, but today people realize that you just can’t talk about theory and mathematical and statistical models,” said Mayo. “You’ve got to really get your hands dirty, do the work and analyze it and interpret it. That’s why we’re here.”


University of Kansas Medical Center

Office of Communications
3901 Rainbow Boulevard
Mailstop 3013
Kansas City, KS 66160

Media inquiries: 913-617-8698
Staff Contacts