AI Resources
High Performance Computing (HPC)
High performance computing and quality data form the backbone that supports AI, enabling innovation but also use of the latest tools in research.
HPC is essential to AI for a few reasons:
- Large data sets: HPC enables efficient processing of the large data sets that AI frequently depends on.
- Computational complexity: Deep learning algorithms take a lot of computational resources to train and to use. HPC enables these algorithms to learn and work faster.
- Scalability: As AI adoption continues to increase, AI workloads become greater. HPC helps to ensure that powerful chips are available to run AI algorithms without everyone needing to have extremely expensive computers.
KU Medical Center researchers have a few options for high performance computing:
- Azure Databricks in the KU Medical Center Digital Research Platform, maintained by Research Informatics. It is pay-as-you-go but offers on-demand compute that can match the needs of specific projects. It ensures that any researcher can demonstrate the necessary compute for grants and enables researchers the ability to work with protected health information or other restricted data.
- KU Center for Research Computing has two resources available to KU Medical Center researchers:
- The KU Community Cluster is a resource cluster made of hardware purchased by different researchers. Those who purchase compute for the cluster can use its shared resources. It is not compatible with working with identifiable human data.
- Hawk is another resource maintained by the KU Center for Research Computing. This smaller cluster is compatible with research health information or research identifiable data. Therefore, it can be used with data collected from studies, but not with protected health information.
KU Medical Center Research Data Lakehouse (RDL)
The KU Medical Center RDL is a cloud-based central repository for research data. It runs on Azure Databricks and uses the Spark distributed computing platform to enable efficient processing of very large data. The RDL uses Delta Lake to enable robust data management. The RDL allows integration of a vast array of data, such as electronic medical records, genetic tests, ECGs, clinical imaging, wearable device data, geospatial data and more.
Data Science Tools
Both Databricks and the KU Community Cluster provide access to advanced analytics, connected to HPC, to enable researchers to build and use AI models for research. Users have access to industry standard libraries such as TensorFlow and PyTorch. On Databricks, users can leverage MLflow to manage AI model development and deployment. Additionally, Databricks users can interact with popular generative AI foundation models or use Mosaic AI to improve those models for specific use cases.
REQUEST ACCESS TO RESEARCH DATA LAKEHOUSE
Guidance
Guidelines for Using Generative Artificial Intelligence – Provides faculty, staff, students and affiliates with clear directives on the ethical and responsible use of generative artificial intelligence (GenAI) tools within the University of Kansas.
KU Center for Teaching Excellence AI Resources – Provides guidance on many topics related to AI, including using Generative AI as a tutor, effective prompting, academic integrity, frequently asked questions and more.
Research Informatics has internal resources available to KUMC researchers, staff and students via the myKUMC intranet.