The COVID-19 pandemic has inspired an unprecedented convergence of scientific research, driven in part by publishers choosing to allow open public access to many research papers and data sets relevant to COVID-19 (the disease) and SARS-CoV-2 (the viral agent). The sheer volume of this data presents both a practical challenge -- how should scientists find the information most relevant to them -- and a valuable case-study in the difficulties present in the curation of multi-disciplinary information spaces, unified around a common scientific theme: in this case, determining how best to approach COVID-19 and to mitigate the global SARSCoV-2 pandemic. In this context, Innovative Data Integration and Conceptual Space Modeling for COVID, Cancer, and Cardiac Care provides a critical and analytic overview of the COVID-19 data ecosystem: the different genres of data which are marshaled toward scientific investigation of COVID-19 and SARS-CoV-2; how this data is obtained, consumed, analyzed, and interpreted; how research data supports scientific claims pertaining to COVID-19's biological and epidemiological mechanisms and trajectory; and how these scientific claims should translate into public policy. The authors examine the operational logistics of COVID-19 data: its structures, protocols, analytic methodology, and empirical significance. Readers will learn to identify the distinct scientific disciplines, each of which is aimed at one particular facet of COVID-19 research -- molecular biology, genomics, radiology, epidemiology, clinical informatics, and their various subfields -- and, for each of these disciplines, review their distinct paradigms for data acquisition, analysis, and modeling. The point of these expositions is to bring the reader from conceiving "data" as something abstract and amorphous, to understanding data as the buildingblocks of scientific research and biomedical claims. One way to supply this backstory is to examine COVID19 data from the viewpoint of software engineering: to demonstrate the methodology for data acquisition and management from the perspective of programmers implementing software which manipulates COVID-19 data. This explication examines the data structures, file formats, API protocols, and other technical details intrinsic to writing code that works with COVID-19 data as a digital artifact. The primary goal of these discussions is to help scientists (who may be well-versed in data structures relevant to their specialization but less so vis-輁is other disciplines), as well as policy makers, to better understand the technical chains which transform COVID-19 information from the realm of laboratories and experiments to the realm of public policy and public health. Aside from the empirical focus geared toward scientists and policy makers, Cross-Disciplinary Data Integration and Conceptual Space Models for COVID-19 also provides a more theoretical and IT-focused case-study in data curation and integration. From this perspective, the COVID-19 data ecosystem is a concrete example through which theories of cross-disciplinary data integration are presented. There are two distinct phenomena which render inter-disciplinary data integration significant for COVID-19 in particular, and clinical/biomedical practices in general. First, certain forms of analysis explicitly combine information or statistical parameters from distinct subject areas. For example, in addition to epidemiological models of SARSCov-2 within an entire population, it is important to study the present or projected spread of the disease among different social groups, identified by age, gender, race, economic status, and so forth. This form of analysis therefore merges epidemiological and sociodemographic data and methods. As such, this is an example of analyses wherein it is explicitly necessary to pool data that is typically represented via different schemas -- and accessed via different protocols -- into a single algorithmic or computational environment. This book therefore examines cross-disciplinary analysis along these lines as case studies of data integration on a procedural level: how computer code can obtain and marshal heterogeneous data into a common form suitable for qualitative and quantitative analyses.
The second context where multi-disciplinary integration becomes relevant operates at a higher level: the development of heterogeneous information spaces which can absorb data from many environments, evincing a variety of disciplinary orientations. The rationale for such heterogeneous repositories is often practical and logistical: institutions have operational reasons for curating a single, comprehensive data ecosystem that is shared by multiple information producers and consumers, such as a "Semantic Data Lake." In these situations, one large central repository will take the place of numerous narrower, domain-specific databases. A central repository may be subdivided into smaller components implementing narrower protocols -- e.g., a clinical software network may provide diagnostic images via a PACS (Picture Archiving and Communication System) service, and treatment/outcome data via an EMR (Electronic Medical Record) architecture. It is understood that the structure and use of data in these two environments (PACS and EMR) is very different. Nevertheless, institutions will often unify these systems into a single data platform for logistical reasons: it is more convenient for doctors and researchers to have a single access point, a single login account, a single query framework, etc., which accesses the totality of information used across the organization's activities. These institutional repositories present challenges which are different from granular syntheses of heterogeneous data into a single procedural/algorithmic context. Disparate data structures in a heterogeneous archive, such as a "Data Lake," may never be directly combined in a single computation. Nevertheless, Data Lakes and th