With the proliferation of databases, both local and public, on genomics and imaging data in cancer, there is ample opportunities for analysts to engage in imaging-based analyses of cancer datasets. While the data availability seems quite promising, in this talk, we will describe the myriad of issues that exist in an attempt to perform data integration in this setting. We will describe appropriate scientific estimands to consider as well as the scientific questions they could answer. Various datasets and modalities will be used to illustrate the concepts. Time permitting, we will talk about the importance of workflows to have a better handle on the capacity of machine learning tools in this area.