Contents

1 Why Do We Need A New Class?

The current implementation for the @sensitivity slot in a PharmacoSet has some limitations.

Firstly, it does not natively support dose-response experiments with multiple drugs and/or cancer cell lines. As a result we have not been able to include this data into a PharmacoSet thus far.

Secondly, drug combination data has the potential to scale to high dimensionality. As a result we need an object that is highly performant to ensure computations on such data can be completed in a timely manner.

2 Design Philosophy

The current use case is supporting drug and cell-line combinations in PharmacoGx, but we wanted to create something flexible enough to fit other use cases. As such, the current class makes no mention of drugs or cell-lines, nor anything specifically related to Bioinformatics or Computation Biology. Rather, we tried to design a general purpose data structure which could support high dimensional data for any use case.

Our design takes the best aspects of the SummarizedExperiment and MultiAssayExperiment classes and implements them using the data.table package, which provides an R API to a rich set of tools for high performance data processing implemented in C.

3 Anatomy of a LongTable

3.1 Class Diagram

We have borrowed directly from the SummarizedExperiment class for the rowData, colData, metadata and assays slot names. We also implemented the SummarizedExperiment accessor generics for the LongTable.

3.2 Object Structure and Cardinality

There are, however, some important differences which make this object more flexible when dealing with high dimensional data.