Keywords: TCR sequencing, next-generation sequencing, time change analysis, pattern recognition
Cancer immunotherapy has demonstrated significant clinical activity in different cancers. T cells represent a crucial component of the adaptive immune system and are thought to mediate antitumoral immunity. Antigen-specific recognition by T cells is via T cell receptor (TCR), which is the product of somatic V(D)J gene recombination, plus the addition/subtraction of nontemplated bases at recombination junctions. Next generation sequencing of TCR is used as a platform to profile TCR repertoire. We developed an analysis pipeline to track and examine TCR repertoire across time by focusing on V and J gene segments, which overcomes the limitation of small or non-overlap clones among subjects and thus can obtain statistical inferences across subjects directly. We developed a customized clustering workflow: 1) Patterns Recognition by using the combination of V and J gene segments based on their abundance change across time based on changepoint analysis; 2) Feature Selection to select the important V and J gene segments by random forest; and 3) Hierarchical Clustering to distinguish the subjects based on the selected important V and J gen segments. TCR sequence data from serial peripheral blood mononuclear cells samples from a group of cancer patients receiving the immunotherapy was used for illustration purpose.