Caltech logo

Pachter Lab

Research highlights

Single cell data preprocessing

bus runtime

The analysis of single-cell RNA-Seq data involves a series of steps that include: (1) pre-processing of reads to associate them with their cells of origin, (2) possible collapsing of reads according to unique molecular identifiers (UMIs), (3) generation of feature counts from the reads to generate a feature-cell matrix and (4) analysis of the matrix to compare and contrast cells.

The pre-processing of single-cell RNA-Seq data involves a series of steps that include: (1) associating reads with their cells of origin, (2) accounting for PCR amplification of molecules by collapsing reads according to unique molecular identifiers (UMIs), and (3) generation of feature counts from the reads to generate a feature-cell matrix. Some of these challenges are procedurally straightforward but computationally demanding. Others are are statistical in nature and require technology specific models. We have recently introduced a format for single-cell RNA-seq data called the BUS (Barcode, UMI, Set) format that facilitates the development of modular workflows to address the complexities of these challenges. It is described in (Melsted et al., 2019). BUS files can be generated from single-cell RNA-seq data produced with any technology using pseudoalignment software. We have implemented a command in kallisto v0.45.0 called bus that allows for the efficient generation of BUS format from any single-cell RNA-seq technology. Tools for manipulating BUS files are provided as part of the bustools package. We have been working on building efficient workflows for pre-processing of single-cell RNA-seq using this infrastructure.

The Barcode, UMI, Set format and BUStools (Bioinformatics 2019).


Open source hardware for bioinstrumentation

poseidon system

We are interested in the development of open source hardware with a focus on instruments used in the wetlab.

Our first open source hardware project is the poseidon syringe pump and microscope system , an alternative to commercial systems that reduces costs from thousands of dollars to less than $400. The system can be assembled in an hour. It uses 3D printed parts and common components that can be easily purchased either from Amazon or other retailers. The microscope and pumps can be used together in microfluidics experiments, or independently for other applications. The pumps and microscope can be run from a computer or Raspberry Pi with an easy to use GUI.

As we developed the poseidon system, our goal was not only to make a good piece of hardware, but also to have a system that would be easy for other people to build and contribute to. Guidelines we developed along the way, together with a brief story of the project are described in the blog post Open sourcing bioinstruments.


Single cell experimental methods

poseidon system

We believe in tackling the challenges of genomics with the best methods, and while we are primarily a computational group, we have developed experimental expertise as some problems we’ve tackled have been best addressed with experimental solutions. Our first experimental protocol enables universal sample multiplexing method for single-cell RNA-seq in which methanol fixed cells are chemically labeled with identifying DNA tags. The tagging mechanism uses a click chemistry one-pot, two-step reaction. The complete method, together with the demonstration of a 96-plex perturbation experiment, is described in the 2018 bioRxiv preprint Highly Multiplexed Single-Cell RNA-seq for Defining Cell Population and Transcriptional Spaces.

A quick overview of the method and generated data is described in the blog post The benefits of multiplexing.