Research project in optimization of scientific software
In the Noble lab we develop algorithmic approaches for analysis of large, complex genomic and proteomic datasets. Several of our proteomics tools have workflows that are amenable to parallelization. We would like to recruit undergraduate researchers to extend at least two of the existing tools with robust, production-grade parallel computational capability, using threading or GPUs. This work can be done for research credit (e.g., CSE 499).
The specific projects are as follows:
1) Previously, we greatly improved the statistical calibration of a standard proteomics scoring function, using dynamic programming (DP) to calculate the distribution of scores over all possible (>10e+20) peptides. DP is inherently slow, but its calculation for each spectrum is independent of other spectra, and could be performed in parallel.
2) We are also adapting the DP to calculate score distributions for a more complex class of peptides which have crosslinked structures. This requires two rounds of combinatorial summation of scores from the DP distribution, and is currently very slow. However, the structure of the combinatorial summation appears ideally suited to acceleration with a GPU.
Programming will be primarily in C++. Prior experience with threading or GPU programming environments like CUDA is desirable.
Please contact Prof. Bill Noble (wnoble@uw.edu) for more information.