Despite the recent advancements driven by deep learning, de novo peptide sequencing is still constrained by incomplete peptide fragmentation and insufficient protein digestion in current single protease-based proteomic experiments. Here, we present a software system, named DiNovo, for high-coverage and high-confidence de novo peptide sequencing by leveraging the complementarity of mirror proteases. DiNovo is empowered by several innovative algorithms, including a mirror-spectra recognition algorithm independent of pre-sequencing, two sequencing algorithms based on deep learning and graph theory, respectively, and target-decoy mapping, a method for sequencing result evaluation free of prior peptide identification. Compared with the trypsin protease used alone, DiNovo using two pairs of mirror proteases leads to two to three times high-confidence amino acids sequenced. Compared with previous single-protease de novo sequencing algorithms, DiNovo achieves much higher sequence coverage. DiNovo also shows great potential as a practical and powerful alternative to database search for peptide identification with quality control.
Publication:
NATURE COMMUNICATIONS
http://dx.doi.org/10.1038/s41467-026-70224-6
Author:
Zixuan Cao
State Key Laboratory of Mathematical Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
University of Chinese Academy of Sciences, Beijing, China
These authors contributed equally: Zixuan Cao, Xueli Peng, Di Zhang, Piyu Zhou, Li Kang
Xueli Peng
State Key Laboratory of Mathematical Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
University of Chinese Academy of Sciences, Beijing, China
These authors contributed equally: Zixuan Cao, Xueli Peng, Di Zhang, Piyu Zhou, Li Kang
Di Zhang
School of Computer Science and Technology, Shandong University of Technology, Zibo, China
These authors contributed equally: Zixuan Cao, Xueli Peng, Di Zhang, Piyu Zhou, Li Kang
Piyu Zhou
State Key Laboratory of Mathematical Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
University of Chinese Academy of Sciences, Beijing, China
These authors contributed equally: Zixuan Cao, Xueli Peng, Di Zhang, Piyu Zhou, Li Kang
Li Kang
State Key Laboratory of Medical Proteomics, National Center for Protein Sciences (Beijing), Research Unit of Proteomics and Research and Development of New Drug of Chinese Academy of Medical Sciences, Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing, China.
Program of Environmental Toxicology, School of Public Health, China Medical University, Shenyang, China
These authors contributed equally: Zixuan Cao, Xueli Peng, Di Zhang, Piyu Zhou, Li Kang
Hao Chi
University of Chinese Academy of Sciences, Beijing, China
Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Ruitao Wu
School of Computer Science and Technology, Shandong University of Technology, Zibo, China
Zhiyuan Cheng
State Key Laboratory of Mathematical Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
University of Chinese Academy of Sciences, Beijing, China
Yao Zhang
State Key Laboratory of Medical Proteomics, National Center for Protein Sciences (Beijing), Research Unit of Proteomics and Research and Development of New Drug of Chinese Academy of Medical Sciences, Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing, China
Jiaxing Dai
State Key Laboratory of Medical Proteomics, National Center for Protein Sciences (Beijing), Research Unit of Proteomics and Research and Development of New Drug of Chinese Academy of Medical Sciences, Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing, China
Yanchang Li
State Key Laboratory of Medical Proteomics, National Center for Protein Sciences (Beijing), Research Unit of Proteomics and Research and Development of New Drug of Chinese Academy of Medical Sciences, Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing, China
Lijin Yao
School of Computer Science and Technology, Shandong University of Technology, Zibo, China
Xinming Li
School of Computer Science and Technology, Shandong University of Technology, Zibo, China
Yaoyu He
State Key Laboratory of Mathematical Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
University of Chinese Academy of Sciences, Beijing, China
Jinghan Yang
State Key Laboratory of Mathematical Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
University of Chinese Academy of Sciences, Beijing, China
Haipeng Wang
School of Computer Science and Technology, Shandong University of Technology, Zibo, China
e-mail: hpwang@sdut.edu.cnn
Ping Xu
State Key Laboratory of Medical Proteomics, National Center for Protein Sciences (Beijing), Research Unit of Proteomics and Research and Development of New Drug of Chinese Academy of Medical Sciences, Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing, China.
Program of Environmental Toxicology, School of Public Health, China Medical University, Shenyang, China
e-mail: xuping_bprc@126.com
Yan Fu
State Key Laboratory of Mathematical Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
University of Chinese Academy of Sciences, Beijing, China
e-mail: yfu@amss.ac.cn
附件下载: