Tmpri2-005.7z -
These files typically contain curated sequences of proteins that cross cell membranes, used to distinguish between transmembrane helices, signal peptides, and globular domains.
The "TmPri" (Transmembrane Primary) naming convention is standard for the benchmark sets used to develop , a leading deep learning tool for protein structure prediction. TmPri2-005.7z
Authors: Jeppe Hallgren, Konstantinos D. Tsirigos, et al. Journal: Nature Communications (2022). These files typically contain curated sequences of proteins
The primary research group's resource page . Tsirigos, et al
If you are looking for the contents of this specific archive for replication or research, they are usually hosted on:
The "-005" suffix often indicates a specific cross-validation fold (e.g., the 5th split of the data) used during the model training process to ensure the AI's accuracy across different protein families. Where to Find the Data