{ "cells": [ { "cell_type": "markdown", "id": "b9961d52", "metadata": {}, "source": [ "# DreaMS-Fluorine" ] }, { "cell_type": "markdown", "id": "b21de2cb", "metadata": {}, "source": [ "## 1. Obtain model weights\n", "\n", "Please contact us at roman.bushuiev@uochb.cas.cz to request the DreaMS-Fluorine model weight files (`dreams_fluorine_epoch=1-step=7000.ckpt` and `dreams_fluorine_epoch=30-step=111000.ckpt`). After receiving them, place both files in `/DreaMS/dreams/models/pretrained/`.\n", "\n", "Note that DreaMS-Fluorine was trained using NIST20, so we can share the weights only with NIST library license holders. Please attach your NIST license or order confirmation to the email.\n", "\n", "## 2. Run DreaMS-Fluorine\n", "\n", "Execute the following command, where `--in_dir` specifies the folder containing `.mzML` or `.mgf` files (here, `data/MSV000099559`). The script will generate `dreams_fluorine_predictions.csv` in the same folder, containing predicted fluorine-presence probabilities for each MS/MS spectrum in each file. The current model version supports positive-mode data only.\n", "\n", "```bash\n", "python3 dreams/cli.py dreams_fluorine --in_dir data/MSV000099559\n", "```\n", "\n", "## 3. Examine the predictions" ] }, { "cell_type": "code", "execution_count": 2, "id": "e9380bec", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | RT | \n", "charge | \n", "file_name | \n", "polarity | \n", "precursor_mz | \n", "precursor_target_mz | \n", "scan_number | \n", "spectrum | \n", "window_lo | \n", "window_uo | \n", "F_preds_111k_steps | \n", "F_preds_7k_steps | \n", "dformat | \n", "tag | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "810.063540 | \n", "1 | \n", "MO23S_030.mzML | \n", "1 | \n", "304.891639 | \n", "304.891632 | \n", "4153 | \n", "[[81.07015991210938, 83.0491714477539, 84.9596... | \n", "0.5 | \n", "0.5 | \n", "0.923519 | \n", "0.210867 | \n", "A | \n", "Only 111k checkpoint > 0.9 hit | \n", "
| 1 | \n", "34.545791 | \n", "1 | \n", "MO23S_030.mzML | \n", "1 | \n", "241.999878 | \n", "241.999878 | \n", "133 | \n", "[[81.07012939453125, 84.95979309082031, 84.964... | \n", "0.5 | \n", "0.5 | \n", "0.921477 | \n", "0.217751 | \n", "A | \n", "Only 111k checkpoint > 0.9 hit | \n", "
| 2 | \n", "514.087512 | \n", "1 | \n", "MO23S_030.mzML | \n", "1 | \n", "204.138540 | \n", "204.138535 | \n", "2640 | \n", "[[78.58903503417969, 79.05422973632812, 84.044... | \n", "0.5 | \n", "0.5 | \n", "0.913897 | \n", "0.604710 | \n", "A | \n", "Only 111k checkpoint > 0.9 hit | \n", "
| 3 | \n", "810.776700 | \n", "1 | \n", "MO23S_027.mzML | \n", "1 | \n", "328.915390 | \n", "328.915405 | \n", "4179 | \n", "[[81.06999206542969, 87.28488159179688, 90.947... | \n", "0.5 | \n", "0.5 | \n", "0.903643 | \n", "0.140945 | \n", "A | \n", "Only 111k checkpoint > 0.9 hit | \n", "
| 4 | \n", "809.980320 | \n", "1 | \n", "MO23S_027.mzML | \n", "1 | \n", "304.891539 | \n", "304.891541 | \n", "4175 | \n", "[[84.95977783203125, 85.2956314086914, 90.0553... | \n", "0.5 | \n", "0.5 | \n", "0.878663 | \n", "0.227668 | \n", "A | \n", "Only 111k checkpoint > 0.75 hit | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 11125 | \n", "745.386000 | \n", "2 | \n", "MO23S_027.mzML | \n", "1 | \n", "615.455358 | \n", "615.455383 | \n", "3859 | \n", "[[79.05448150634766, 80.05440521240234, 81.069... | \n", "0.5 | \n", "0.5 | \n", "0.000000 | \n", "0.000000 | \n", "A | \n", "NaN | \n", "
| 11126 | \n", "745.794840 | \n", "2 | \n", "MO23S_027.mzML | \n", "1 | \n", "593.442417 | \n", "593.442444 | \n", "3861 | \n", "[[80.05455780029297, 81.06998443603516, 83.085... | \n", "0.5 | \n", "0.5 | \n", "0.000000 | \n", "0.000000 | \n", "A | \n", "NaN | \n", "
| 11127 | \n", "747.321360 | \n", "2 | \n", "MO23S_027.mzML | \n", "1 | \n", "571.429528 | \n", "571.429504 | \n", "3868 | \n", "[[80.05457305908203, 81.06989288330078, 83.085... | \n", "0.5 | \n", "0.5 | \n", "0.000000 | \n", "0.000000 | \n", "A | \n", "NaN | \n", "
| 11128 | \n", "748.533720 | \n", "2 | \n", "MO23S_027.mzML | \n", "1 | \n", "549.416251 | \n", "549.416260 | \n", "3874 | \n", "[[80.05448150634766, 81.06990814208984, 81.132... | \n", "0.5 | \n", "0.5 | \n", "0.000000 | \n", "0.000000 | \n", "A | \n", "NaN | \n", "
| 11129 | \n", "749.549340 | \n", "2 | \n", "MO23S_027.mzML | \n", "1 | \n", "527.403199 | \n", "527.403198 | \n", "3879 | \n", "[[77.15992736816406, 77.27590942382812, 80.054... | \n", "0.5 | \n", "0.5 | \n", "0.000000 | \n", "0.000000 | \n", "A | \n", "NaN | \n", "
11130 rows × 14 columns
\n", "