Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev #20

Open
wants to merge 32 commits into
base: main
Choose a base branch
from
Open

Dev #20

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
3d79d78
add performer attention
jkobject Jan 31, 2025
5802742
few debugs
jkobject Feb 1, 2025
97cdc44
more debugs
jkobject Feb 1, 2025
5995495
more debugs
jkobject Feb 6, 2025
82e77ad
debug
jkobject Feb 6, 2025
860d2b6
adding esm2 finetuning and
jkobject Feb 8, 2025
7fbf6e3
classic forgot
jkobject Feb 8, 2025
c08243b
wip
jkobject Feb 10, 2025
771eb12
debug issue
jkobject Feb 19, 2025
0b31119
adding to model.py
jkobject Feb 20, 2025
04beb13
Merge branch 'dev' of https://github.com/cantinilab/scPRINT into dev
jkobject Feb 20, 2025
c67c50d
add wip gnn
jkobject Feb 21, 2025
b84e7e9
Merge remote-tracking branch 'origin/dev' into gnn
jkobject Feb 21, 2025
23923d9
actually adding a KL loss and setting up better package versions
jkobject Feb 21, 2025
49bde83
testing models
jkobject Feb 21, 2025
9b84e91
debug
jkobject Feb 21, 2025
8a6d12a
better params for get knn cells and sota denoising
jkobject Feb 22, 2025
71a7326
Merge branch 'dev' of https://github.com/cantinilab/scPRINT into dev
jkobject Feb 22, 2025
934fb4a
debug knn cells
jkobject Feb 24, 2025
2027beb
debugging cli, and slurm runs
jkobject Feb 24, 2025
334b9ef
updating defaults
jkobject Feb 26, 2025
572c558
debug of a key circular import and start to solve gene emb
jkobject Feb 28, 2025
9c950d3
debug embedding gene issue
jkobject Feb 28, 2025
beb49c8
debug gene encoder and making it load prev model versions
jkobject Mar 1, 2025
373eadc
Merge branch 'dev' of https://github.com/cantinilab/scPRINT into dev
jkobject Mar 1, 2025
c1a7c5e
wip on the large task front
jkobject Mar 5, 2025
0c2525a
Merge remote-tracking branch 'origin/main' into dev
jkobject Mar 6, 2025
73278a2
merge solve
jkobject Mar 6, 2025
92871bc
some debug
jkobject Mar 9, 2025
a387597
Merge commit 'b06953e4e0b0d44d2d53b2c759f3747f60045cfe' into dev
jkobject Mar 9, 2025
cd92495
debug
jkobject Mar 9, 2025
9243b8e
debug
jkobject Mar 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions config/ablation_study.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ trainer:
- class_path: lightning.pytorch.loggers.WandbLogger
init_args:
project: ${project}
save_dir: /lustre/fswork/projects/rech/xeg/uat95fg/ #/pasteur/zeus/projets/p02/ml4ig_hot/Users/jkalfon/ #/data/log/
offline: False
save_dir: /pasteur/zeus/projets/p02/ml4ig_hot/Users/jkalfon/ #/lustre/fswork/projects/rech/xeg/uat95fg/ #/pasteur/zeus/projets/p02/ml4ig_hot/Users/jkalfon/ #/data/log/
offline: True
callbacks:
#- class_path: lightning.pytorch.callbacks.StochasticWeightAveraging
# init_args:
Expand Down Expand Up @@ -53,6 +53,7 @@ model:
prenorm: True
fused_mlp: False
fused_bias_fc: False
depth_atinput: False
drop_path_rate: 0
freeze_embeddings: True
normalization: log
Expand All @@ -63,7 +64,7 @@ data:
- NCBITaxon:9606
- NCBITaxon:10090
gene_position_tolerance: 10_000
gene_embeddings: /lustre/fswork/projects/rech/xeg/uat95fg/gene_embeddings.parquet
gene_embeddings: ./data/main/gene_embeddings.parquet # /lustre/fswork/projects/rech/xeg/uat95fg/gene_embeddings.parquet
collection_name: scPRINT-V2 full #scPRINT-V2 (good quality)
how: random expr
max_len: 2200
Expand Down
14 changes: 8 additions & 6 deletions config/base_v2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ log_graph: True
trainer:
precision: 16-mixed
strategy: ddp_find_unused_parameters_true
gradient_clip_val: 40
gradient_clip_val: 10
log_every_n_steps: 100
limit_train_batches: 20000
gradient_clip_algorithm: norm
Expand All @@ -33,7 +33,7 @@ trainer:
save_last: True
scprint_training:
run_full_forward: True
noise: [0.6]
noise: [0.9]
do_ecs: True
do_cce: True
mask_ratio: ["TF"]
Expand All @@ -42,9 +42,10 @@ scprint_training:
scprint_early_stopping:
patience: 4
model:
dropout: 0.1
num_heads_kv: 2
dropout: 0
num_heads_kv: 4
transformer: flash
finetune_gene_emb: True
mvc_decoder: inner product
residual_in_fp32: True
checkpointing: True
Expand All @@ -68,6 +69,7 @@ model:
# self_reported_ethnicity_ontology_term_id: 10
# sex_ontology_term_id: 2
# organism_ontology_term_id: 8
class_compression: "none"
data:
organisms:
- NCBITaxon:9606
Expand All @@ -76,10 +78,10 @@ data:
gene_embeddings: ./data/main/gene_embeddings.parquet
collection_name: scPRINT-V2 full #scPRINT-V2 (good quality) # scPRINT-V2 (some)
how: random expr
max_len: 2200
max_len: 3200
pin_memory: True
prefetch_factor: 3
metacell_mode: 0.2
metacell_mode: 0
weight_scaler: 200
do_gene_pos: ./data/main/biomart_pos.parquet
add_zero_genes: 0
Expand Down
3 changes: 3 additions & 0 deletions docs/notebooks/cancer_usecase.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@
"5. [Denoising and differential expression](#denoising-and-differential-expression)\n",
"6. [Gene network inference](#gene-network-inference)\n",
"\n",
"Please note that although your adata needs to be processed in a similar way to what is shown in `1.` You can perform the analaysis in `2-3-4`, `5` or `6` in any order as they are completely distinct.\n",
"\n",
"\n",
"> In the notebook [cancer_usecase_part2.ipynb](./cancer_usecase_part2.ipynb) you will see how to analyse cell type specific gene regulatory networks.\n",
"\n"
]
Expand Down
2,830 changes: 2,830 additions & 0 deletions notebooks/bench_embedding_new.ipynb

Large diffs are not rendered by default.

Loading
Loading