-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monocle3 as input to tradeSeq #247
Comments
Could you please check how |
Hi, sorry, If I should not enter this thread, but I am having same error following the mentioned vignette. I sure have zero weights in my dataset frequently. I am wondering how to sort his out. My data looks like this this is summary of cell weights |
Hi @mahwish2 It is fine to have weights of zero, but the sum of the weights across all lineages should not be zero. In that case, we cannot assign a cell to any lineage. |
Hi All,
I am trying to use Monocle3 as input to tradeSeq. But i met following error:
sce <- tradeSeq::fitGAM(counts = counts,
pseudotime = pseudotime,
cellWeights = cellWeights)
#Error in .assignCells(cellWeights) :
Some cells have no positive cell weights.
Full codes were as following;
-----------https://statomics.github.io/tradeSeq/articles/Monocle.html
#########Extracting the pseudotimes and cell weights for tradeSeq
library(tradeSeq)
library(RColorBrewer)
library(SingleCellExperiment)
For reproducibility
palette(brewer.pal(8, "Dark2"))
#counts <- as.matrix(countMatrix)
#data(celltype, package = "tradeSeq")
library(magrittr)
note the colnames(counts) must match rownames(meta)
data <- GetAssayData(scRNA66, assay = 'RNA', slot = 'counts')
cell_metadata <- [email protected]
gene_annotation <- data.frame(gene_short_name = rownames(data))
rownames(gene_annotation) <- rownames(data)
cds <- new_cell_data_set(expression_data = as.matrix(data),
cell_metadata = cell_metadata,
gene_metadata = gene_annotation)
access the gene names
rownames(cds)[1:5]
access the meta data
head(colData(cds))
access the assay data
head(assay(cds)[,1:3])
preprocess_cds does both 1) normalization and 2) preliminary dimension reduction. By default 1) gene expression count for each cell is divided by the total counts of that cell, multiplied by a scale factor, and log-transformed. This is done to address differences in sequencing depths for different cells. 2) PCA is used to calculate a lower dimensional space that will be used as the input for downstream steps like UMAP visualization.
cds <- preprocess_cds(cds,
num_dim = 30)
look at the percentage of variance explained by our principal components
plot_pc_variance_explained(cds)
Note here we are using the results of PCA dimension reduction in the last step and visualizing the results in 2 dimensions.
umap.fast_sgd=FALSE and cores = 1 are needed for reproducible results
cds <- reduce_dimension(cds,
preprocess_method = 'PCA',
umap.fast_sgd=FALSE,
cores = 1)
let's take another look at our cell data set object
cds
let's examine our new
reducedDimNames
slot!head(reducedDims(cds)$UMAP)
view the cell types in our UMAP plot
plot_cells(cds,
color_cells_by = "celltype",
show_trajectory_graph = FALSE,
group_label_size = 3,
cell_size = 0.5)
Grouping cells into clusters is an important step in identifying the cell types represented in your data. Monocle uses a technique called community detection to group cells into cluster and partitions.
cds <- cluster_cells(cds,
k = 20,
partition_qval = 0.05)
plot_cells(cds,
color_cells_by = "partition",
show_trajectory_graph = FALSE,
group_label_size = 3,
cell_size = 0.5)
additionally we will run learn_graph to calculate trajectories on our subset!
cds <- learn_graph(cds,
use_partition = FALSE,
close_loop = TRUE)
now that we have subset out data, re-run our monocle workflow, and calculated
trajectories, let's see where we should set our root node!
plot_cells(cds,
color_cells_by = "celltype",
cell_size = 0.5,
labels_per_group = 0)
cluster.before.traj <-plot_cells(cds,
color_cells_by = "celltype",
label_groups_by_cluster = T,
group_label_size = 5) +
theme(legend.position = "right")
cluster.before.traj
head(colData(cds))
table(colData(cds)$celltype)
a helper function to identify the root principal points:
get_earliest_principal_node <- function(cds, celltype="C2_immNK1"){
cell_ids <- which(colData(cds)[, "celltype"] == "C2_immNK1")
closest_vertex <-
cds@principal_graph_aux[["UMAP"]]$pr_graph_cell_proj_closest_vertex
closest_vertex <- as.matrix(closest_vertex[colnames(cds), ])
root_pr_nodes <-
igraph::V(principal_graph(cds)[["UMAP"]])$name[as.numeric(names
(which.max(table(closest_vertex[cell_ids,]))))]
root_pr_nodes
}
cds = order_cells(cds, root_pr_nodes=get_earliest_principal_node(cds))
#cds <- order_cells(cds, reduction_method = "UMAP",
root_cells = colnames(cds[, clusters(cds) == 5]))
plot_cells(cds, color_cells_by = "pseudotime", label_groups_by_cluster = T,
label_branch_points = T, label_roots = F, label_leaves = F)
plot pseudo time after choosing root nodes
(defined as the bottom in the cycling progenitors cell type)
plot_cells(cds,
show_trajectory_graph = T,
color_cells_by = "pseudotime",
graph_label_size=3,
cell_size = 0.5)
plot_cells(cds,
show_trajectory_graph = T,
color_cells_by = "celltype",
graph_label_size=3,
cell_size = 0.5)
##############
how are our wild-type and mutant cells distributed in this graph?
plot_cells(cds_2,
color_cells_by = "treat",
show_trajectory_graph = F,
label_cell_groups = F,
cell_size = 0.5)
cds@colData$pseudotime=pseudotime(cds)
We find all the cells that are close to the starting point
cell_ids <- colnames(cds)[cds@colData$celltype == "C2_immNK1"]
closest_vertex <- cds@principal_graph_aux[["UMAP"]]$pr_graph_cell_proj_closest_vertex
closest_vertex <- as.matrix(closest_vertex[colnames(cds), ])
closest_vertex <- closest_vertex[cell_ids, ]
closest_vertex <- as.numeric(names(which.max(table(closest_vertex))))
mst <- principal_graph(cds)$UMAP
root_pr_nodes <- igraph::V(mst)$name[closest_vertex]
We compute the trajectory
cds <- order_cells(cds, root_pr_nodes = root_pr_nodes)
plot_cells(cds, color_cells_by = "pseudotime")
library(magrittr)
Get the closest vertice for every cell
y_to_cells <- principal_graph_aux(cds)$UMAP$pr_graph_cell_proj_closest_vertex %>%
as.data.frame()
y_to_cells$cells <- rownames(y_to_cells)
y_to_cells$Y <- y_to_cells$V1
Get the root vertices
It is the same node as above
root <- cds@principal_graph_aux$UMAP$root_pr_nodes
Get the other endpoints
endpoints <- names(which(igraph::degree(mst) == 1))
endpoints <- endpoints[!endpoints %in% root]
For each endpoint
cellWeights <- lapply(endpoints, function(endpoint) {
We find the path between the endpoint and the root
path <- igraph::shortest_paths(mst, root, endpoint)$vpath[[1]]
path <- as.character(path)
We find the cells that map along that path
df <- y_to_cells[y_to_cells$Y %in% path, ]
df <- data.frame(weights = as.numeric(colnames(cds) %in% df$cells))
colnames(df) <- endpoint
return(df)
}) %>% do.call(what = 'cbind', args = .) %>%
as.matrix()
rownames(cellWeights) <- colnames(cds)
pseudotime <- matrix(pseudotime(cds), ncol = ncol(cellWeights),
nrow = ncol(cds), byrow = FALSE)
counts <- as.matrix(data)
counts <- as.matrix(Biobase::exprs(cds))
sce <- tradeSeq::fitGAM(counts = counts,
pseudotime = pseudotime,
cellWeights = cellWeights)
#Error in .assignCells(cellWeights) :
Some cells have no positive cell weights.
i will be greatly appreciated if you can help me!
best
jiang
The text was updated successfully, but these errors were encountered: