单细胞转录组数据分析|| scanpy教程：PAGA轨迹推断

单细胞转录组数据分析|| scanpy教程：预处理与聚类

说起轨迹推断，很多人的第一印象就是monocle的轨迹图，大概率是长这样子的：

如果说单细胞转录组数据分析中的分群是寻找细胞的离散属性，那么轨迹推断就是寻找细胞分化连续性的尝试。为什么细胞的分化既有离散性又有连续性呢？这是一个历史问题，细胞的分化当然是连续的，之所以用分群的方法来解释异质性，实在是一种无奈之举。每一个细胞都是独一无二的，没有一个细胞是孤岛，这是我们的口号，但是理想与现实总是不能统一。

monocle提供了一套具有启发意义的轨迹方法，一简单粗暴的方式试图弥补这理想与现实的大峡谷。在monocle的世界里轨迹与图谱是分离的，即图谱是tsne/umap的，轨迹是另一个降维空间。那么有没有一种降维技术能够再走一步呢？今天我们介绍的scanpy的PAGA(graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells)就是这方面的一个尝试:在保留细胞图谱的基础上完成细胞轨迹的推断：

它是如何现实的呢？其实是统一了聚类和轨迹推断的空间结构。

基于分区的图抽象(Partition-based graph abstraction )生成单个细胞的拓扑结构并保留映射。高维基因表达数据降维后计算邻域关系的相关距离度量来表示kNN图（pca和欧氏距离）。kNN图按所需的分辨率进行分群，其中分群表示连续细胞群。为此，我们通常使用Louvain算法，然而，分群也可以通过其他任何方式获得。PAGA图是通过将一个节点与每个亚群关联起来，并通过表示亚群之间连接性的统计度量的加权边连接每个节点得到。通过丢弃低权重的假边，PAGA图揭示了数据在选定分辨率下的去噪拓扑结构，并揭示了其连接和断开的区域。

好了，我们来看看scanpy中PAGA是如何实现的吧，好不好？

首先我们导入我们的数据：


import numpy as np
import pandas as pd
import scanpy as sc
import seaborn as sns 




sc.settings.verbosity = 3             # verbosity: errors (0), warnings (1), info (2), hints (3)
sc.logging.print_versions()
sc.settings.set_figure_params(dpi=80)

scanpy==1.4.5.1 
anndata==0.7.1 
umap==0.3.10 
numpy==1.16.5 
scipy==1.3.1 
pandas==0.25.1 
scikit-learn==0.21.3 
statsmodels==0.10.1 
python-igraph==0.8.0

results_file = 'E:\learnscanpy\write\pbmc3k.h5ad' 
help(sc.read_10x_mtx)
adata = sc.read_10x_mtx(
    'E:/learnscanpy/data/filtered_feature_bc_matrix',  # the directory with the `.mtx` file
    var_names='gene_symbols',                  # use gene symbols for the variable names (variables-axis index)
    cache=True) 
adata.var_names_make_unique()  # this is unnecessary if using `var_names='gene_ids'` in `sc.read_10x_mtx`

adata
Out[282]: 
AnnData object with n_obs × n_vars = 5025 × 33694 
    var: 'gene_ids', 'feature_types'

这里我们使用zheng17的数据预处理方法，仅仅是因为简单，您也可以像单细胞转录组数据分析|| scanpy教程：预处理与聚类一样自己一步一步地执行预处理。

sc.pp.recipe_zheng17(adata)
sc.tl.pca(adata, svd_solver='arpack')
sc.pp.neighbors(adata, n_neighbors=4, n_pcs=20)
sc.tl.leiden(adata)

也许，你会问这到底是一种怎样的cell QC的过程，不难，来看看它的帮助文档啊，都i给你写好了呢？！

help(sc.pp.recipe_zheng17)

Help on function recipe_zheng17 in module scanpy.preprocessing._recipes:

recipe_zheng17(adata: anndata._core.anndata.AnnData, n_top_genes: int = 1000, log: bool = True, plot: bool = False, copy: bool = False) -> Union[anndata._core.anndata.AnnData, NoneType]
    Normalization and filtering as of [Zheng17]_.
    
    Reproduces the preprocessing of [Zheng17]_ – the Cell Ranger R Kit of 10x
    Genomics.
    
    Expects non-logarithmized data.
    If using logarithmized data, pass `log=False`.
    
    The recipe runs the following steps
    
    .. code:: python
    
        sc.pp.filter_genes(adata, min_counts=1)         # only consider genes with more than 1 count
        sc.pp.normalize_per_cell(                       # normalize with total UMI count per cell
             adata, key_n_counts='n_counts_all'
        )
        filter_result = sc.pp.filter_genes_dispersion(  # select highly-variable genes
            adata.X, flavor='cell_ranger', n_top_genes=n_top_genes, log=False
        )
        adata = adata[:, filter_result.gene_subset]     # subset the genes
        sc.pp.normalize_per_cell(adata)                 # renormalize after filtering
        if log: sc.pp.log1p(adata)                      # log transform: adata.X = log(adata.X + 1)
        sc.pp.scale(adata)                              # scale to unit variance and shift to zero mean
    
    
    Parameters
    ----------
    adata
        Annotated data matrix.
    n_top_genes
        Number of genes to keep.
    log
        Take logarithm.
    plot
        Show a plot of the gene dispersion vs. mean relation.
    copy
        Return a copy of `adata` instead of updating it.
    
    Returns
    -------
    Returns or updates `adata` depending on `copy`.

加个plot参数试试，好使！

sc.tl.draw_graph(adata)
sc.pl.draw_graph(adata, color='leiden', legend_loc='on data')

这里你不要问下没有执行降维，这个细胞图谱是在哪个空间里的呢？这就要问一下sc.tl.draw_graph了：

adata
Out[292]: 
AnnData object with n_obs × n_vars = 5025 × 1000 
    obs: 'n_counts_all', 'leiden'
    var: 'gene_ids', 'feature_types', 'n_counts'
    uns: 'log1p', 'pca', 'neighbors', 'leiden', 'draw_graph', 'leiden_colors'
    obsm: 'X_pca', 'X_draw_graph_fr'
    varm: 'PCs'

help(sc.tl.draw_graph)
Help on function draw_graph in module scanpy.tools._draw_graph:

draw_graph(adata: anndata._core.anndata.AnnData, layout: scanpy._compat.Literal_ = 'fa', init_pos: Union[str, bool, NoneType] = None, root: Union[int, NoneType] = None, random_state: Union[int, mtrand.RandomState, NoneType] = 0, n_jobs: Union[int, NoneType] = None, adjacency: Union[scipy.sparse.base.spmatrix, NoneType] = None, key_added_ext: Union[str, NoneType] = None, copy: bool = False, **kwds)
    Force-directed graph drawing [Islam11]_ [Jacomy14]_ [Chippada18]_.
    
    An alternative to tSNE that often preserves the topology of the data
    better. This requires to run :func:`~scanpy.pp.neighbors`, first.
    
    The default layout ('fa', `ForceAtlas2`) [Jacomy14]_ uses the package |fa2|_
    [Chippada18]_, which can be installed via `pip install fa2`.
    
    `Force-directed graph drawing`_ describes a class of long-established
    algorithms for visualizing graphs.
    It has been suggested for visualizing single-cell data by [Islam11]_.
    Many other layouts as implemented in igraph [Csardi06]_ are available.
    Similar approaches have been used by [Zunder15]_ or [Weinreb17]_.
    
    .. |fa2| replace:: `fa2`
    .. _fa2: https://github.com/bhargavchippada/forceatlas2
    .. _Force-directed graph drawing: https://en.wikipedia.org/wiki/Force-directed_graph_drawing
    
    Parameters
    ----------
    adata
        Annotated data matrix.
    layout
        'fa' (`ForceAtlas2`) or any valid `igraph layout
        <http://igraph.org/c/doc/igraph-Layout.html>`__. Of particular interest
        are 'fr' (Fruchterman Reingold), 'grid_fr' (Grid Fruchterman Reingold,
        faster than 'fr'), 'kk' (Kamadi Kawai', slower than 'fr'), 'lgl' (Large
        Graph, very fast), 'drl' (Distributed Recursive Layout, pretty fast) and
        'rt' (Reingold Tilford tree layout).
    root
        Root for tree layouts.
    random_state
        For layouts with random initialization like 'fr', change this to use
        different intial states for the optimization. If `None`, no seed is set.
    adjacency
        Sparse adjacency matrix of the graph, defaults to
        `adata.uns['neighbors']['connectivities']`.
    key_added_ext
        By default, append `layout`.
    proceed
        Continue computation, starting off with 'X_draw_graph_`layout`'.
    init_pos
        `'paga'`/`True`, `None`/`False`, or any valid 2d-`.obsm` key.
        Use precomputed coordinates for initialization.
        If `False`/`None` (the default), initialize randomly.
    copy
        Return a copy instead of writing to adata.
    **kwds
        Parameters of chosen igraph layout. See e.g. `fruchterman-reingold`_
        [Fruchterman91]_. One of the most important ones is `maxiter`.
    
        .. _fruchterman-reingold: http://igraph.org/python/doc/igraph.Graph-class.html#layout_fruchterman_reingold
    
    Returns
    -------
    Depending on `copy`, returns or updates `adata` with the following field.
    
    **X_draw_graph_layout** : `adata.obsm`
        Coordinates of graph layout. E.g. for layout='fa' (the default),
        the field is called 'X_draw_graph_fa'

这是tsne的一种代替方案，由fa2库完成，绘图用的是Force-directed_graph_drawing，说明文档中给出了相关的链接。

那么我们看看umap的结构是怎么样的：

sc.tl.umap(adata)
sc.pl.umap(adata, color=['leiden'])

umap都做了，再看看tsne的结果吧：

sc.tl.tsne(adata)
sc.pl.tsne(adata, color=['leiden'])

果然tsne要慢很多，等了好一会。。。

可选操作：图形去噪

为了去噪，用扩散映射空间来表示它(而不是PCA空间)。计算几个扩散分量内的距离相当于图像去噪——我们只取几个第一个光谱分量。这与使用PCA去噪数据矩阵非常相似。注意，对于PAGA、聚类或伪时间估计，这都不是必要的步骤。

sc.tl.diffmap(adata)
computing Diffusion Maps using n_comps=15(=n_dcs)
computing transitions
    finished (0:00:00)
    eigenvalues of transition matrix
    [1.         1.         1.         0.9997476  0.9994019  0.9990253
     0.9986317  0.99379796 0.9936346  0.9925993  0.99070626 0.9898019
     0.98812026 0.9873447  0.9869766 ]
    finished: added
    'X_diffmap', diffmap coordinates (adata.obsm)
    'diffmap_evals', eigenvalues of transition matrix (adata.uns) (0:00:00)

sc.pp.neighbors(adata, n_neighbors=10, use_rep='X_diffmap')
sc.tl.draw_graph(adata)
sc.pl.draw_graph(adata, color='leiden', legend_loc='on data')

虽然还是有些凌乱，但还是清爽了一些的：

Clustering and PAGA

其实我们已经聚类过了啊，不然哪来的标签呢。

# sc.tl.louvain(adata, resolution=1.0)
# sc.tl.leiden(adata)
sc.tl.paga(adata, groups='leiden')
running PAGA
    finished: added
    'paga/connectivities', connectivities adjacency (adata.uns)
    'paga/connectivities_tree', connectivities subtree (adata.uns) (0:00:00)

adata.var_names
Out[304]: 
Index(['AP006222.2', 'MEGF6', 'RP11-46F15.2', 'GPR153', 'RP5-1113E3.3',
       'RP4-734G22.3', 'MASP2', 'RP3-467K16.2', 'RP4-680D5.2', 'AKR7A3',
       ...
       'MT-CO1', 'MT-CO2', 'MT-ATP6', 'MT-CO3', 'MT-ND3', 'MT-ND4', 'MT-ND5',
       'MT-CYB', 'AC145212.2', 'AC011043.1'],
      dtype='object', length=1000)

adata
Out[308]: 
AnnData object with n_obs × n_vars = 5025 × 1000 
    obs: 'n_counts_all', 'leiden'
    var: 'gene_ids', 'feature_types', 'n_counts'
    uns: 'log1p', 'pca', 'neighbors', 'leiden', 'draw_graph', 'leiden_colors', 'umap', 'diffmap_evals', 'paga', 'leiden_sizes'
    obsm: 'X_pca', 'X_draw_graph_fr', 'X_umap', 'X_tsne', 'X_diffmap'
    varm: 'PCs'

sc.pl.paga(adata, color=['leiden', 'MEGF6', 'MT-CO2', 'AKR7A3'])
--> added 'pos', the PAGA positions (adata.uns['paga'])

我们看到一个框架图，其实是一个无向的网络图，点的颜色代表不同分群，点的大小代表该群细胞数的大小，连线只有一种颜色，粗细代表两个群之间连接更密切。这里并没有一个拟时的概念，轨迹是一种相互关系，可以是时间的，也可以不是，而往往以人类的直觉，时间都是单向的，但是每个细胞都有自己的分化方向。一个简单点的框架图，让我们从新思考细胞分化这一基本事实！但是下面我们会在这个基础上推断出一个拟时结构。

adata.obs['leiden'].cat.categories
Out[306]: 
Index(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12',
       '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24',
       '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35'],
      dtype='object')

遇到核心函数，就算没有什么疑问也是要看说明文档的啊：

help(sc.tl.paga)
Help on function paga in module scanpy.tools._paga:

paga(adata: anndata._core.anndata.AnnData, groups: Union[str, NoneType] = None, use_rna_velocity: bool = False, model: scanpy._compat.Literal_ = 'v1.2', copy: bool = False)
    Mapping out the coarse-grained connectivity structures of complex manifolds [Wolf19]_.
    
    By quantifying the connectivity of partitions (groups, clusters) of the
    single-cell graph, partition-based graph abstraction (PAGA) generates a much
    simpler abstracted graph (*PAGA graph*) of partitions, in which edge weights
    represent confidence in the presence of connections. By tresholding this
    confidence in :func:`~scanpy.pl.paga`, a much simpler representation of the
    manifold data is obtained, which is nonetheless faithful to the topology of
    the manifold.
    
    The confidence should be interpreted as the ratio of the actual versus the
    expected value of connetions under the null model of randomly connecting
    partitions. We do not provide a p-value as this null model does not
    precisely capture what one would consider "connected" in real data, hence it
    strongly overestimates the expected value. See an extensive discussion of
    this in [Wolf19]_.
    
    .. note::
        Note that you can use the result of :func:`~scanpy.pl.paga` in
        :func:`~scanpy.tl.umap` and :func:`~scanpy.tl.draw_graph` via
        `init_pos='paga'` to get single-cell embeddings that are typically more
        faithful to the global topology.
    
    Parameters
    ----------
    adata
        An annotated data matrix.
    groups
        Key for categorical in `adata.obs`. You can pass your predefined groups
        by choosing any categorical annotation of observations. Default:
        The first present key of `'leiden'` or `'louvain'`.
    use_rna_velocity
        Use RNA velocity to orient edges in the abstracted graph and estimate
        transitions. Requires that `adata.uns` contains a directed single-cell
        graph with key `['velocity_graph']`. This feature might be subject
        to change in the future.
    model
        The PAGA connectivity model.
    copy
        Copy `adata` before computation and return a copy. Otherwise, perform
        computation inplace and return `None`.
    
    Returns
    -------
    **connectivities** : :class:`numpy.ndarray` (adata.uns['connectivities'])
        The full adjacency matrix of the abstracted graph, weights correspond to
        confidence in the connectivities of partitions.
    **connectivities_tree** : :class:`scipy.sparse.csr_matrix` (adata.uns['connectivities_tree'])
        The adjacency matrix of the tree-like subgraph that best explains
        the topology.
    
    Notes
    -----
    Together with a random walk-based distance measure
    (e.g. :func:`scanpy.tl.dpt`) this generates a partial coordinatization of
    data useful for exploring and explaining its variation.
    
    .. currentmodule:: scanpy
    
    See Also
    --------
    pl.paga
    pl.paga_path
    pl.paga_compare

我们看到use_rna_velocity参数，scanpy也可以用rna速率数据啊。

假装我们已经做好了细胞定义：

adata.obs['leiden_anno'] = adata.obs['leiden']
adata.obs['leiden_anno'].cat.categories = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10/Ery', '11', '12',
       '13', '14', '15', '16/Stem', '17', '18', '19/Neu', '20/Mk', '21', '22/Baso', '23', '24/Mo','25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35']
sc.tl.paga(adata, groups='leiden_anno')

sc.pl.paga(adata, threshold=0.03, show=False)

用PAGA的结果重新计算嵌入，也就是使得细胞的结构与PAGA的结构一致：

sc.tl.draw_graph(adata, init_pos='paga')
sc.pl.draw_graph(adata, color=['leiden_anno','MEGF6', 'MT-CO2', 'AKR7A3'], legend_loc='on data')

drawing single-cell graph using layout 'fa'
WARNING: Package 'fa2' is not installed, falling back to layout 'fr'.To use the faster and better ForceAtlas2 layout, install package 'fa2' (`pip install fa2`).

    finished: added
    'X_draw_graph_fr', graph_drawing coordinates (adata.obsm) (0:00:18)

当然这个图并不是我们期望，看上去依然略显混乱，根据waring的提示也许我应该pip install fa2,于是我们就pip install fa2,再来看一下结果：

forceatlas2将Gephi的Force Atlas 2布局算法移植到python2和python3(带有NetworkX和igraph的包装)。这是最快的python实现，大部分功能已经完成。它还支持Barnes Hut的最大加速近似值。ForceAtlas2是一种非常快速的面向力的图形布局算法。它用于在2D中对一个加权的无向图进行空间化(边缘权值定义了连接的强度)。实现基于本文和相应的gephi-java代码。与networkx的fruchterman reingold算法(spring layout)相比，它确实非常快，并且可以很好地扩展到大量节点(>10000)。

回到我们的help(sc.tl.draw_graph)，看到这layout的选项支持不同的igraph layout主题：

layout
        'fa' (`ForceAtlas2`) or any valid `igraph layout
        <http://igraph.org/c/doc/igraph-Layout.html>`__. Of particular interest
        are 'fr' (Fruchterman Reingold), 'grid_fr' (Grid Fruchterman Reingold,
        faster than 'fr'), 'kk' (Kamadi Kawai', slower than 'fr'), 'lgl' (Large
        Graph, very fast), 'drl' (Distributed Recursive Layout, pretty fast) and
        'rt' (Reingold Tilford tree layout).

sc.tl.draw_graph(adata, init_pos='paga',layout= 'lgl')
sc.pl.draw_graph(adata, color=['leiden_anno','MEGF6', 'MT-CO2', 'AKR7A3'], legend_loc='on data')

sc.tl.draw_graph(adata, init_pos='paga',layout= 'kk')
sc.pl.draw_graph(adata, color=['leiden_anno','MEGF6', 'MT-CO2', 'AKR7A3'], legend_loc='on data')

我们还是用fa吧。我们不禁会想，monocle的layout是什么呢，为什么看起来那么直白？

pl.figure(figsize=(8, 2))
for i in range(28):
    pl.scatter(i, 1, c=sc.pl.palettes.zeileis_28[i], s=200)
pl.show()

zeileis_colors = np.array(sc.pl.palettes.zeileis_28)
new_colors = np.array(adata.uns['leiden_anno_colors'])

new_colors[[16]] = zeileis_colors[[12]]  # Stem colors / green
new_colors[[10, 17, 5, 3, 15, 6, 18, 13, 7, 12]] = zeileis_colors[[5, 5, 5, 5, 11, 11, 10, 9, 21, 21]]  # Ery colors / red
new_colors[[20, 8]] = zeileis_colors[[17, 16]]  # Mk early Ery colors / yellow
new_colors[[4, 0]] = zeileis_colors[[2, 8]]  # lymph progenitors / grey
new_colors[[22]] = zeileis_colors[[18]]  # Baso / turquoise
new_colors[[19, 14, 2]] = zeileis_colors[[6, 6, 6]]  # Neu / light blue
new_colors[[24, 9, 1, 11]] = zeileis_colors[[0, 0, 0, 0]]  # Mo / dark blue
new_colors[[21, 23]] = zeileis_colors[[25, 25]]  # outliers / grey

adata.uns['leiden_anno_colors'] = new_colors
sc.pl.paga_compare(
    adata, threshold=0.03, title='', right_margin=0.2, size=10, edge_width_scale=0.5,
    legend_fontsize=12, fontsize=12, frameon=False, edges=True, save=True)

接下来我们在这个细胞图谱上绘制拟时信息，虽说是拟时推断，却需要指定一个亚群作为起点。但是真正的样本中有时并不存在一个明确的起点，有时是同时发生发育的。尽管每个生命从长远来看所有的细胞都来自一个细胞。这个起点如何确定，就像牛顿的宇宙第一推动者的问题一样，必须跳出数据本身来思考它。这里我们为了完成任务还是选择一个发育的起点吧。

adata.uns['iroot'] = np.flatnonzero(adata.obs['leiden_anno']  == '16/Stem')[0]
sc.tl.dpt(adata)

adata
Out[332]: 
AnnData object with n_obs × n_vars = 5025 × 1000 
    obs: 'n_counts_all', 'leiden', 'leiden_anno', 'dpt_pseudotime'
    var: 'gene_ids', 'feature_types', 'n_counts'
    uns: 'log1p', 'pca', 'neighbors', 'leiden', 'draw_graph', 'leiden_colors', 'umap', 'diffmap_evals', 'paga', 'leiden_sizes', 'leiden_anno_sizes', 'leiden_anno_colors', 'iroot'
    obsm: 'X_pca', 'X_draw_graph_fa', 'X_umap', 'X_tsne', 'X_diffmap', 'X_draw_graph_lgl', 'X_draw_graph_kk'
    varm: 'PCs'

adata.obs['dpt_pseudotime']
Out[333]: 
AAACCCAAGCGTATGG-1    0.661545
AAACCCAGTCCTACAA-1    0.670825
AAACCCATCACCTCAC-1    0.000000
AAACGCTAGGGCATGT-1    0.839952
AAACGCTGTAGGTACG-1    0.816287
  
TTTGTTGCAGGTACGA-1    0.826858
TTTGTTGCAGTCTCTC-1    0.792751
TTTGTTGGTAATTAGG-1    0.329366
TTTGTTGTCCTTGGAA-1    0.905214
TTTGTTGTCGCACGAC-1    0.771131
Name: dpt_pseudotime, Length: 5025, dtype: float32

我们不禁思考，既然每个生命从长远来看所有的细胞都来自一个细胞，是不是在一套数据集中可以设置一个遥远的点（实际本不存在的）作为发育的起点呢？这样是不是更能反映轨迹推断的实际呢？据我所知好像还没有这样的算法出现。

sc.tl.draw_graph(adata, init_pos='paga')
sc.pl.draw_graph(adata, color=['leiden_anno', 'dpt_pseudotime'], legend_loc='on data')

可以自定义发育路径

paths = [('erythrocytes', [16, 12, 7, 13, 18, 6, 5, 10]),
         ('neutrophils', [16, 0, 4, 2, 14, 19]),
         ('monocytes', [16, 0, 4, 11, 1, 9, 24])]
adata.obs['distance'] = adata.obs['dpt_pseudotime']
adata.obs['clusters'] = adata.obs['leiden_anno'] 
adata.uns['clusters_colors'] = adata.uns['leiden_anno_colors']
gene_names = adata.var_names[1:10]

adata
Out[338]: 
AnnData object with n_obs × n_vars = 5025 × 1000 
    obs: 'n_counts_all', 'leiden', 'leiden_anno', 'dpt_pseudotime', 'distance', 'clusters'
    var: 'gene_ids', 'feature_types', 'n_counts'
    uns: 'log1p', 'pca', 'neighbors', 'leiden', 'draw_graph', 'leiden_colors', 'umap', 'diffmap_evals', 'paga', 'leiden_sizes', 'leiden_anno_sizes', 'leiden_anno_colors', 'iroot', 'clusters_colors'
    obsm: 'X_pca', 'X_draw_graph_fa', 'X_umap', 'X_tsne', 'X_diffmap', 'X_draw_graph_lgl', 'X_draw_graph_kk'
    varm: 'PCs'

_, axs = pl.subplots(ncols=3, figsize=(6, 2.5), gridspec_kw={'wspace': 0.05, 'left': 0.12})
pl.subplots_adjust(left=0.05, right=0.98, top=0.82, bottom=0.2)
for ipath, (descr, path) in enumerate(paths):
    _, data = sc.pl.paga_path(
        adata, path, gene_names,
        show_node_names=False,
        ax=axs[ipath],
        ytick_fontsize=12,
        left_margin=0.15,
        n_avg=50,
        annotations=['distance'],
        show_yticks=True if ipath==0 else False,
        show_colorbar=False,
        color_map='Greys',
        groups_key='clusters',
        color_maps_annotations={'distance': 'viridis'},
        title='{} path'.format(descr),
        return_data=True,
        show=False)
    data.to_csv('./write/paga_path_{}.csv'.format(descr))
pl.savefig('./figures/paga_path_paul15.pdf')
pl.show()

仔细看看这个图有没有想起来monocle的，基因在不同命运中的变化情况：

paga

最后编辑于：2020.03.26 07:58:05

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 202,056评论 5赞 474
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 84,842评论 2赞 378
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 148,938评论 0赞 335
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 54,296评论 1赞 272
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 63,292评论 5赞 363
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,413评论 1赞 281
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 37,824评论 3赞 393
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,493评论 0赞 256
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 40,686评论 1赞 295
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,502评论 2赞 318
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,553评论 1赞 329
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,281评论 4赞 318
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 38,820评论 3赞 305
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 29,873评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,109评论 1赞 258
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 42,699评论 2赞 348
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,257评论 2赞 341

单细胞转录组数据分析|| scanpy教程：PAGA轨迹推断

可选操作：图形去噪

Clustering and PAGA

推荐阅读更多精彩内容