Package 'Azimuth'

Title: A Shiny App Demonstrating a Query-Reference Mapping Algorithm for Single-Cell Data
Description: Azimuth uses an annotated reference dataset to automate the processing, analysis, and interpretation of a new single-cell RNA-seq or ATAC-seq experiment. Azimuth leverages a 'reference-based mapping' pipeline that inputs a counts matrix and performs normalization, visualization, cell annotation, and differential expression (biomarker discovery).
Authors: Andrew Butler [aut] , Charlotte Darby [aut] , Yuhan Hao [aut] , Austin Hartman [aut] , Paul Hoffman [aut, cre] , Jaison Jain [ctb] , Gesmira Molla [aut] , Rahul Satija [aut] , Satija Lab and Collaborators [fnd]
Maintainer: Paul Hoffman <[email protected]>
License: GPL-3 | file LICENSE
Version: 0.5.0
Built: 2024-11-12 06:35:53 UTC
Source: https://github.com/satijalab/azimuth

Help Index


Azimuth: A Shiny App Demonstrating a Query-Reference Mapping Algorithm for Single-Cell Data

Description

Azimuth uses an annotated reference dataset to automate the processing, analysis, and interpretation of a new single-cell RNA-seq or ATAC-seq experiment. Azimuth leverages a 'reference-based mapping' pipeline that inputs a counts matrix and performs normalization, visualization, cell annotation, and differential expression (biomarker discovery).

Package options

Azimuth uses the following options to control the behavior of the app. Users can provide these as named arguments to AzimuthApp through dots (...), specify these in the config file, or configure these with options.

App options

The following options control app behavior

Azimuth.app.default_adt

ADT to select by default in feature/violin plot

Azimuth.app.default_gene

Gene to select by default in feature/violin plot

Azimuth.app.default_metadata

Default metadata transferred from reference.

Azimuth.app.demodataset

Path to data file (in any Azimuth-supported format) to automatically load when the user clicks a button. The button is only available in the UI if this option is set

Azimuth.app.googlesheet

Google Sheet identifier (appropriate for use with googlesheets4:gs4_get) to write log records. Logging is only enabled if this and other google* options are set

Azimuth.app.googletoken

Path to directory containing Google Authentication token file. Logging is only enabled if this and other google* options are set

Azimuth.app.googletokenemail

Email address corresponding to the Google Authentication token file. Logging is only enabled if this and other google* options are set

Azimuth.app.max_cells

Maximum number of cells allowed to upload

Azimuth.app.metadata_notransfer

Metadata to annotate in reference but not transfer to query

Azimuth.app.mito

Regular expression pattern indicating mitochondrial features in query object

Azimuth.app.plotseed

Seed to shuffle colors for cell types

Azimuth.app.reference

URL or directory path to reference dataset; see LoadReference for more details

Azimuth.app.refuri

URL for publicly available reference dataset, used in the downloadable analysis script in case Azimuth.app.reference points to a directory

Azimuth.app.refdescriptor

Provide (as a string) the html to render the reference description on the welcome page

Azimuth.app.welcomebox

Provide (as a string) the code to render the box on the welcome page (quotes escaped). Example:

box(
  h3(\"Header\"),
  \"body text\",
  a(\"link\", href=\"www.satijalab.org\", target=\"_blank\"),
  width = 12
)
Azimuth.app.homologs

URL or path to file containing the human/mouse homolog table.

Azimuth.app.metatableheatmap

Display the meta.data table as a heatmap rather than in tabular form. defaults to FALSE.

Azimuth.app.overlayedreference

Display the mapped query on top of greyed out reference in the 'Cell Plots' tab. defaults to FALSE

Control options

These options control mapping and analysis behavior

Azimuth.map.ncells

Minimum number of cells required to accept uploaded file defaults to 100

Azimuth.map.ngenes

Minimum number of genes in common with reference to accept uploaded file; defaults to 250

Azimuth.map.nanchors

Minimum number of anchors that must be found to complete mapping. Defaults to 50

Azimuth.map.panchorscolors

Configure the valuebox on the main page corresponding to the values for failure, warning, success for fraction of unique query cells that participate in anchor pairs. Failure corresponds to [0:Azimuth.map.fracanchorscolors[1]), warning to [Azimuth.map.fracanchorscolors[1]:Azimuth.map.fracanchorscolors[2]), and success is >= Azimuth.map.fracanchorscolors[2]. Defaults to c(5, 15)

Azimuth.map.postmapqccolors

Configure the valuebox on the main page corresponding to the values for failure, warning, success for the post mapping cluster based QC metric. Failure corresponds to [0:Azimuth.map.postmapqc[1]), warning to [Azimuth.map.postmapqc[1]:Azimuth.map.postmapqc[2]), and success is >= Azimuth.map.postmapqc[2]. Defaults to c(0.15, 0.25)

Azimuth.map.postmapqcds

Set the amount of query random downsampling to perform before computing the mapping QC metric. Defaults to 5000

Azimuth.map.ntrees

Annoy (approximate nearest neighbor) n.trees parameter Defaults to 20

Azimuth.map.ndims

Number of dimensions to use in FindTransferAnchors and TransferData Defaults to 50

Azimuth.de.mincells

Minimum number of cells per cluster for differential expression; defaults to 15

Azimuth.de.digits

Number of digits to round differential expression table to; defaults to 3

Azimuth.sct.ncells, Azimuth.sct.nfeats

Number of cells and features to use for SCTransform, respectively. Defaults to 2000 for each

External options

The following options are used by external dependencies that have an effect on Azimuth's behavior. Refer to original package documentation for more details

shiny.maxRequestSize

User-configurable; used for controlling the maximum file size of uploaded datasets. Defaults to 500 Mb

DT.options

User-configurable; used for controlling biomarker table outputs. Defaults to setting pageLength to 10

future.globals.maxSize

Non-configurable; used for parallelization. Defaults to Azimuth.app.max_cells * 320000

Author(s)

Maintainer: Paul Hoffman [email protected] (ORCID)

Authors:

Other contributors:

See Also

Useful links:


Launch the mapping app

Description

Launch the mapping app

Usage

AzimuthApp(config = NULL, ...)

Arguments

config

Path to JSON-formatted configuration file specifying options; for an example config file, see system.file("resources", "config.json", package = "Azimuth")

...

Options to set, see ?`Azimuth-package` for details on Azimuth-provided options

Value

None, launches the mapping Shiny app

Specifying options

R options can be provided as named arguments to AzimuthApp through dots (...), set in a config file, or set globally. Arguments provided to AzimuthApp through dots take precedence if the same option is provided in a config file. Options provided through dots or a config file take precedence if the same option was set globally.

Options in the Azimuth.app namespace can be specified using a shorthand notation in both the config file and as arguments to AzimuthApp. For example, the option Azimuth.app.reference can be shortened to reference in the config file or as an argument to AzimuthApp

See Also

Azimuth-package

Examples

if (interactive()) {
  AzimuthApp(system.file("resources", "config.json", package = "Azimuth"))
}

Create a Seurat object compatible with Azimuth.

Description

Create a Seurat object compatible with Azimuth.

Usage

AzimuthBridgeReference(
  object,
  reference.reduction = "spca",
  bridge.ref.reduction = "ref.spca",
  bridge.query.reduction = "slsi",
  laplacian.reduction = "lap",
  refUMAP = "wnn.umap",
  refAssay = "SCT",
  dims = 1:50,
  plotref = "wnn.umap",
  plot.metadata = NULL,
  ori.index = NULL,
  colormap = NULL,
  assays = c("Bridge", "RNA"),
  metadata = NULL,
  reference.version = "0.0.0",
  verbose = FALSE
)

Arguments

object

Seurat object

refUMAP

Name of UMAP in reference to use for mapping

refAssay

Name of SCTAssay to use in reference

dims

Dimensions to use in reference neighbor finding

plotref

Either the name of the DimReduc in the provided Seurat object to use for the plotting reference or the DimReduc object itself.

plot.metadata

A data.frame of discrete metadata fields for the cells in the plotref.

ori.index

Index of the cells used in mapping in the original object on which UMAP was run. Only need to provide if UMAP was run on different set of cells.

colormap

A list of named and ordered vectors specifying the colors and levels for the metadata. See CreateColorMap for help generating your own.

assays

Assays to retain for transfer

metadata

Metadata to retain for transfer

reference.version

Version of the Azimuth reference

verbose

Display progress/messages

refDR

Name of DimReduc in reference to use for mapping

k.param

Defines k for the k-nearest neighbor algorithm

Value

Returns a Seurat object with AzimuthData stored in the tools slot for use with Azimuth.


AzimuthData

Description

The AzimuthData class is used to store reference info needed for Azimuth

Slots

plotref

DimReduc object containing UMAP for plotting and projection. This should also contain the cell IDs in the misc slot

colormap

Vector of id-color mapping for specifying the plots.

seurat.version

Version of Seurat used in reference construction

azimuth.version

Version of Azimuth used in reference construction

reference.version

Version of the Azimuth reference


Create a Seurat object compatible with Azimuth.

Description

Create a Seurat object compatible with Azimuth.

Usage

AzimuthReference(
  object,
  refUMAP = "umap",
  refDR = "spca",
  refAssay = "SCT",
  dims = 1:50,
  k.param = 31,
  plotref = "umap",
  plot.metadata = NULL,
  ori.index = NULL,
  colormap = NULL,
  assays = NULL,
  metadata = NULL,
  reference.version = "0.0.0",
  verbose = FALSE
)

Arguments

object

Seurat object

refUMAP

Name of UMAP in reference to use for mapping

refDR

Name of DimReduc in reference to use for mapping

refAssay

Name of SCTAssay to use in reference

dims

Dimensions to use in reference neighbor finding

k.param

Defines k for the k-nearest neighbor algorithm

plotref

Either the name of the DimReduc in the provided Seurat object to use for the plotting reference or the DimReduc object itself.

plot.metadata

A data.frame of discrete metadata fields for the cells in the plotref.

ori.index

Index of the cells used in mapping in the original object on which UMAP was run. Only need to provide if UMAP was run on different set of cells.

colormap

A list of named and ordered vectors specifying the colors and levels for the metadata. See CreateColorMap for help generating your own.

assays

Assays to retain for transfer

metadata

Metadata to retain for transfer

reference.version

Version of the Azimuth reference

verbose

Display progress/messages

Value

Returns a Seurat object with AzimuthData stored in the tools slot for use with Azimuth.


Converts gene names of query to match type/species of reference names (human or mouse).

Description

Converts gene names of query to match type/species of reference names (human or mouse).

Usage

ConvertGeneNames(object, reference.names, homolog.table)

Arguments

object

Object to convert, must contain only RNA counts matrix

reference.names

Gene names of reference

homolog.table

Location of file (or URL) containing table with human/mouse homologies

Value

query object with converted feature names, likely subsetted


Create an AzimuthData object

Description

Create an auxiliary AzimuthData object for storing necessary info when generating an Azimuth reference.

Usage

CreateAzimuthData(
  object,
  plotref = "umap",
  plot.metadata = NULL,
  colormap = NULL,
  reference.version = "0.0.0"
)

Arguments

object

Seurat object

plotref

Either the name of the DimReduc in the provided Seurat object to use for the plotting reference or the DimReduc object itself.

plot.metadata

A data.frame of discrete metadata fields for the cells in the plotref.

colormap

A list of named and ordered vectors specifying the colors and levels for the metadata. See CreateColorMap for help generating your own.

reference.version

Version of the Azimuth reference

Value

Returns an AzimuthData object


Create A Color Map

Description

Create mapping between IDs and colors to use with reference plotting in Azimuth

Usage

CreateColorMap(object, ids = NULL, colors = NULL, seed = NULL)

Arguments

object

Seurat object

ids

Vector of IDs to link to colors

colors

Vector of colors to use

seed

Set to randomly shuffle color assignments

Value

A named vector of colors


Get Azimuth color mapping

Description

Pull ID-color mapping for Azimuth plotting

Usage

GetColorMap(object, ...)

## S3 method for class 'AzimuthData'
GetColorMap(object, ...)

## S3 method for class 'Seurat'
GetColorMap(object, slot = "AzimuthReference", ...)

Arguments

object

An object

...

Arguments passed to other methods

slot

Name of tool

Value

A named vector specifying the colors for all reference IDs


Get Azimuth plotref

Description

Pull DimReduc used in Azimuth plotting/projection

Usage

GetPlotRef(object, ...)

## S3 method for class 'AzimuthData'
GetPlotRef(object, ...)

## S3 method for class 'Seurat'
GetPlotRef(object, slot = "AzimuthReference", ...)

Arguments

object

An object

...

Arguments passed to other methods

slot

Name of tool

Value

A DimReduc object


Get transcripts modified from Signac::GeneActivity

Description

Get transcripts modified from Signac::GeneActivity

Usage

GetTranscripts(
  object,
  assay = NULL,
  features = NULL,
  extend.upstream = 2000,
  extend.downstream = 0,
  biotypes = "protein_coding",
  max.width = 5e+05,
  process_n = 2000,
  gene.id = FALSE,
  verbose = TRUE
)

Arguments

object

A Seurat object

assay

Name of assay to use. If NULL, use the default assay

features

Genes to include. If NULL, use all protein-coding genes in the annotations stored in the object

extend.upstream

Number of bases to extend upstream of the TSS

extend.downstream

Number of bases to extend downstream of the TTS

biotypes

Gene biotypes to include. If NULL, use all biotypes in the gene annotation.

max.width

Maximum allowed gene width for a gene to be quantified. Setting this parameter can avoid quantifying extremely long transcripts that can add a relatively long amount of time. If NULL, do not filter genes based on width.

process_n

Number of regions to load into memory at a time, per thread. Processing more regions at once can be faster but uses more memory.

gene.id

Record gene IDs in output matrix rather than gene name.

verbose

Value

Transcripts


Load the extended reference RDS file for bridge integration

Description

Read in a precomputed extended reference. This function can read either from URLs or a file path. The function looks for a file called ext.Rds for the extended reference Seurat object

Usage

LoadBridgeReference(path, seconds = 10L)

Arguments

path

Path or URL to the RDS file

seconds

Timeout to check for URLs in seconds

Value

A list with two entries:

map

The extended reference Seurat object

plot

The reference Seurat object (for plotting)

Examples

## Not run: 
# Load from a URL
ref <- LoadBridgeReference("https://seurat.nygenome.org/references/pbmc")
# Load a file from the path to a directory 
ref2 <- LoadBridgeReference("path/")
# Load a file directly
ref3 <- LoadBridgeReference("ext.Rds")

## End(Not run)

Load file input into a Seurat object

Description

Take a file and load it into a Seurat object. Supports a variety of file types and always returns a Seurat object

Usage

LoadFileInput(path, bridge = FALSE)

Arguments

path

Path to input data

Details

LoadFileInput supports several file types to be read in as Seurat objects. File type is determined by extension, matched in a case-insensitive manner See sections below for details about supported filtypes, required extension, and specifics for how data is loaded

Value

A Seurat object

10X H5 File (extension h5)

10X HDF5 files are supported for all versions of Cell Ranger; data is read in using Read10X_h5. Note: for multi-modal 10X HDF5 files, only the first matrix is read in

Rds File (extension rds)

Rds files are supported as long as they contain one of the following data types:

For S4 Matrix, S3 matrix, and data.frame objects, a Seurat object will be made with CreateSeuratObject using the default arguments

h5Seurat File (extension h5seurat)

h5Seurat files and all of their features are fully supported. They are read in via LoadH5Seurat. Note: only the “counts” matrices are read in and only the default assay is kept

AnnData H5AD File (extension h5ad)

Only H5AD files from AnnData v0.7 or higher are supported. Data is read from the H5AD file in the following manner

  • The counts matrix is read from “/raw/X”; if “/raw/X” is not present, the matrix is read from “/X”

  • Feature names are read from feature-level metadata. Feature level metadata must be an HDF5 group, HDF5 compound datasets are not supported. If counts are read from /raw/X, features names are looked for in “/raw/var”; if counts are read from “/X”, features names are looked for in “/var”. In both cases, feature names are read from the dataset specified by the “_index” attribute, “_index” dataset, or “index” dataset, in that order

  • Cell names are read from cell-level metadata. Cell-level metadata must be an HDF5 group, HDF5 compound datasets are not supported. Cell-level metadata is read from “/obs”. Cell names are read from the dataset specified by the “_index” attribute, “_index” dataset, or “index” dataset, in that order

  • Cell-level metadata is read from the “/obs” dataset. Columns will be returned in the same order as in the “column-order”, if present, or in alphabetical order. If a dataset named “__categories” is present, then all datasets in “__categories” will serve as factor levels for datasets present in “/obs” with the same name (eg. a dataset named “/obs/__categories/leiden” will serve as the levels for “/obs/leiden”). Row names will be set as cell names as described above. All datasets in “/obs” will be loaded except for “__categories” and the cell names dataset


Load obs from a H5AD file

Description

Read in only the metadata of an H5AD file and return a data.frame object

Usage

LoadH5ADobs(path, cell.groups = NULL)

Load the reference RDS files

Description

Read in a reference Seurat object and annoy index. This function can read either from URLs or a file path. In order to read properly, there must be the following files:

  • “ref.Rds” for the downsampled reference Seurat object (for mapping)

  • “idx.annoy” for the nearest-neighbor index object

Usage

LoadReference(path, seconds = 10L)

Arguments

path

Path or URL to the two RDS files

seconds

Timeout to check for URLs in seconds

Value

A list with two entries:

map

The downsampled reference Seurat object (for mapping)

plot

The reference Seurat object (for plotting)

Examples

## Not run: 
# Load from a URL
ref <- LoadReference("https://seurat.nygenome.org/references/pbmc")
# Load from a directory
ref2 <- LoadReference("/var/www/html")

## End(Not run)

Get Azimuth reference version number

Description

Pull the reference version information

Usage

ReferenceVersion(object, ...)

## S3 method for class 'AzimuthData'
ReferenceVersion(object, ...)

## S3 method for class 'Seurat'
ReferenceVersion(object, slot = "AzimuthReference", ...)

Arguments

object

Seurat or AzimuthData object

...

Not used

slot

Name of the version to pull. Can be "seurat.version", "azimuth.version", or "reference.version".

Value

A character string specifying the reference version


Run Azimuth annotation

Description

Run Azimuth annotation

Usage

## S3 method for class 'Seurat'
RunAzimuth(
  query,
  reference,
  query.modality = "RNA",
  annotation.levels = NULL,
  umap.name = "ref.umap",
  do.adt = FALSE,
  verbose = TRUE,
  assay = NULL,
  k.weight = 50,
  n.trees = 20,
  mapping.score.k = 100,
  ...
)

## S3 method for class 'character'
RunAzimuth(query, ...)

RunAzimuth(query, ...)

Arguments

query

Seurat object or following type of path:

  • A .h5 matrix

  • A .rds file containing a Seurat object

  • A .h5ad anndata object

  • A .h5seurat object

reference

Name of reference to map to or a path to a directory containing ref.Rds and idx.annoy

annotation.levels

list of annotation levels to map. If not specified, all will be mapped.

umap.name

name of umap reduction in the returned object

do.adt

transfer ADT assay

assay

query assay name

Value

Seurat object with reference reductions and annotations

Returns a Seurat object containing celltype annotations


Run Azimuth annotation for ATAC query

Description

Run Azimuth annotation for ATAC query

Usage

## S3 method for class 'Seurat'
RunAzimuthATAC(
  query,
  reference,
  fragment.path = NULL,
  annotation.levels = NULL,
  umap.name = "ref.umap",
  verbose = TRUE,
  assay = NULL,
  k.weight = 50,
  n.trees = 20,
  mapping.score.k = 100,
  dims.atac = 2:50,
  dims.rna = 1:50
)

## S3 method for class 'character'
RunAzimuthATAC(query, ...)

RunAzimuthATAC(query, ...)

Arguments

query

Seurat object or following type of path:

  • A .h5 matrix

  • A .rds file containing a Seurat object

  • A .h5ad anndata object

  • A .h5seurat object

reference

Name of reference to map to or a path to a directory containing ext.Rds

annotation.levels

list of annotation levels to map. If not specified, all will be mapped.

umap.name

name of umap reduction in the returned object

assay

query assay name

dims.atac

dimensions

dims.rna

dimensions

do.adt

transfer ADT assay

Value

Seurat object with reference reductions and annotations

Returns a Seurat object containing celltype annotations


Save Azimuth references and neighbors index to same folder

Description

Save Azimuth references and neighbors index to same folder

Usage

SaveAzimuthReference(object = NULL, folder = NULL)

Arguments

object

An Azimuth reference

file

Path to save Azimuth reference to; defaults to file.path(getwd(), "azimuth_reference/"))

...

Arguments passed on to base::saveRDS

ascii

a logical. If TRUE or NA, an ASCII representation is written; otherwise (default), a binary one is used. See the comments in the help for save.

version

the workspace format version to use. NULL specifies the current default version (3). The only other supported value is 2, the default from R 1.4.0 to R 3.5.0.

compress

a logical specifying whether saving to a named file is to use "gzip" compression, or one of "gzip", "bzip2" or "xz" to indicate the type of compression to be used. Ignored if file is a connection.

refhook

a hook function for handling reference objects.

Value

Invisibly returns file

See Also

saveRDS() readRDS()

Examples

# Make Azimuth Reference object
obj.azimuth <- AzimuthReference(object)

# Save 
SaveAzimuthReference(object = obj.azimuth, folder = "azimuth_reference")

# Run Azimuth

query <- RunAzimuth(query = query, 
                    reference = "azimuth_reference", 
                    ...)

Set Azimuth color mapping

Description

Set ID-color mapping for Azimuth plotting

Usage

SetColorMap(object, ...)

## S3 method for class 'AzimuthData'
SetColorMap(object, value, ...)

## S3 method for class 'Seurat'
SetColorMap(object, slot = "AzimuthReference", value, ...)

Arguments

object

An object

...

Arguments passed to other methods

value

New colormap to assign

slot

Name of tool

Value

An object with the colormap slot set


Validate References for Azimuth

Description

Validate aspects of a Seurat object to be used as an Azimuth reference

Usage

ValidateAzimuthReference(object, ad.name = "AzimuthReference")

Arguments

object

Seurat object

ad.name

Name in the tools slot containing the AzimuthData object.

Value

No return value