ggseqlogo - A 'ggplot2' Extension for Drawing Publication-Ready Sequence Logos
The extensive range of functions provided by this package makes it possible to draw highly versatile sequence logos. Features include, but not limited to, modifying colour schemes and fonts used to draw the logo, generating multiple logo plots, and aiding the visualisation with annotations. Sequence logos can easily be combined with other plots 'ggplot2' plots.
Last updated
11.76 score 227 stars 10 dependents 1.1k scripts 7.8k downloads
margins - Marginal Effects for Model Objects
An R port of Stata's 'margins' command, which can be used to calculate marginal (or partial) effects from model objects.
Last updated
glmlinear-modelsmarginal-effectspartial-effectsregressionstata-command
11.71 score 265 stars 7 dependents 1.5k scripts 20k downloadsRcppRoll - Efficient Rolling / Windowed Operations
Provides fast and efficient routines for common rolling / windowed operations. Routines for the efficient computation of windowed mean, median, sum, product, minimum, maximum, standard deviation and variance are provided.
Last updated
cpp
11.08 score 82 stars 71 dependents 2.0k scripts 34k downloadsmonocle3 - Clustering, Differential Expression, and Trajectory Analysis for Single-Cell RNA-Seq
Monocle 3 performs clustering, differential expression and trajectory analysis for single-cell expression experiments. It orders individual cells according to progress through a biological process, without knowing ahead of time which genes define progress through that process. Monocle 3 also performs differential expression analysis, clustering, visualization, and other useful tasks on single-cell expression data. It is designed to work with RNA-Seq data, but could be used with other types as well.
Last updated
softwaresinglecellrnaseqatacseqnormalizationpreprocessingdimensionreductionvisualizationqualitycontrolclusteringclassificationannotationgeneexpressiondifferentialexpressionsingle-cell-rna-seqcpp
9.79 score 453 stars 2 dependents 3.2k scriptsdotwhisker - Dot-and-Whisker Plots of Regression Results
Create quick and easy dot-and-whisker plots of regression results. It takes as input either (1) a coefficient table in standard form or (2) one (or a list of) fitted model objects (of any type that has methods implemented in the 'parameters' package). It returns 'ggplot' objects that can be further customized using tools from the 'ggplot2' package. The package also includes helper functions for tasks such as rescaling coefficients or relabeling predictor variables. See more methodological discussion of the visualization and data management methods used in this package in Kastellec and Leoni (2007) <doi:10.1017/S1537592707072209> and Gelman (2008) <doi:10.1002/sim.3107>.
Last updated
graphicsplotregression-models
9.49 score 59 stars 1 dependents 844 scripts 6.9k downloads
rsparse - Statistical Learning on Sparse Matrices
Implements many algorithms for statistical learning on sparse matrices - matrix factorizations, matrix completion, elastic net regressions, factorization machines. Also 'rsparse' enhances 'Matrix' package by providing methods for multithreaded <sparse, dense> matrix products and native slicing of the sparse matrices in Compressed Sparse Row (CSR) format. List of the algorithms for regression problems: 1) Elastic Net regression via Follow The Proximally-Regularized Leader (FTRL) Stochastic Gradient Descent (SGD), as per McMahan et al(, <doi:10.1145/2487575.2488200>) 2) Factorization Machines via SGD, as per Rendle (2010, <doi:10.1109/ICDM.2010.127>) List of algorithms for matrix factorization and matrix completion: 1) Weighted Regularized Matrix Factorization (WRMF) via Alternating Least Squares (ALS) - paper by Hu, Koren, Volinsky (2008, <doi:10.1109/ICDM.2008.22>) 2) Maximum-Margin Matrix Factorization via ALS, paper by Rennie, Srebro (2005, <doi:10.1145/1102351.1102441>) 3) Fast Truncated Singular Value Decomposition (SVD), Soft-Thresholded SVD, Soft-Impute matrix completion via ALS - paper by Hastie, Mazumder et al. (2014, <doi:10.48550/arXiv.1410.2596>) 4) Linear-Flow matrix factorization, from 'Practical linear models for large-scale one-class collaborative filtering' by Sedhain, Bui, Kawale et al (2016, ISBN:978-1-57735-770-4) 5) GlobalVectors (GloVe) matrix factorization via SGD, paper by Pennington, Socher, Manning (2014, <https://aclanthology.org/D14-1162/>) Package is reasonably fast and memory efficient - it allows to work with large datasets - millions of rows and millions of columns. This is particularly useful for practitioners working on recommender systems.
Last updated
collaborative-filteringfactorization-machinesmatrix-completionmatrix-factorizationrecommender-systemsparse-matricessvdopenblascppopenmp
9.16 score 180 stars 27 dependents 57 scripts 7.0k downloadsCellChat - Inference and analysis of cell-cell communication from single-cell and spatially resolved transcriptomics data
an open source R tool that infers, visualizes and analyzes the cell-cell communication networks from scRNA-seq and spatially resolved transcriptomics data.
Last updated
cell-cell-communicationcell-cell-interactionmicroenvironmentsingle-cell-analysisspatial-transcriptomicscpp
8.82 score 616 stars 1 dependents 2.4k scriptsSeuratWrappers - Community-Provided Methods and Extensions for the Seurat Object
SeuratWrappers is a collection of community-provided methods and extensions for Seurat, curated by the Satija Lab at NYGC. These methods comprise functionality not presently found in Seurat, and are able to be updated much more frequently.
Last updated
communitysingle-cell-analysissingle-cell-genomics
8.59 score 364 stars 4.2k scriptsBPCells - Single Cell Counts Matrices to PCA
> Efficient operations for single cell ATAC-seq fragments and RNA counts matrices. Interoperable with standard file formats, and introduces efficient bit-packed formats that allow large storage savings and increased read speeds.
Last updated
zlibhdf5cpp
8.14 score 286 stars 5 dependents 720 scriptsSeuratDisk - Interfaces for HDF5-Based Single Cell File Formats
The h5Seurat file format is specifically designed for the storage and analysis of multi-modal single-cell and spatially-resolved expression experiments, for example, from CITE-seq or 10X Visium technologies. It holds all molecular information and associated metadata, including (for example) nearest-neighbor graphs, dimensional reduction information, spatial coordinates and image data, and cluster labels. We also support rapid and on-disk conversion between h5Seurat and AnnData objects, with the goal of enhancing interoperability between Seurat and Scanpy.
Last updated
hdf5-formatsingle-cell-genomicssingle-cell-rna-seq
7.78 score 175 stars 1 dependents 5.7k scriptsSeuratData - Install and Manage Seurat Datasets
Single cell RNA sequencing datasets can be large, consisting of matrices that contain expression data for several thousand features across several thousand cells. This package is designed to easily install, manage, and learn about various single-cell datasets, provided Seurat objects and distributed as independent packages.
Last updated
datasetssingle-cellsingle-cell-genomics
7.49 score 154 stars 4.5k scriptspenalized - L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model
Fitting possibly high dimensional penalized regression models. The penalty structure can be any combination of an L1 penalty (lasso and fused lasso), an L2 penalty (ridge) and a positivity constraint on the regression coefficients. The supported regression models are linear, logistic and Poisson regression and the Cox Proportional Hazards model. Cross-validation routines allow optimization of the tuning parameters.
Last updated
openblascpp
7.38 score 4 stars 17 dependents 367 scripts 6.4k downloadsspeedglm - Fitting Linear and Generalized Linear Models to Large Data Sets
Fitting linear models and generalized linear models to large data sets by updating algorithms.
Last updated
7.08 score 7 stars 13 dependents 585 scripts 15k downloadsRcisTarget - RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions
RcisTarget identifies transcription factor binding motifs (TFBS) over-represented on a gene list. In a first step, RcisTarget selects DNA motifs that are significantly over-represented in the surroundings of the transcription start site (TSS) of the genes in the gene-set. This is achieved by using a database that contains genome-wide cross-species rankings for each motif. The motifs that are then annotated to TFs and those that have a high Normalized Enrichment Score (NES) are retained. Finally, for each motif and gene-set, RcisTarget predicts the candidate target genes (i.e. genes in the gene-set that are ranked above the leading edge).
Last updated
generegulationmotifannotationtranscriptomicstranscriptiongenesetenrichmentgenetarget
6.89 score 49 stars 262 scriptsAzimuth - A Shiny App Demonstrating a Query-Reference Mapping Algorithm for Single-Cell Data
Azimuth uses an annotated reference dataset to automate the processing, analysis, and interpretation of a new single-cell RNA-seq or ATAC-seq experiment. Azimuth leverages a 'reference-based mapping' pipeline that inputs a counts matrix and performs normalization, visualization, cell annotation, and differential expression (biomarker discovery).
Last updated
shiny-appsingle-cell-genomicssingle-cell-rna-seqcpp
6.79 score 146 stars 762 scriptsloupeR - Converts Seurat objects to 10x Genomics Loupe files
Converts Seurat objects to 10x Genomics Loupe files. This is a second line to make the package checker not complain.
Last updated
10xgenomicssingle-cell-genomics
6.66 score 154 stars 1 dependents 124 scriptsDDRTree - Learning Principal Graphs with DDRTree
Project data into a reduced dimensional space and construct a principal graph from the reduced dimension.
Last updated
cpp
6.19 score 6 stars 3 dependents 54 scripts 5.3k downloadsArchR - Analyzing single-cell regulatory chromatin in R.
This package is designed to streamline scATAC analyses in R.
Last updated
cpp
5.05 score 451 starsRforMassSpectrometry - R for MassSpectrometry meta-package
The RforMassSpectrometry meta-package loads and manages the core packages of the R for Mass Spectrometry initiative, that provide efficient, thoroughly documented, tested and flexible R software for the analysis and interpretation of high throughput mass spectrometry assays.
Last updated
4.70 score 20 stars 1 scriptsblaseRtools - R Tools for Blaser Lab Data Analysis
This is a repository of R tools for Single Cell RNA seq and other lab data analysis functions.
Last updated
3.18 score 1 stars 25 scripts
tidyproteomics - An S3 data object and framework for common quantitative proteomic analyses
Creates a simple, universal S3 data structure for the post analysis of mass spectrometry based quantitative proteomic data. In addition, this package collects, adapts and organizes several useful algorithms and methods used in typical post analysis workflows.
Last updated
5.01 score 47 stars 110 scriptsmhurdle - Multiple Hurdle Tobit Models
Estimation of models with dependent variable left-censored at zero. Null values may be caused by a selection process Cragg (1971) <doi:10.2307/1909582>, insufficient resources Tobin (1958) <doi:10.2307/1907382>, or infrequency of purchase Deaton and Irish (1984) <doi:10.1016/0047-2727(84)90067-7>.
Last updated
4.65 score 16 scripts 5.5k downloadsGOsummaries - Word cloud summaries of GO enrichment analysis
A package to visualise Gene Ontology (GO) enrichment analysis results on gene lists arising from different analyses such clustering or PCA. The significant GO categories are visualised as word clouds that can be combined with different plots summarising the underlying data.
Last updated
geneexpressionclusteringgovisualizationcpp
3.89 score 11 stars 47 scripts 3 downloadsdatascience.curriculum - Blaser Lab Data Science Online Course
What the package does (one paragraph).
Last updated
3.68 score 1 stars 12 scriptsmuRtools - Mueller's R tools
Fabian's custom plotting functions and utilities
Last updated
1.90 score 1 stars 16 scriptsDAseq - Detecting regions of differential abundance between scRNA-seq datasets
DA-seq is a multiscale approach for detecting DA subpopulations. In contrast to clustering based approaches, our method can detect DA subpopulations that do not form well separated clusters. For each cell, we compute a multiscale differential abundance score measure. These scores are based on the k nearest neighbors in the transcriptome space for various values of k.
Last updated
4.49 score 40 stars 1 dependents 51 scriptsblaseRdata - Supporting Data for the blaseRtools Package
What the package does (one paragraph).
Last updated
2.22 score 1 dependents 11 scriptsTxDb.Drerio.UCSC.danRer11.ensGene - Annotation package for TxDb object(s)
Exposes an annotation databases generated from UCSC by exposing these as TxDb objects
Last updated
annotationdatageneticstxdbdanio_rerio
2.18 score 1 dependentsrdoc - Colourised R Documentation
Extends tools::Rd2txt() by adding customisable text and colour formatting to R documentation contents. If used from a terminal, output will be displayed via file.show() otherwise contents will be printed in sections. Also provides stand-in replacements for ?() and help().
Last updated
clicolorizationcrayondocumentationpretty-print
4.11 score 48 stars 54 scripts 13 downloadsmuLogR - muLogR
Logging and code structuring for R. Based on code from the RnBeads package.
Last updated
1.45 score 28 scriptsmuReportR - muReportR
Generate HTML reports from R. Based on code from the RnBeads package.
Last updated
1.00 score 1 scripts