RiemannianStats

Riemannian STATS

Riemannian STATS: Statistical Analysis on Riemannian Manifolds

Riemannian STATS is a Python package designed to extend classical multivariate statistical methods to data that lie on non-Euclidean spaces. This package introduces a general framework for Riemannian Principal Component Analysis (R-PCA), a method developed to operate on datasets modeled as Riemannian manifolds. The foundational ideas are presented in the scientific paper “Riemannian Principal Component Analysis” by Oldemar Rodríguez.

Unlike traditional PCA, which assumes a flat Euclidean geometry, R-PCA uses UMAP to define local distances and induce a Riemannian structure from any data table—structured or unstructured, real or synthetic. This enables geometric-aware dimensionality reduction and correlation analysis, even on datasets with complex topologies, non-linear relationships, or varying local densities.

Built on these principles, Riemannian STATS enables:

  • Transformation of data tables into Riemannian manifolds via UMAP-based metrics.

  • Riemannian correlation and covariance computation.

  • Extraction of Riemannian principal components.

  • Intuitive 2D/3D visualizations reflecting the manifold’s geometry.

  • Applications in high-dimensional data, image analysis, clustering, and beyond.

The core idea is simple yet powerful: treat your dataset not as flat, but as curved—honoring its internal structure. This unlocks more expressive models, better visualizations, and more accurate statistical summaries.

Ideal for researchers, data scientists, and developers looking to enhance their analysis of complex datasets with geometry-aware tools.

User Guide