Seaborn Function Reference

This document provides a comprehensive reference for all major seaborn functions, organized by category.

Relational Plots

scatterplot()

Purpose: Create a scatter plot with points representing individual observations.

Key Parameters: - data - DataFrame, array, or dict of arrays - x, y - Variables for x and y axes - hue - Grouping variable for color encoding - size - Grouping variable for size encoding - style - Grouping variable for marker style - palette - Color palette name or list - hue_order - Order for categorical hue levels - hue_norm - Normalization for numeric hue (tuple or Normalize object) - sizes - Size range for size encoding (tuple or dict) - size_order - Order for categorical size levels - size_norm - Normalization for numeric size - markers - Marker style(s) (string, list, or dict) - style_order - Order for categorical style levels - legend - How to draw legend: "auto", "brief", "full", or False - ax - Matplotlib axes to plot on

Example:

sns.scatterplot(data=df, x='height', y='weight',
                hue='gender', size='age', style='smoker',
                palette='Set2', sizes=(20, 200))

lineplot()

Purpose: Draw a line plot with automatic aggregation and confidence intervals for repeated measures.

Key Parameters: - data - DataFrame, array, or dict of arrays - x, y - Variables for x and y axes - hue - Grouping variable for color encoding - size - Grouping variable for line width - style - Grouping variable for line style (dashes) - units - Grouping variable for sampling units (no aggregation within units) - estimator - Function for aggregating across observations (default: mean) - errorbar - Method for error bars: "sd", "se", "pi", ("ci", level), ("pi", level), or None - n_boot - Number of bootstrap iterations for CI computation - seed - Random seed for reproducible bootstrapping - sort - Sort data before plotting - err_style - "band" or "bars" for error representation - err_kws - Additional parameters for error representation - markers - Marker style(s) for emphasizing data points - dashes - Dash style(s) for lines - legend - How to draw legend - ax - Matplotlib axes to plot on

Example:

sns.lineplot(data=timeseries, x='time', y='signal',
             hue='condition', style='subject',
             errorbar=('ci', 95), markers=True)

relplot()

Purpose: Figure-level interface for drawing relational plots (scatter or line) onto a FacetGrid.

Key Parameters: All parameters from scatterplot() and lineplot(), plus: - kind - "scatter" or "line" - col - Categorical variable for column facets - row - Categorical variable for row facets - col_wrap - Wrap columns after this many columns - col_order - Order for column facet levels - row_order - Order for row facet levels - height - Height of each facet in inches - aspect - Aspect ratio (width = height * aspect) - facet_kws - Additional parameters for FacetGrid

Example:

sns.relplot(data=df, x='time', y='measurement',
            hue='treatment', style='batch',
            col='cell_line', row='timepoint',
            kind='line', height=3, aspect=1.5)

Distribution Plots

histplot()

Purpose: Plot univariate or bivariate histograms with flexible binning.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (y optional for bivariate) - hue - Grouping variable - weights - Variable for weighting observations - stat - Aggregate statistic: "count", "frequency", "probability", "percent", "density" - bins - Number of bins, bin edges, or method ("auto", "fd", "doane", "scott", "stone", "rice", "sturges", "sqrt") - binwidth - Width of bins (overrides bins) - binrange - Range for binning (tuple) - discrete - Treat x as discrete (centers bars on values) - cumulative - Compute cumulative distribution - common_bins - Use same bins for all hue levels - common_norm - Normalize across hue levels - multiple - How to handle hue: "layer", "dodge", "stack", "fill" - element - Visual element: "bars", "step", "poly" - fill - Fill bars/elements - shrink - Scale bar width (for multiple="dodge") - kde - Overlay KDE estimate - kde_kws - Parameters for KDE - line_kws - Parameters for step/poly elements - thresh - Minimum count threshold for bins - pthresh - Minimum probability threshold - pmax - Maximum probability for color scaling - log_scale - Log scale for axis (bool or base) - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.histplot(data=df, x='measurement', hue='condition',
             stat='density', bins=30, kde=True,
             multiple='layer', alpha=0.5)

kdeplot()

Purpose: Plot univariate or bivariate kernel density estimates.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (y optional for bivariate) - hue - Grouping variable - weights - Variable for weighting observations - palette - Color palette - hue_order - Order for hue levels - hue_norm - Normalization for numeric hue - multiple - How to handle hue: "layer", "stack", "fill" - common_norm - Normalize across hue levels - common_grid - Use same grid for all hue levels - cumulative - Compute cumulative distribution - bw_method - Method for bandwidth: "scott", "silverman", or scalar - bw_adjust - Bandwidth multiplier (higher = smoother) - log_scale - Log scale for axis - levels - Number or values for contour levels (bivariate) - thresh - Minimum density threshold for contours - gridsize - Grid resolution - cut - Extension beyond data extremes (in bandwidth units) - clip - Data range for curve (tuple) - fill - Fill area under curve/contours - legend - Whether to show legend - ax - Matplotlib axes

Example:

# Univariate
sns.kdeplot(data=df, x='measurement', hue='condition',
            fill=True, common_norm=False, bw_adjust=1.5)

# Bivariate
sns.kdeplot(data=df, x='var1', y='var2',
            fill=True, levels=10, thresh=0.05)

ecdfplot()

Purpose: Plot empirical cumulative distribution functions.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (specify one) - hue - Grouping variable - weights - Variable for weighting observations - stat - "proportion" or "count" - complementary - Plot complementary CDF (1 - ECDF) - palette - Color palette - hue_order - Order for hue levels - hue_norm - Normalization for numeric hue - log_scale - Log scale for axis - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.ecdfplot(data=df, x='response_time', hue='treatment',
             stat='proportion', complementary=False)

rugplot()

Purpose: Plot tick marks showing individual observations along an axis.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variable (specify one) - hue - Grouping variable - height - Height of ticks (proportion of axis) - expand_margins - Add margin space for rug - palette - Color palette - hue_order - Order for hue levels - hue_norm - Normalization for numeric hue - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.rugplot(data=df, x='value', hue='category', height=0.05)

displot()

Purpose: Figure-level interface for distribution plots onto a FacetGrid.

Key Parameters: All parameters from histplot(), kdeplot(), and ecdfplot(), plus: - kind - "hist", "kde", "ecdf" - rug - Add rug plot on marginal axes - rug_kws - Parameters for rug plot - col - Categorical variable for column facets - row - Categorical variable for row facets - col_wrap - Wrap columns - col_order - Order for column facets - row_order - Order for row facets - height - Height of each facet - aspect - Aspect ratio - facet_kws - Additional parameters for FacetGrid

Example:

sns.displot(data=df, x='measurement', hue='treatment',
            col='timepoint', kind='kde', fill=True,
            height=3, aspect=1.5, rug=True)

jointplot()

Purpose: Draw a bivariate plot with marginal univariate plots.

Key Parameters: - data - DataFrame - x, y - Variables for x and y axes - hue - Grouping variable - kind - "scatter", "kde", "hist", "hex", "reg", "resid" - height - Size of the figure (square) - ratio - Ratio of joint to marginal axes - space - Space between joint and marginal axes - dropna - Drop missing values - xlim, ylim - Axis limits (tuples) - marginal_ticks - Show ticks on marginal axes - joint_kws - Parameters for joint plot - marginal_kws - Parameters for marginal plots - hue_order - Order for hue levels - palette - Color palette

Example:

sns.jointplot(data=df, x='var1', y='var2', hue='group',
              kind='scatter', height=6, ratio=4,
              joint_kws={'alpha': 0.5})

pairplot()

Purpose: Plot pairwise relationships in a dataset.

Key Parameters: - data - DataFrame - hue - Grouping variable for color encoding - hue_order - Order for hue levels - palette - Color palette - vars - Variables to plot (default: all numeric) - x_vars, y_vars - Variables for x and y axes (non-square grid) - kind - "scatter", "kde", "hist", "reg" - diag_kind - "auto", "hist", "kde", None - markers - Marker style(s) - height - Height of each facet - aspect - Aspect ratio - corner - Plot only lower triangle - dropna - Drop missing values - plot_kws - Parameters for non-diagonal plots - diag_kws - Parameters for diagonal plots - grid_kws - Parameters for PairGrid

Example:

sns.pairplot(data=df, hue='species', palette='Set2',
             vars=['sepal_length', 'sepal_width', 'petal_length'],
             corner=True, height=2.5)

Categorical Plots

stripplot()

Purpose: Draw a categorical scatterplot with jittered points.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (one categorical, one continuous) - hue - Grouping variable - order - Order for categorical levels - hue_order - Order for hue levels - jitter - Amount of jitter: True, float, or False - dodge - Separate hue levels side-by-side - orient - "v" or "h" (usually inferred) - color - Single color for all elements - palette - Color palette - size - Marker size - edgecolor - Marker edge color - linewidth - Marker edge width - native_scale - Use numeric scale for categorical axis - formatter - Formatter for categorical axis - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.stripplot(data=df, x='day', y='total_bill',
              hue='sex', dodge=True, jitter=0.2)

swarmplot()

Purpose: Draw a categorical scatterplot with non-overlapping points.

Key Parameters: Same as stripplot(), except: - No jitter parameter - size - Marker size (important for avoiding overlap) - warn_thresh - Threshold for warning about too many points (default: 0.05)

Note: Computationally intensive for large datasets. Use stripplot for >1000 points.

Example:

sns.swarmplot(data=df, x='day', y='total_bill',
              hue='time', dodge=True, size=5)

boxplot()

Purpose: Draw a box plot showing quartiles and outliers.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (one categorical, one continuous) - hue - Grouping variable - order - Order for categorical levels - hue_order - Order for hue levels - orient - "v" or "h" - color - Single color for boxes - palette - Color palette - saturation - Color saturation intensity - width - Width of boxes - dodge - Separate hue levels side-by-side - fliersize - Size of outlier markers - linewidth - Box line width - whis - IQR multiplier for whiskers (default: 1.5) - notch - Draw notched boxes - showcaps - Show whisker caps - showmeans - Show mean value - meanprops - Properties for mean marker - boxprops - Properties for boxes - whiskerprops - Properties for whiskers - capprops - Properties for caps - flierprops - Properties for outliers - medianprops - Properties for median line - native_scale - Use numeric scale - formatter - Formatter for categorical axis - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.boxplot(data=df, x='day', y='total_bill',
            hue='smoker', palette='Set3',
            showmeans=True, notch=True)

violinplot()

Purpose: Draw a violin plot combining boxplot and KDE.

Key Parameters: Same as boxplot(), plus: - bw_method - KDE bandwidth method - bw_adjust - KDE bandwidth multiplier - cut - KDE extension beyond extremes - density_norm - "area", "count", "width" - inner - "box", "quartile", "point", "stick", None - split - Split violins for hue comparison - scale - Scaling method: "area", "count", "width" - scale_hue - Scale across hue levels - gridsize - KDE grid resolution

Example:

sns.violinplot(data=df, x='day', y='total_bill',
               hue='sex', split=True, inner='quartile',
               palette='muted')

boxenplot()

Purpose: Draw enhanced box plot for larger datasets showing more quantiles.

Key Parameters: Same as boxplot(), plus: - k_depth - "tukey", "proportion", "trustworthy", "full", or int - outlier_prop - Proportion of data as outliers - trust_alpha - Alpha for trustworthy depth - showfliers - Show outlier points

Example:

sns.boxenplot(data=df, x='day', y='total_bill',
              hue='time', palette='Set2')

barplot()

Purpose: Draw a bar plot with error bars showing statistical estimates.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (one categorical, one continuous) - hue - Grouping variable - order - Order for categorical levels - hue_order - Order for hue levels - estimator - Aggregation function (default: mean) - errorbar - Error representation: "sd", "se", "pi", ("ci", level), ("pi", level), or None - n_boot - Bootstrap iterations - seed - Random seed - units - Identifier for sampling units - weights - Observation weights - orient - "v" or "h" - color - Single bar color - palette - Color palette - saturation - Color saturation - width - Bar width - dodge - Separate hue levels side-by-side - errcolor - Error bar color - errwidth - Error bar line width - capsize - Error bar cap width - native_scale - Use numeric scale - formatter - Formatter for categorical axis - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.barplot(data=df, x='day', y='total_bill',
            hue='sex', estimator='median',
            errorbar=('ci', 95), capsize=0.1)

countplot()

Purpose: Show counts of observations in each categorical bin.

Key Parameters: Same as barplot(), but: - Only specify one of x or y (the categorical variable) - No estimator or errorbar (shows counts) - stat - "count" or "percent"

Example:

sns.countplot(data=df, x='day', hue='time',
              palette='pastel', dodge=True)

pointplot()

Purpose: Show point estimates and confidence intervals with connecting lines.

Key Parameters: Same as barplot(), plus: - markers - Marker style(s) - linestyles - Line style(s) - scale - Scale for markers - join - Connect points with lines - capsize - Error bar cap width

Example:

sns.pointplot(data=df, x='time', y='total_bill',
              hue='sex', markers=['o', 's'],
              linestyles=['-', '--'], capsize=0.1)

catplot()

Purpose: Figure-level interface for categorical plots onto a FacetGrid.

Key Parameters: All parameters from categorical plots, plus: - kind - "strip", "swarm", "box", "violin", "boxen", "bar", "point", "count" - col - Categorical variable for column facets - row - Categorical variable for row facets - col_wrap - Wrap columns - col_order - Order for column facets - row_order - Order for row facets - height - Height of each facet - aspect - Aspect ratio - sharex, sharey - Share axes across facets - legend - Whether to show legend - legend_out - Place legend outside figure - facet_kws - Additional FacetGrid parameters

Example:

sns.catplot(data=df, x='day', y='total_bill',
            hue='smoker', col='time',
            kind='violin', split=True,
            height=4, aspect=0.8)

Regression Plots

regplot()

Purpose: Plot data and a linear regression fit.

Key Parameters: - data - DataFrame - x, y - Variables or data vectors - x_estimator - Apply estimator to x bins - x_bins - Bin x for estimator - x_ci - CI for binned estimates - scatter - Show scatter points - fit_reg - Plot regression line - ci - CI for regression estimate (int or None) - n_boot - Bootstrap iterations for CI - units - Identifier for sampling units - seed - Random seed - order - Polynomial regression order - logistic - Fit logistic regression - lowess - Fit lowess smoother - robust - Fit robust regression - logx - Log-transform x - x_partial, y_partial - Partial regression (regress out variables) - truncate - Limit regression line to data range - dropna - Drop missing values - x_jitter, y_jitter - Add jitter to data - label - Label for legend - color - Color for all elements - marker - Marker style - scatter_kws - Parameters for scatter - line_kws - Parameters for regression line - ax - Matplotlib axes

Example:

sns.regplot(data=df, x='total_bill', y='tip',
            order=2, robust=True, ci=95,
            scatter_kws={'alpha': 0.5})

lmplot()

Purpose: Figure-level interface for regression plots onto a FacetGrid.

Key Parameters: All parameters from regplot(), plus: - hue - Grouping variable - col - Column facets - row - Row facets - palette - Color palette - col_wrap - Wrap columns - height - Facet height - aspect - Aspect ratio - markers - Marker style(s) - sharex, sharey - Share axes - hue_order - Order for hue levels - col_order - Order for column facets - row_order - Order for row facets - legend - Whether to show legend - legend_out - Place legend outside - facet_kws - FacetGrid parameters

Example:

sns.lmplot(data=df, x='total_bill', y='tip',
           hue='smoker', col='time', row='sex',
           height=3, aspect=1.2, ci=None)

residplot()

Purpose: Plot residuals of a regression.

Key Parameters: Same as regplot(), but: - Always plots residuals (y - predicted) vs x - Adds horizontal line at y=0 - lowess - Fit lowess smoother to residuals

Example:

sns.residplot(data=df, x='x', y='y', lowess=True,
              scatter_kws={'alpha': 0.5})

Matrix Plots

heatmap()

Purpose: Plot rectangular data as a color-encoded matrix.

Key Parameters: - data - 2D array-like data - vmin, vmax - Anchor values for colormap - cmap - Colormap name or object - center - Value at colormap center - robust - Use robust quantiles for colormap range - annot - Annotate cells: True, False, or array - fmt - Format string for annotations (e.g., ".2f") - annot_kws - Parameters for annotations - linewidths - Width of cell borders - linecolor - Color of cell borders - cbar - Draw colorbar - cbar_kws - Colorbar parameters - cbar_ax - Axes for colorbar - square - Force square cells - xticklabels, yticklabels - Tick labels (True, False, int, or list) - mask - Boolean array to mask cells - ax - Matplotlib axes

Example:

# Correlation matrix
corr = df.corr()
mask = np.triu(np.ones_like(corr, dtype=bool))
sns.heatmap(corr, mask=mask, annot=True, fmt='.2f',
            cmap='coolwarm', center=0, square=True,
            linewidths=1, cbar_kws={'shrink': 0.8})

clustermap()

Purpose: Plot a hierarchically-clustered heatmap.

Key Parameters: All parameters from heatmap(), plus: - pivot_kws - Parameters for pivoting (if needed) - method - Linkage method: "single", "complete", "average", "weighted", "centroid", "median", "ward" - metric - Distance metric for clustering - standard_scale - Standardize data: 0 (rows), 1 (columns), or None - z_score - Z-score normalize data: 0 (rows), 1 (columns), or None - row_cluster, col_cluster - Cluster rows/columns - row_linkage, col_linkage - Precomputed linkage matrices - row_colors, col_colors - Additional color annotations - dendrogram_ratio - Ratio of dendrogram to heatmap - colors_ratio - Ratio of color annotations to heatmap - cbar_pos - Colorbar position (tuple: x, y, width, height) - tree_kws - Parameters for dendrogram - figsize - Figure size

Example:

sns.clustermap(data, method='average', metric='euclidean',
               z_score=0, cmap='viridis',
               row_colors=row_colors, col_colors=col_colors,
               figsize=(12, 12), dendrogram_ratio=0.1)

Multi-Plot Grids

FacetGrid

Purpose: Multi-plot grid for plotting conditional relationships.

Initialization:

g = sns.FacetGrid(data, row=None, col=None, hue=None,
                  col_wrap=None, sharex=True, sharey=True,
                  height=3, aspect=1, palette=None,
                  row_order=None, col_order=None, hue_order=None,
                  hue_kws=None, dropna=False, legend_out=True,
                  despine=True, margin_titles=False,
                  xlim=None, ylim=None, subplot_kws=None,
                  gridspec_kws=None)

Methods: - map(func, *args, **kwargs) - Apply function to each facet - map_dataframe(func, *args, **kwargs) - Apply function with full DataFrame - set_axis_labels(x_var, y_var) - Set axis labels - set_titles(template, **kwargs) - Set subplot titles - set(kwargs) - Set attributes on all axes - add_legend(legend_data, title, label_order, **kwargs) - Add legend - savefig(*args, **kwargs) - Save figure

Example:

g = sns.FacetGrid(df, col='time', row='sex', hue='smoker',
                  height=3, aspect=1.5, margin_titles=True)
g.map(sns.scatterplot, 'total_bill', 'tip', alpha=0.7)
g.add_legend()
g.set_axis_labels('Total Bill ($)', 'Tip ($)')
g.set_titles('{col_name} | {row_name}')

PairGrid

Purpose: Grid for plotting pairwise relationships in a dataset.

Initialization:

g = sns.PairGrid(data, hue=None, vars=None,
                 x_vars=None, y_vars=None,
                 hue_order=None, palette=None,
                 hue_kws=None, corner=False,
                 diag_sharey=True, height=2.5,
                 aspect=1, layout_pad=0.5,
                 despine=True, dropna=False)

Methods: - map(func, **kwargs) - Apply function to all subplots - map_diag(func, **kwargs) - Apply to diagonal - map_offdiag(func, **kwargs) - Apply to off-diagonal - map_upper(func, **kwargs) - Apply to upper triangle - map_lower(func, **kwargs) - Apply to lower triangle - add_legend(legend_data, **kwargs) - Add legend - savefig(*args, **kwargs) - Save figure

Example:

g = sns.PairGrid(df, hue='species', vars=['a', 'b', 'c', 'd'],
                 corner=True, height=2.5)
g.map_upper(sns.scatterplot, alpha=0.5)
g.map_lower(sns.kdeplot)
g.map_diag(sns.histplot, kde=True)
g.add_legend()

JointGrid

Purpose: Grid for bivariate plot with marginal univariate plots.

Initialization:

g = sns.JointGrid(data=None, x=None, y=None, hue=None,
                  height=6, ratio=5, space=0.2,
                  dropna=False, xlim=None, ylim=None,
                  marginal_ticks=False, hue_order=None,
                  palette=None)

Methods: - plot(joint_func, marginal_func, **kwargs) - Plot both joint and marginals - plot_joint(func, **kwargs) - Plot joint distribution - plot_marginals(func, **kwargs) - Plot marginal distributions - refline(x, y, **kwargs) - Add reference line - set_axis_labels(xlabel, ylabel, **kwargs) - Set axis labels - savefig(*args, **kwargs) - Save figure

Example:

g = sns.JointGrid(data=df, x='x', y='y', hue='group',
                  height=6, ratio=5, space=0.2)
g.plot_joint(sns.scatterplot, alpha=0.5)
g.plot_marginals(sns.histplot, kde=True)
g.set_axis_labels('Variable X', 'Variable Y')

references/function_reference.md

Seaborn Function Reference

Relational Plots

scatterplot()

lineplot()

relplot()

Distribution Plots

histplot()

kdeplot()

ecdfplot()

rugplot()

displot()

jointplot()

pairplot()

Categorical Plots

stripplot()

swarmplot()

boxplot()

violinplot()

boxenplot()

barplot()

countplot()

pointplot()

catplot()

Regression Plots

regplot()

lmplot()

residplot()

Matrix Plots

heatmap()

clustermap()

Multi-Plot Grids

FacetGrid

PairGrid

JointGrid