references/function_reference.md

Seaborn Function Reference

This document provides a comprehensive reference for all major seaborn functions, organized by category.

Relational Plots

scatterplot()

Purpose: Create a scatter plot with points representing individual observations.

Key Parameters: - data - DataFrame, array, or dict of arrays - x, y - Variables for x and y axes - hue - Grouping variable for color encoding - size - Grouping variable for size encoding - style - Grouping variable for marker style - palette - Color palette name or list - hue_order - Order for categorical hue levels - hue_norm - Normalization for numeric hue (tuple or Normalize object) - sizes - Size range for size encoding (tuple or dict) - size_order - Order for categorical size levels - size_norm - Normalization for numeric size - markers - Marker style(s) (string, list, or dict) - style_order - Order for categorical style levels - legend - How to draw legend: "auto", "brief", "full", or False - ax - Matplotlib axes to plot on

Example:

sns.scatterplot(data=df, x='height', y='weight',
                hue='gender', size='age', style='smoker',
                palette='Set2', sizes=(20, 200))

lineplot()

Purpose: Draw a line plot with automatic aggregation and confidence intervals for repeated measures.

Key Parameters: - data - DataFrame, array, or dict of arrays - x, y - Variables for x and y axes - hue - Grouping variable for color encoding - size - Grouping variable for line width - style - Grouping variable for line style (dashes) - units - Grouping variable for sampling units (no aggregation within units) - estimator - Function for aggregating across observations (default: mean) - errorbar - Method for error bars: "sd", "se", "pi", ("ci", level), ("pi", level), or None - n_boot - Number of bootstrap iterations for CI computation - seed - Random seed for reproducible bootstrapping - sort - Sort data before plotting - err_style - "band" or "bars" for error representation - err_kws - Additional parameters for error representation - markers - Marker style(s) for emphasizing data points - dashes - Dash style(s) for lines - legend - How to draw legend - ax - Matplotlib axes to plot on

Example:

sns.lineplot(data=timeseries, x='time', y='signal',
             hue='condition', style='subject',
             errorbar=('ci', 95), markers=True)

relplot()

Purpose: Figure-level interface for drawing relational plots (scatter or line) onto a FacetGrid.

Key Parameters: All parameters from scatterplot() and lineplot(), plus: - kind - "scatter" or "line" - col - Categorical variable for column facets - row - Categorical variable for row facets - col_wrap - Wrap columns after this many columns - col_order - Order for column facet levels - row_order - Order for row facet levels - height - Height of each facet in inches - aspect - Aspect ratio (width = height * aspect) - facet_kws - Additional parameters for FacetGrid

Example:

sns.relplot(data=df, x='time', y='measurement',
            hue='treatment', style='batch',
            col='cell_line', row='timepoint',
            kind='line', height=3, aspect=1.5)

Distribution Plots

histplot()

Purpose: Plot univariate or bivariate histograms with flexible binning.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (y optional for bivariate) - hue - Grouping variable - weights - Variable for weighting observations - stat - Aggregate statistic: "count", "frequency", "probability", "percent", "density" - bins - Number of bins, bin edges, or method ("auto", "fd", "doane", "scott", "stone", "rice", "sturges", "sqrt") - binwidth - Width of bins (overrides bins) - binrange - Range for binning (tuple) - discrete - Treat x as discrete (centers bars on values) - cumulative - Compute cumulative distribution - common_bins - Use same bins for all hue levels - common_norm - Normalize across hue levels - multiple - How to handle hue: "layer", "dodge", "stack", "fill" - element - Visual element: "bars", "step", "poly" - fill - Fill bars/elements - shrink - Scale bar width (for multiple="dodge") - kde - Overlay KDE estimate - kde_kws - Parameters for KDE - line_kws - Parameters for step/poly elements - thresh - Minimum count threshold for bins - pthresh - Minimum probability threshold - pmax - Maximum probability for color scaling - log_scale - Log scale for axis (bool or base) - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.histplot(data=df, x='measurement', hue='condition',
             stat='density', bins=30, kde=True,
             multiple='layer', alpha=0.5)

kdeplot()

Purpose: Plot univariate or bivariate kernel density estimates.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (y optional for bivariate) - hue - Grouping variable - weights - Variable for weighting observations - palette - Color palette - hue_order - Order for hue levels - hue_norm - Normalization for numeric hue - multiple - How to handle hue: "layer", "stack", "fill" - common_norm - Normalize across hue levels - common_grid - Use same grid for all hue levels - cumulative - Compute cumulative distribution - bw_method - Method for bandwidth: "scott", "silverman", or scalar - bw_adjust - Bandwidth multiplier (higher = smoother) - log_scale - Log scale for axis - levels - Number or values for contour levels (bivariate) - thresh - Minimum density threshold for contours - gridsize - Grid resolution - cut - Extension beyond data extremes (in bandwidth units) - clip - Data range for curve (tuple) - fill - Fill area under curve/contours - legend - Whether to show legend - ax - Matplotlib axes

Example:

# Univariate
sns.kdeplot(data=df, x='measurement', hue='condition',
            fill=True, common_norm=False, bw_adjust=1.5)

# Bivariate
sns.kdeplot(data=df, x='var1', y='var2',
            fill=True, levels=10, thresh=0.05)

ecdfplot()

Purpose: Plot empirical cumulative distribution functions.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (specify one) - hue - Grouping variable - weights - Variable for weighting observations - stat - "proportion" or "count" - complementary - Plot complementary CDF (1 - ECDF) - palette - Color palette - hue_order - Order for hue levels - hue_norm - Normalization for numeric hue - log_scale - Log scale for axis - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.ecdfplot(data=df, x='response_time', hue='treatment',
             stat='proportion', complementary=False)

rugplot()

Purpose: Plot tick marks showing individual observations along an axis.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variable (specify one) - hue - Grouping variable - height - Height of ticks (proportion of axis) - expand_margins - Add margin space for rug - palette - Color palette - hue_order - Order for hue levels - hue_norm - Normalization for numeric hue - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.rugplot(data=df, x='value', hue='category', height=0.05)

displot()

Purpose: Figure-level interface for distribution plots onto a FacetGrid.

Key Parameters: All parameters from histplot(), kdeplot(), and ecdfplot(), plus: - kind - "hist", "kde", "ecdf" - rug - Add rug plot on marginal axes - rug_kws - Parameters for rug plot - col - Categorical variable for column facets - row - Categorical variable for row facets - col_wrap - Wrap columns - col_order - Order for column facets - row_order - Order for row facets - height - Height of each facet - aspect - Aspect ratio - facet_kws - Additional parameters for FacetGrid

Example:

sns.displot(data=df, x='measurement', hue='treatment',
            col='timepoint', kind='kde', fill=True,
            height=3, aspect=1.5, rug=True)

jointplot()

Purpose: Draw a bivariate plot with marginal univariate plots.

Key Parameters: - data - DataFrame - x, y - Variables for x and y axes - hue - Grouping variable - kind - "scatter", "kde", "hist", "hex", "reg", "resid" - height - Size of the figure (square) - ratio - Ratio of joint to marginal axes - space - Space between joint and marginal axes - dropna - Drop missing values - xlim, ylim - Axis limits (tuples) - marginal_ticks - Show ticks on marginal axes - joint_kws - Parameters for joint plot - marginal_kws - Parameters for marginal plots - hue_order - Order for hue levels - palette - Color palette

Example:

sns.jointplot(data=df, x='var1', y='var2', hue='group',
              kind='scatter', height=6, ratio=4,
              joint_kws={'alpha': 0.5})

pairplot()

Purpose: Plot pairwise relationships in a dataset.

Key Parameters: - data - DataFrame - hue - Grouping variable for color encoding - hue_order - Order for hue levels - palette - Color palette - vars - Variables to plot (default: all numeric) - x_vars, y_vars - Variables for x and y axes (non-square grid) - kind - "scatter", "kde", "hist", "reg" - diag_kind - "auto", "hist", "kde", None - markers - Marker style(s) - height - Height of each facet - aspect - Aspect ratio - corner - Plot only lower triangle - dropna - Drop missing values - plot_kws - Parameters for non-diagonal plots - diag_kws - Parameters for diagonal plots - grid_kws - Parameters for PairGrid

Example:

sns.pairplot(data=df, hue='species', palette='Set2',
             vars=['sepal_length', 'sepal_width', 'petal_length'],
             corner=True, height=2.5)

Categorical Plots

stripplot()

Purpose: Draw a categorical scatterplot with jittered points.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (one categorical, one continuous) - hue - Grouping variable - order - Order for categorical levels - hue_order - Order for hue levels - jitter - Amount of jitter: True, float, or False - dodge - Separate hue levels side-by-side - orient - "v" or "h" (usually inferred) - color - Single color for all elements - palette - Color palette - size - Marker size - edgecolor - Marker edge color - linewidth - Marker edge width - native_scale - Use numeric scale for categorical axis - formatter - Formatter for categorical axis - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.stripplot(data=df, x='day', y='total_bill',
              hue='sex', dodge=True, jitter=0.2)

swarmplot()

Purpose: Draw a categorical scatterplot with non-overlapping points.

Key Parameters: Same as stripplot(), except: - No jitter parameter - size - Marker size (important for avoiding overlap) - warn_thresh - Threshold for warning about too many points (default: 0.05)

Note: Computationally intensive for large datasets. Use stripplot for >1000 points.

Example:

sns.swarmplot(data=df, x='day', y='total_bill',
              hue='time', dodge=True, size=5)

boxplot()

Purpose: Draw a box plot showing quartiles and outliers.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (one categorical, one continuous) - hue - Grouping variable - order - Order for categorical levels - hue_order - Order for hue levels - orient - "v" or "h" - color - Single color for boxes - palette - Color palette - saturation - Color saturation intensity - width - Width of boxes - dodge - Separate hue levels side-by-side - fliersize - Size of outlier markers - linewidth - Box line width - whis - IQR multiplier for whiskers (default: 1.5) - notch - Draw notched boxes - showcaps - Show whisker caps - showmeans - Show mean value - meanprops - Properties for mean marker - boxprops - Properties for boxes - whiskerprops - Properties for whiskers - capprops - Properties for caps - flierprops - Properties for outliers - medianprops - Properties for median line - native_scale - Use numeric scale - formatter - Formatter for categorical axis - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.boxplot(data=df, x='day', y='total_bill',
            hue='smoker', palette='Set3',
            showmeans=True, notch=True)

violinplot()

Purpose: Draw a violin plot combining boxplot and KDE.

Key Parameters: Same as boxplot(), plus: - bw_method - KDE bandwidth method - bw_adjust - KDE bandwidth multiplier - cut - KDE extension beyond extremes - density_norm - "area", "count", "width" - inner - "box", "quartile", "point", "stick", None - split - Split violins for hue comparison - scale - Scaling method: "area", "count", "width" - scale_hue - Scale across hue levels - gridsize - KDE grid resolution

Example:

sns.violinplot(data=df, x='day', y='total_bill',
               hue='sex', split=True, inner='quartile',
               palette='muted')

boxenplot()

Purpose: Draw enhanced box plot for larger datasets showing more quantiles.

Key Parameters: Same as boxplot(), plus: - k_depth - "tukey", "proportion", "trustworthy", "full", or int - outlier_prop - Proportion of data as outliers - trust_alpha - Alpha for trustworthy depth - showfliers - Show outlier points

Example:

sns.boxenplot(data=df, x='day', y='total_bill',
              hue='time', palette='Set2')

barplot()

Purpose: Draw a bar plot with error bars showing statistical estimates.

Key Parameters: - data - DataFrame, array, or dict - x, y - Variables (one categorical, one continuous) - hue - Grouping variable - order - Order for categorical levels - hue_order - Order for hue levels - estimator - Aggregation function (default: mean) - errorbar - Error representation: "sd", "se", "pi", ("ci", level), ("pi", level), or None - n_boot - Bootstrap iterations - seed - Random seed - units - Identifier for sampling units - weights - Observation weights - orient - "v" or "h" - color - Single bar color - palette - Color palette - saturation - Color saturation - width - Bar width - dodge - Separate hue levels side-by-side - errcolor - Error bar color - errwidth - Error bar line width - capsize - Error bar cap width - native_scale - Use numeric scale - formatter - Formatter for categorical axis - legend - Whether to show legend - ax - Matplotlib axes

Example:

sns.barplot(data=df, x='day', y='total_bill',
            hue='sex', estimator='median',
            errorbar=('ci', 95), capsize=0.1)

countplot()

Purpose: Show counts of observations in each categorical bin.

Key Parameters: Same as barplot(), but: - Only specify one of x or y (the categorical variable) - No estimator or errorbar (shows counts) - stat - "count" or "percent"

Example:

sns.countplot(data=df, x='day', hue='time',
              palette='pastel', dodge=True)

pointplot()

Purpose: Show point estimates and confidence intervals with connecting lines.

Key Parameters: Same as barplot(), plus: - markers - Marker style(s) - linestyles - Line style(s) - scale - Scale for markers - join - Connect points with lines - capsize - Error bar cap width

Example:

sns.pointplot(data=df, x='time', y='total_bill',
              hue='sex', markers=['o', 's'],
              linestyles=['-', '--'], capsize=0.1)

catplot()

Purpose: Figure-level interface for categorical plots onto a FacetGrid.

Key Parameters: All parameters from categorical plots, plus: - kind - "strip", "swarm", "box", "violin", "boxen", "bar", "point", "count" - col - Categorical variable for column facets - row - Categorical variable for row facets - col_wrap - Wrap columns - col_order - Order for column facets - row_order - Order for row facets - height - Height of each facet - aspect - Aspect ratio - sharex, sharey - Share axes across facets - legend - Whether to show legend - legend_out - Place legend outside figure - facet_kws - Additional FacetGrid parameters

Example:

sns.catplot(data=df, x='day', y='total_bill',
            hue='smoker', col='time',
            kind='violin', split=True,
            height=4, aspect=0.8)

Regression Plots

regplot()

Purpose: Plot data and a linear regression fit.

Key Parameters: - data - DataFrame - x, y - Variables or data vectors - x_estimator - Apply estimator to x bins - x_bins - Bin x for estimator - x_ci - CI for binned estimates - scatter - Show scatter points - fit_reg - Plot regression line - ci - CI for regression estimate (int or None) - n_boot - Bootstrap iterations for CI - units - Identifier for sampling units - seed - Random seed - order - Polynomial regression order - logistic - Fit logistic regression - lowess - Fit lowess smoother - robust - Fit robust regression - logx - Log-transform x - x_partial, y_partial - Partial regression (regress out variables) - truncate - Limit regression line to data range - dropna - Drop missing values - x_jitter, y_jitter - Add jitter to data - label - Label for legend - color - Color for all elements - marker - Marker style - scatter_kws - Parameters for scatter - line_kws - Parameters for regression line - ax - Matplotlib axes

Example:

sns.regplot(data=df, x='total_bill', y='tip',
            order=2, robust=True, ci=95,
            scatter_kws={'alpha': 0.5})

lmplot()

Purpose: Figure-level interface for regression plots onto a FacetGrid.

Key Parameters: All parameters from regplot(), plus: - hue - Grouping variable - col - Column facets - row - Row facets - palette - Color palette - col_wrap - Wrap columns - height - Facet height - aspect - Aspect ratio - markers - Marker style(s) - sharex, sharey - Share axes - hue_order - Order for hue levels - col_order - Order for column facets - row_order - Order for row facets - legend - Whether to show legend - legend_out - Place legend outside - facet_kws - FacetGrid parameters

Example:

sns.lmplot(data=df, x='total_bill', y='tip',
           hue='smoker', col='time', row='sex',
           height=3, aspect=1.2, ci=None)

residplot()

Purpose: Plot residuals of a regression.

Key Parameters: Same as regplot(), but: - Always plots residuals (y - predicted) vs x - Adds horizontal line at y=0 - lowess - Fit lowess smoother to residuals

Example:

sns.residplot(data=df, x='x', y='y', lowess=True,
              scatter_kws={'alpha': 0.5})

Matrix Plots

heatmap()

Purpose: Plot rectangular data as a color-encoded matrix.

Key Parameters: - data - 2D array-like data - vmin, vmax - Anchor values for colormap - cmap - Colormap name or object - center - Value at colormap center - robust - Use robust quantiles for colormap range - annot - Annotate cells: True, False, or array - fmt - Format string for annotations (e.g., ".2f") - annot_kws - Parameters for annotations - linewidths - Width of cell borders - linecolor - Color of cell borders - cbar - Draw colorbar - cbar_kws - Colorbar parameters - cbar_ax - Axes for colorbar - square - Force square cells - xticklabels, yticklabels - Tick labels (True, False, int, or list) - mask - Boolean array to mask cells - ax - Matplotlib axes

Example:

# Correlation matrix
corr = df.corr()
mask = np.triu(np.ones_like(corr, dtype=bool))
sns.heatmap(corr, mask=mask, annot=True, fmt='.2f',
            cmap='coolwarm', center=0, square=True,
            linewidths=1, cbar_kws={'shrink': 0.8})

clustermap()

Purpose: Plot a hierarchically-clustered heatmap.

Key Parameters: All parameters from heatmap(), plus: - pivot_kws - Parameters for pivoting (if needed) - method - Linkage method: "single", "complete", "average", "weighted", "centroid", "median", "ward" - metric - Distance metric for clustering - standard_scale - Standardize data: 0 (rows), 1 (columns), or None - z_score - Z-score normalize data: 0 (rows), 1 (columns), or None - row_cluster, col_cluster - Cluster rows/columns - row_linkage, col_linkage - Precomputed linkage matrices - row_colors, col_colors - Additional color annotations - dendrogram_ratio - Ratio of dendrogram to heatmap - colors_ratio - Ratio of color annotations to heatmap - cbar_pos - Colorbar position (tuple: x, y, width, height) - tree_kws - Parameters for dendrogram - figsize - Figure size

Example:

sns.clustermap(data, method='average', metric='euclidean',
               z_score=0, cmap='viridis',
               row_colors=row_colors, col_colors=col_colors,
               figsize=(12, 12), dendrogram_ratio=0.1)

Multi-Plot Grids

FacetGrid

Purpose: Multi-plot grid for plotting conditional relationships.

Initialization:

g = sns.FacetGrid(data, row=None, col=None, hue=None,
                  col_wrap=None, sharex=True, sharey=True,
                  height=3, aspect=1, palette=None,
                  row_order=None, col_order=None, hue_order=None,
                  hue_kws=None, dropna=False, legend_out=True,
                  despine=True, margin_titles=False,
                  xlim=None, ylim=None, subplot_kws=None,
                  gridspec_kws=None)

Methods: - map(func, *args, **kwargs) - Apply function to each facet - map_dataframe(func, *args, **kwargs) - Apply function with full DataFrame - set_axis_labels(x_var, y_var) - Set axis labels - set_titles(template, **kwargs) - Set subplot titles - set(kwargs) - Set attributes on all axes - add_legend(legend_data, title, label_order, **kwargs) - Add legend - savefig(*args, **kwargs) - Save figure

Example:

g = sns.FacetGrid(df, col='time', row='sex', hue='smoker',
                  height=3, aspect=1.5, margin_titles=True)
g.map(sns.scatterplot, 'total_bill', 'tip', alpha=0.7)
g.add_legend()
g.set_axis_labels('Total Bill ($)', 'Tip ($)')
g.set_titles('{col_name} | {row_name}')

PairGrid

Purpose: Grid for plotting pairwise relationships in a dataset.

Initialization:

g = sns.PairGrid(data, hue=None, vars=None,
                 x_vars=None, y_vars=None,
                 hue_order=None, palette=None,
                 hue_kws=None, corner=False,
                 diag_sharey=True, height=2.5,
                 aspect=1, layout_pad=0.5,
                 despine=True, dropna=False)

Methods: - map(func, **kwargs) - Apply function to all subplots - map_diag(func, **kwargs) - Apply to diagonal - map_offdiag(func, **kwargs) - Apply to off-diagonal - map_upper(func, **kwargs) - Apply to upper triangle - map_lower(func, **kwargs) - Apply to lower triangle - add_legend(legend_data, **kwargs) - Add legend - savefig(*args, **kwargs) - Save figure

Example:

g = sns.PairGrid(df, hue='species', vars=['a', 'b', 'c', 'd'],
                 corner=True, height=2.5)
g.map_upper(sns.scatterplot, alpha=0.5)
g.map_lower(sns.kdeplot)
g.map_diag(sns.histplot, kde=True)
g.add_legend()

JointGrid

Purpose: Grid for bivariate plot with marginal univariate plots.

Initialization:

g = sns.JointGrid(data=None, x=None, y=None, hue=None,
                  height=6, ratio=5, space=0.2,
                  dropna=False, xlim=None, ylim=None,
                  marginal_ticks=False, hue_order=None,
                  palette=None)

Methods: - plot(joint_func, marginal_func, **kwargs) - Plot both joint and marginals - plot_joint(func, **kwargs) - Plot joint distribution - plot_marginals(func, **kwargs) - Plot marginal distributions - refline(x, y, **kwargs) - Add reference line - set_axis_labels(xlabel, ylabel, **kwargs) - Set axis labels - savefig(*args, **kwargs) - Save figure

Example:

g = sns.JointGrid(data=df, x='x', y='y', hue='group',
                  height=6, ratio=5, space=0.2)
g.plot_joint(sns.scatterplot, alpha=0.5)
g.plot_marginals(sns.histplot, kde=True)
g.set_axis_labels('Variable X', 'Variable Y')
← Back to seaborn