matminer.figrecipes.plotly package

Submodules

matminer.figrecipes.plotly.make_plots module

class matminer.figrecipes.plotly.make_plots.PlotlyFig(df=None, plot_mode='offline', plot_title=None, x_title=None, y_title=None, hovermode='closest', filename='auto', show_offline_plot=True, username=None, api_key=None, textsize=25, ticksize=25, fontfamily='Courier', height=None, width=None, scale=None, margins=100, pad=0, marker_scale=1.0, text_scale=1.0, tick_scale=1.0, x_scale='linear', y_scale='linear', hoverinfo='x+y+text')

Bases: object

__init__(df=None, plot_mode='offline', plot_title=None, x_title=None, y_title=None, hovermode='closest', filename='auto', show_offline_plot=True, username=None, api_key=None, textsize=25, ticksize=25, fontfamily='Courier', height=None, width=None, scale=None, margins=100, pad=0, marker_scale=1.0, text_scale=1.0, tick_scale=1.0, x_scale='linear', y_scale='linear', hoverinfo='x+y+text')

Class for making Plotly plots

Args:
df (DataFrame): A pandas dataframe object which can be used to
generate several plots.
plot_mode: (str)
  1. ‘offline’: creates and saves plots on the local disk
  2. ‘notebook’: to embed plots in a IPython/Jupyter notebook,
  3. ‘online’: save the plot in your online plotly account,
  4. ‘static’: save a static image of the plot locally

(v) ‘return’: Any plotting method returns its Plotly Figure object. Useful for fine tuning the plot. NOTE: Both ‘online’ and ‘static’ modes require either the fields ‘username’ and ‘api_key’ or Plotly credentials file.

plot_title: (str) title of plot x_title: (str) title of x-axis y_title: (str) title of y-axis hovermode: (str) determines the mode of hover interactions. Can be

‘x’/’y’/’closest’/False

filename: (str) name/filepath of plot file show_offline_plot: (bool) automatically open the plot (the plot is

saved either way); only applies to ‘offline’ mode.

username: (str) plotly account username api_key: (str) plotly account API key textsize: (int) size of text of plot title and axis titles ticksize: (int) size of ticks fontfamily: (str) HTML font family - the typeface that will be applied by the web browser. The web browser

will only be able to apply a font if it is available on the system which it operates. Provide multiple font families, separated by commas, to indicate the preference in which to apply fonts if they aren’t available on the system. The plotly service (at https://plot.ly or on-premise) generates images on a server, where only a select number of fonts are installed and supported. These include “Arial”, “Balto”,

“Courier New”, “Droid Sans”,, “Droid Serif”, “Droid Sans Mono”, “Gravitas One”, “Old Standard TT”, “Open Sans”, “Overpass”, “PT Sans Narrow”, “Raleway”, “Times New Roman”.

height: (float) output height (in pixels) width: (float) output width (in pixels) scale: (float) Increase the resolution of the image by scale amount, eg: 3. Only valid for PNG and

JPEG images.
margins (float or [float]): Specify the margin (in px) with a list [top, bottom, right, left], or a
number which will set all margins.

pad: (float) Sets the amount of padding (in px) between the plotting area and the axis lines marker_scale (float): scale the size of all markers w.r.t. defaults x_scale: (str) Sets the x axis scaling type. Select from ‘linear’, ‘log’, ‘date’, ‘category’. y_scale: (str) Sets the y axis scaling type. Select from ‘linear’, ‘log’, ‘date’, ‘category’. hoverinfo: (str) Any combination of “x”, “y”, “z”, “text”, “name”

joined with a “+” OR “all” or “none” or “skip”. Examples: “x”, “y”, “x+y”, “x+y+z”, “all” Determines which trace information appear on hover. If none or skip are set, no information is displayed upon hovering. But, if none is set, click and hover events are still fired.

Returns: None

bar(data=None, cols=None, x=None, y=None, labels=None, barmode='group', colors=None, bargap=None)

Create a bar chart using Plotly.

Can be used with x and y arguments or with a dataframe (passed as ‘data’ or taken from constructor).

Args:
data (DataFrame): The column names will become the ‘x’ axis. The
rows will become sets of bars (e.g., 3 rows = 3 sets of bars for each x point).
cols ([str]): A list of strings specifying columns of a DataFrame
passed into the constructor to be used as data. Should not be used with ‘data’.
x (list or [list]): A list containing ‘x’ axis values. Can be a list
of lists if there is more than one set of bars.
y (list or [list]): A list containing ‘y’ values. Can be a list of
lists if there is more than one set of bars (more than one set of data for each ‘x’ axis value).
labels (str or [str]): Defines the label for each set of bars. If
str, defines the column of the DataFrame to use for labelling. The column’s entry for a row will be the label for that row. If it is a list of strings, should be used with x and y, and defines the label for each set of bars.
barmode: Defines how sets of bars are displayed. Can be set to
“group” or “stack”.
colors ([str]): The list of colors to use for each set of bars.
The length of this list should be equal to the number of rows (sets of bars) present in your data.

bargap (int/float): Separation between bars.

Returns:
A Plotly bar chart object.
create_plot(fig)

Creates a plotly plot based on its dictionary representation. The modes of plotting are:

  1. offline: Makes an offline html.
  2. notebook: Embeds in Jupyter notebook
  3. online: Send to Plotly, requires credentials
  4. static: Creates a static image of the plot
  5. return: Returns the dictionary representation of the plot.
Args:
fig: (dictionary) contains data and layout information
Returns:
A Plotly Figure object (if self.plot_mode = ‘return’)
data_from_col(col, df=None)
try to get data based on column name in dataframe and return
informative error if failed.
Args:
col (str): column name to look for

Returns (pd.Series or col itself):

heatmap_plot(data, x_labels=None, y_labels=None, colorscale='Viridis', colorscale_range=None, annotations_text=None, annotations_text_size=20, annotations_color='white')

Make a heatmap plot, either using 2D arrays of values, or a dataframe.

Args:

data: (array) an array of arrays. For example, in case of a pandas dataframe ‘df’, data=df.values.tolist() x_labels: (array) an array of strings to label the heatmap columns y_labels: (array) an array of strings to label the heatmap rows colorscale: (str/array) Sets the colorscale. The colorscale must be an array containing arrays mapping a

normalized value to an rgb, rgba, hex, hsl, hsv, or named color string. At minimum, a mapping for the lowest (0) and highest (1) values are required. For example, [[0, ‘rgb(0,0,255)’, [1, ‘rgb(255,0,0)’]]. Alternatively, colorscale may be a palette name string of the following list: Greys, YlGnBu, Greens, YlOrRd, Bluered, RdBu, Reds, Blues, Picnic, Rainbow, Portland, Jet, Hot, Blackbody, Earth, Electric, Viridis
colorscale_range: (array) Sets the minimum (first array item) and maximum value (second array item)
of the colorscale
annotations_text: (array) an array of arrays, with each value being a string annotation to the corresponding
value in ‘data’

annotations_text_size: (int) size of annotation text annotations_color: (str/array) color of annotation text - accepts similar formats as other color variables

Returns: A Plotly heatmap plot Figure object.

histogram(data=None, cols=None, orientation='vertical', histnorm='count', n_bins=None, start=None, end=None, size=None, colors=None, bargap=0)

Creates a Plotly histogram. If multiple series of data are available, will create an overlaid histogram.

For n_bins, start, end, size, colors, and bargaps, all defaults are Plotly defaults.

Args:
data (DataFrame or list): A dataframe containing at least
one numerical column. Also accepts lists of numerical values. If None, uses the dataframe passed into the constructor.
cols ([str]): A list of strings specifying the columns of the
dataframe to use. Each column will be represented with its own histogram in the overlay.
orientation (str): Determines whether histogram is oriented
horizontally or vertically. Use “vertical” or “horizontal”.
histnorm: The technique for creating the plot. Can be “probability
density”, “probability”, “density”, or “” (count).

n_bins (int): The number of binds to include on each plot. start (float or list): The list of starting points for each

histogram’s bins (if overlaid). If only one series of data is present or all series should have the same value, a single float/int determines the starting point.
end (float or list): The list of ending points for each histogram’s
bins (if overlaid). If only one series of data is present or all series should have the same value, a single float/int determines the ending point.
size (float or list): The list of sizes of each histogram’s bins
(if overlaid). If only one series of data is present or all series should have the same value, a single float/int determines the size of the bins.
colors (str or list): The list of colors for each histogram (if
overlaid). If only one series of data is present or all series should have the same value, a single str determines the color of the bins.
bargaps (float or list): The gaps between bars for all histograms
shown.
Returns:
Plotly histogram figure.
scatter_matrix(data=None, cols=None, colbar=None, marker=None, text=None, **kwargs)

Create a Plotly scatter matrix plot from dataframes using Plotly. Args:

data (DataFrame or list): A dataframe containing at least
one numerical column. Also accepts lists of numerical values. If None, uses the dataframe passed into the constructor.
cols ([str]): A list of strings specifying the columns of the
dataframe to use.

colbar: (str) name of the column used for colorbar marker (dict): if size is set, it will override the automatic size text (see PlotlyFig.xy_plot documentation): **kwargs: keyword arguments of scatterplot. Forbidden args are

‘size’, ‘color’ and ‘colorscale’ in ‘marker’. See example below

Returns: a Plotly scatter matrix plot

# Example for more control over markers: from matminer.figrecipes.plotly.make_plots import PlotlyFig from matminer.datasets.dataframe_loader import load_elastic_tensor df = load_elastic_tensor() pf = PlotlyFig() pf.scatter_matrix(df[[‘volume’, ‘G_VRH’, ‘K_VRH’, ‘poisson_ratio’]],

colbar_col=’poisson_ratio’, text=df[‘material_id’], marker={‘symbol’: ‘diamond’, ‘size’: 8, ‘line’: {‘width’: 1, ‘color’: ‘black’}}, colormap=’Viridis’, title=’Elastic Properties Scatter Matrix’)
violin(data=None, cols=None, group_col=None, groups=None, title=None, colors=None, use_colorscale=False)

Create a violin plot using Plotly.

Args:
data: (DataFrame or list) A dataframe containing at least one
numerical column. Also accepts lists of numerical values. If None, uses the dataframe passed into the constructor.
cols: ([str]) The labels for the columns of the dataframe to be
included in the plot. Not used if data is passed in as list.
group_col: (str) Name of the column containing the group for each
row, if it exists. Used only if there is one entry in cols.
groups: ([str]): All group names to be included in the violin plot.
Used only if there is one entry in cols.

title: (str) Title of the violin plot colors: (str/tuple/list/dict) either a plotly scale name (Greys,

YlGnBu, Greens, etc.), an rgb or hex color, a color tuple, a list/dict of colors. An rgb color is of the form ‘rgb(x, y, z)’ where x, y and z belong to the interval [0, 255] and a color tuple is a tuple of the form (a, b, c) where a, b and c belong to [0, 1]. If colors is a list, it must contain valid color types as its members. If colors is a dictionary, its keys must represent group names, and corresponding values must be valid color types (str).
use_colorscale: (bool) Only applicable if grouping by another
variable. Will implement a colorscale based on the first 2 colors of param colors. This means colors must be a list with at least 2 colors in it (Plotly colorscales are accepted since they map to a list of two rgb colors)

Returns: A Plotly violin plot Figure object.

xy(xy_pairs, colorbar=None, labels=None, names=None, modes='markers', markers=None, lines=None, colorscale='Viridis', showlegends=None)

Make an XY scatter plot, either using arrays of values, or a dataframe. Args:

xy_pairs (tuple or [tuple]): x & y columns of scatter plots
with possibly different lengths are extracted from this arg example 1: ([1, 2], [3, 4]) example 2: [(df[‘x1’], df[‘y1’]), (df[‘x2’], df[‘y2’])] example 3: [(‘x1’, ‘y1’), (‘x2’, ‘y2’)]
colorbar (list or np.ndarray or pd.Series): set the colorscale for
the colorbar (list of numbers)
labels (list or [list]): to individually set annotation for scatter
point either the same for all traces or can be set for each
names (str or [str]): list of trace names used for legend. By
default column name (or trace if NA) used if pd.Series passed

modes (str or [str]): trace style; can be ‘markers’/’lines’/’lines+markers’ markers (dict or [dict]): gives the ability to fine tune marker

of each scatter plot individually if list of dicts passed

lines (dict or [dict]: similar to markers though only if mode==’lines’ colorscale: (str) Sets the colorscale (colormap). It can be an array

containing arrays mapping a normalized value to an rgb, rgba, hex, hsl, hsv, or named color string. At minimum, a mapping for the lowest (0) and highest (1) values are required. Example: ‘[[0, ‘rgb(0,0,255)’, [1, ‘rgb(255,0,0)’]]’. Alternatively, it may be a palette name from the following list: Greys, YlGnBu, Greens, YlOrRd, Bluered, RdBu, Reds, Blues, Jet, Picnic, Rainbow, Portland, Hot, Blackbody, Earth, Electric, Viridis
showlegends (bool or [bool]): indicating whether to show legend
for each trace (or simply turn it on/off for all if not list)

Returns: A Plotly Scatter plot Figure object.

xy_plot(x_col, y_col, text=None, color='rgba(70, 130, 180, 1)', size=6, colorscale='Viridis', legend=None, showlegend=False, mode='markers', marker='circle', marker_fill='fill', hoverinfo='x+y+text', add_xy_plot=None, marker_outline_width=0, marker_outline_color='black', linedash='solid', linewidth=2, lineshape='linear', error_type=None, error_direction=None, error_array=None, error_value=None, error_symmetric=True, error_arrayminus=None, error_valueminus=None)

Make an XY scatter plot, either using arrays of values, or a dataframe.

Args:

x_col: (array) x-axis values, which can be a list/array/dataframe column y_col: (array) y-axis values, which can be a list/array/dataframe column text: (str/array) text to use when hovering over points; a single string, or an array of strings, or a

dataframe column containing text strings
color: (str/array) in the format of a (i) color name (eg: “red”), or (ii) a RGB tuple,
(eg: “rgba(255, 0, 0, 0.8)”), where the last number represents the marker opacity/transparency, which must be between 0.0 and 1.0., (iii) hexagonal code (eg: “FFBAD2”), or (iv) name of a dataframe numeric column to set the marker color scale to
size: (int/array) marker size in the format of (i) a constant integer size, or (ii) name of a dataframe
numeric column to set the marker size scale to. In the latter case, scaled Z-scores are used.
colorscale: (str) Sets the colorscale. The colorscale must be an array containing arrays mapping a
normalized value to an rgb, rgba, hex, hsl, hsv, or named color string. At minimum, a mapping for the lowest (0) and highest (1) values are required. For example, [[0, ‘rgb(0,0,255)’, [1, ‘rgb(255,0,0)’]]. Alternatively, colorscale may be a palette name string of the following list: Greys, YlGnBu, Greens, YlOrRd, Bluered, RdBu, Reds, Blues, Picnic, Rainbow, Portland, Jet, Hot, Blackbody, Earth, Electric, Viridis

legend: (str) plot legend mode: (str) marker style; can be ‘markers’/’lines’/’lines+markers’ marker: (str) Shape of marker symbol. For all options, please see

marker_fill: (str) Shape fill of marker symbol. Options are “fill”/”open”/”dot”/”open-dot” hoverinfo: (str) Any combination of “x”, “y”, “z”, “text”, “name” joined with a “+” OR “all” or “none” or

“skip”. Examples: “x”, “y”, “x+y”, “x+y+z”, “all” default: “x+y+text” Determines which trace information appear on hover. If none or skip are set, no information is displayed upon hovering. But, if none is set, click and hover events are still fired.

showlegend: (bool) show legend or not add_xy_plot: (list) of dictionaries, each of which contain additional data to add to the xy plot. Keys are

names of arguments to the original xy_plot method - required keys are ‘x_col’, ‘y_col’, ‘text’, ‘mode’, ‘name’, ‘color’, ‘size’. Values are corresponding argument values in the same format as for the original xy_plot. Use None for values not to be set, else a KeyError will be raised. Optional keys are ‘marker’ and ‘marker_fill’ (same format as root keys)

marker_outline_width: (int) thickness of marker outline marker_outline_color: (str/array) color of marker outline - accepts similar formats as other color variables linedash: (str) sets the dash style of a line. Options are ‘solid’/’dash’ linewidth: (int) sets the line width (in px) lineshape: (str) determines the line shape. With “spline” the lines are drawn using spline interpolation error_type: (str) Determines the rule used to generate the error bars. Options are,

  1. “data”: bar lengths are set in variable error_array/’error_arrayminus’,
  2. “percent”: bar lengths correspond to a percentage of underlying data. Set this percentage in the
variable ‘error_value’/’error_valueminus’,

(iii) “constant”: bar lengths are of a constant value. Set this constant in the variable ‘error_value’/’error_valueminus’

error_direction: (str) direction of error bar, “x”/”y” error_array: (list/array/series) Sets the data corresponding the length of each error bar.

Values are plotted relative to the underlying data
error_value: (float) Sets the value of either the percentage (if error_type is set to “percent”) or
the constant (if error_type is set to “constant”) corresponding to the lengths of the error bars.
error_symmetric: (bool) Determines whether or not the error bars have the same length in both direction
(top/bottom for vertical bars, left/right for horizontal bars
error_arrayminus: (list/array/series) Sets the data corresponding the length of each error bar in the bottom
(left) direction for vertical (horizontal) bars Values are plotted relative to the underlying data.
error_valueminus: (float) Sets the value of either the percentage (if error_type is set to “percent”) or
the constant (if error_type is set to “constant”) corresponding to the lengths of the error bars in the bottom (left) direction for vertical (horizontal) bars

Returns: A Plotly Scatter plot Figure object.

Module contents