graphpkg.static package
Contents
graphpkg.static package#
Module contents#
- graphpkg.static.grid_classification_boundary(models_list: list, data: Optional[numpy.ndarray] = None, size: int = 4, n_plot_cols: int = 3, figsize: tuple = (5, 5), canvas_details: int = 50, canvas_opacity: float = 0.4, canvas_palette='coolwarm') None [source]#
Plot multiple plots of clasification boundaries for mulitple ml models.
Only models are allowed with 1D prediction.
- Parameters
models_list (list) – Models list of dictionary.
data (np.ndarray, optional) – source data. restricted to 2 features and 1 target, in total 3 columns. Defaults to None.
size (int, optional) – Size of canvas. Defaults to 4.
n_plot_cols (int, optional) – number of plot columns. Defaults to 3.
figsize (tuple, optional) – figure size. Defaults to (5, 5).
canvas_details (int, optional) – detailing in canvas. Defaults to 50.
canvas_opacity (float, optional) – Canvas transparency parameter. Defaults to 0.4.
canvas_palette (str, optional) – palette from matplotlib. Defaults to coolwarm.
- Raises
ValueError – Only 3 dimensional data, 2 features, 1 target is allowed.
Examples
>>> from sklearn.linear_model import LogisticRegression >>> from sklearn.tree import DecisionTreeClassifier >>> from sklearn.datasets import make_classification >>> import matplotlib.pyplot as plt >>> X, y = make_classification(n_samples=500, n_features=2, random_state=25, >>> n_informative=1, n_classes=2, n_clusters_per_class=1, >>> n_repeated=0, n_redundant=0) >>> lr_model = LogisticRegression().fit(X, y) >>> dt_model = DecisionTreeClassifier().fit(X, y) >>> models_list = [{ >>> "name": "Logistic Regression Classifier", >>> "function": lr_model.predict >>> },{ >>> "name": "Decision Tree Classifier", >>> "function": dt_model.predict >>> }] >>> grid_classification_boundary(models_list=models_list, data=np.hstack((X, y.reshape(-1, 1))), >>> figsize=(7,5), canvas_details=100) >>> plt.show()
- graphpkg.static.multi_distplots(df: pandas.core.frame.DataFrame, n_cols: int = 4, bins: int = 20, kde: bool = True, class_col: Optional[str] = None, legend: bool = True, legend_loc: str = 'best', figsize: Optional[tuple] = None, palette: str = 'dark', grid_flag: bool = True, xticks_rotation: int = 60) None [source]#
Mulitple Distribution Plots using pandas dataframe.
Seaborn’s histplot is used for distribution with additional functionality to have multiple distributions in one grid.
- Parameters
df (pd.DataFrame) – Input dataframe.
n_cols (int, optional) – Number of columns in the grid. Defaults to 4.
bins (int, optional) – number of bins in distribution. Defaults to 20.
kde (bool, optional) – kde estimation line & plot. Defaults to True.
class_col (str, optional) – class column name for distribution separation and legend. Defaults to None.
legend (bool, optional) – put legend or not. Defaults to True.
legend_loc (str, optional) – where to put legend, takes inputs similar to matplotlib.pyplot. Defaults to ‘best’.
figsize (tuple, optional) – figure size, similar to matplotlib.pyplot. Defaults to None.
palette (str, optional) – color palette, property from seaborn. Defaults to ‘dark’.
grid_flag (bool, optional) – put grid or not. Defaults to True.
xticks_rotation (int, optional) – xticks rotation angle. Defaults to 60.
Examples
>>> from sklearn.datasets import fetch_california_housing >>> import pandas as pd >>> import numpy as np >>> dataset = fetch_california_housing() >>> df = pd.DataFrame(dataset.data, columns=dataset.feature_names) >>> df['target'] = dataset.target >>> multi_distplots(df, n_cols=2) >>> plt.show()
- graphpkg.static.plot_boxed_timeseries(df: pandas.core.frame.DataFrame, ts_col: str, data_col: str, box: Optional[str] = 'MONTH', figsize: Optional[tuple] = None)[source]#
Plot timeseries data integrated with boxplot to see window based data variation.
- Parameters
- Returns
Matplotlib figure and axes.
- Return type
Figure, Axes
Examples
>>> import numpy as np >>> import matplotlib.pyplot as plt >>> import pandas as pd >>> from graphpkg.static import plot_boxed_timeseries >>> size = 1000 >>> df = pd.DataFrame({ >>> "data": np.random.normal(size=(size,)) * 100, >>> "timestamps": pd.date_range(start='1/1/2018', periods=size, freq='MIN') >>> }) >>> fig, ax = plot_boxed_timeseries(df, data_col='data', ts_col='timestamps', box='hour', figsize=(10, 5)) >>> plt.tight_layout() >>> plt.show()
- graphpkg.static.plot_classification_boundary(func: Callable, data: Optional[numpy.ndarray] = None, size: int = 4, n_plot_cols: int = 1, figsize: tuple = (5, 5), canvas_details: int = 50, canvas_opacity: float = 0.5, canvas_palette: str = 'coolwarm')[source]#
Plot classification model’s decision boundary.
- Parameters
func (function) – Prediction function of ML model that.
data (np.ndarray, optional) – source data. restricted to 2 features and 1 target, in total 3 columns. Defaults to None.
size (int, optional) – size of canvas. Defaults to 4.
n_plot_cols (int, optional) – number of columns for number of plots. Defaults to 1.
figsize (tuple, optional) – matplotlib figure size. Defaults to (5, 5).
canvas_details (int, optional) – how detailed the boundary should be. Defaults to 50.
canvas_opacity (float, optional) – Canvas transparency parameter. Defaults to 0.3.
canvas_palette (str, optional) – palette of canvas. Defaults to ‘coolwarm’.
- Raises
ValueError – If the input data’s shape is not (k,3), k=number of rows.
Examples
>>> from sklearn.linear_model import LogisticRegression >>> from sklearn.datasets import make_classification >>> import matplotlib.pyplot as plt >>> X, y = make_classification(n_samples=500, n_features=2, random_state=25, >>> n_informative=1, n_classes=2, n_clusters_per_class=1, >>> n_repeated=0, n_redundant=0) >>> model = LogisticRegression().fit(X, y) >>> plot_classification_boundary(func=model.predict, data=np.hstack((X,y.reshape(-1,1))),canvas_details=100) >>> plt.show()
- graphpkg.static.plot_distribution(x: numpy.ndarray, kde: Optional[bool] = True, indicate_data: Optional[Union[list, numpy.ndarray]] = None, figsize: Optional[tuple] = None) None [source]#
Plot distribution with additional informations.
distribution and box plot from matplotlib and seaborn.
- Parameters
x (np.ndarray) – input 1D array.
kde (Optional[bool], optional) – kde parameter from seaborn. Defaults to True.
indicate_data (Optional[Union[list, np.ndarray]], optional) – data points to observe/indicate in plot. Defaults to None.
figsize (Optional[tuple], optional) – figure size from matplotlib. Defaults to None.
- Raises
AssertionError – only 1d arrays are allowed for input.
Examples
>>> import numpy as np >>> import matplotlib.pyplot as plt >>> from graphpkg.static import plot_distribution >>> x = np.random.normal(size=(200,)) >>> plot_distribution(x, indicate_data=[0.6]) >>> plt.show()