Matplotlib for presenting results(论文画图matplotlib jupyter文档)

2023-12-12 07:58

本文主要是介绍Matplotlib for presenting results(论文画图matplotlib jupyter文档),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

参考jupyter的官方文档点击打开链接

第一部分讲了matplotlib 自带的各种风格,可以画出不同 背景样式的图

第二部分讲的是绘制heatmap,利用heatmap可以将自己的结果和别的模型进对比(用不同深重颜色代表差异)

第三部分利用TSNE这个库可以把高维数据表示为二维图中(没看懂)

第四部分画堆叠bar图


Learning curves¶

Make matplotlib graphics to show up inline.

In [8]:
%matplotlib inline

Import matplotlib. If you want to generate images without having a window appear (if you run your scripts on servers), use a non-interactive backend such as Agg (for PNGs), PDF, SVG or PS. To do so, uncomment the second line in the following cell.

In [9]:
import matplotlib
#matplotlib.use('Agg')
import matplotlib.pyplot as plt

Matplotlib has different styles. Run the cell to check which styles are available.

In [10]:
plt.style.available
Out[10]:
[u'seaborn-darkgrid',u'Solarize_Light2',u'seaborn-notebook',u'classic',u'seaborn-ticks',u'grayscale',u'bmh',u'seaborn-talk',u'dark_background',u'ggplot',u'fivethirtyeight',u'_classic_test',u'seaborn-colorblind',u'seaborn-deep',u'seaborn-whitegrid',u'seaborn-bright',u'seaborn-poster',u'seaborn-muted',u'seaborn-paper',u'seaborn-white',u'fast',u'seaborn-pastel',u'seaborn-dark',u'seaborn',u'seaborn-dark-palette']

To set a style you want use stlye.use.

In [11]:
matplotlib.style.use('seaborn-darkgrid')

plot_learning_curves plots train, dev, test measure-time curves. Play around with different parameters to get a figure that suits you the best. flist is the list of size 3; the first element is the list of train scores, the second of dev scors and the third of test scores.

In [12]:
def plot_learning_curves(fig_path, n_epochs, flist, style=''):measure = 'f1'steps_measure = 'epochs'plt.figure(dpi=400)plt.rcParams['font.size'] = 10plt.rcParams['axes.labelsize'] = 12plt.rcParams['axes.labelweight'] = 'bold'plt.rcParams['axes.titlesize'] = 12plt.rcParams['xtick.labelsize'] = 10plt.rcParams['ytick.labelsize'] = 10plt.rcParams['legend.fontsize'] = 10plt.rcParams['figure.titlesize'] = 12steps = range(1, n_epochs+1)plt.title('learning curves' + style)plt.plot(steps, flist[0], linewidth=1, color='#6699ff', linestyle='-', marker='o',markeredgecolor='black',markeredgewidth=0.5, label='train')plt.plot(steps, flist[1], linewidth=3, color='#ff4d4d', linestyle='-', marker='D',markeredgecolor='black',markeredgewidth=0.5, label='test')plt.plot(steps, flist[2], linewidth=2, color='#ffcc66', linestyle='-', marker='s',markeredgecolor='black',markeredgewidth=0.5, label='dev')plt.xlabel(steps_measure)plt.xticks(steps)plt.ylabel(measure)plt.legend(loc='best', numpoints=1, fancybox=True)plt.show()plt.savefig(fig_path)

Let's generate a random examples to illustrate one figure with learning curves.

In [16]:
fig_path = 'figs/'
train_f1 = [0.2, 0.3, 0.4, 0.5, 0.6, 0.62, 0.65, 0.67, 0.68, 0.67, 0.69, 0.72, 0.721, 0.719, 0.72]
dev_f1 = [0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.57, 0.59, 0.6, 0.62, 0.61, 0.615, 0.614, 0.6159, 0.62]
test_f1 = [0.05, 0.15, 0.2, 0.25, 0.3, 0.4, 0.45, 0.43, 0.5, 0.51, 0.55, 0.52, 0.53, 0.525, 0.531]
flist = [train_f1, dev_f1, test_f1]
n_epochs = 15
plot_learning_curves(fig_path + 'learning_curve.png', n_epochs , flist)
<matplotlib.figure.Figure at 0x1074e1e50>

And the same figure with all available styles.

In [17]:
for st in plt.style.available:matplotlib.style.use(st)plot_learning_curves(fig_path + 'learning_curve.png', n_epochs , flist, '--' + st)
<matplotlib.figure.Figure at 0x107517250>
<matplotlib.figure.Figure at 0x10a271f10>
<matplotlib.figure.Figure at 0x10a246790>
<matplotlib.figure.Figure at 0x10a878750>
<matplotlib.figure.Figure at 0x10af8a850>
<matplotlib.figure.Figure at 0x107254ad0>
<matplotlib.figure.Figure at 0x10af91ad0>
<matplotlib.figure.Figure at 0x10a8afc10>
<matplotlib.figure.Figure at 0x10bfd8d10>
<matplotlib.figure.Figure at 0x10a583210>
<matplotlib.figure.Figure at 0x10a9d4cd0>
<matplotlib.figure.Figure at 0x10a749d10>
<matplotlib.figure.Figure at 0x10a6573d0>
<matplotlib.figure.Figure at 0x10a5f3f10>
<matplotlib.figure.Figure at 0x10a1ec310>
<matplotlib.figure.Figure at 0x10a118d90>
<matplotlib.figure.Figure at 0x10af57110>
<matplotlib.figure.Figure at 0x10cf635d0>
<matplotlib.figure.Figure at 0x10a9c7fd0>
<matplotlib.figure.Figure at 0x10a1cc550>
<matplotlib.figure.Figure at 0x10a986590>
<matplotlib.figure.Figure at 0x10a 
<matplotlib.figure.Figure at 0x10a300f10>
<matplotlib.figure.Figure at 0x10a21ee90>
<matplotlib.figure.Figure at 0x10bfd1e90>

In your scripts after every epoch or after every 1K iterations evaluate your model on train, dev and test data and append corresponding scripts.

Heatmaps¶

Visualize LSTM outputs¶

Now import prettyplotlib.

In [18]:
import prettyplotlib as ppl

visu_lstm_outputs illustrates a heatmap of LSTM outputs. More examples can be found here: https://github.com/olgabot/prettyplotlib/wiki/Examples-with-code.

In [19]:
def visu_lstm_outputs(fig_path, LSTM_outputs, sentence_tokenized):plt.rcParams['xtick.labelsize'] = 20plt.rcParams['ytick.labelsize'] = 20slen = len(sentence_tokenized)fig, ax = ppl.subplots(1)fig.set_figheight(15)fig.set_figwidth(20)ppl.pcolormesh(fig, ax, LSTM_outputs)ax.set_xticks(np.arange(0.5, slen + 0.5, 1))ax.set_xticklabels(sentence_tokenized)ax.set_title('lstm outputs', fontsize = 25)plt.show()fig.savefig(fig_path)

Let's see how it works for a randomly generated example.

In [20]:
import numpy as npsentence = 'today is a beautiful day'
sentence_tokenized = sentence.split(' ')
slen = len(sentence_tokenized)
LSTM_hidden_size = 15
LSTM_outputs = np.random.rand(LSTM_hidden_size, slen) fig_path = "figs/heatmap_lstm_hidden.png"
visu_lstm_outputs(fig_path, LSTM_outputs, sentence_tokenized)
/Users/ana/tf_cpu/lib/python2.7/site-packages/matplotlib/__init__.py:800: MatplotlibDeprecationWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle; please use the latter.mplDeprecation)

For a tensorflow model you need to retrieve ourputs with outputs_op = graph.get_operation_by_name(op_name).outputs[0], run the operation outputs = session.run(outputs_op) and transpose them np.asarray(outputs).transpose().

Visualize improvements¶

If you compare your models with another model, you can visualize improvements and make your results table easier to interpret.

In [21]:
matplotlib.style.use('seaborn-white')

visu_imporovements makes a heatmap of the improvements data and writes improvements values in the center of cells.

In [22]:
def visu_improvements(improvements_data):y_labels = ['model1', 'model2', 'model3', 'model4']x_labels = ['measure1', 'measure2', 'measure3']fig, ax = plt.subplots(1)fig.set_figheight(15)fig.set_figwidth(30)#plt.pcolormesh(fig, ax, my_data)plt.pcolor(improvements_data, cmap=plt.cm.YlGn)ax.set_aspect('auto')for y in range(improvements_data.shape[0]):for x in range(improvements_data.shape[1]):plt.text(x + 0.5, y + 0.5, '%.2f' % improvements_data[y, x],horizontalalignment='center',verticalalignment='center',size=30,weight='bold')ax.set_yticks(np.arange(0.5, len(y_labels) + 0.5, 1))ax.set_yticklabels(y_labels, size=25, weight='bold')ax.set_xticks(np.arange(0.5, len(x_labels) + 0.5, 1))ax.set_xticklabels(x_labels, size=17, weight='bold')ax.tick_params(axis='both', labelsize=25)ax.set_title('improvements', fontsize = 40)cbar = plt.colorbar()cbar.ax.tick_params(labelsize=30)fig.savefig(fig_path)
Make a random example.
In [23]:
fig_path = "improvements.png"
n_models = 4, 
n_measures = 3
improvements_data = np.random.random_sample((4,3))
np.savetxt(fname='improvements.txt', X=improvements_data, fmt='%10.5f', delimiter='\t')
visu_improvements(improvements_data)

You will write differences of your models and another model in a text file (one row for one model) and read it as a numpy array.

In [24]:
improvements_data = np.loadtxt('improvements.txt', delimiter='\t')
visu_improvements(improvements_data)

tSNE¶

Using tSNE you can visualize your high-dimensional vectors in 2D space. Learn first how to use tSNE effectively https://distill.pub/2016/misread-tsne/.

In [25]:
def plot(candidates_tsne, n_classes, tags, ids, file_path):almost_black = '#262626'fig, ax = plt.subplots(1)# hard-coded for the example with 3 classescolors = ['red', 'blue']for k in range(n_classes):begin = sum(sizes[:k])end = sum(sizes[:k]) + sizes[k]x = [candidates_tsne[i, 0] for i in range(begin, end)]y = [candidates_tsne[i, 1] for i in range(begin, end)]# marks every vector with "id (tag)", where id and tag could be anything you like, e.g. word (POS)# every vector point is colored with the corresponding class colortext = [str(ids[i]) + " (" + str(tags[i]) + ")" for i in range(begin, end)]ax.scatter(x, y, label='class'+str(k+1), alpha=0.5, edgecolor=almost_black, facecolor=colors[k], linewidth=0.15)for i, txt in enumerate(text):ax.annotate(txt, (x[i], y[i]))# remove top and right axesspines_to_remove = ['top', 'right']for spine in spines_to_remove:ax.spines[spine].set_visible(False)ax.xaxis.set_ticks_position('none')ax.yaxis.set_ticks_position('none')spines_to_keep = ['bottom', 'left']for spine in spines_to_keep:ax.spines[spine].set_linewidth(0.5)ax.spines[spine].set_color(almost_black)# make axis almost blackax.xaxis.label.set_color(almost_black)ax.yaxis.label.set_color(almost_black)ax.set_xticks([])ax.set_yticks([])ax.title.set_color(almost_black)ax.set_title('tsne', fontsize = 20)# make the legend background light graylight_grey = np.array([float(248)/float(255)]*3)legend = ax.legend(frameon=True, scatterpoints=1)rect = legend.get_frame()rect.set_facecolor(light_grey)rect.set_linewidth(0.0)# change the legend label colors to almost blacktexts = legend.textsfor t in texts:t.set_color(almost_black)ax.grid(False)plt.show()fig.savefig(str(file_path), dpi=200)plt.close()

Make random classification dataset.

In [26]:
import sklearn
from sklearn import datasetsn_classes = 2
X, y = datasets.make_classification(n_samples=10, n_features=20, class_sep=10)
types = [[] for _ in range(n_classes)]
sizes = [0]*n_classes
for i, l in enumerate(y):types[l].append(X[i])sizes[l] += 1for i in range(n_classes):types[i] = np.asarray(types[i]).reshape((sizes[i], 20))

And plot.

In [27]:
from sklearn.manifold import TSNEjoint = np.concatenate(types, 0)
tsne = TSNE(init='pca', n_iter=5000)
candidates_tsne = tsne.fit_transform(joint)
file_path = "figs/tsne.png"
tags = [1]*sizes[0] + [2]*sizes[1]
ids = range(sum(sizes))
plot(candidates_tsne, n_classes, tags, ids, file_path)

When running on serves comment plt.show for all examples.

Stacked bars plot¶

In [30]:
#from matplotlib.font_manager import FontProperties
from operator import add
In [91]:
def tags_stacked_bars(distr1, distr2):distr1.sort(key=lambda item: item[1], reverse=True)distr1_dict = dict(distr1)distr2.sort(key=lambda item: item[1], reverse=True)distr2_dict = dict(distr2)# plot tags which has positive frequency in the first distribution and higher than one in the secondplot_tags = [tag for tag, freq in distr1 if freq > 0.0]other = []for tag, freq in distr2:if freq > 1.0 and tag not in plot_tags:plot_tags.append(tag)else:other.append(freq)distr1_pruned_dict = {}for key in plot_tags:if key in distr1_dict:distr1_pruned_dict[key] = distr1_dict[key]else:distr1_pruned_dict[key] = 0distr2_pruned_dict = {}for key in plot_tags:if key in distr2_dict:distr2_pruned_dict[key] = distr2_dict[key]else:distr2_pruned_dict[key] = 0distr1_freq = []distr2_freq = []for tag in plot_tags:distr1_freq.append(distr1_pruned_dict[tag])distr2_freq.append(distr2_pruned_dict[tag])plot_tags.append("other")distr1_freq.append(0.0)distr2_freq.append(np.mean(np.asarray(other)))# plot stacked barwidth = 0.4height_cumulative = [0.0, 0.0]plots = []ind = [0, 0.5]fig = plt.figure(figsize=(11, 11))ax = fig.add_subplot(1, 1, 1)colors = ['#0066ff', '#ffcc99', '#adebad', '#ff5c33', '#ac3973', '#ffbf00', '#7979d2', '#00cc99']for k in range(len(plot_tags)):if k < 8:color = colors[k]else:color = np.random.rand(3)if k == 0:plots.append(ax.bar(ind, [distr1_freq[k], distr2_freq[k]], width, color=color, edgecolor='black'))else:plots.append(ax.bar(ind, [distr1_freq[k], distr2_freq[k]], width,bottom=height_cumulative, color=color, edgecolor='black'))height_cumulative = map(add, [distr1_freq[k], distr2_freq[k]], height_cumulative)plt.ylabel('median occurrence of a tag', fontsize=20)title = 'distribution of tags'plt.title(title, fontsize=25)plt.xticks([ind[0]+width/2.0, ind[1]+width/2], ["distr1", "distr2"], fontsize=20)plt.yticks(np.arange(0, sum(distr2_freq), 3))handles = [p[0] for p in plots]plt.legend(handles[::-1], plot_tags[::-1], prop={'size': 10}, loc='center left', bbox_to_anchor=(1, 0.5))fig_path = "figs/stacked_bars.png"plt.show()fig.savefig(fig_path)
In [92]:
distr1 = [["NN", 5.0], ["NP", 3.0], ["PP", 2.0], ["ADV", 2.0], ["S", 2.0], ["ART", 2.0], ["$.", 1.0], ["KON", 1.0], ["ADJA", 1.0], ["APPR", 1.0], ["VVINF", 1.0], ["VVFIN", 1.0], ["$,", 1.0], ["KOUS", 1.0], ["VP", 1.0], ["VAFIN", 1.0], ["PPER", 1.0], ["CVZ", 0.0], ["AVP", 0.0], ["CAP", 0.0], ["CVP", 0.0], ["NE", 0.0], ["CPP", 0.0], ["VMFIN", 0.0], ["PTKNEG", 0.0], ["VAPP", 0.0], ["APPO", 0.0], ["PRF", 0.0], ["VVIZU", 0.0], ["NM", 0.0], ["PDAT", 0.0], ["PIAT", 0.0], ["ADJD", 0.0], ["$[", 0.0], ["PTKVZ", 0.0], ["PRELS", 0.0], ["PIS", 0.0], ["ROOT", 0.0], ["PROAV", 0.0], ["APZR", 0.0], ["PPOSAT", 0.0], ["CO", 0.0], ["CNP", 0.0], ["PDS", 0.0], ["VVPP", 0.0], ["AP", 0.0], ["XY", 0.0], ["PWAV", 0.0], ["CS", 0.0], ["PTKANT", 0.0], ["VZ", 0.0], ["PTKZU", 0.0], ["CARD", 0.0], ["PWS", 0.0], ["VMINF", 0.0], ["MPN", 0.0], ["VAINF", 0.0], ["APPRART", 0.0], ["KOKOM", 0.0], ["PTKA", 0.0], ["PIDAT", 0.0], ["TRUNC", 0.0], ["KOUI", 0.0], ["CAVP", 0.0]]
distr2 = [["NN", 10.0], ["NP", 7.0], ["S", 7.0], ["PP", 5.0], ["ADV", 4.0], ["ADJA", 3.0], ["APPR", 3.0], ["VP", 3.0], ["ART", 3.0], ["VVINF", 2.0], ["VVFIN", 2.0], ["VAFIN", 2.0], ["PPER", 2.0], ["KON", 1.0], ["VMFIN", 1.0], ["ADJD", 1.0], ["PTKNEG", 1.0], ["$.", 1.0], ["$,", 1.0], ["CNP", 1.0], ["KOUS", 1.0], ["PDS", 1.0], ["VVPP", 1.0], ["AP", 1.0], ["CS", 1.0], ["APPRART", 1.0], ["CVZ", 0.0], ["AVP", 0.0], ["CAP", 0.0], ["PWAT", 0.0], ["CVP", 0.0], ["NE", 0.0], ["CPP", 0.0], ["VAPP", 0.0], ["APPO", 0.0], ["PRF", 0.0], ["VVIZU", 0.0], ["NM", 0.0], ["PDAT", 0.0], ["PIAT", 0.0], ["FM", 0.0], ["$[", 0.0], ["PTKVZ", 0.0], ["PRELS", 0.0], ["PIS", 0.0], ["ROOT", 0.0], ["PROAV", 0.0], ["TRUNC", 0.0], ["PPOSAT", 0.0], ["CO", 0.0], ["XY", 0.0], ["PWAV", 0.0], ["PTKANT", 0.0], ["VZ", 0.0], ["PTKZU", 0.0], ["CARD", 0.0], ["PWS", 0.0], ["PRELAT", 0.0], ["VVIMP", 0.0], ["VMINF", 0.0], ["MPN", 0.0], ["VAINF", 0.0], ["KOKOM", 0.0], ["PTKA", 0.0], ["PIDAT", 0.0], ["APZR", 0.0], ["KOUI", 0.0], ["CAVP", 0.0]]tags_stacked_bars(distr1, distr2)


这篇关于Matplotlib for presenting results(论文画图matplotlib jupyter文档)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/483834

相关文章

使用EasyPoi快速导出Word文档功能的实现步骤

《使用EasyPoi快速导出Word文档功能的实现步骤》EasyPoi是一个基于ApachePOI的开源Java工具库,旨在简化Excel和Word文档的操作,本文将详细介绍如何使用EasyPoi快速... 目录一、准备工作1、引入依赖二、准备好一个word模版文件三、编写导出方法的工具类四、在Export

利用Python操作Word文档页码的实际应用

《利用Python操作Word文档页码的实际应用》在撰写长篇文档时,经常需要将文档分成多个节,每个节都需要单独的页码,下面:本文主要介绍利用Python操作Word文档页码的相关资料,文中通过代码... 目录需求:文档详情:要求:该程序的功能是:总结需求:一次性处理24个文档的页码。文档详情:1、每个

C++读写word文档(.docx)DuckX库的使用详解

《C++读写word文档(.docx)DuckX库的使用详解》DuckX是C++库,用于创建/编辑.docx文件,支持读取文档、添加段落/片段、编辑表格,解决中文乱码需更改编码方案,进阶功能含文本替换... 目录一、基本用法1. 读取文档3. 添加段落4. 添加片段3. 编辑表格二、进阶用法1. 文本替换2

Python实现自动化删除Word文档超链接的实用技巧

《Python实现自动化删除Word文档超链接的实用技巧》在日常工作中,我们经常需要处理各种Word文档,本文将深入探讨如何利用Python,特别是借助一个功能强大的库,高效移除Word文档中的超链接... 目录为什么需要移除Word文档超链接准备工作:环境搭建与库安装核心实现:使用python移除超链接的

C#实现一键批量合并PDF文档

《C#实现一键批量合并PDF文档》这篇文章主要为大家详细介绍了如何使用C#实现一键批量合并PDF文档功能,文中的示例代码简洁易懂,感兴趣的小伙伴可以跟随小编一起学习一下... 目录前言效果展示功能实现1、添加文件2、文件分组(书签)3、定义页码范围4、自定义显示5、定义页面尺寸6、PDF批量合并7、其他方法

Java实现在Word文档中添加文本水印和图片水印的操作指南

《Java实现在Word文档中添加文本水印和图片水印的操作指南》在当今数字时代,文档的自动化处理与安全防护变得尤为重要,无论是为了保护版权、推广品牌,还是为了在文档中加入特定的标识,为Word文档添加... 目录引言Spire.Doc for Java:高效Word文档处理的利器代码实战:使用Java为Wo

使用Python实现Word文档的自动化对比方案

《使用Python实现Word文档的自动化对比方案》我们经常需要比较两个Word文档的版本差异,无论是合同修订、论文修改还是代码文档更新,人工比对不仅效率低下,还容易遗漏关键改动,下面通过一个实际案例... 目录引言一、使用python-docx库解析文档结构二、使用difflib进行差异比对三、高级对比方

Python自动化处理PDF文档的操作完整指南

《Python自动化处理PDF文档的操作完整指南》在办公自动化中,PDF文档处理是一项常见需求,本文将介绍如何使用Python实现PDF文档的自动化处理,感兴趣的小伙伴可以跟随小编一起学习一下... 目录使用pymupdf读写PDF文件基本概念安装pymupdf提取文本内容提取图像添加水印使用pdfplum

Python从Word文档中提取图片并生成PPT的操作代码

《Python从Word文档中提取图片并生成PPT的操作代码》在日常办公场景中,我们经常需要从Word文档中提取图片,并将这些图片整理到PowerPoint幻灯片中,手动完成这一任务既耗时又容易出错,... 目录引言背景与需求解决方案概述代码解析代码核心逻辑说明总结引言在日常办公场景中,我们经常需要从 W

C#高效实现Word文档内容查找与替换的6种方法

《C#高效实现Word文档内容查找与替换的6种方法》在日常文档处理工作中,尤其是面对大型Word文档时,手动查找、替换文本往往既耗时又容易出错,本文整理了C#查找与替换Word内容的6种方法,大家可以... 目录环境准备方法一:查找文本并替换为新文本方法二:使用正则表达式查找并替换文本方法三:将文本替换为图