【数据分析】之ReGat的VQAFeaturesDataset加载

2024-01-16 01:58

本文主要是介绍【数据分析】之ReGat的VQAFeaturesDataset加载,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

1 .VQAFeatureDataset

此类是ReGat项目对torch自带的from torch.utils.data import Dataset的重写,是模型运行的时候训练集和测试集的加载,加载的数据是模型forward函数的参数。如下:
regat.forward():

    def forward(self, v, b, q, implicit_pos_emb, sem_adj_matrix,spa_adj_matrix, labels):"""Forwardv: [batch, num_objs, obj_dim]b: [batch, num_objs, b_dim]q: [batch_size, seq_length]pos: [batch_size, num_objs, nongt_dim, emb_dim]sem_adj_matrix: [batch_size, num_objs, num_objs, num_edge_labels]spa_adj_matrix: [batch_size, num_objs, num_objs, num_edge_labels]return: logits, not probs"""w_emb = self.w_emb(q) #问题嵌入q_emb_seq = self.q_emb.forward_all(w_emb)  # [batch, q_len, q_dim]q_emb_self_att = self.q_att(q_emb_seq)  #添加自注意力信息# [batch_size, num_rois, out_dim]if self.relation_type == "semantic": #如果关系类型是语义v_emb = self.v_relation.forward(v, sem_adj_matrix, q_emb_self_att)elif self.relation_type == "spatial": #如果关系类型是空间v_emb = self.v_relation.forward(v, spa_adj_matrix, q_emb_self_att)else:  # implicit #如果是隐式关系v_emb = self.v_relation.forward(v, implicit_pos_emb,q_emb_self_att)if self.fusion == "ban": #融合模型1joint_emb, att = self.joint_embedding(v_emb, q_emb_seq, b)elif self.fusion == "butd": #融合模型2q_emb = self.q_emb(w_emb)  # [batch, q_dim]joint_emb, att = self.joint_embedding(v_emb, q_emb)else:  # mutan融合模型3joint_emb, att = self.joint_embedding(v_emb, q_emb_self_att)if self.classifier: #分类模型logits = self.classifier(joint_emb)else: logits = joint_embreturn logits, att

VQAFeatureDataset

self中的变量

变量名含义来源
self.ans2label单词-索引表示:字典trainval_ans2label.pkl{‘net’: 0, ‘pitcher’: 1, ‘orange’: 2, ‘yes’: 3, ‘white’: 4,…
self.label2ans索引-单词表示:列表trainval_label2ans.pkl[‘net’, ‘pitcher’, ‘orange’, ‘yes’, ‘white’,.
self.num_ans_candidates答案单词候选数:intlen(self.ans2label)3129
self.img_id2idx图像id-索引表示:字典imgid2idx.pkl{218224: 0, 306670: 1, 208663: 2, 225177: 3, 467257: 4, .
self.features图像特征:Tensorhf.get(‘image_features’)tensor[40504,36,2048]
self.normalized_bb标准化区域边界框空间位置:Tensorhf.get(‘spatial_features’)tensor([40504, 36, 4])
self.bb区域边界框位置:Tensorhf.get(‘image_bb’)tensor[40504,36,4]
self.semantic_adj_matrix语义形容词矩阵如果在hf的键中:hf.get(‘semantic_adj_matrix’) ,不在=None
self.spatial_adj_matrix空间形容词矩阵如果在hf的键中:hf.get(‘image_adj_matrix’) ,不在=None
self.pos_boxesNoneNoneNone
self.entries数据条目,items:list_load_dataset(dataroot, name, self.img_id2idx,self.label2ans)长度214354
self.nongt_dimself.nongt_dim = nongt_dim36
self.emb_dim位置嵌入维度pos_emb_dim64
self.v_dim图像特征嵌入维度self.features.size(1 if self.adaptive else 2)2048
self.s_dim方向维度self.normalized_bb.size(1 if self.adaptive else 2)6
class VQAFeatureDataset(Dataset):def __init__(self, name, dictionary, relation_type, dataroot='data',adaptive=False, pos_emb_dim=64, nongt_dim=36):super(VQAFeatureDataset, self).__init__()assert name in ['train', 'val', 'test-dev2015', 'test2015']# 加载annotations.json的预处理后的pkl文件ans2label_path = os.path.join(dataroot, 'cache', 'trainval_ans2label.pkl')label2ans_path = os.path.join(dataroot, 'cache', 'trainval_label2ans.pkl')self.ans2label = pickle.load(open(ans2label_path, 'rb')) #形如{'w1':1,'w2':2,...,'w3129':3129}self.label2ans = pickle.load(open(label2ans_path, 'rb'))#['w1','w2',...,'w3129']self.num_ans_candidates = len(self.ans2label) #候选答案单词数目=3129self.dictionary = dictionary #词典,包含19901个单词,键:idx2word#['w1','w2',...,'w19901'],word2idx{'w1':1,'w2':2,...,'w19901':19901},padding_idx=19901,ntoken=19901self.relation_type = relation_typeself.adaptive = adaptive #数据集是否是自适应的10-100个区域的prefix = '36'if 'test' in name:prefix = '_36'#加载hdf5文件目录h5_dataroot = dataroot+"/Bottom-up-features-adaptive"\if self.adaptive else dataroot+"/Bottom-up-features-fixed"imgid_dataroot = dataroot+"/imgids" #加载图像ids文件:#加载imgid2idx.pkl文件,保存再self.img_id2idx里:{id1:1,id2:2,...,id40504:40504}self.img_id2idx = pickle.load(open(os.path.join(imgid_dataroot, '%s%s_imgid2idx.pkl' %(name, '' if self.adaptive else prefix)), 'rb'))#加载hdf5文件h5_path = os.path.join(h5_dataroot, '%s%s.hdf5' %(name, '' if self.adaptive else prefix))print('loading features from h5 file %s' % h5_path)with h5py.File(h5_path, 'r') as hf:# self.features = np.array(hf.get('image_features'))self.features = np.array(hf.get('image_features'),dtype='float32')self.normalized_bb = np.array(hf.get('spatial_features'),dtype='float32')self.bb = np.array(hf.get('image_bb'),dtype='float32')print("hdf5数据加载成功!")if "semantic_adj_matrix" in hf.keys() \and self.relation_type == "semantic":self.semantic_adj_matrix = np.array(hf.get('semantic_adj_matrix'))print("Loaded semantic adj matrix from file...",self.semantic_adj_matrix.shape)else:self.semantic_adj_matrix = Noneprint("Setting semantic adj matrix to None...")if "image_adj_matrix" in hf.keys()\and self.relation_type == "spatial":self.spatial_adj_matrix = np.array(hf.get('image_adj_matrix'))#从文件加载空间的形容词矩阵print("Loaded spatial adj matrix from file...",self.spatial_adj_matrix.shape)else:self.spatial_adj_matrix = Noneprint("Setting spatial adj matrix to None...")self.pos_boxes = Noneif self.adaptive:self.pos_boxes = np.array(hf.get('pos_boxes'),dtype='float32')self.entries = _load_dataset(dataroot, name, self.img_id2idx,self.label2ans)self.tokenize()print("数据加载成功!")self.tensorize()self.nongt_dim = nongt_dimself.emb_dim = pos_emb_dimself.v_dim = self.features.size(1 if self.adaptive else 2)self.s_dim = self.normalized_bb.size(1 if self.adaptive else 2)def tokenize(self, max_length=14):"""Tokenizes the questions.This will add q_token in each entry of the dataset.-1 represent nil, and should be treated as padding_idx in embedding"""for entry in self.entries:tokens = self.dictionary.tokenize(entry['question'], False)tokens = tokens[:max_length]if len(tokens) < max_length:# Note here we pad to the back of the sentencepadding = [self.dictionary.padding_idx] * \(max_length - len(tokens))tokens = tokens + paddingutils.assert_eq(len(tokens), max_length)entry['q_token'] = tokensdef tensorize(self):self.features = torch.from_numpy(self.features)self.normalized_bb = torch.from_numpy(self.normalized_bb)self.bb = torch.from_numpy(self.bb)if self.semantic_adj_matrix is not None:self.semantic_adj_matrix = torch.from_numpy(self.semantic_adj_matrix).double()if self.spatial_adj_matrix is not None:self.spatial_adj_matrix = torch.from_numpy(self.spatial_adj_matrix).double()if self.pos_boxes is not None:self.pos_boxes = torch.from_numpy(self.pos_boxes)for entry in self.entries:question = torch.from_numpy(np.array(entry['q_token']))entry['q_token'] = questionanswer = entry['answer']if answer is not None:labels = np.array(answer['labels'])scores = np.array(answer['scores'], dtype=np.float32)if len(labels):labels = torch.from_numpy(labels)scores = torch.from_numpy(scores)entry['answer']['labels'] = labelsentry['answer']['scores'] = scoreselse:entry['answer']['labels'] = Noneentry['answer']['scores'] = Nonedef __getitem__(self, index):entry = self.entries[index]raw_question = entry["question"]image_id = entry["image_id"]question = entry['q_token']question_id = entry['question_id']if self.spatial_adj_matrix is not None:spatial_adj_matrix = self.spatial_adj_matrix[entry["image"]]else:spatial_adj_matrix = torch.zeros(1).double()if self.semantic_adj_matrix is not None:semantic_adj_matrix = self.semantic_adj_matrix[entry["image"]]else:semantic_adj_matrix = torch.zeros(1).double()if not self.adaptive:# fixed number of bounding boxesfeatures = self.features[entry['image']]normalized_bb = self.normalized_bb[entry['image']]bb = self.bb[entry["image"]]else:features = self.features[self.pos_boxes[entry['image']][0]:self.pos_boxes[entry['image']][1], :]normalized_bb = self.normalized_bb[self.pos_boxes[entry['image']][0]:self.pos_boxes[entry['image']][1], :]bb = self.bb[self.pos_boxes[entry['image']][0]:self.pos_boxes[entry['image']][1], :]answer = entry['answer']if answer is not None:labels = answer['labels']scores = answer['scores']target = torch.zeros(self.num_ans_candidates)if labels is not None:target.scatter_(0, labels, scores)return features, normalized_bb, question, target,\question_id, image_id, bb, spatial_adj_matrix,\semantic_adj_matrixelse:return features, normalized_bb, question, question_id,\question_id, image_id, bb, spatial_adj_matrix,\semantic_adj_matrixdef __len__(self):return len(self.entries)

entries
entries是数据的条目,类型是list,共214354条数据,每条数据是一个字典。
每条数据如下:

键值含义
question_id问题id42000
image_id图像id42
image图像37244
question问题文字表示‘What color are the gym shoes?’
answer答案:label,score{‘labels’: tensor([ 4, 1594], dtype=torch.int32), ‘scores’: tensor([1.0000, 0.3000])}
q_token问题索引向量表示tensor([ 0, 10, 68, 11, 2618, 225, 19901, 19901, 19901, 19901,19901, 19901, 19901, 19901], dtype=torch.int32)

2. 模型需要传入的getitem数据返回 :如果是固定36个区域

变量名来源
featuresself.features[entry[‘image’]],此处的image是位置
normalized_bbself.normalized_bb[entry[‘image’]]
questionentry[‘q_token’]
targetscatter_(0, labels, scores)
question_identry[‘question_id’]
image_identry[“image_id”]
bbself.bb[entry[“image”]]
spatial_adj_matrixself.spatial_adj_matrix[entry[“image”]]
semantic_adj_matrixself.semantic_adj_matrix[entry[“image”]]

这篇关于【数据分析】之ReGat的VQAFeaturesDataset加载的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/610958

相关文章

springboot加载不到nacos配置中心的配置问题处理

《springboot加载不到nacos配置中心的配置问题处理》:本文主要介绍springboot加载不到nacos配置中心的配置问题处理,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑... 目录springboot加载不到nacos配置中心的配置两种可能Spring Boot 版本Nacos

Python数据分析与可视化的全面指南(从数据清洗到图表呈现)

《Python数据分析与可视化的全面指南(从数据清洗到图表呈现)》Python是数据分析与可视化领域中最受欢迎的编程语言之一,凭借其丰富的库和工具,Python能够帮助我们快速处理、分析数据并生成高质... 目录一、数据采集与初步探索二、数据清洗的七种武器1. 缺失值处理策略2. 异常值检测与修正3. 数据

使用Python获取JS加载的数据的多种实现方法

《使用Python获取JS加载的数据的多种实现方法》在当今的互联网时代,网页数据的动态加载已经成为一种常见的技术手段,许多现代网站通过JavaScript(JS)动态加载内容,这使得传统的静态网页爬取... 目录引言一、动态 网页与js加载数据的原理二、python爬取JS加载数据的方法(一)分析网络请求1

IDEA下"File is read-only"可能原因分析及"找不到或无法加载主类"的问题

《IDEA下Fileisread-only可能原因分析及找不到或无法加载主类的问题》:本文主要介绍IDEA下Fileisread-only可能原因分析及找不到或无法加载主类的问题,具有很好的参... 目录1.File is read-only”可能原因2.“找不到或无法加载主类”问题的解决总结1.File

重新对Java的类加载器的学习方式

《重新对Java的类加载器的学习方式》:本文主要介绍重新对Java的类加载器的学习方式,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录1、介绍1.1、简介1.2、符号引用和直接引用1、符号引用2、直接引用3、符号转直接的过程2、加载流程3、类加载的分类3.1、显示

在 PyQt 加载 UI 三种常见方法

《在PyQt加载UI三种常见方法》在PyQt中,加载UI文件通常指的是使用QtDesigner设计的.ui文件,并将其转换为Python代码,以便在PyQt应用程序中使用,这篇文章给大家介绍在... 目录方法一:使用 uic 模块动态加载 (不推荐用于大型项目)方法二:将 UI 文件编译为 python 模

Spring框架中@Lazy延迟加载原理和使用详解

《Spring框架中@Lazy延迟加载原理和使用详解》:本文主要介绍Spring框架中@Lazy延迟加载原理和使用方式,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐... 目录一、@Lazy延迟加载原理1.延迟加载原理1.1 @Lazy三种配置方法1.2 @Component

SpringBoot中配置文件的加载顺序解读

《SpringBoot中配置文件的加载顺序解读》:本文主要介绍SpringBoot中配置文件的加载顺序,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录SpringBoot配置文件的加载顺序1、命令⾏参数2、Java系统属性3、操作系统环境变量5、项目【外部】的ap

Spring Boot 配置文件之类型、加载顺序与最佳实践记录

《SpringBoot配置文件之类型、加载顺序与最佳实践记录》SpringBoot的配置文件是灵活且强大的工具,通过合理的配置管理,可以让应用开发和部署更加高效,无论是简单的属性配置,还是复杂... 目录Spring Boot 配置文件详解一、Spring Boot 配置文件类型1.1 applicatio

SpringBoot项目启动报错"找不到或无法加载主类"的解决方法

《SpringBoot项目启动报错找不到或无法加载主类的解决方法》在使用IntelliJIDEA开发基于SpringBoot框架的Java程序时,可能会出现找不到或无法加载主类com.example.... 目录一、问题描述二、排查过程三、解决方案一、问题描述在使用 IntelliJ IDEA 开发基于