tensorflow泰坦尼克号沉船数据预测模型

2024-06-14 08:38

本文主要是介绍tensorflow泰坦尼克号沉船数据预测模型,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

首先下载数据
https://www.kaggle.com/c/titanic/data
kaggle上面的数据

import pandas as pd
import numpy as np
import os,sys
os.getcwd()
data = pd.read_csv(’./tt/train.csv’)
data.columns
data = data[[‘Survived’, ‘Pclass’, ‘Sex’, ‘Age’, ‘SibSp’,
‘Parch’, ‘Fare’, ‘Cabin’, ‘Embarked’]]
data[‘Age’] = data[‘Age’].fillna(data[‘Age’].mean())
data[‘Cabin’] = pd.factorize(data.Cabin)[0]
data.fillna(0,inplace = True)
data[‘p1’] = np.array(data[‘Pclass’] == 1).astype(np.int32)

data[‘p2’] = np.array(data[‘Pclass’] == 2).astype(np.int32)

data[‘p3’] = np.array(data[‘Pclass’] == 3).astype(np.int32)
del data[‘Pclass’]
data.Embarked.unique()

data[‘e1’] = np.array(data[‘Embarked’] == ‘S’).astype(np.int32)

data[‘e2’] = np.array(data[‘Embarked’] == ‘C’).astype(np.int32)

data[‘e3’] = np.array(data[‘Embarked’] == ‘Q’).astype(np.int32)

del data[‘Embarked’]

data[‘Sex’] = [1 if x == ‘male’ else 0 for x in data.Sex]

data.values.dtype
data_train = data[[ ‘Sex’, ‘Age’, ‘SibSp’,
‘Parch’, ‘Fare’, ‘Cabin’, ‘p1’,‘p2’,‘p3’,‘e1’,‘e2’,‘e3’]]
data_target = data[‘Survived’].values.reshape(len(data),1)

np.shape(data_train),np.shape(data_target)

import tensorflow as tf

x = tf.placeholder(“float”,shape=[None,12])
y = tf.placeholder(“float”,shape=[None,1])

weight = tf.Variable(tf.random_normal([12,1]))
bias = tf.Variable(tf.random_normal([1]))
output = tf.matmul(x,weight) + bias
pred = tf.cast(tf.sigmoid(output)>0.5,tf.float32)

loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = y,logits = output))

loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels = y,logits = output))

train_step = tf.train.GradientDescentOptimizer(0.0003).minimize(loss)

accuracy = tf.reduce_mean(tf.cast(tf.equal(pred,y),tf.float32))

data_test = pd.read_csv(’./tt/test.csv’)

data_test.column

data_test.columns

In[42]:

date_test = data_test[[‘Pclass’, ‘Sex’, ‘Age’, ‘SibSp’, ‘Parch’,
‘Fare’, ‘Cabin’, ‘Embarked’]].copy()

In[43]:

data_test

In[44]:

data_test = data_test[[‘Pclass’, ‘Sex’, ‘Age’, ‘SibSp’, ‘Parch’,
‘Fare’, ‘Cabin’, ‘Embarked’]]

In[51]:

data_test

In[46]:

data_test[‘Age’] = data_test[‘Age’].fillna(data[‘Age’].mean())

In[47]:

data_test[‘Age’] = data_test[‘Age’].fillna(data_test[‘Age’].mean())

In[48]:

data_test

In[49]:

data_test[‘Age’] = data_test[‘Age’].fillna(data_test[‘Age’].mean())

In[50]:

data_test[‘Cabin’] = pd.factorize(data_test.Cabin)[0]

In[52]:

data_test = data_test[[‘Pclass’, ‘Sex’, ‘Age’, ‘SibSp’, ‘Parch’,
‘Fare’, ‘Cabin’, ‘Embarked’]].copy()
data_test[‘Cabin’] = pd.factorize(data_test.Cabin)[0]

In[53]:

data_test[‘Age’] = data_test[‘Age’].fillna(data_test[‘Age’].mean())

In[54]:

data_test.fillna(0,inplace = True)

In[55]:

data_test[‘Sex’] = [1 if x == ‘male’ else 0 for x in data_test.Sex]

In[56]:

data_test[‘p1’] = np.array(data_test[‘Pclass’] == 1).astype(np.int32)
data_test[‘p2’] = np.array(data_test[‘Pclass’] == 2).astype(np.int32)
data_test[‘p3’] = np.array(data_test[‘Pclass’] == 3).astype(np.int32)
data_test[‘e1’] = np.array(data_test[‘Embarked’] == ‘S’).astype(np.int32)
data_test[‘e2’] = np.array(data_test[‘Embarked’] == ‘C’).astype(np.int32)
data_test[‘e3’] = np.array(data_test[‘Embarked’] == ‘Q’).astype(np.int32)
del data_test[‘Pclass’]
del data_test[‘Embarked’]

In[57]:

test_lable = pd.read_csv(’./tt/gender.csv’)
test_lable = np.reshape(test_lable.Survived.values.astype(np.float32),(418,1))

In[58]:

test_lable = pd.read_csv(’./tt/gender.csv’)
test_lable = np.reshape(test_lable.Survived.values.astype(np.float32),(418,1))

In[59]:

sess = tf.Session()
sess.run(tf.global_variables_initializer())
loss_train = []
train_acc = []
test_acc = []

In[61]:

for i in range(25000):
index = np.random.permutation(len(data_target))
data_train = data_train[index]
data_target = data_target[index]
for n in range(len(data_target)//100 + 1):
batch_xs = data_train[n100:n100 + 100]
batch_ys = data_target[n100:n100 + 100]
sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys})
if i % 1000==0:
loss_temp = sess.run(loss,feed_dict={x: batch_xs,y: batch_ys})
loss_train.append(loss_temp)
train_acc_temp = sess.run(accuracy,feed_dict={x: batch_xs,y: batch_ys})
train_acc.append(train_acc_temp)
print(loss_temp,train_acc_temp)

In[62]:

for i in range(25000):
index = np.random.permutation(len(data_target))
data_train = data_train[index]
data_target = data_target[index]
for n in range(len(data_target)//100 + 1):
batch_xs = data_train[n100:n100 + 100]
batch_ys = data_target[n100:n100 + 100]
sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys})
if i%1000 == 0:
loss_temp = sess.run(loss,feed_dict={x: batch_xs,y: batch_ys})
loss_train.append(loss_temp)
train_acc_temp = sess.run(accuracy,feed_dict={x: batch_xs,y: batch_ys})
train_acc.append(train_acc_temp)
print(loss_temp,train_acc_temp)

In[64]:

for i in range(25000):
index = np.random.permutation(len(data_target))
data_train = data_train[index]
data_target = data_target[index]
for n in range(len(data_target)//100 + 1):
batch_xs = data_train[n100:n100 + 100]
batch_ys = data_target[n100:n100 + 100]
sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys})
if i%1000 == 0:
loss_temp = sess.run(loss,feed_dict={x: batch_xs,y: batch_ys})
loss_train.append(loss_temp)
train_acc_temp = sess.run(accuracy,feed_dict={x: batch_xs,y: batch_ys})
train_acc.append(train_acc_temp)
print(loss_temp,train_acc_temp)

In[65]:

for i in range(25000):
index = np.random.permutation(len(data_target))
data_train = data_train[index]
data_target = data_target[index]
for n in range(len(data_target)//100 + 1):
batch_xs = data_train[n100:n100 + 100]
batch_ys = data_target[n100:n100 + 100]
sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys})
if i%1000 == 0:
loss_temp = sess.run(loss,feed_dict={x: batch_xs,y: batch_ys})
loss_train.append(loss_temp)
train_acc_temp = sess.run(accuracy,feed_dict={x: batch_xs,y: batch_ys})
train_acc.append(train_acc_temp)
print(loss_temp,train_acc_temp)

In[66]:

for i in range(25000):
#index = np.random.permutation(len(data_target))
#data_train = data_train[index]
#data_target = data_target[index]
for n in range(len(data_target)//100 + 1):
batch_xs = data_train[n100:n100 + 100]
batch_ys = data_target[n100:n100 + 100]
sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys})
if i%1000 == 0:
loss_temp = sess.run(loss,feed_dict={x: batch_xs,y: batch_ys})
loss_train.append(loss_temp)
train_acc_temp = sess.run(accuracy,feed_dict={x: batch_xs,y: batch_ys})
train_acc.append(train_acc_temp)
print(loss_temp,train_acc_temp)

In[67]:

import matplotlib.pyplot as plt

In[68]:

plt.plot(loss_train,‘k-’)
plt.title(‘train loss’)
plt.show()

In[69]:

plt.plot(train_acc,‘b–’,label = ‘train_acc’)
plt.plot(test_acc,‘r–’,label = ‘test_acc’)
plt.title(‘acc’)
plt.legend()
plt.show()

这篇关于tensorflow泰坦尼克号沉船数据预测模型的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1059913

相关文章

SQL Server修改数据库名及物理数据文件名操作步骤

《SQLServer修改数据库名及物理数据文件名操作步骤》在SQLServer中重命名数据库是一个常见的操作,但需要确保用户具有足够的权限来执行此操作,:本文主要介绍SQLServer修改数据... 目录一、背景介绍二、操作步骤2.1 设置为单用户模式(断开连接)2.2 修改数据库名称2.3 查找逻辑文件名

canal实现mysql数据同步的详细过程

《canal实现mysql数据同步的详细过程》:本文主要介绍canal实现mysql数据同步的详细过程,本文通过实例图文相结合给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的... 目录1、canal下载2、mysql同步用户创建和授权3、canal admin安装和启动4、canal

使用SpringBoot整合Sharding Sphere实现数据脱敏的示例

《使用SpringBoot整合ShardingSphere实现数据脱敏的示例》ApacheShardingSphere数据脱敏模块,通过SQL拦截与改写实现敏感信息加密存储,解决手动处理繁琐及系统改... 目录痛点一:痛点二:脱敏配置Quick Start——Spring 显示配置:1.引入依赖2.创建脱敏

详解如何使用Python构建从数据到文档的自动化工作流

《详解如何使用Python构建从数据到文档的自动化工作流》这篇文章将通过真实工作场景拆解,为大家展示如何用Python构建自动化工作流,让工具代替人力完成这些数字苦力活,感兴趣的小伙伴可以跟随小编一起... 目录一、Excel处理:从数据搬运工到智能分析师二、PDF处理:文档工厂的智能生产线三、邮件自动化:

Python数据分析与可视化的全面指南(从数据清洗到图表呈现)

《Python数据分析与可视化的全面指南(从数据清洗到图表呈现)》Python是数据分析与可视化领域中最受欢迎的编程语言之一,凭借其丰富的库和工具,Python能够帮助我们快速处理、分析数据并生成高质... 目录一、数据采集与初步探索二、数据清洗的七种武器1. 缺失值处理策略2. 异常值检测与修正3. 数据

pandas实现数据concat拼接的示例代码

《pandas实现数据concat拼接的示例代码》pandas.concat用于合并DataFrame或Series,本文主要介绍了pandas实现数据concat拼接的示例代码,具有一定的参考价值,... 目录语法示例:使用pandas.concat合并数据默认的concat:参数axis=0,join=

C#代码实现解析WTGPS和BD数据

《C#代码实现解析WTGPS和BD数据》在现代的导航与定位应用中,准确解析GPS和北斗(BD)等卫星定位数据至关重要,本文将使用C#语言实现解析WTGPS和BD数据,需要的可以了解下... 目录一、代码结构概览1. 核心解析方法2. 位置信息解析3. 经纬度转换方法4. 日期和时间戳解析5. 辅助方法二、L

使用Python和Matplotlib实现可视化字体轮廓(从路径数据到矢量图形)

《使用Python和Matplotlib实现可视化字体轮廓(从路径数据到矢量图形)》字体设计和矢量图形处理是编程中一个有趣且实用的领域,通过Python的matplotlib库,我们可以轻松将字体轮廓... 目录背景知识字体轮廓的表示实现步骤1. 安装依赖库2. 准备数据3. 解析路径指令4. 绘制图形关键

详解如何使用Python从零开始构建文本统计模型

《详解如何使用Python从零开始构建文本统计模型》在自然语言处理领域,词汇表构建是文本预处理的关键环节,本文通过Python代码实践,演示如何从原始文本中提取多尺度特征,并通过动态调整机制构建更精确... 目录一、项目背景与核心思想二、核心代码解析1. 数据加载与预处理2. 多尺度字符统计3. 统计结果可

解决mysql插入数据锁等待超时报错:Lock wait timeout exceeded;try restarting transaction

《解决mysql插入数据锁等待超时报错:Lockwaittimeoutexceeded;tryrestartingtransaction》:本文主要介绍解决mysql插入数据锁等待超时报... 目录报错信息解决办法1、数据库中执行如下sql2、再到 INNODB_TRX 事务表中查看总结报错信息Lock