数据迁移到 Django 模型表：详尽指南

本文主要是介绍数据迁移到 Django 模型表：详尽指南，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

数据迁移是许多应用程序开发过程中必不可少的一部分。在这篇文章中，我们将详细分析和总结如何通过一个定制的 Django 管理命令，将数据从 MySQL 数据库迁移到 Django 模型表中。这种方法可以确保数据在多个数据库之间有效且安全地迁移，同时避免了手动操作的繁琐和错误。

项目概览

我们将实现一个 Django 管理命令，该命令将从 MySQL 数据库中提取数据并批量插入到 Django 模型表中。这个过程将使用事务处理来确保数据一致性，并通过记录偏移量来支持断点续传。

代码详解

首先，我们需要定义一个 Django 管理命令。以下是完整的代码：

import os
import traceback
import mysql.connector
from django.db import transaction
from mysql.connector import Error
from django.core.management.base import BaseCommand
from myapp.models import HotSearchTermsReportABAdb_config = {# MySQL 数据库配置
}class Command(BaseCommand):help = '数据迁移到 Django 模型表'def handle(self, *args, **kwargs):try:db_conn = mysql.connector.connect(**db_config)db_cursor = db_conn.cursor()self.stdout.write(self.style.SUCCESS("正在连接数据库"))except Error as e:self.stdout.write(self.style.ERROR(f"连接过程中出现异常：{e}"))self.stdout.write(self.style.ERROR(str(traceback.format_exc())))returnperiod = '最新'fetch_sql = f"""SELECT search_rank, search_term FROM hot_terms_table WHERE period = '{period}' LIMIT %s OFFSET %s;"""# 批量大小batch_size = 5000# 读取偏移量offset = self.get_last_offset()total_rows_transferred = 0try:while True:db_cursor.execute(fetch_sql, (batch_size, offset))batch_data = db_cursor.fetchall()if not batch_data:break  # 如果没有更多的数据，退出循环# 将batch_data转换为HotSearchTermsReportABA对象列表objects = [HotSearchTermsReportABA(search_rank=row[0],search_term=row[1]) for row in batch_data]with transaction.atomic():  # 开启事务# 在Django中批量创建对象HotSearchTermsReportABA.objects.bulk_create(objects)# 更新偏移量和总条数offset += batch_sizetotal_rows_transferred += len(batch_data)self.stdout.write(self.style.SUCCESS(f"{len(batch_data)} 行数据已在此批中转移。"))self.stdout.write(self.style.SUCCESS(f"总共完成将 {total_rows_transferred} 行数据转移。"))# 更新文件中的偏移量self.update_last_offset(offset)except Error as e:self.stdout.write(self.style.ERROR(f"传输过程中出现异常：{e}"))self.stdout.write(self.style.ERROR(str(traceback.format_exc())))finally:# 关闭所有连接和游标if db_cursor:db_cursor.close()if db_conn:db_conn.close()def get_last_offset(self):# 从文件中读取偏移量offset_file = 'migration_offset.txt'if os.path.exists(offset_file):with open(offset_file, 'r') as file:return int(file.read().strip())return 0def update_last_offset(self, offset):# 将偏移量写入文件offset_file = 'migration_offset.txt'with open(offset_file, 'w') as file:file.write(str(offset))# python manage.py migrate_data

代码分析

数据库连接

首先，代码尝试连接到 MySQL 数据库。如果连接失败，会捕获异常并输出错误信息。

try:db_conn = mysql.connector.connect(**db_config)db_cursor = db_conn.cursor()self.stdout.write(self.style.SUCCESS("正在连接数据库"))
except Error as e:self.stdout.write(self.style.ERROR(f"连接过程中出现异常：{e}"))self.stdout.write(self.style.ERROR(str(traceback.format_exc())))return

SQL 查询与数据提取

接下来，代码定义了一个 SQL 查询语句，用于从 hot_search_terms_report 表中获取数据。使用 LIMIT 和 OFFSET 实现分页读取数据。

period = '最新'
fetch_sql = f"""SELECT search_rank, search_term FROM hot_terms_table WHERE period = '{period}' LIMIT %s OFFSET %s;
"""
batch_size = 5000
offset = self.get_last_offset()
total_rows_transferred = 0

数据迁移与事务处理

代码使用一个循环来分页读取数据，并将数据转换为 Django 模型对象，然后使用事务处理将数据批量插入到 Django 数据库中。事务处理确保数据的一致性，即使在插入过程中发生错误，也能回滚事务。

try:while True:db_cursor.execute(fetch_sql, (batch_size, offset))batch_data = db_cursor.fetchall()if not batch_data:break  # 如果没有更多的数据，退出循环objects = [HotSearchTermsReportABA(search_rank=row[0],search_term=row[1]) for row in batch_data]with transaction.atomic():HotSearchTermsReportABA.objects.bulk_create(objects)offset += batch_sizetotal_rows_transferred += len(batch_data)self.stdout.write(self.style.SUCCESS(f"{len(batch_data)} 行数据已在此批中转移。"))self.stdout.write(self.style.SUCCESS(f"总共完成了将 {total_rows_transferred} 行数据转移。"))self.update_last_offset(offset)

偏移量管理

为了支持断点续传，代码会将每次读取的数据偏移量存储在一个文件中。下次运行时，会从该文件读取偏移量，继续上次未完成的迁移任务。

def get_last_offset(self):offset_file = 'migration_offset.txt'if os.path.exists(offset_file):with open(offset_file, 'r') as file:return int(file.read().strip())return 0def update_last_offset(self, offset):offset_file = 'migration_offset.txt'with open(offset_file, 'w') as file:file.write(str(offset))