用git bash调用md5sum进行批量MD5计算

2024-01-20 10:52

本文主要是介绍用git bash调用md5sum进行批量MD5计算,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

对于非常大的文件或者很重要的文件,在不稳定的网络环境下,可能文件的某些字节会损坏。此时,对文件计算MD5即可以校验其完整性。比如本次的 OpenStreetMap 导出包,我的学弟反馈通过网盘下载无法解压,并建议我增加每个文件的MD5校验。

对于文件非常多的情况,需要批量计算。最简便的方法是使用git自带的md5sum进行计算。

1. 安装git并进入bash

到 https://git-scm.com/ 下载git,并安装。

安装后,右键单击网盘下载的文件夹,选择“git bash” 进入bash:

bash
bash
可以查看 md5sum的说明

$ md5sum --help
Usage: md5sum [OPTION]... [FILE]...
Print or check MD5 (128-bit) checksums.With no FILE, or when FILE is -, read standard input.-b, --binary         read in binary mode (default unless reading tty stdin)-c, --check          read MD5 sums from the FILEs and check them--tag            create a BSD-style checksum-t, --text           read in text mode (default if reading tty stdin)-z, --zero           end each output line with NUL, not newline,and disable file name escapingThe following five options are useful only when verifying checksums:--ignore-missing  don't fail or report status for missing files--quiet          don't print OK for each successfully verified file--status         don't output anything, status code shows success--strict         exit non-zero for improperly formatted checksum lines-w, --warn           warn about improperly formatted checksum lines--help     display this help and exit--version  output version information and exitThe sums are computed as described in RFC 1321.  When checking, the input
should be a former output of this program.  The default mode is to print a
line with checksum, a space, a character indicating input mode ('*' for binary,
' ' for text or where binary is insignificant), and name for each FILE.Note: There is no difference between binary mode and text mode on GNU systems.GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
Report any translation bugs to <https://translationproject.org/team/>
Full documentation <https://www.gnu.org/software/coreutils/md5sum>
or available locally via: info '(coreutils) md5sum invocation'

2. 批量计算md5

Linux常见命令 find 能够枚举文件并批量执行指令。

执行下面的指令,可以在屏幕输出各个文件的md5:

$ find ./Arch*.* -exec md5sum {} \;
d060dd81785d957ae4e2bbd4f9ebeb4e *./ArchOSManjaro.7z.001
b7326e73452d3fbbc56a889f55aa9a14 *./ArchOSManjaro.7z.002
805c9ef68887953554c6c160c2a72eeb *./ArchOSManjaro.7z.003
#...
2cc5ab567abba1d7e3a284ec5c383d84 *./ArchOSManjaro.7z.059
$

执行下面的指令,可以在文件输出各个文件的md5:

$ find ./Arch*.* -exec md5sum {} >> md5.txt \;

3.比较两个文件是否一致

我们假设本地校验结果放在check.txt,标准校验结果放在 md5.txt,则使用下面指令比较:

$ diff --help
Usage: diff [OPTION]... FILES
Compare FILES line by line.Mandatory arguments to long options are mandatory for short options too.--normal                  output a normal diff (the default)-q, --brief                   report only when files differ-s, --report-identical-files  report when two files are the same-c, -C NUM, --context[=NUM]   output NUM (default 3) lines of copied context-u, -U NUM, --unified[=NUM]   output NUM (default 3) lines of unified context-e, --ed                      output an ed script-n, --rcs                     output an RCS format diff-y, --side-by-side            output in two columns-W, --width=NUM               output at most NUM (default 130) print columns--left-column             output only the left column of common lines--suppress-common-lines   do not output common lines-p, --show-c-function         show which C function each change is in-F, --show-function-line=RE   show the most recent line matching RE--label LABEL             use LABEL instead of file name and timestamp(can be repeated)-t, --expand-tabs             expand tabs to spaces in output-T, --initial-tab             make tabs line up by prepending a tab--tabsize=NUM             tab stops every NUM (default 8) print columns--suppress-blank-empty    suppress space or tab before empty output lines-l, --paginate                pass output through 'pr' to paginate it-r, --recursive                 recursively compare any subdirectories found--no-dereference            don't follow symbolic links-N, --new-file                  treat absent files as empty--unidirectional-new-file   treat absent first files as empty--ignore-file-name-case     ignore case when comparing file names--no-ignore-file-name-case  consider case when comparing file names-x, --exclude=PAT               exclude files that match PAT-X, --exclude-from=FILE         exclude files that match any pattern in FILE-S, --starting-file=FILE        start with FILE when comparing directories--from-file=FILE1           compare FILE1 to all operands;FILE1 can be a directory--to-file=FILE2             compare all operands to FILE2;FILE2 can be a directory-i, --ignore-case               ignore case differences in file contents-E, --ignore-tab-expansion      ignore changes due to tab expansion-Z, --ignore-trailing-space     ignore white space at line end-b, --ignore-space-change       ignore changes in the amount of white space-w, --ignore-all-space          ignore all white space-B, --ignore-blank-lines        ignore changes where lines are all blank-I, --ignore-matching-lines=RE  ignore changes where all lines match RE-a, --text                      treat all files as text--strip-trailing-cr         strip trailing carriage return on input--binary                    read and write data in binary mode-D, --ifdef=NAME                output merged file with '#ifdef NAME' diffs--GTYPE-group-format=GFMT   format GTYPE input groups with GFMT--line-format=LFMT          format all input lines with LFMT--LTYPE-line-format=LFMT    format LTYPE input lines with LFMTThese format options provide fine-grained control over the outputof diff, generalizing -D/--ifdef.LTYPE is 'old', 'new', or 'unchanged'.  GTYPE is LTYPE or 'changed'.GFMT (only) may contain:%<  lines from FILE1%>  lines from FILE2%=  lines common to FILE1 and FILE2%[-][WIDTH][.[PREC]]{doxX}LETTER  printf-style spec for LETTERLETTERs are as follows for new group, lower case for old group:F  first line numberL  last line numberN  number of lines = L-F+1E  F-1M  L+1%(A=B?T:E)  if A equals B then T else ELFMT (only) may contain:%L  contents of line%l  contents of line, excluding any trailing newline%[-][WIDTH][.[PREC]]{doxX}n  printf-style spec for input line numberBoth GFMT and LFMT may contain:%%  %%c'C'  the single character C%c'\OOO'  the character with octal code OOOC    the character C (other characters represent themselves)-d, --minimal            try hard to find a smaller set of changes--horizon-lines=NUM  keep NUM lines of the common prefix and suffix--speed-large-files  assume large files and many scattered small changes--color[=WHEN]       color output; WHEN is 'never', 'always', or 'auto';plain --color means --color='auto'--palette=PALETTE    the colors to use when --color is active; PALETTE isa colon-separated list of terminfo capabilities--help               display this help and exit-v, --version            output version information and exitFILES are 'FILE1 FILE2' or 'DIR1 DIR2' or 'DIR FILE' or 'FILE DIR'.
If --from-file or --to-file is given, there are no restrictions on FILE(s).
If a FILE is '-', read standard input.
Exit status is 0 if inputs are the same, 1 if different, 2 if trouble.Report bugs to: bug-diffutils@gnu.org
GNU diffutils home page: <https://www.gnu.org/software/diffutils/>
General help using GNU software: <https://www.gnu.org/gethelp/>

执行指令:

$ diff  check.txt  md5.txt
36c36
< 23dfa036cd5d772f01173da67ebf2634 *./ArchOSManjaro.7z.029
---
> db2ce0bc5c39fd5d8f672f478788095c *./ArchOSManjaro.7z.029

可以看到第29号文件有问题。

使用 -y 选项,可以查看完整输出(左右两列)

$ diff -y check.txt  md5.txt
..
e155774f7dd158ded02d9a9aae68f5eb *./ArchOSManjaro.7z.028        e155774f7dd158ded02d9a9aae68f5eb *./ArchOSManjaro.7z.028
23dfa036cd5d772f01173da67ebf2634 *./ArchOSManjaro.7z.029      | db2ce0bc5c39fd5d8f672f478788095c *./ArchOSManjaro.7z.029
c8c32363ebd14a7eefce1cadaaa64def *./ArchOSManjaro.7z.030        c8c32363ebd14a7eefce1cadaaa64def *./ArchOSManjaro.7z.030
...

不一致的行,会用竖线“|”标记。

这篇关于用git bash调用md5sum进行批量MD5计算的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/625780

相关文章

SpringBoot分段处理List集合多线程批量插入数据方式

《SpringBoot分段处理List集合多线程批量插入数据方式》文章介绍如何处理大数据量List批量插入数据库的优化方案:通过拆分List并分配独立线程处理,结合Spring线程池与异步方法提升效率... 目录项目场景解决方案1.实体类2.Mapper3.spring容器注入线程池bejsan对象4.创建

Python实现Excel批量样式修改器(附完整代码)

《Python实现Excel批量样式修改器(附完整代码)》这篇文章主要为大家详细介绍了如何使用Python实现一个Excel批量样式修改器,文中的示例代码讲解详细,感兴趣的小伙伴可以跟随小编一起学习一... 目录前言功能特性核心功能界面特性系统要求安装说明使用指南基本操作流程高级功能技术实现核心技术栈关键函

使用Python批量将.ncm格式的音频文件转换为.mp3格式的实战详解

《使用Python批量将.ncm格式的音频文件转换为.mp3格式的实战详解》本文详细介绍了如何使用Python通过ncmdump工具批量将.ncm音频转换为.mp3的步骤,包括安装、配置ffmpeg环... 目录1. 前言2. 安装 ncmdump3. 实现 .ncm 转 .mp34. 执行过程5. 执行结

Python实现批量CSV转Excel的高性能处理方案

《Python实现批量CSV转Excel的高性能处理方案》在日常办公中,我们经常需要将CSV格式的数据转换为Excel文件,本文将介绍一个基于Python的高性能解决方案,感兴趣的小伙伴可以跟随小编一... 目录一、场景需求二、技术方案三、核心代码四、批量处理方案五、性能优化六、使用示例完整代码七、小结一、

C#实现一键批量合并PDF文档

《C#实现一键批量合并PDF文档》这篇文章主要为大家详细介绍了如何使用C#实现一键批量合并PDF文档功能,文中的示例代码简洁易懂,感兴趣的小伙伴可以跟随小编一起学习一下... 目录前言效果展示功能实现1、添加文件2、文件分组(书签)3、定义页码范围4、自定义显示5、定义页面尺寸6、PDF批量合并7、其他方法

Python实现精确小数计算的完全指南

《Python实现精确小数计算的完全指南》在金融计算、科学实验和工程领域,浮点数精度问题一直是开发者面临的重大挑战,本文将深入解析Python精确小数计算技术体系,感兴趣的小伙伴可以了解一下... 目录引言:小数精度问题的核心挑战一、浮点数精度问题分析1.1 浮点数精度陷阱1.2 浮点数误差来源二、基础解决

Nginx中配置使用非默认80端口进行服务的完整指南

《Nginx中配置使用非默认80端口进行服务的完整指南》在实际生产环境中,我们经常需要将Nginx配置在其他端口上运行,本文将详细介绍如何在Nginx中配置使用非默认端口进行服务,希望对大家有所帮助... 目录一、为什么需要使用非默认端口二、配置Nginx使用非默认端口的基本方法2.1 修改listen指令

Python文本相似度计算的方法大全

《Python文本相似度计算的方法大全》文本相似度是指两个文本在内容、结构或语义上的相近程度,通常用0到1之间的数值表示,0表示完全不同,1表示完全相同,本文将深入解析多种文本相似度计算方法,帮助您选... 目录前言什么是文本相似度?1. Levenshtein 距离(编辑距离)核心公式实现示例2. Jac

Java调用Python脚本实现HelloWorld的示例详解

《Java调用Python脚本实现HelloWorld的示例详解》作为程序员,我们经常会遇到需要在Java项目中调用Python脚本的场景,下面我们来看看如何从基础到进阶,一步步实现Java与Pyth... 目录一、环境准备二、基础调用:使用 Runtime.exec()2.1 实现步骤2.2 代码解析三、

MySQL按时间维度对亿级数据表进行平滑分表

《MySQL按时间维度对亿级数据表进行平滑分表》本文将以一个真实的4亿数据表分表案例为基础,详细介绍如何在不影响线上业务的情况下,完成按时间维度分表的完整过程,感兴趣的小伙伴可以了解一下... 目录引言一、为什么我们需要分表1.1 单表数据量过大的问题1.2 分表方案选型二、分表前的准备工作2.1 数据评估