PSP - 配置 AlphaFold2 的高效 Tensorflow 运行环境

2024-03-27 15:50

本文主要是介绍PSP - 配置 AlphaFold2 的高效 Tensorflow 运行环境,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

欢迎关注我的CSDN:https://spike.blog.csdn.net/
本文地址:https://blog.csdn.net/caroline_wendy/article/details/130560538

AF2

AlphaFold2 是由 DeepMind 开发,可以根据蛋白质的氨基酸序列预测其三维结构,准确度经常可以与实验相媲美。DeepMind 和 EMBL 的欧洲生物信息学研究所合作,创建AlphaFold DB,免费向科学界提供这些预测结果。最新的数据库版本,包含了超过 200 万种蛋白质的结构预测,涵盖人类和其他 20 多种物种的蛋白质组。AlphaFold2 的核心是基于神经网络的计算模型,结合了蛋白质的物理和生物学知识,利用多序列比对(MSA)所设计出的深度学习算法。

1. Docker 环境

命令如下:

# 启动 nvidia-docker 环境
nvidia-docker run -it --name [docker-name] -v [...]:[...] [nvidia-base]:v1.0# 配置 conda
bash Miniconda3-py38_4.10.3-Linux-x86_64.sh
source ~/.bashrc# 创建 alphafold 环境
conda create --name alphafold python==3.8
conda update -n base conda
conda activate alphafold# 配置 conda 库
conda install -y -c conda-forge openmm==7.5.1 cudatoolkit==11.2.2 pdbfixer
conda install -y -c bioconda hmmer hhsuite==3.3.0 kalign2# 再次更新
conda install -y -c conda-forge openmm==7.7.0 
conda install -y -c conda-forge pdbfixer==1.8.1# 配置 pip 库, tensorflow-gpu 或 tensorflow-cpu,根据机器选择
pip install absl-py==1.0.0 biopython==1.79 chex==0.0.7 dm-haiku==0.0.9 dm-tree==0.1.6 immutabledict==2.0.0 jax==0.3.25 ml-collections==0.1.0 numpy==1.21.6 pandas==1.3.4 protobuf==3.20.1 scipy==1.7.0 tensorflow-gpu==2.9.0 -i https://pypi.tuna.tsinghua.edu.cn/simple/# 配置 pip jax 库
pip install --upgrade --no-cache-dir jax==0.3.25 jaxlib==0.3.25+cuda11.cudnn805 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html# 导出环境
export alphafold_path="$(pwd)"# 下载配置文件
wget -q -P $alphafold_path/alphafold/common/ https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b554ec94415f8/modules/mol/alg/src/stereo_chemical_props.txt# 配置 openmm.patch
git checkout v2.3.1  # 最新版本删除 openmm.patch
cd ~/miniconda3/envs/alphafold/lib/python3.8/site-packages/
patch -p0 < $alphafold_path/docker/openmm.patch

测试 Tensorflow 是否安装成功,以及 GPU 是否启动:

python3  # 进入命令行import tensorflow as tfprint(f"is_gpu_available: {tf.test.is_gpu_available()}")
gpu_device_name = tf.test.gpu_device_name()
print(f"gpu_device_name: {gpu_device_name}")from tensorflow.python.client import device_lib 
# 列出所有的本地机器设备
local_device_protos = device_lib.list_local_devices()
# 只打印GPU设备
print(x) for x in local_device_protos if x.device_type == 'GPU'

保存和复用 docker,命令如下:

# 保存环境
docker ps -l
docker commit [container-id] af2:v1.0
docker save af2:v1.0 | gzip > af2_v1.tar.gz# 加载环境
docker image load -i af2_v1.tar.gz
nvidia-docker run -it --name [docker-name] -v [...]:[...] af2:v1.0

如需更换 Tensorflow 的 CPU 或 GPU 配置,先卸载再更新即可:

pip uninstall tensorflow-cpu tensorflow-estimator tensorflow-io-gcs-filesystem
pip install tensorflow-gpu==2.9.0 -i https://pypi.tuna.tsinghua.edu.cn/simple/

Bug1: OpenMM 相关 Bug

如遇 OpenMM Bug,以及解决方案:

openmm Bug 1: conda install -c conda-forge openmm==7.7.0

Traceback (most recent call last):File "run_alphafold.py", line 41, in <module>from alphafold.relax import relaxFile "alphafold/relax/relax.py", line 18, in <module>from alphafold.relax import amber_minimizeFile "alphafold/relax/amber_minimize.py", line 25, in <module>from alphafold.relax import cleanupFile "alphafold/relax/cleanup.py", line 23, in <module>from openmm import app
ModuleNotFoundError: No module named 'openmm'

pdbfixer Bug2:conda install -c conda-forge pdbfixer==1.8.1

Traceback (most recent call last):File "run_alphafold.py", line 41, in <module>from alphafold.relax import relaxFile "alphafold/relax/relax.py", line 18, in <module>from alphafold.relax import amber_minimizeFile "alphafold/relax/amber_minimize.py", line 25, in <module>from alphafold.relax import cleanupFile "alphafold/relax/cleanup.py", line 22, in <module>import pdbfixerFile "/root/miniconda3/envs/alphafold/lib/python3.8/site-packages/pdbfixer/__init__.py", line 2, in <module>from .pdbfixer import PDBFixerFile "/root/miniconda3/envs/alphafold/lib/python3.8/site-packages/pdbfixer/pdbfixer.py", line 38, in <module>from simtk.openmm.app.internal.pdbstructure import PdbStructure
ModuleNotFoundError: No module named 'simtk.openmm.app.internal'

参考:PSP - 替换 MSA 数据库 以及 OpenMM 和 mmCIF 异常

Bug2: Collecting package metadata (repodata.json): / Killed

参考:StackOverflow - Collecting package metadata (repodata.json): / Killed

显存 RAM 过低,提升显存 0.5GB 至 8GB + 即可。

2. 配置数据库

参考:官方GitHub:GitHub - deepmind/alphafold

2.1 AlphaFold2 Model

目前,最新版本 (2023.5.7) 是 alphafold_params_2022-12-06

下载命令:

mkdir params
cd params/
wget -P . https://storage.googleapis.com/alphafold/alphafold_params_2022-12-06.tar   # 5.2G
tar --extract --verbose --file="alphafold_params_2022-12-06.tar" --directory="." --preserve-permissions

模型参数说明:

MP

2.2 Small BFD

下载命令:

mkdir small_bfd
cd small_bfd/
wget -P . https://storage.googleapis.com/alphafold-databases/reduced_dbs/bfd-first_non_consensus_sequences.fasta.gz  # 9.6G
gunzip "bfd-first_non_consensus_sequences.fasta.gz"

2.3 数据库配置

其他数据库,根据工程自行下载。将已有的数据库,配置到一个数据文件夹中,可以使用软连接的方式,即 ln -s,数据库如下:

bfd/					# 多个文件的相同前缀
mgnify/				# fa文件,64G
params/   		# 模型参数,最新版本2022-12-06,monomer,monomer-ptm,multimer_v3
pdb70/				# 文件夹
pdb_mmcif/		# 文件夹
pdb_seqres/		# multimer使用txt,208M
small_bfd/		# bfd的fasta文件,17G
uniprot/			# fasta文件,98G,注意版本信息
uniref30/			# 多个文件的相同前缀,注意日期
uniref90/			# fasta文件,59G

3. 配置脚本

修改运行脚本:run_alphafold.sh

修改数据库配置,注意 uniref30 的不同版本信息,配置如下:

# Path and user config (change me if required)
uniref90_database_path="$data_dir/uniref90/uniref90.fasta"
uniprot_database_path="$data_dir/uniprot/uniprot.fasta"
mgnify_database_path="$data_dir/mgnify/mgy_clusters_2022_05.fa"
bfd_database_path="$data_dir/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt"
small_bfd_database_path="$data_dir/small_bfd/bfd-first_non_consensus_sequences.fasta"
# uniref30_database_path="$data_dir/uniref30/UniRef30_2021_03"
uniref30_database_path="$data_dir/uniref30/uniclust30_2018_08"
pdb70_database_path="$data_dir/pdb70/pdb70"
pdb_seqres_database_path="$data_dir/pdb_seqres/pdb_seqres.txt"
template_mmcif_dir="$data_dir/pdb_mmcif/mmcif_files"
obsolete_pdbs_path="$data_dir/pdb_mmcif/obsolete.dat"

修改 MSA 搜索工具位置,配置如下:

hhblits_binary_path="/root/miniconda3/envs/alphafold/bin/hhblits"
hhsearch_binary_path="/root/miniconda3/envs/alphafold/bin/hhsearch"
jackhmmer_binary_path="/root/miniconda3/envs/alphafold/bin/jackhmmer"
kalign_binary_path="/root/miniconda3/envs/alphafold/bin/kalign"

修改 数据库位置 与 最大模版日期,配置如下:

if [[ "$data_dir" == "" || "$output_dir" == "" || "$fasta_path" == "" || "$max_template_date" == "" ]] ; thendata_dir=[my data dir];max_template_date="2022-04-01";
fi

搜索 MSA 的过程,在 AF2 推理运行中,占用时间较长,修改优先使用已有 MSA 文件,如下:

if [[ "$use_precomputed_msas" == "" ]] ; thenuse_precomputed_msas="true"
fi

4. 配置源码

加速搜索 MSA 的过程,需要修改 CPU 数量,默认是8个。查询 Linux 的 GPU 数量,如下:

lscpu | grep 'CPU(s):' | head -1 | awk '{print $2}'   # 查询 CPU 数量

修改文件 alphafold/data/tools/hhblits.py,如下:

                binary_path: str,databases: Sequence[str],
-               n_cpu: int = 4,
+               n_cpu: int = [your num],n_iter: int = 3,e_value: float = 0.001,maxseq: int = 1_000_000,

修改文件 alphafold/data/tools/hmmsearch.py,如下:

       cmd = [self.binary_path,'--noali',  # Don't include the alignment in stdout.
-          '--cpu', '8'
+          '--cpu', '[your num]']# If adding flags, we have to do so before the output and input:if self.flags:

修改文件 alphafold/data/tools/jackhmmer.py,如下:

                binary_path: str,database_path: str,
-               n_cpu: int = 8,
+               n_cpu: int = [your num],n_iter: int = 1,e_value: float = 0.0001,z_value: Optional[int] = None,

也可以修改 monomer_casp14 模式的默认模型,由 monomer 替换为 monomer_ptm,如下:

-MODEL_PRESETS['monomer_casp14'] = MODEL_PRESETS['monomer']
+# MODEL_PRESETS['monomer_casp14'] = MODEL_PRESETS['monomer']
+MODEL_PRESETS['monomer_casp14'] = MODEL_PRESETS['monomer_ptm']

其中,pTM 模型:

pTM models were fine-tuned to produce pTM (predicted TM-score) and (PAE) predicted aligned error values alongside their structure predictions.

pTM 模型经过微调 (基于monomer),在进行结构预测时,产生 pTM(预测的TM得分)和 PAE(预测的对齐误差)值。

5. 推理序列

推理命令:

bash run_alphafold.sh -o mydata/output/ -f mydata/query.fasta -m monomer_casp14 -c full_dbs

seq:

>dummy_sequence
GWSTELEKHREELKEFLKKEGITNVEIRIDNGRLEVRVEGGTERLKRFLEELRQKLEKKGYTVDIKIE

主要输出:

  1. pdb,最好的结构是 ranked_0.pdb
  2. ranking_debug.json,pdb排名
  3. msas,搜索出的 MSA 文件,即mgnify_hits.stopdb_hits.hhrbfd_uniref_hits.a3muniref90_hits.sto 等。
  4. timings.json,运行耗时。

其中 ranking_debug.json,如下:

{"plddts": {"model_1_pred_0": 86.26850453604357,"model_2_pred_0": 85.06505646965638,"model_3_pred_0": 87.40822765097714,"model_4_pred_0": 84.71053426936133,"model_5_pred_0": 82.69870802756033},"order": ["model_3_pred_0","model_1_pred_0","model_2_pred_0","model_4_pred_0","model_5_pred_0"]
}

其中timings.json,如下:

{"features": 103.40737819671631,"process_features_model_1_pred_0": 3.8775177001953125,"predict_and_compile_model_1_pred_0": 116.74437546730042,"relax_model_1_pred_0": 11.63992977142334,"process_features_model_2_pred_0": 1.3910491466522217,"predict_and_compile_model_2_pred_0": 114.51620531082153,"relax_model_2_pred_0": 5.43536114692688,"process_features_model_3_pred_0": 1.1890630722045898,"predict_and_compile_model_3_pred_0": 87.88086938858032,"relax_model_3_pred_0": 5.768261194229126,"process_features_model_4_pred_0": 1.1486437320709229,"predict_and_compile_model_4_pred_0": 87.95040488243103,"relax_model_4_pred_0": 5.295060873031616,"process_features_model_5_pred_0": 1.2103533744812012,"predict_and_compile_model_5_pred_0": 88.90721249580383,"relax_model_5_pred_0": 5.518966436386108
}

输出的最优PDB结构,如下:

PDB

参考

  1. GitHub - deepmind/alphafold
  2. GitHub - kalininalab/alphafold_non_docker

源码如下:

#!/bin/bashusage() {echo ""echo "Please make sure all required parameters are given"echo "Usage: $0 <OPTIONS>"echo "Required Parameters:"echo "-d <data_dir>         Path to directory of supporting data"echo "-o <output_dir>       Path to a directory that will store the results."echo "-f <fasta_paths>      Path to FASTA files containing sequences. If a FASTA file contains multiple sequences, then it will be folded as a multimer. To fold more sequences one after another, write the files separated by a comma"echo "-t <max_template_date> Maximum template release date to consider (ISO-8601 format - i.e. YYYY-MM-DD). Important if folding historical test sets"echo "Optional Parameters:"echo "-g <use_gpu>          Enable NVIDIA runtime to run with GPUs (default: true)"echo "-r <run_relax>        Whether to run the final relaxation step on the predicted models. Turning relax off might result in predictions with distracting stereochemical violations but might help in case you are having issues with the relaxation stage (default: true)"echo "-e <enable_gpu_relax> Run relax on GPU if GPU is enabled (default: true)"echo "-n <openmm_threads>   OpenMM threads (default: all available cores)"echo "-a <gpu_devices>      Comma separated list of devices to pass to 'CUDA_VISIBLE_DEVICES' (default: 0)"echo "-m <model_preset>     Choose preset model configuration - the monomer model, the monomer model with extra ensembling, monomer model with pTM head, or multimer model (default: 'monomer')"echo "-c <db_preset>        Choose preset MSA database configuration - smaller genetic database config (reduced_dbs) or full genetic database config (full_dbs) (default: 'full_dbs')"echo "-p <use_precomputed_msas> Whether to read MSAs that have been written to disk. WARNING: This will not check if the sequence, database or configuration have changed (default: 'false')"echo "-l <num_multimer_predictions_per_model> How many predictions (each with a different random seed) will be generated per model. E.g. if this is 2 and there are 5 models then there will be 10 predictions per input. Note: this FLAG only applies if model_preset=multimer (default: 5)"echo "-b <benchmark>        Run multiple JAX model evaluations to obtain a timing that excludes the compilation time, which should be more indicative of the time required for inferencing many proteins (default: 'false')"echo ""exit 1
}while getopts ":d:o:f:t:g:r:e:n:a:m:c:p:l:b:" i; docase "${i}" ind)data_dir=$OPTARG;;o)output_dir=$OPTARG;;f)fasta_path=$OPTARG;;t)max_template_date=$OPTARG;;g)use_gpu=$OPTARG;;r)run_relax=$OPTARG;;e)enable_gpu_relax=$OPTARG;;n)openmm_threads=$OPTARG;;a)gpu_devices=$OPTARG;;m)model_preset=$OPTARG;;c)db_preset=$OPTARG;;p)use_precomputed_msas=$OPTARG;;l)num_multimer_predictions_per_model=$OPTARG;;b)benchmark=$OPTARG;;esac
done# Parse input and set defaults
if [[ "$data_dir" == "" || "$output_dir" == "" || "$fasta_path" == "" || "$max_template_date" == "" ]] ; thenusage
fiif [[ "$benchmark" == "" ]] ; thenbenchmark=false
fiif [[ "$use_gpu" == "" ]] ; thenuse_gpu=true
fiif [[ "$gpu_devices" == "" ]] ; thengpu_devices=0
fiif [[ "$run_relax" == "" ]] ; thenrun_relax="true"
fiif [[ "$enable_gpu_relax" == "" ]] ; thenenable_gpu_relax="true"
fiif [[ "$enable_gpu_relax" == true && "$use_gpu" == true ]] ; thenuse_gpu_relax="true"
elseuse_gpu_relax="false"
fiif [[ "$num_multimer_predictions_per_model" == "" ]] ; thennum_multimer_predictions_per_model=5
fiif [[ "$model_preset" == "" ]] ; thenmodel_preset="monomer"
fiif [[ "$model_preset" != "monomer" && "$model_preset" != "monomer_casp14" && "$model_preset" != "monomer_ptm" && "$model_preset" != "multimer" ]] ; thenecho "Unknown model preset! Using default ('monomer')"model_preset="monomer"
fiif [[ "$db_preset" == "" ]] ; thendb_preset="full_dbs"
fiif [[ "$db_preset" != "full_dbs" && "$db_preset" != "reduced_dbs" ]] ; thenecho "Unknown database preset! Using default ('full_dbs')"db_preset="full_dbs"
fiif [[ "$use_precomputed_msas" == "" ]] ; thenuse_precomputed_msas="false"
fi# This bash script looks for the run_alphafold.py script in its current working directory, if it does not exist then exits
current_working_dir=$(pwd)
alphafold_script="$current_working_dir/run_alphafold.py"if [ ! -f "$alphafold_script" ]; thenecho "Alphafold python script $alphafold_script does not exist."exit 1
fi# Export ENVIRONMENT variables and set CUDA devices for use
# CUDA GPU control
export CUDA_VISIBLE_DEVICES=-1
if [[ "$use_gpu" == true ]] ; thenexport CUDA_VISIBLE_DEVICES=0if [[ "$gpu_devices" ]] ; thenexport CUDA_VISIBLE_DEVICES=$gpu_devicesfi
fi# OpenMM threads control
if [[ "$openmm_threads" ]] ; thenexport OPENMM_CPU_THREADS=$openmm_threads
fi# TensorFlow control
export TF_FORCE_UNIFIED_MEMORY='1'# JAX control
export XLA_PYTHON_CLIENT_MEM_FRACTION='4.0'# Path and user config (change me if required)
uniref90_database_path="$data_dir/uniref90/uniref90.fasta"
uniprot_database_path="$data_dir/uniprot/uniprot.fasta"
mgnify_database_path="$data_dir/mgnify/mgy_clusters_2022_05.fa"
bfd_database_path="$data_dir/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt"
small_bfd_database_path="$data_dir/small_bfd/bfd-first_non_consensus_sequences.fasta"
uniref30_database_path="$data_dir/uniref30/UniRef30_2021_03"
pdb70_database_path="$data_dir/pdb70/pdb70"
pdb_seqres_database_path="$data_dir/pdb_seqres/pdb_seqres.txt"
template_mmcif_dir="$data_dir/pdb_mmcif/mmcif_files"
obsolete_pdbs_path="$data_dir/pdb_mmcif/obsolete.dat"# Binary path (change me if required)
hhblits_binary_path=$(which hhblits)
hhsearch_binary_path=$(which hhsearch)
jackhmmer_binary_path=$(which jackhmmer)
kalign_binary_path=$(which kalign)command_args="--fasta_paths=$fasta_path --output_dir=$output_dir --max_template_date=$max_template_date --db_preset=$db_preset --model_preset=$model_preset --benchmark=$benchmark --use_precomputed_msas=$use_precomputed_msas --num_multimer_predictions_per_model=$num_multimer_predictions_per_model --run_relax=$run_relax --use_gpu_relax=$use_gpu_relax --logtostderr"database_paths="--uniref90_database_path=$uniref90_database_path --mgnify_database_path=$mgnify_database_path --data_dir=$data_dir --template_mmcif_dir=$template_mmcif_dir --obsolete_pdbs_path=$obsolete_pdbs_path"binary_paths="--hhblits_binary_path=$hhblits_binary_path --hhsearch_binary_path=$hhsearch_binary_path --jackhmmer_binary_path=$jackhmmer_binary_path --kalign_binary_path=$kalign_binary_path"if [[ $model_preset == "multimer" ]]; thendatabase_paths="$database_paths --uniprot_database_path=$uniprot_database_path --pdb_seqres_database_path=$pdb_seqres_database_path"
elsedatabase_paths="$database_paths --pdb70_database_path=$pdb70_database_path"
fiif [[ "$db_preset" == "reduced_dbs" ]]; thendatabase_paths="$database_paths --small_bfd_database_path=$small_bfd_database_path"
elsedatabase_paths="$database_paths --uniref30_database_path=$uniref30_database_path --bfd_database_path=$bfd_database_path"
fi# Run AlphaFold with required parameters
$(python $alphafold_script $binary_paths $database_paths $command_args)

这篇关于PSP - 配置 AlphaFold2 的高效 Tensorflow 运行环境的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/852633

相关文章

python常见环境管理工具超全解析

《python常见环境管理工具超全解析》在Python开发中,管理多个项目及其依赖项通常是一个挑战,下面:本文主要介绍python常见环境管理工具的相关资料,文中通过代码介绍的非常详细,需要的朋友... 目录1. conda2. pip3. uvuv 工具自动创建和管理环境的特点4. setup.py5.

C++高效内存池实现减少动态分配开销的解决方案

《C++高效内存池实现减少动态分配开销的解决方案》C++动态内存分配存在系统调用开销、碎片化和锁竞争等性能问题,内存池通过预分配、分块管理和缓存复用解决这些问题,下面就来了解一下... 目录一、C++内存分配的性能挑战二、内存池技术的核心原理三、主流内存池实现:TCMalloc与Jemalloc1. TCM

Redis Cluster模式配置

《RedisCluster模式配置》:本文主要介绍RedisCluster模式配置,本文给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友参考下吧... 目录分片 一、分片的本质与核心价值二、分片实现方案对比 ‌三、分片算法详解1. ‌范围分片(顺序分片)‌2. ‌哈希分片3. ‌虚

SpringBoot项目配置logback-spring.xml屏蔽特定路径的日志

《SpringBoot项目配置logback-spring.xml屏蔽特定路径的日志》在SpringBoot项目中,使用logback-spring.xml配置屏蔽特定路径的日志有两种常用方式,文中的... 目录方案一:基础配置(直接关闭目标路径日志)方案二:结合 Spring Profile 按环境屏蔽关

Python中使用uv创建环境及原理举例详解

《Python中使用uv创建环境及原理举例详解》uv是Astral团队开发的高性能Python工具,整合包管理、虚拟环境、Python版本控制等功能,:本文主要介绍Python中使用uv创建环境及... 目录一、uv工具简介核心特点:二、安装uv1. 通过pip安装2. 通过脚本安装验证安装:配置镜像源(可

Maven 配置中的 <mirror>绕过 HTTP 阻断机制的方法

《Maven配置中的<mirror>绕过HTTP阻断机制的方法》:本文主要介绍Maven配置中的<mirror>绕过HTTP阻断机制的方法,本文给大家分享问题原因及解决方案,感兴趣的朋友一... 目录一、问题场景:升级 Maven 后构建失败二、解决方案:通过 <mirror> 配置覆盖默认行为1. 配置示

Springboot3+将ID转为JSON字符串的详细配置方案

《Springboot3+将ID转为JSON字符串的详细配置方案》:本文主要介绍纯后端实现Long/BigIntegerID转为JSON字符串的详细配置方案,s基于SpringBoot3+和Spr... 目录1. 添加依赖2. 全局 Jackson 配置3. 精准控制(可选)4. OpenAPI (Spri

Python基于微信OCR引擎实现高效图片文字识别

《Python基于微信OCR引擎实现高效图片文字识别》这篇文章主要为大家详细介绍了一款基于微信OCR引擎的图片文字识别桌面应用开发全过程,可以实现从图片拖拽识别到文字提取,感兴趣的小伙伴可以跟随小编一... 目录一、项目概述1.1 开发背景1.2 技术选型1.3 核心优势二、功能详解2.1 核心功能模块2.

maven私服配置全过程

《maven私服配置全过程》:本文主要介绍maven私服配置全过程,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录使用Nexus作为 公司maven私服maven 私服setttings配置maven项目 pom配置测试效果总结使用Nexus作为 公司maven私

基于Python构建一个高效词汇表

《基于Python构建一个高效词汇表》在自然语言处理(NLP)领域,构建高效的词汇表是文本预处理的关键步骤,本文将解析一个使用Python实现的n-gram词频统计工具,感兴趣的可以了解下... 目录一、项目背景与目标1.1 技术需求1.2 核心技术栈二、核心代码解析2.1 数据处理函数2.2 数据处理流程