DeepMD 快速上手 | Eastsheng's Wiki

[toc]

将以气态甲烷分子为例，详细介绍Deep Potential (DP)模型的训练和应用。

手动安装cpu版

1
2
3

wget https://github.com/deepmodeling/deepmd-kit/releases/download/v3.0.0b3/deepmd-kit-3.0.0b3-cpu-Linux-x86_64.sh
bash deepmd-kit-3.0.0b3-cpu-Linux-x86_64.sh
conda activate /home/xxx/softwares/deepmd-kit

安装过程中可以自定义路径 /home/xxx/softwares/miniconda/miniconda3/envs/deepmd-kit

激活环境

数据准备

wget https://bohrium-api.dp.tech/ds-dl/DeePMD-kit-Tutorial-a8z5-v1.zip
unzip DeePMD-kit-Tutorial-a8z5-v1.zip
cd DeePMD-kit_Tutorial
tree -L 1

.
├── 00.data : 存储训练和测试数据
├── 01.train：包含使用DeePMD-kit训练模型的示例脚本
├── 01.train.finished：包括训练过程的完整结果
├── 02.lmp：包含使用LAMMPS进行分子动力学模拟的示例脚本
└── 02.lmp.finished

5 directories, 0 files

1	tree 00.data/ -L 1

00.data/
├── abacus_md：包含通过ABACUS从头分子动力学(AIMD)模拟获得的数据
├── training_data
└── validation_data

3 directories, 0 files

DeePMD-kit的训练数据来源于第一性原理计算数据，包括atomic types, simulation cells, atomic coordinates, atomic forces, system energies, and virials.

采用”dpdata”工具划分数据：

1 2	pip install dpdata python split_data.py

# split_data.py
import dpdata
import numpy as np

# load data of abacus/md format
data = dpdata.LabeledSystem("../DeePMD-kit_Tutorial/00.data/abacus_md", fmt="abacus/md")
print("# the data contains %d frames" % len(data))

# random choose 40 index for validation_data
rng = np.random.default_rng()
index_validation = rng.choice(201, size=40, replace=False)

# other indexes are training_data
index_training = list(set(range(201)) - set(index_validation))
data_training = data.sub_system(index_training)
data_validation = data.sub_system(index_validation)

# all training data put into directory:"training_data"
data_training.to_deepmd_npy("../DeePMD-kit_Tutorial/00.data/training_data")

# all validation data put into directory:"validation_data"
data_validation.to_deepmd_npy("../DeePMD-kit_Tutorial/00.data/validation_data")

print("# the training data contains %d frames" % len(data_training))
print("# the validation data contains %d frames" % len(data_validation))

# the data contains 201 frames
# the training data contains 161 frames
# the validation data contains 40 frames

00.data/training_data/
├── set.000：存放压缩格式数据(NumPy压缩数组)的目录
├── type_map.raw：它是一个包含原子类型(表示为整数)的文件
└── type.raw：它是一个包含原子类型名称的文件

1 directory, 2 files

输入脚本准备

数据准备完成后，可以开始训练

现在转到training 目录。DeePMD-kit需要一个json格式的文件来指定训练参数。

1	pip show dargs \|\| pip install --upgrade dargs

# input_pre.py
# Show input.json
from deepmd.utils.argcheck import gen_args
from dargs.notebook import JSON

with open("../DeePMD-kit_Tutorial/01.train/input.json") as f:
    JSON(f.read(), gen_args())

{
    "_comment": " model parameters",
    "model": {
	"type_map":	["H", "C"],
	"descriptor" :{
	    "type":		"se_e2_a", # 描述符的类型
	    "sel":		"auto",
	    "rcut_smth":	0.50, # 平滑开始的位置
	    "rcut":		6.00,
	    "neuron":		[25, 50, 100], # 嵌入 神经 网络 的 大小
	    "resnet_dt":	false,
	    "axis_neuron":	16, # G (嵌入 矩阵) 的 子矩阵 的 大小
	    "seed":		1,
	    "_comment":		" that's all"
	},
	"fitting_net" : {
	    "neuron":		[240, 240, 240], # 拟合 神经 网络 的 大小
	    "resnet_dt":	true,
	    "seed":		1,
	    "_comment":		" that's all"
	},
	"_comment":	" that's all"
    },

    "learning_rate" :{
	"type":		"exp",
	"decay_steps":	50,
	"start_lr":	0.001,	
	"stop_lr":	3.51e-8,
	"_comment":	"that's all"
    },

    "loss" :{
	"type":		"ener",
	"start_pref_e":	0.02,
	"limit_pref_e":	1,
	"start_pref_f":	1000,
	"limit_pref_f":	1,
	"start_pref_v":	0,
	"limit_pref_v":	0,
	"_comment":	" that's all"
    },

    "training" : { # 训练参数
	"training_data": {
	    "systems":     ["../00.data/training_data"],
	    "batch_size":  "auto",
	    "_comment":	   "that's all"
	},
	"validation_data":{
	    "systems":	   ["../00.data/validation_data"],
	    "batch_size":  "auto",
	    "numb_btch":   1,
	    "_comment":	   "that's all"
	},
	"numb_steps":	10000,
	"seed":		10,
	"disp_file":	"lcurve.out",
	"disp_freq":	200,
	"save_freq":	1000,
	"_comment":	"that's all"
    },    

    "_comment":		"that's all"
}

模型训练

运行 DeePMD-kit 来开始训练

1 2	cd ./01.train dp train input.json

# screen输出
[2024-09-24 12:48:59,897] DEEPMD INFO    batch       0: trn: rmse = 1.86e+01, rmse_e = 1.35e-01, rmse_f = 5.87e-01, lr = 1.00e-03
[2024-09-24 12:48:59,897] DEEPMD INFO    batch       0: val: rmse = 1.91e+01, rmse_e = 1.34e-01, rmse_f = 6.05e-01
[2024-09-24 12:49:11,313] DEEPMD INFO    batch     200: trn: rmse = 4.95e+00, rmse_e = 1.99e+00, rmse_f = 1.59e-01, lr = 8.15e-04
[2024-09-24 12:49:11,314] DEEPMD INFO    batch     200: val: rmse = 5.72e+00, rmse_e = 1.99e+00, rmse_f = 1.88e-01
[2024-09-24 12:49:11,314] DEEPMD INFO    batch     200: total wall time = 11.68 s
[2024-09-24 12:49:23,078] DEEPMD INFO    batch     400: trn: rmse = 4.44e+00, rmse_e = 5.28e-01, rmse_f = 1.70e-01, lr = 6.63e-04
[2024-09-24 12:49:23,078] DEEPMD INFO    batch     400: val: rmse = 2.95e+00, rmse_e = 5.24e-01, rmse_f = 1.11e-01
[2024-09-24 12:49:23,078] DEEPMD INFO    batch     400: total wall time = 11.76 s
[2024-09-24 12:49:36,968] DEEPMD INFO    batch     600: trn: rmse = 2.00e+00, rmse_e = 1.01e-01, rmse_f = 8.59e-02, lr = 5.40e-04
[2024-09-24 12:49:36,968] DEEPMD INFO    batch     600: val: rmse = 2.70e+00, rmse_e = 1.03e-01, rmse_f = 1.16e-01
[2024-09-24 12:49:36,968] DEEPMD INFO    batch     600: total wall time = 13.89 s
[2024-09-24 12:49:48,904] DEEPMD INFO    batch     800: trn: rmse = 1.25e+00, rmse_e = 4.03e-02, rmse_f = 5.93e-02, lr = 4.40e-04
[2024-09-24 12:49:48,904] DEEPMD INFO    batch     800: val: rmse = 1.85e+00, rmse_e = 4.16e-02, rmse_f = 8.83e-02
[2024-09-24 12:49:48,904] DEEPMD INFO    batch     800: total wall time = 11.94 s
[2024-09-24 12:50:01,073] DEEPMD INFO    batch    1000: trn: rmse = 1.62e+00, rmse_e = 1.60e-02, rmse_f = 8.56e-02, lr = 3.59e-04
[2024-09-24 12:50:01,074] DEEPMD INFO    batch    1000: val: rmse = 1.56e+00, rmse_e = 1.57e-02, rmse_f = 8.23e-02
[2024-09-24 12:50:01,074] DEEPMD INFO    batch    1000: total wall time = 12.17 s
[2024-09-24 12:50:01,347] DEEPMD INFO    saved checkpoint model.ckpt
...
[2024-09-24 12:59:24,280] DEEPMD INFO    batch   10000: total wall time = 15.22 s
[2024-09-24 12:59:24,565] DEEPMD INFO    saved checkpoint model.ckpt
[2024-09-24 12:59:24,565] DEEPMD INFO    average training time: 0.0616 s/batch (exclude first 200 batches)
[2024-09-24 12:59:24,565] DEEPMD INFO    finished training
[2024-09-24 12:59:24,565] DEEPMD INFO    wall time: 625.217 s
WARNING:tensorflow:disable_mixed_precision_graph_rewrite() called when mixed precision is already disabled.

# 输出lcurve.out文件
#  step      rmse_val    rmse_trn    rmse_e_val  rmse_e_trn    rmse_f_val  rmse_f_trn         lr
# If there is no available reference data, rmse_*_{val,trn} will print nan
      0      1.91e+01    1.86e+01      1.34e-01    1.35e-01      6.05e-01    5.87e-01    1.0e-03
    200      5.72e+00    4.95e+00      1.99e+00    1.99e+00      1.88e-01    1.59e-01    8.1e-04
    400      2.95e+00    4.44e+00      5.24e-01    5.28e-01      1.11e-01    1.70e-01    6.6e-04
    600      2.70e+00    2.00e+00      1.03e-01    1.01e-01      1.16e-01    8.59e-02    5.4e-04
    800      1.85e+00    1.25e+00      4.16e-02    4.03e-02      8.83e-02    5.93e-02    4.4e-04

# 绘图
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# 设置Mathtext字体，可以选择合适的字体
plt.rcParams['mathtext.fontset'] = 'custom'
plt.rcParams['mathtext.rm'] = 'Arial'  # 使用Arial字体作为Mathtext字体

# 在这之后进行绘图
path = "./01.train/"
with open(f"{path}lcurve.out") as f:
    headers = f.readline().split()[1:]
lcurve = pd.DataFrame(np.loadtxt(f"{path}lcurve.out"), columns=headers)
legends = ["rmse_e_val", "rmse_e_trn", "rmse_f_val", "rmse_f_trn"]
for legend in legends:
    plt.loglog(lcurve["step"], lcurve[legend], label=legend)
plt.legend()
plt.xlabel("Training steps")
plt.ylabel("Loss")
plt.show()

模型冻结

在训练结束时，应该将保存在 TensorFlow 检查点文件中的模型参数冻结为一个通常以扩展名.pb结尾的模型文件。只需执行以下操作：

1	dp freeze -o graph.pb

模型测试

检查训练模型的质量

1	dp test -m graph.pb -s ../00.data/validation_data

[2024-09-24 13:02:43,966] DEEPMD INFO    # ---------------output of dp test---------------
[2024-09-24 13:02:43,966] DEEPMD INFO    # testing system : ../00.data/validation_data
[2024-09-24 13:02:44,293] DEEPMD INFO    # number of test data : 40
[2024-09-24 13:02:44,293] DEEPMD INFO    Energy MAE         : 2.501408e-03 eV
[2024-09-24 13:02:44,293] DEEPMD INFO    Energy RMSE        : 3.163454e-03 eV
[2024-09-24 13:02:44,294] DEEPMD INFO    Energy MAE/Natoms  : 5.002815e-04 eV
[2024-09-24 13:02:44,294] DEEPMD INFO    Energy RMSE/Natoms : 6.326908e-04 eV
[2024-09-24 13:02:44,294] DEEPMD INFO    Force  MAE         : 2.953246e-02 eV/A
[2024-09-24 13:02:44,294] DEEPMD INFO    Force  RMSE        : 3.907943e-02 eV/A
[2024-09-24 13:02:44,294] DEEPMD INFO    Virial MAE         : 3.898108e-02 eV
[2024-09-24 13:02:44,294] DEEPMD INFO    Virial RMSE        : 5.222230e-02 eV
[2024-09-24 13:02:44,294] DEEPMD INFO    Virial MAE/Natoms  : 7.796217e-03 eV
[2024-09-24 13:02:44,294] DEEPMD INFO    Virial RMSE/Natoms : 1.044446e-02 eV
[2024-09-24 13:02:44,294] DEEPMD INFO    # -----------------------------------------------

计算预测数据和原始数据之间的相关性

import dpdata
import matplotlib.pyplot as plt
import numpy as np

# training_systems = dpdata.LabeledSystem("./00.data/training_data", fmt="deepmd/npy")
# predict = training_systems.predict("./01.train/graph.pb")

# # print(training_systems["energies"],predict["energies"])
# data = np.vstack((training_systems["energies"],predict["energies"])).T
# np.savetxt("./train_pre.dat",data)


data = np.loadtxt("./train_pre.dat")
y1 = data[:,0]
y2 = data[:,1]

print(y1.shape,y2.shape)

plt.scatter(y1,y2)

x_range = np.linspace(plt.xlim()[0], plt.xlim()[1])

plt.plot(x_range, x_range, "r--", linewidth=1)
plt.xlabel("Energy of DFT")
plt.ylabel("Energy predicted by deep potential")

plt.show()

在LAMMPS中执行MD模拟

1
2
3

cd 02.lmp
cp ../01.train/graph.pb ./
tree -L 1

.
├── ch4.dump
├── conf.lmp：气相甲烷分子动力学模拟的初始构型
├── graph.pb
├── in.lammps：标准的 MD 模拟 LAMMPS 输入文件，只有pair_style 和pair_style 例外
└── log.lammps

0 directories, 5 files

在具有兼容版本的 LAMMPS 环境中，执行深度势分子动力学：

1	mpirun -np 8 lmp_mpi -i in.lammps

Summary of lammps deepmd module ...
  >>> Info of deepmd-kit:
  installed to:       /home/xxx/softwares/deepmd-kit
  source:
  source branch:      HEAD
  source commit:      cbf2de6
  source commit at:   2024-07-27 05:11:58 +0000
  support model ver.: 1.1
  build variant:      cpu

参考

[1] https://docs.deepmodeling.com/projects/deepmd/en/r2/getting-started/quick_start.html

[2] https://hyper.ai/tutorials/26001

← GBDT 和 GBR 的区别一个DeepMD完整例子：water →