Use nni to optimize sysmtem automatically

nni是微软的一个AutoML工具,可以用来做一个自动化特征选择,自动剪枝和量化,神经网络架构搜索以及超参数调优。本文重点介绍下使用nni做超参数调优部分

支持的算法

nni中包含了好多种超参数调优算法,大概有三类搜索策略, ### Exhaustive seach 1. Grid Search 2. Random

  1. Anneal
  2. Evolution
  3. Hyperband
  4. PBT

Bayesian optimization

  1. BOHB
  2. DNGO
  3. GP
  4. Metis
  5. SMAC
  6. TPE

使用方式

使用nni需要准备三个文件,包括 1. nni_search.py: 计算每个参数的score

1
2
3
4
5
6
7
8
9
10
11
import nni

def main():
para = nni.get_next_parameter()
x = para['x']
# square(x) = 2
score = abs(x*x - 2)
nni.report_final_result(score)

if __name__ == '__main__':
main()
  1. search_space.json: 定义参数搜索的空间

    1
    2
    3
    4
    {
    "x": {"_type": "uniform", "_value": [1, 2]},
    }

  2. config.yml: nni配置文件

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    experimentName: search_sqrt_2

    # 并行度
    trialConcurrency: 1

    # 最大允许时长,
    maxExecDuration: 1h
    # 最多的运行次数,如果不设置则意味着永远不会停止
    maxTrialNum: 10

    #choice: local, remote
    trainingServicePlatform: local

    # search_space file
    searchSpacePath: search_space.json

    #choice: true, false
    useAnnotation: false

    tuner:
    #choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner
    #SMAC (SMAC should be installed through nnictl)
    builtinTunerName: TPE
    classArgs:
    #choice: maximize, minimize
    optimize_mode: minimize

    trial:
    command: python3 nni_search.py
    codeDir: .
    gpuNum: 0

运行命令: nnictl create --config config.yml 启动试验,nni会打印出具体配置,并启动一个webserver,可以检查试验结果,以下为上面例子启动后的输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
[2023-10-06 21:51:13] WARNING: Config field "authorName" is no longer supported and has been ignored
[2023-10-06 21:51:13] WARNING: You are using legacy config file, please update it to latest format:
================================================================================
experimentName: search_sqrt_2
trialConcurrency: 1
maxExperimentDuration: 1h
maxTrialNumber: 100
searchSpaceFile: search_space.json
useAnnotation: false
trialCommand: python3 nni_search.py
trialCodeDirectory: .
trialGpuNumber: 0
tuner:
name: TPE
classArgs:
optimize_mode: minimize
trainingService:
platform: local
================================================================================
Reference: https://nni.readthedocs.io/en/stable/reference/experiment_config.html
[2023-10-06 21:51:13] Creating experiment, Experiment ID: vl4y6ge3
[2023-10-06 21:51:13] Starting web server...
[2023-10-06 21:51:14] Setting up...
[2023-10-06 21:51:14] Web portal URLs: http://127.0.0.1:8080 http://192.168.50.103:8080
[2023-10-06 21:51:14] To stop experiment run "nnictl stop vl4y6ge3" or "nnictl stop --all"
[2023-10-06 21:51:14] Reference: https://nni.readthedocs.io/en/stable/reference/nnictl.html

运行完成以后,通过 nnictl stop -all来停止试验 ##