NAS-bench-201

发布时间 2023-04-11 16:07:04作者: jasonzhangxianrong

我们提出了一种与算法无关的NAS基准测试(NAS-Bench-201),它具有固定的搜索空间,为几乎所有最新的NAS算法提供了统一的基准测试。我们搜索空间的设计灵感来自于最流行的基于单元格的搜索算法中使用的设计,其中一个单元格被表示为一个有向无环图。这里的每条边都与从预定义操作集中选择的操作相关联。为了使其适用于所有NAS算法,NAS-Bench-201中定义的搜索空间包括4个节点和5个相关操作选项,总共生成了15,625个神经元单元格候选项。

In this Markdown file, we provide:

For the following two things, please use AutoDL-Projects:

Note: please use PyTorch >= 1.2.0 and Python >= 3.6.0.

You can simply type pip install nas-bench-201 to install our api. Please see source codes of nas-bench-201 module in this repo.

If you have any questions or issues, please post it at here or email me.

一、准备与下载

[已弃用] NAS-Bench-201的旧基准文件可以从Google Drive或百度网盘(代码:6u5d)下载。

[推荐] NAS-Bench-201的最新基准文件(NAS-Bench-201-v1_1-096897.pth)可以从Google Drive下载。模型权重文件太大(431G),我需要一些时间上传它。请耐心等待,感谢您的理解。

您可以将其移动到您想要的任何地方,并将其路径发送给我们的API进行初始化。

[2020.02.25] APIv1.0/FILEv1.0:NAS-Bench-201-v1_0-e61699.pth(2.2G),其中e61699是此文件的后六位数字。它包含除每次试验的训练权重以外的所有信息。

[2020.02.25] APIv1.0/FILEv1.0:每个架构的完整数据可以从NAS-BENCH-201-4-v1.0-archive.tar(约226GB)下载。这个压缩文件夹有15625个文件,包含训练权重。

[2020.02.25] APIv1.0/FILEv1.0:Google Drive中提供了每个基线NAS算法3次运行的检查点。

[2020.03.09] APIv1.2/FILEv1.0:具有更多功能和描述的更强大API

[2020.03.16] APIv1.3/FILEv1.1:NAS-Bench-201-v1_1-096897.pth(4.7G),其中096897是此文件的后六位数字。与NAS-Bench-201-v1_0-e61699.pth相比,它包含更多试验信息,特别是所有数据集上训练12个时期的所有模型都可用。

[2020.06.30] APIv2.0:使用抽象类(NASBenchMetaAPI)作为NAS-Bench-x0y的API。

[2020.06.30] FILEv2.0:即将推出!

我们建议使用NAS-Bench-201-v1_1-096897.pth。

NAS-Bench-201中使用的训练和评估数据可以从Google Drive或百度网盘(代码:4fg7)下载。建议将这些数据放入$TORCH_HOME(默认为~/.torch/)。如果您想自己生成NAS-Bench-201或类似的NAS数据集或训练模型,您需要这些数据。

二、如何用

More usage can be found in our test codes.

1、创建API实例:

from nas_201_api import NASBench201API as API
api = API('$path_to_meta_nas_bench_file')
# Create an API without the verbose log
api = API('NAS-Bench-201-v1_1-096897.pth', verbose=False)
# The default path for benchmark file is '{:}/{:}'.format(os.environ['TORCH_HOME'], 'NAS-Bench-201-v1_1-096897.pth')
api = API(None)

2、显示每一个架构

num = len(api)
for i, arch_str in enumerate(api):
  print ('{:5d}/{:5d} : {:}'.format(i, len(api), arch_str))

3、Show the results of all trials for a single architecture:

# show all information for a specific architecture
api.show(1)
api.show(2)

# show the mean loss and accuracy of an architecture
info = api.query_meta_info_by_index(1)  # This is an instance of `ArchResults`
res_metrics = info.get_metrics('cifar10', 'train') # This is a dict with metric names as keys
cost_metrics = info.get_comput_costs('cifar100') # This is a dict with metric names as keys, e.g., flops, params, latency

# get the detailed information
results = api.query_by_index(1, 'cifar100') # a dict of all trials for 1st net on cifar100, where the key is the seed
print ('There are {:} trials for this architecture [{:}] on cifar100'.format(len(results), api[1]))
for seed, result in results.items():
  print ('Latency : {:}'.format(result.get_latency()))
  print ('Train Info : {:}'.format(result.get_train()))
  print ('Valid Info : {:}'.format(result.get_eval('x-valid')))
  print ('Test  Info : {:}'.format(result.get_eval('x-test')))
  # for the metric after a specific epoch
  print ('Train Info [10-th epoch] : {:}'.format(result.get_train(10)))

4、靠字符串查询架构索引

index = api.query_index_by_arch('|nor_conv_3x3~0|+|nor_conv_3x3~0|avg_pool_3x3~1|+|skip_connect~0|nor_conv_3x3~1|skip_connect~2|')
api.show(index)

字符串的意思是

node-0: the input tensor
node-1: conv-3x3( node-0 )
node-2: conv-3x3( node-0 ) + avg-pool-3x3( node-1 )
node-3: skip-connect( node-0 ) + conv-3x3( node-1 ) + skip-connect( node-2 )

5、从API中创建网络

config = api.get_net_config(123, 'cifar10') # obtain the network configuration for the 123-th architecture on the CIFAR-10 dataset
from models import get_cell_based_tiny_net # this module is in AutoDL-Projects/lib/models
network = get_cell_based_tiny_net(config) # create the network from configurration
print(network) # show the structure of this architecture

如果您想加载此创建网络的训练权重,您需要使用api.get_net_param(123,…)来获取权重,然后将其加载到网络中。

6、api.get_more_info(…)可以返回训练/验证/测试集上的损失/准确度/时间,这非常有帮助。有关更多详细信息,请查看get_more_info函数中的注释。

7、有关其他用法,请参见lib/nas_201_api/api.py。我们在相应函数的注释中提供了一些使用信息。如果您想要的内容未提供,请随时打开一个问题进行讨论,我很乐意回答有关NAS-Bench-201的任何问题。

三、详细介绍

在nas_201_api中,我们定义了三个类:NASBench201API,ArchResults,ResultsCount。

ResultsCount维护特定试验的所有信息。可以实例化ResultsCount并通过以下代码获取信息(000157-FULL.pth保存了157-th架构的所有试验的所有信息):

from nas_201_api import ResultsCount
xdata  = torch.load('000157-FULL.pth')
odata  = xdata['full']['all_results'][('cifar10-valid', 777)]
result = ResultsCount.create_from_state_dict( odata )
print(result) # print it
print(result.get_train())   # print the final training loss/accuracy/[optional:time-cost-of-a-training-epoch]
print(result.get_train(11)) # print the training info of the 11-th epoch
print(result.get_eval('x-valid'))     # print the final evaluation info on the validation set
print(result.get_eval('x-valid', 11)) # print the info on the validation set of the 11-th epoch
print(result.get_latency())           # print the evaluation latency [in batch]
result.get_net_param()                # the trained parameters of this trial
arch_config = result.get_config(CellStructure.str2structure) # create the network with params
net_config  = dict2config(arch_config, None)
network    = get_cell_based_tiny_net(net_config)
network.load_state_dict(result.get_net_param())

ArchResults维护一个架构的所有试验的所有信息。请参见以下用法:

from nas_201_api import ArchResults
xdata   = torch.load('000157-FULL.pth')
archRes = ArchResults.create_from_state_dict(xdata['less']) # load trials trained with  12 epochs
archRes = ArchResults.create_from_state_dict(xdata['full']) # load trials trained with 200 epochs

print(archRes.arch_idx_str())      # print the index of this architecture 
print(archRes.get_dataset_names()) # print the supported training data
print(archRes.get_compute_costs('cifar10-valid')) # print all computational info when training on cifar10-valid 
print(archRes.get_metrics('cifar10-valid', 'x-valid', None, False)) # print the average loss/accuracy/time on all trials
print(archRes.get_metrics('cifar10-valid', 'x-valid', None,  True)) # print loss/accuracy/time of a randomly selected trial

NASBench201API is the topest level api. Please see the following usages:

from nas_201_api import NASBench201API as API
api = API('NAS-Bench-201-v1_1-096897.pth') # This will load all the information of NAS-Bench-201 except the trained weights
api = API('{:}/{:}'.format(os.environ['TORCH_HOME'], 'NAS-Bench-201-v1_1-096897.pth')) # The same as the above line while I usually save NAS-Bench-201-v1_1-096897.pth in ~/.torch/.
api.show(-1)  # show info of all architectures
api.reload('{:}/{:}'.format(os.environ['TORCH_HOME'], 'NAS-BENCH-201-4-v1.0-archive'), 3) # This code will reload the information 3-th architecture with the trained weights

weights = api.get_net_param(3, 'cifar10', None) # Obtaining the weights of all trials for the 3-th architecture on cifar10. It will returns a dict, where the key is the seed and the value is the trained weights.

要获取训练和评估信息(请参见这里的注释):

api.get_more_info(112, 'cifar10', None, hp='200', is_random=True)
# Query info of last training epoch for 112-th architecture
# using 200-epoch-hyper-parameter and randomly select a trial.
api.get_more_info(112, 'ImageNet16-120', None, hp='200', is_random=True)