重现步骤
点击训练报错。
期待结果和实际结果
训练成功。
软硬件版本信息
K230 nncase 版本 2.8.3
错误日志
训练参数
训练次数
700
批数据量大小
8
学习率
0.001
标注框限制
5
训练日志
激活Python 2.8.3 环境。
gen_dataset
cd /workspace/code/k230_training_code/
cd /workspace/code/k230_training_code/
train task
python3 -u gen_dataset.py -t classification -i /workspace/datasets/classification/9786 -o /workspace/datasets/classification/9786_d
python3 -u run_task.py -c /workspace/datasets/classification/9786/params.json -g 2
shell generate successfully.
激活Python 2.8.3 环境。
Subfolders copied successfully.
Creating task...
Parsing config from /workspace/datasets/classification/9786/params.json...
Initializing training module...
激活Python 2.8.3 环境。
Training module initialization completed!
Starting training...
Setting split ratio, split ratio is [training: validation: testing]=[0.8:0.1:0.1]
There was a problem when trying to write in your cache folder (/.cache/huggingface/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.
epoch:1/700
return self.metric(preds, target.to(torch.int))
return forward_call(*args, **kwargs)
mode = _mode(preds, target, self.threshold, self.top_k, self.num_classes, self.multiclass, self.ignore_index)
File "/usr/local/lib/python3.9/dist-packages/torchmetrics/classification/accuracy.py", line 569, in update
_check_num_classes_mc(preds, target, num_classes, multiclass, implied_classes)
File "/workspace/code/k230_training_code/run_task.py", line 84, in
task.start_pipeline()
File "/workspace/code/k230_training_code/algorithm/task.py", line 119, in start_pipeline
raise e
File "/workspace/code/k230_training_code/run_task.py", line 64, in start_training
You have set num_classes=1
, but predictions are integers. If you want to convert (multi-dimensional) multi-class data with 2 classes to binary/multi-label, set multiclass=False
.
File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
File "/workspace/code/k230_training_code/algorithm/cls_code/classification_engine/trainer.py", line 264, in train
File "/workspace/code/k230_training_code/algorithm/cls_code/classification/metric.py", line 42, in forward
result_acc = acc(outputs, targets)
File "/workspace/code/k230_training_code/algorithm/task.py", line 148, in start_pipeline
File "/usr/local/lib/python3.9/dist-packages/torchmetrics/metric.py", line 245, in forward
File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
self._forward_cache = self._forward_reduce_state_update(*args, **kwargs)
return forward_call(*args, **kwargs)
self.update(*args, **kwargs)
start_training(config_path, gpu_id)
Traceback (most recent call last):
self.trainer.train()
raise e
File "/workspace/code/k230_training_code/algorithm/cls_code/classification_engine/trainer.py", line 124, in train
File "/usr/local/lib/python3.9/dist-packages/torchmetrics/metric.py", line 395, in wrapped_func
update(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/torchmetrics/utilities/checks.py", line 292, in _check_classification_inputs
File "/usr/local/lib/python3.9/dist-packages/torchmetrics/metric.py", line 309, in _forward_reduce_state_update
mode = _check_classification_inputs(
File "/usr/local/lib/python3.9/dist-packages/torchmetrics/functional/classification/accuracy.py", line 424, in _mode
ValueError: You have set num_classes=1
, but predictions are integers. If you want to convert (multi-dimensional) multi-class data with 2 classes to binary/multi-label, set multiclass=False
.
raise ValueError(
File "/usr/local/lib/python3.9/dist-packages/torchmetrics/utilities/checks.py", line 156, in _check_num_classes_mc
尝试解决过程
补充材料