wellcome_공부일기

Mutiple Project_train network Error 본문

DeepLabCut

Mutiple Project_train network Error

ma_heroine 2021. 8. 19. 09:30

Multiple=True and Identity=True Project_ Train Network Error

tensorflow 2.6 -> 2.5로 다운그레이드 후 하단의 일부 ERROR만 사라짐

tensorflow pip unistall reference 

https://rk1993.tistory.com/entry/Pythonthe-following-packages-are-missing-from-the-target-environment-tensorflow

The training dataset is successfully created. Use the function 'train_network' to start training. Happy training!
Selecting multi-animal trainer
Config:
{'all_joints': [[0]],
 'all_joints_names': ['head'],
 'alpha_r': 0.02,
 'apply_prob': 0.5,
 'batch_size': 8,
 'clahe': True,
 'claheratio': 0.1,
 'crop_pad': 0,
 'crop_sampling': 'hybrid',
 'crop_size': [400, 400],
 'cropratio': 0.4,
 'dataset': 'training-datasets\\iteration-0\\UnaugmentedDataSet_flies_projectAug17\\flies_project_J95shuffle1.pickle',
 'dataset_type': 'multi-animal-imgaug',
 'decay_steps': 30000,
 'deterministic': False,
 'display_iters': 500,
 'edge': False,
 'emboss': {'alpha': [0.0, 1.0], 'embossratio': 0.1, 'strength': [0.5, 1.5]},
 'fg_fraction': 0.25,
 'global_scale': 0.8,
 'histeq': True,
 'histeqratio': 0.1,
 'init_weights': 'C:\\Users\\KBRI\\anaconda3\\envs\\DEEPLABCUT\\lib\\site-packages\\deeplabcut\\pose_estimation_tensorflow\\models\\pretrained\\resnet_v1_50.ckpt',
 'intermediate_supervision': False,
 'intermediate_supervision_layer': 12,
 'location_refinement': True,
 'locref_huber_loss': True,
 'locref_loss_weight': 0.05,
 'locref_stdev': 7.2801,
 'log_dir': 'log',
 'lr_init': 0.0005,
 'max_input_size': 1500,
 'max_shift': 0.4,
 'mean_pixel': [123.68, 116.779, 103.939],
 'metadataset': 'training-datasets\\iteration-0\\UnaugmentedDataSet_flies_projectAug17\\Documentation_data-flies_project_95shuffle1.pickle',
 'min_input_size': 64,
 'mirror': False,
 'multi_stage': True,
 'multi_step': [[0.0001, 7500], [5e-05, 12000], [1e-05, 200000]],
 'net_type': 'resnet_50',
 'num_idchannel': 20,
 'num_joints': 1,
 'num_limbs': 0,
 'optimizer': 'adam',
 'pafwidth': 20,
 'pairwise_huber_loss': False,
 'pairwise_loss_weight': 0.1,
 'pairwise_predict': False,
 'partaffinityfield_graph': [],
 'partaffinityfield_predict': True,
 'pos_dist_thresh': 17,
 'pre_resize': [],
 'project_path': 'C:\\juyeon\\DeepLabCut\\flies_project-J-2021-08-17',
 'regularize': False,
 'rotation': 25,
 'rotratio': 0.4,
 'save_iters': 10000,
 'scale_jitter_lo': 0.5,
 'scale_jitter_up': 1.25,
 'scoremap_dir': 'test',
 'sharpen': False,
 'sharpenratio': 0.3,
 'shuffle': True,
 'snapshot_prefix': 'C:\\juyeon\\DeepLabCut\\flies_project-J-2021-08-17\\dlc-models\\iteration-0\\flies_projectAug17-trainset95shuffle1\\train\\snapshot',
 'stride': 8.0,
 'weigh_negatives': False,
 'weigh_only_present_joints': False,
 'weigh_part_predictions': False,
 'weight_decay': 0.0001}
Activating limb prediction...
Batch Size is 8
Getting specs multi-animal-imgaug 0 1
C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\keras\engine\base_layer_v1.py:1694: UserWarning: `layer.apply` is deprecated and will be removed in a future version. Please use `layer.__call__` method instead.
  warnings.warn('`layer.apply` is deprecated and '
2021-08-18 13:33:36.608709: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2021-08-18 13:33:36.608851: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-08-18 13:33:36.612049: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: A5F_5-3인턴
2021-08-18 13:33:36.612236: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: A5F_5-3인턴
2021-08-18 13:33:36.613175: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Loading ImageNet-pretrained resnet_50
Traceback (most recent call last):
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\framework\ops.py", line 1880, in _create_c_op
    c_op = pywrap_tf_session.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Depth of filter must not be 0 for '{{node gradients/pose/pairwise_pred/block4/conv2d_transpose_grad/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], explicit_paddings=[], padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true](gradients/pose/pairwise_pred/block4/BiasAdd_grad/tuple/control_dependency, pose/pairwise_pred/block4/conv2d_transpose/ReadVariableOp)' with input shapes: [8,?,?,0], [3,3,0,2048].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\deeplabcut\gui\train_network.py", line 319, in train_network
    deeplabcut.train_network(
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\deeplabcut\pose_estimation_tensorflow\training.py", line 187, in train_network
    raise e
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\deeplabcut\pose_estimation_tensorflow\training.py", line 163, in train_network
    train(
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\deeplabcut\pose_estimation_tensorflow\core\train_multianimal.py", line 114, in train
    learning_rate, train_op, tstep = get_optimizer(total_loss, cfg)
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\deeplabcut\pose_estimation_tensorflow\core\train.py", line 117, in get_optimizer
    train_op = slim.learning.create_train_op(loss_op, optimizer)
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tf_slim\learning.py", line 436, in create_train_op
    return training.create_train_op(
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tf_slim\training\training.py", line 447, in create_train_op
    grads = optimizer.compute_gradients(
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\training\optimizer.py", line 516, in compute_gradients
    grads = gradients.gradients(
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 169, in gradients
    return gradients_util._GradientsHelper(
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\ops\gradients_util.py", line 681, in _GradientsHelper
    in_grads = _MaybeCompile(grad_scope, op, func_call,
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\ops\gradients_util.py", line 338, in _MaybeCompile
    return grad_fn()  # Exit early
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\ops\gradients_util.py", line 682, in <lambda>
    lambda: grad_fn(op, *out_grads))
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\ops\nn_grad.py", line 55, in _Conv2DBackpropInputGrad
    gen_nn_ops.conv2d(
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 968, in conv2d
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 748, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\framework\ops.py", line 3561, in _create_op_internal
    ret = Operation(
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\framework\ops.py", line 2041, in __init__
    self._c_op = _create_c_op(self._graph, node_def, inputs,
  File "C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\framework\ops.py", line 1883, in _create_c_op
    raise ValueError(str(e))
ValueError: Depth of filter must not be 0 for '{{node gradients/pose/pairwise_pred/block4/conv2d_transpose_grad/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], explicit_paddings=[], padding="SAME", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true](gradients/pose/pairwise_pred/block4/BiasAdd_grad/tuple/control_dependency, pose/pairwise_pred/block4/conv2d_transpose/ReadVariableOp)' with input shapes: [8,?,?,0], [3,3,0,2048].

 

tensorflow 2.6 -> 2.5 downgrade 후 train  network한 결과 

The training dataset is successfully created. Use the function 'train_network' to start training. Happy training!
Selecting multi-animal trainer
Config:
{'all_joints': [[0]],
 'all_joints_names': ['head'],
 'alpha_r': 0.02,
 'apply_prob': 0.5,
 'batch_size': 8,
 'clahe': True,
 'claheratio': 0.1,
 'crop_pad': 0,
 'crop_sampling': 'hybrid',
 'crop_size': [400, 400],
 'cropratio': 0.4,
 'dataset': 'training-datasets\\iteration-0\\UnaugmentedDataSet_flies_projectAug17\\flies_project_J95shuffle1.pickle',
 'dataset_type': 'multi-animal-imgaug',
 'decay_steps': 30000,
 'deterministic': False,
 'display_iters': 500,
 'edge': False,
 'emboss': {'alpha': [0.0, 1.0], 'embossratio': 0.1, 'strength': [0.5, 1.5]},
 'fg_fraction': 0.25,
 'global_scale': 0.8,
 'histeq': True,
 'histeqratio': 0.1,
 'init_weights': 'C:\\Users\\KBRI\\anaconda3\\envs\\DEEPLABCUT\\lib\\site-packages\\deeplabcut\\pose_estimation_tensorflow\\models\\pretrained\\resnet_v1_50.ckpt',
 'intermediate_supervision': False,
 'intermediate_supervision_layer': 12,
 'location_refinement': True,
 'locref_huber_loss': True,
 'locref_loss_weight': 0.05,
 'locref_stdev': 7.2801,
 'log_dir': 'log',
 'lr_init': 0.0005,
 'max_input_size': 1500,
 'max_shift': 0.4,
 'mean_pixel': [123.68, 116.779, 103.939],
 'metadataset': 'training-datasets\\iteration-0\\UnaugmentedDataSet_flies_projectAug17\\Documentation_data-flies_project_95shuffle1.pickle',
 'min_input_size': 64,
 'mirror': False,
 'multi_stage': True,
 'multi_step': [[0.0001, 7500], [5e-05, 12000], [1e-05, 200000]],
 'net_type': 'resnet_50',
 'num_idchannel': 20,
 'num_joints': 1,
 'num_limbs': 0,
 'optimizer': 'adam',
 'pafwidth': 20,
 'pairwise_huber_loss': False,
 'pairwise_loss_weight': 0.1,
 'pairwise_predict': False,
 'partaffinityfield_graph': [],
 'partaffinityfield_predict': True,
 'pos_dist_thresh': 17,
 'pre_resize': [],
 'project_path': 'C:\\juyeon\\DeepLabCut\\flies_project-J-2021-08-17',
 'regularize': False,
 'rotation': 25,
 'rotratio': 0.4,
 'save_iters': 10000,
 'scale_jitter_lo': 0.5,
 'scale_jitter_up': 1.25,
 'scoremap_dir': 'test',
 'sharpen': False,
 'sharpenratio': 0.3,
 'shuffle': True,
 'snapshot_prefix': 'C:\\juyeon\\DeepLabCut\\flies_project-J-2021-08-17\\dlc-models\\iteration-0\\flies_projectAug17-trainset95shuffle1\\train\\snapshot',
 'stride': 8.0,
 'weigh_negatives': False,
 'weigh_only_present_joints': False,
 'weigh_part_predictions': False,
 'weight_decay': 0.0001}
Activating limb prediction...
Batch Size is 8
Getting specs multi-animal-imgaug 0 1
C:\Users\KBRI\anaconda3\envs\DEEPLABCUT\lib\site-packages\tensorflow\python\keras\engine\base_layer_v1.py:1692: UserWarning: `layer.apply` is deprecated and will be removed in a future version. Please use `layer.__call__` method instead.
  warnings.warn('`layer.apply` is deprecated and '
2021-08-19 08:05:26.352832: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2021-08-19 08:05:26.352976: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2021-08-19 08:05:26.360456: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: A5F_5-3인턴
2021-08-19 08:05:26.360657: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: A5F_5-3인턴
2021-08-19 08:05:26.374065: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Loading ImageNet-pretrained resnet_50

(DEEPLABCUT) C:\Users\KBRI>

resnet50을 로딩 중 deeplabcut gui가 꺼지면서 되돌아옴. 

 

결국 Colab에서 진행

colab에서 돌렸을 때는, 아래 링크에서 포스팅한 에러와 같이 나옴... 다시 라벨링을 시작해보겠다. 

한 마리당 2개의 포인트를 이용하여....

https://forum.image.sc/t/trouble-during-training-of-dataset-using-google-colab/56300

 

라벨링을 해줄때 틈틈히 save 버튼을 눌러주자.. 나는 갑자기 GUI가 꺼져서 저장이 되지 않았다. ㅜㅜ

 

 

라벨링 포인트를 2개로 한 결과

매우 잘 돌아간다...

나는 config file에 원래 head만 있었는데 head 다음에 back을 추가해줬다. 

starting multi-animal training에서 stuck되었길래, 왜 안되지라고 생각했는데, displat iteration을 따로 설정안해줘서 그런 것이었다.
아래 처럼 파라미터를 따로 설정해주었다. 

deeplabcut.train_network(path_config_file, displayiters=100,saveiters=15000, allow_growth=True)

 

 

'DeepLabCut' 카테고리의 다른 글

Colab | deeplabcut 설치 중 error  (0) 2021.08.19
Comments