Hi, when i run the code python generate/generate.py --model_weight gua_tpsa_logp_sas.pt --props tpsa logp sas --data_name guacamol2 --csv_name gua_tpsa_logp_sas_temp1 --gen_size 10000 --batch_size 512 --vocab_size 94 --block_size 100 in the generate_guacamol_prop.sh, i meet an RuntimeError: Error(s) in loading state_dict for GPT
size mismatch for blocks.0.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.1.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.2.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.3.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.4.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.5.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.6.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.7.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]).
Hi, when i run the code
python generate/generate.py --model_weight gua_tpsa_logp_sas.pt --props tpsa logp sas --data_name guacamol2 --csv_name gua_tpsa_logp_sas_temp1 --gen_size 10000 --batch_size 512 --vocab_size 94 --block_size 100in the generate_guacamol_prop.sh, i meet an RuntimeError: Error(s) in loading state_dict for GPTsize mismatch for blocks.0.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.1.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.2.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.3.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.4.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.5.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.6.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]). size mismatch for blocks.7.attn.mask: copying a param with shape torch.Size([1, 1, 101, 101]) from checkpoint, the shape in current model is torch.Size([1, 1, 201, 201]).