Skip to content

Latest commit

 

History

History
183 lines (172 loc) · 10.9 KB

File metadata and controls

183 lines (172 loc) · 10.9 KB

Results with model merging

One

step:0/6200 val_loss:13.2661 train_time:179ms step_avg:nanms step:125/6200 val_loss:5.4293 train_time:17992ms step_avg:156.45ms step:250/6200 val_loss:4.6323 train_time:37931ms step_avg:158.05ms step:375/6200 val_loss:4.2121 train_time:57614ms step_avg:157.85ms step:500/6200 val_loss:4.0421 train_time:77603ms step_avg:158.37ms step:625/6200 val_loss:3.9342 train_time:97608ms step_avg:158.71ms step:750/6200 val_loss:3.8709 train_time:117330ms step_avg:158.55ms step:875/6200 val_loss:3.8117 train_time:137360ms step_avg:158.80ms step:1000/6200 val_loss:3.7724 train_time:157401ms step_avg:158.99ms step:1125/6200 val_loss:3.7341 train_time:177124ms step_avg:158.86ms step:1250/6200 val_loss:3.7036 train_time:197113ms step_avg:158.96ms step:1375/6200 val_loss:3.6771 train_time:217107ms step_avg:159.05ms step:1500/6200 val_loss:3.6535 train_time:236780ms step_avg:158.91ms step:1625/6200 val_loss:3.6328 train_time:256763ms step_avg:158.99ms step:1750/6200 val_loss:3.6157 train_time:276728ms step_avg:159.04ms step:1875/6200 val_loss:3.6014 train_time:296394ms step_avg:158.92ms step:2000/6200 val_loss:3.5879 train_time:316362ms step_avg:158.98ms step:2125/6200 val_loss:3.5731 train_time:336335ms step_avg:159.02ms step:2250/6200 val_loss:3.5615 train_time:355992ms step_avg:158.93ms step:2375/6200 val_loss:3.5483 train_time:375940ms step_avg:158.96ms step:2500/6200 val_loss:3.5394 train_time:395912ms step_avg:159.00ms step:2625/6200 val_loss:3.5284 train_time:415551ms step_avg:158.91ms step:2750/6200 val_loss:3.5232 train_time:435479ms step_avg:158.93ms step:2875/6200 val_loss:3.5108 train_time:455424ms step_avg:158.96ms step:3000/6200 val_loss:3.5037 train_time:475058ms step_avg:158.88ms step:3125/6200 val_loss:3.4935 train_time:495005ms step_avg:158.91ms step:3250/6200 val_loss:3.4876 train_time:514944ms step_avg:158.93ms step:3375/6200 val_loss:3.4807 train_time:534573ms step_avg:158.86ms step:3500/6200 val_loss:3.4750 train_time:554498ms step_avg:158.88ms step:3625/6200 val_loss:3.4681 train_time:574389ms step_avg:158.89ms step:3750/6200 val_loss:3.4636 train_time:594017ms step_avg:158.83ms step:3875/6200 val_loss:3.4545 train_time:613917ms step_avg:158.84ms step:4000/6200 val_loss:3.4498 train_time:633828ms step_avg:158.85ms step:4125/6200 val_loss:3.4452 train_time:653454ms step_avg:158.80ms step:4250/6200 val_loss:3.4393 train_time:673358ms step_avg:158.81ms step:4375/6200 val_loss:3.4350 train_time:693281ms step_avg:158.83ms step:4500/6200 val_loss:3.4258 train_time:712899ms step_avg:158.77ms step:4625/6200 val_loss:3.4148 train_time:732807ms step_avg:158.79ms step:4750/6200 val_loss:3.4064 train_time:752749ms step_avg:158.81ms step:4875/6200 val_loss:3.3932 train_time:772369ms step_avg:158.76ms step:5000/6200 val_loss:3.3816 train_time:792310ms step_avg:158.78ms step:5125/6200 val_loss:3.3705 train_time:812051ms step_avg:158.76ms step:5250/6200 val_loss:3.3595 train_time:831825ms step_avg:158.75ms step:5375/6200 val_loss:3.3487 train_time:851736ms step_avg:158.76ms step:5500/6200 val_loss:3.3385 train_time:871483ms step_avg:158.74ms step:5625/6200 val_loss:3.3304 train_time:891252ms step_avg:158.73ms step:5750/6200 val_loss:3.3193 train_time:911147ms step_avg:158.74ms step:5875/6200 val_loss:3.3108 train_time:930896ms step_avg:158.72ms step:6000/6200 val_loss:3.3032 train_time:950683ms step_avg:158.71ms step:6125/6200 val_loss:3.2975 train_time:970607ms step_avg:158.73ms step:6200/6200 val_loss:3.2959 train_time:982387ms step_avg:158.71ms

Two

step:0/6200 val_loss:108.2990 train_time:167ms step_avg:nanms step:125/6200 val_loss:5.7375 train_time:17895ms step_avg:155.61ms step:250/6200 val_loss:4.8343 train_time:37747ms step_avg:157.28ms step:375/6200 val_loss:4.3876 train_time:57384ms step_avg:157.22ms step:500/6200 val_loss:4.0984 train_time:77312ms step_avg:157.78ms step:625/6200 val_loss:3.9519 train_time:97307ms step_avg:158.22ms step:750/6200 val_loss:3.8613 train_time:117014ms step_avg:158.13ms step:875/6200 val_loss:3.7957 train_time:137008ms step_avg:158.39ms step:1000/6200 val_loss:3.7491 train_time:156973ms step_avg:158.56ms step:1125/6200 val_loss:3.7115 train_time:176729ms step_avg:158.50ms step:1250/6200 val_loss:3.6824 train_time:196733ms step_avg:158.66ms step:1375/6200 val_loss:3.6549 train_time:216734ms step_avg:158.78ms step:1500/6200 val_loss:3.6344 train_time:236465ms step_avg:158.70ms step:1625/6200 val_loss:3.6130 train_time:256425ms step_avg:158.78ms step:1750/6200 val_loss:3.5994 train_time:276440ms step_avg:158.87ms step:1875/6200 val_loss:3.5842 train_time:296150ms step_avg:158.79ms step:2000/6200 val_loss:3.5684 train_time:316125ms step_avg:158.86ms step:2125/6200 val_loss:3.5562 train_time:336103ms step_avg:158.91ms step:2250/6200 val_loss:3.5455 train_time:355809ms step_avg:158.84ms step:2375/6200 val_loss:3.5340 train_time:375757ms step_avg:158.88ms step:2500/6200 val_loss:3.5247 train_time:395716ms step_avg:158.92ms step:2625/6200 val_loss:3.5145 train_time:415415ms step_avg:158.86ms step:2750/6200 val_loss:3.5069 train_time:435371ms step_avg:158.89ms step:2875/6200 val_loss:3.4995 train_time:455359ms step_avg:158.94ms step:3000/6200 val_loss:3.4907 train_time:475047ms step_avg:158.88ms step:3125/6200 val_loss:3.4837 train_time:494995ms step_avg:158.91ms step:3250/6200 val_loss:3.4820 train_time:514937ms step_avg:158.93ms step:3375/6200 val_loss:3.4740 train_time:534618ms step_avg:158.88ms step:3500/6200 val_loss:3.4684 train_time:554548ms step_avg:158.90ms step:3625/6200 val_loss:3.4634 train_time:574514ms step_avg:158.93ms step:3750/6200 val_loss:3.4581 train_time:594193ms step_avg:158.88ms step:3875/6200 val_loss:3.4545 train_time:614126ms step_avg:158.89ms step:4000/6200 val_loss:3.4480 train_time:634061ms step_avg:158.91ms step:4125/6200 val_loss:3.4453 train_time:653739ms step_avg:158.87ms step:4250/6200 val_loss:3.4416 train_time:673680ms step_avg:158.89ms step:4375/6200 val_loss:3.4358 train_time:693617ms step_avg:158.90ms step:4500/6200 val_loss:3.4279 train_time:713294ms step_avg:158.86ms step:4625/6200 val_loss:3.4226 train_time:733261ms step_avg:158.89ms step:4750/6200 val_loss:3.4117 train_time:753197ms step_avg:158.90ms step:4875/6200 val_loss:3.4007 train_time:772858ms step_avg:158.86ms step:5000/6200 val_loss:3.3913 train_time:792782ms step_avg:158.87ms step:5125/6200 val_loss:3.3819 train_time:812585ms step_avg:158.86ms step:5250/6200 val_loss:3.3733 train_time:832388ms step_avg:158.85ms step:5375/6200 val_loss:3.3654 train_time:852292ms step_avg:158.86ms step:5500/6200 val_loss:3.3582 train_time:872090ms step_avg:158.85ms step:5625/6200 val_loss:3.3514 train_time:891870ms step_avg:158.84ms step:5750/6200 val_loss:3.3431 train_time:911760ms step_avg:158.84ms step:5875/6200 val_loss:3.3367 train_time:931542ms step_avg:158.83ms step:6000/6200 val_loss:3.3311 train_time:951330ms step_avg:158.82ms step:6125/6200 val_loss:3.3266 train_time:971225ms step_avg:158.83ms step:6200/6200 val_loss:3.3252 train_time:983008ms step_avg:158.81ms

Three

step:0/6200 val_loss:13.3082 train_time:183ms step_avg:nanms step:125/6200 val_loss:5.4940 train_time:19343ms step_avg:168.20ms step:250/6200 val_loss:4.6869 train_time:40472ms step_avg:168.64ms step:375/6200 val_loss:4.2255 train_time:61399ms step_avg:168.22ms step:500/6200 val_loss:4.0390 train_time:82609ms step_avg:168.59ms step:625/6200 val_loss:3.9499 train_time:104979ms step_avg:170.70ms step:750/6200 val_loss:3.8803 train_time:125921ms step_avg:170.16ms step:875/6200 val_loss:3.8245 train_time:147121ms step_avg:170.08ms step:1000/6200 val_loss:3.7869 train_time:168320ms step_avg:170.02ms step:1125/6200 val_loss:3.7640 train_time:190421ms step_avg:170.78ms step:1250/6200 val_loss:3.7346 train_time:211510ms step_avg:170.57ms step:1375/6200 val_loss:3.7096 train_time:232597ms step_avg:170.40ms step:1500/6200 val_loss:3.6882 train_time:253496ms step_avg:170.13ms step:1625/6200 val_loss:3.7002 train_time:275741ms step_avg:170.74ms step:1750/6200 val_loss:3.6646 train_time:296878ms step_avg:170.62ms step:1875/6200 val_loss:3.6503 train_time:317825ms step_avg:170.42ms step:2000/6200 val_loss:3.6398 train_time:338949ms step_avg:170.33ms step:2125/6200 val_loss:3.6379 train_time:361330ms step_avg:170.84ms step:2250/6200 val_loss:3.6250 train_time:382246ms step_avg:170.65ms step:2375/6200 val_loss:3.6111 train_time:403295ms step_avg:170.53ms step:2500/6200 val_loss:3.5979 train_time:424385ms step_avg:170.44ms step:2625/6200 val_loss:3.6046 train_time:446467ms step_avg:170.73ms step:2750/6200 val_loss:3.5934 train_time:467605ms step_avg:170.66ms step:2875/6200 val_loss:3.5842 train_time:488817ms step_avg:170.62ms step:3000/6200 val_loss:3.5817 train_time:509694ms step_avg:170.47ms step:3125/6200 val_loss:3.5850 train_time:531994ms step_avg:170.78ms step:3250/6200 val_loss:3.5764 train_time:553086ms step_avg:170.71ms step:3375/6200 val_loss:3.5674 train_time:573993ms step_avg:170.58ms step:3500/6200 val_loss:3.5581 train_time:594979ms step_avg:170.48ms step:3625/6200 val_loss:3.5635 train_time:617270ms step_avg:170.75ms step:3750/6200 val_loss:3.5578 train_time:638370ms step_avg:170.69ms step:3875/6200 val_loss:3.5519 train_time:659395ms step_avg:170.61ms step:4000/6200 val_loss:3.5476 train_time:680508ms step_avg:170.55ms step:4125/6200 val_loss:3.5530 train_time:702753ms step_avg:170.78ms step:4250/6200 val_loss:3.5473 train_time:723882ms step_avg:170.73ms step:4375/6200 val_loss:3.5416 train_time:745118ms step_avg:170.70ms step:4500/6200 val_loss:3.5258 train_time:765955ms step_avg:170.59ms step:4625/6200 val_loss:3.5206 train_time:788347ms step_avg:170.82ms step:4750/6200 val_loss:3.5026 train_time:809457ms step_avg:170.77ms step:4875/6200 val_loss:3.4809 train_time:830291ms step_avg:170.67ms step:5000/6200 val_loss:3.4653 train_time:851440ms step_avg:170.63ms step:5125/6200 val_loss:3.4696 train_time:873577ms step_avg:170.79ms step:5250/6200 val_loss:3.4500 train_time:894526ms step_avg:170.71ms step:5375/6200 val_loss:3.4336 train_time:915630ms step_avg:170.67ms step:5500/6200 val_loss:3.4184 train_time:936640ms step_avg:170.61ms step:5625/6200 val_loss:3.4297 train_time:958825ms step_avg:170.76ms step:5750/6200 val_loss:3.4046 train_time:979830ms step_avg:170.70ms step:5875/6200 val_loss:3.3851 train_time:1000797ms step_avg:170.64ms step:6000/6200 val_loss:3.3698 train_time:1021775ms step_avg:170.58ms step:6125/6200 val_loss:3.5608 train_time:1044089ms step_avg:170.74ms step:6200/6200 val_loss:3.4287 train_time:1055868ms step_avg:170.58ms

Stacks

Results:

val_loss_stack val_losses model_names
15.87 3.33, 3.43 'two', 'three'
11.80 3.33, 3.33 'two', 'two'
11.34 3.30, 3.33 'one', 'two'
18.38 3.30, 3.33, 3.43 'one', 'two', 'three'

With nicer naming:

val_loss_stack val_losses model_names
15.87 3.33, 3.43 'mixin-model', 'target-model'
11.80 3.33, 3.33 'mixin-model', 'mixin-model'
11.34 3.30, 3.33 'orig-model', 'mixin-model'
18.38 3.30, 3.33, 3.43 'orig-model', 'mixin-model', 'target-model'