Subject: Rebel NNUE development diary Tue Jul 05, 2022 9:17 am
A look in the REBEL kitchen just for the fun of it.
I am currently creating a new even bigger network (3.6 billion) positions using a new architecture from Chris that moves from 3 layers to one, meaning a) a somewhat less knowledgeable evaluation but b) compensated by a bigger network and a much higher NPS.
In this thread you can follow the daily progress how the neural net is build by the learner. In a nutshell, to process the massive data the learner needs 4-5 days to finish. Each 25-30 minutes the learner creates what has been called an Epoch which represent the current strength of the network. While the learner progresses it slowly will create even stronger epochs. For this net it's expected at least 200 epochs are needed to squeeze the maximum elo out of the ~150Gb data.
SSE testing - I have created a 10 engine GRL elo pool of 3349, each epoch will play 1000 games at 40/60.
Code:
EPOCH GAMES PERC ELO [SSE testing] [elo pool 3349] 36 1000 48.5% 3339 [-10] | EPOCH GAMES PERC ELO 50 1000 51.2% 3357 [+8] | 170 2000 57.2% 3400 [+51] * 2000 games from now on 61 1000 52.4% 3366 [+17] | 185 2000 56.1% 3392 [+43] 70 1000 52.3% 3366 [+17] | 210 2000 57.3% 3400 [+51] * patience is a virtue. 80 1000 53.4% 3372 [+23] | 202 2000 56.6% 3395 [+46] 90 1000 51.0% 3356 [+7] | 240 2000 57.1% 3399 [+50] 100 1000 55.9% 3390 [+41] | 240+ 2000 57.9% 3404 [+55] * plus search change [690] 123 1000 55.6% 3388 [+39] | 240+ 2000 58.7% 3410 [+61] * plus search change [890] 142 1000 56.9% 3397 [+48] | 240+ 2000 59.0% 3412 [+63] * plus search change [875] 158 1000 55.8% 3389 [+40] | 240+ 2000 58.4% 3408 [+59] * 40/120 testing
IMPORTANT - the testing is done on my SSE pc which is a considerable disadvantage for REBEL playing mainly against HCE engines.
AVX2 testing - Elo pool of 3379, 30 elo higher than the SSE testing.
Code:
EPOCH GAMES PERC ELO [AVX2 testing] [elo pool 3379] 100 1000 56.6% 3425 [+46] 123 1000 57.2% 3429 [+50] 142 1000 57.8% 3433 [+54] 158 1000 57.0% 3428 [+49] 170 1000 57.6% 3432 [+53] 185 1000 57.9% 3434 [+55] 210 1000 58.1% 3435 [+56] * patience is a virtue. 202 1000 58.4% 3437 [+58] 240 1000 59.3% 3444 [+65] * did Santa finally arrive?
Both tables will be updated when new epochs are tested, probably 3 times a day in the coming 4-5 days.
Self play against Rebel 15x2 on my AVX2 machine, time control also 40/60
Code:
EPOCH GAMES PERC ELO 61 1000 61.3% +79 80 1000 64.7% +103 123 1000 63.9% +96
Last edited by ed on Wed Jul 06, 2022 9:37 pm; edited 10 times in total
TheSelfImprover, adminx, matejst, Damir Desevac, Prometheus, Dio and Ipmanchess like this post
What are the reasons to stop the NNUE training after already 100 epochs after 200 epochs necessary as estimated by you ?
I assume that the Elo increase after already 100 epochs will not be very big, but the effort will be disproportionate, but this is only a guess of mine.
What are your current test results ?
Edit: I misunderstood something there, the NNUE training seems to be continued.
Admin Admin
Posts : 2609 Join date : 2020-11-17 Location : Netherlands
What are the reasons to stop the NNUE training after already 100 epochs after 200 epochs necessary as estimated by you ?
Note what I said, at least 200 and it all depends.
Dio wrote:
I assume that the Elo increase after already 100 epochs will not be very big, but the effort will be disproportionate, but this is only a guess of mine.
It's a good guess. Expectations from 100 to 200 are at least 20-30, sometimes even 50. It's totally unclear, the learner is inscrutable.
Thanks for the clarification, I misunderstood something there....
Admin likes this post
Admin Admin
Posts : 2609 Join date : 2020-11-17 Location : Netherlands
Subject: Re: Rebel NNUE development diary Thu Jul 07, 2022 9:20 am
This is a diary so here goes, yesterday afternoon we got a one second power failure, everything went out and restarted one second later. 2½ days work gone. Fortunately the learner has a restart option but apparently the learner software needs an update first and so there is a delay and pause.
Meanwhile I took the opportunity to cleanup the OP where the results are posted.
Another thing worthy to mention, just 1000 games is not very accurate testing especially not in self play. The error bar for 1000 games is -15/+15 elo about -2/+2% So results can fluctuate but the general pattern will be a rising one.
The procedure after the learner is finished: 1. pick the 3 most promising epochs as candidate for the final version, increase the number of games to play to 5000 using self play. The error bar for 5000 games is -7/+7 elo or about -1/+1% 2. The best version (and usually there is very little difference between the 3) plays 5000 games against a given elo pool of other engines. 3. If I happy it's time for release.
Ideally one plays 30,000 - 40,000 in order to reduce the error bar even more but I don't have the unlimited hardware like the SF folks or at OpenBench.
matejst and Dio like this post
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
What are the reasons to stop the NNUE training after already 100 epochs after 200 epochs necessary as estimated by you ?
I assume that the Elo increase after already 100 epochs will not be very big, but the effort will be disproportionate, but this is only a guess of mine.
What are your current test results ?
Edit: I misunderstood something there, the NNUE training seems to be continued.
Most of the “Elo increase” comes from the first ten epochs. An epoch, btw, is an arbitrary number of positions, Gary Linscott defines it as 100M positions, so I guess that’s now the chess industry standard. Cstal which gets to about 3600 on the GPL scale, is already at around 3400 Elo after one epoch. It gets another 100 or by 50-100 epochs and then the final 100 in the next 200 or so. I’ve only 6B positions in my current set, so after 60 epochs mine will be recycling through the training set again. Ideal would be enough positions for 400 epochs without recycling. Is that 40B? Something of that order, then presumably we get more Elo. Elo is not a linear scale. And we discuss not the nnue strength but the search-nnue combination, but it’s kind of amusing that perhaps half an epoch of training combined with a good search is well beyond world champion level.
I am not familiar with the training of a NNUE. This looks somewhat different with the training of an NN (Lc0), here I can at least estimate which increase in playing strength can still be expected.
I think that the training of a NNUE and NN has many similarities, but also many differences, since the training of a NN requires considerably much more computing effort.
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
I am not familiar with the training of a NNUE. This looks somewhat different with the training of an NN (Lc0), here I can at least estimate which increase in playing strength can still be expected.
I think that the training of a NNUE and NN has many similarities, but also many differences, since the training of a NN requires considerably much more computing effort.
Well, my anecdotes are not definitive. Elo ramps up fast while all the training positions are unique (60 epochs in my case) and then after only slowly. What would happen on adding more training positions is not known (to me) at present. It’s also unclear how net complexity (size and structure) relates to total chess knowledge absorbable. LC0 is another case. It also computes predicted move outputs. It’s not clear, at least to me, that using visual recognition resnets for chess is going to be better than keeping things relatively simple. Dunno. Not looked at LC0 code stuff for a long time. Anyway, all these net styles can change in a flash if there’s a new development in cpu/gpu. It is the two ways it is because hardware.
Dio likes this post
Admin Admin
Posts : 2609 Join date : 2020-11-17 Location : Netherlands
Subject: Re: Rebel NNUE development diary Fri Jul 08, 2022 10:00 am
UPDATE DIARY
I wasn't able to restart the learner and finally give up and restarted the learner from scratch, meaning a time loss of 2½ days before reaching epoch 123 again.
Instead of waiting doing nothing, or take a short holiday I decided to test a search change in the meantime using epoch 123, the last epoch before the power failure.
First impression looks promising.
Code:
EPOCH GAMES PERC ELO [SSE testing] [elo pool 3349] 123 1000 55.6% 3388 [+39] 123+ 1000 58.4% 3407 [+58] * epoch 123 + search change
Code:
EPOCH GAMES PERC ELO [AVX2 testing] [elo pool 3379] 123 1000 57.2% 3429 [+50] 123+ 1000 58.5% 3438 [+59] * epoch 123 + search change
Meanwhile I am now testing this search change more thoroughly, playing 5000 games.
Mclane, adminx, matejst, Damir Desevac, Dio and Ipmanchess like this post
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
I wasn't able to restart the learner and finally give up and restarted the learner from scratch, meaning a time loss of 2½ days before reaching epoch 123 again.
Instead of waiting doing nothing, or take a short holiday I decided to test a search change in the meantime using epoch 123, the last epoch before the power failure.
First impression looks promising.
Code:
EPOCH GAMES PERC ELO [SSE testing] [elo pool 3349] 123 1000 55.6% 3388 [+39] 123+ 1000 58.4% 3407 [+58] * epoch 123 + search change
Code:
EPOCH GAMES PERC ELO [AVX2 testing] [elo pool 3379] 123 1000 57.2% 3429 [+50] 123+ 1000 58.5% 3438 [+59] * epoch 123 + search change
Meanwhile I am now testing this search change more thoroughly, playing 5000 games.
I’ve been doing search-y things a lot while waiting for net builds and net play testing. It’s a good time for experimenting with ideas.
matejst likes this post
Admin Admin
Posts : 2609 Join date : 2020-11-17 Location : Netherlands
Surprising development, I finally got the restart procedure of the learner to work and so you can follow the progress of learner (epoch 142) again in the OP. The search changes (estimated 5-15 elo) will have to wait.
matejst and Dio like this post
Dio
Posts : 222 Join date : 2021-08-28
Subject: Re: Rebel NNUE development diary Sun Jul 10, 2022 7:20 pm
Epoch 240 looks very strong..
Admin Admin
Posts : 2609 Join date : 2020-11-17 Location : Netherlands
Subject: Re: Rebel NNUE development diary Sun Jul 10, 2022 7:36 pm
Dio wrote:
Epoch 240 looks very strong..
Yes, will be a worthy candidate for the final network.
Because the sudden (long awaited) increase I let the learner run a bit longer, testing now epoch 260. First to test stability and maybe it's in the mood to give more, happens sometimes.
adminx, matejst, Damir Desevac and Dio like this post
Admin Admin
Posts : 2609 Join date : 2020-11-17 Location : Netherlands
Subject: Re: Rebel NNUE development diary Sun Jul 10, 2022 11:04 pm
UPDATE Net training (also known as the learner) is in its final stage. In the SSE section of the OP I already made a (poor) start testing 2 minor search changes. Poor, because Thorsten currently is helping to test these 2 changes in a decent way, 5000 games.
Out of curiosity I will do some unusual experiments afterwards with the best network, I don't expect anything of it but if you never shoot you don't hit anything.
Great news, Ed! I am positively surprised how much Chris and you have managed to improve the evaluation in such a short time -- experimenting with data, NN structures, about 200 Elos! To put things in perspective, with a modern search, Rebel would be among the top three engines (I guess that CSTal is probably there).
Did you have time to work on the weakness with check sequences?
Admin Admin
Posts : 2609 Join date : 2020-11-17 Location : Netherlands
Great news, Ed! I am positively surprised how much Chris and you have managed to improve the evaluation in such a short time -- experimenting with data, NN structures, about 200 Elos! To put things in perspective, with a modern search, Rebel would be among the top three engines (I guess that CSTal is probably there).
Top 3 is out of the question if you are mainly interested in NNUE development, top-10 would be nice however.
matejst wrote:
Did you have time to work on the weakness with check sequences?
I had a look at the code and it seems the TOGA guys prune moves that give a check, that's a bad idea, could explain your complaint.
Doing a last check, tuning the tuner-value, maybe there is a 0-10 elo gain.
Great news, Ed! I am positively surprised how much Chris and you have managed to improve the evaluation in such a short time -- experimenting with data, NN structures, about 200 Elos! To put things in perspective, with a modern search, Rebel would be among the top three engines (I guess that CSTal is probably there).
Top 3 is out of the question if you are mainly interested in NNUE development, top-10 would be nice however.
matejst wrote:
Did you have time to work on the weakness with check sequences?
I had a look at the code and it seems the TOGA guys prune moves that give a check, that's a bad idea, could explain your complaint.
Doing a last check, tuning the tuner-value, maybe there is a 0-10 elo gain.
For me, the progress with the NNUE is the most valuable aspect here, especially since Rebel's evaluation was already excellent (with the exception of some tactical positions). I noticed that many other engines plateau after a certain level with the NN, while Ed and Chris continue to improve, and there's probably something valuable in their efforts for other authors. [No doubt that anything creative will be incorporated soon or later in SF.]
Admin Admin
Posts : 2609 Join date : 2020-11-17 Location : Netherlands
Subject: Re: Rebel NNUE development diary Wed Jul 13, 2022 7:56 am
Last chapter for every new release is to find the optimal balance between search and evaluation. A bit technical but here goes.
1. You have created a good HCE engine and the values of the evaluation are in sync with all the values you use in search for each search algorithm you use. Finding the optimal values in search is a work of months if not years.
2. And at a certain moment a programmer wants to move from HCE evaluation to NNUE evaluation. And as we have seen the results are fantastic. But wait a moment, the question arises: are the evaluations that now come from the network still in sync with the search values in use? And for the neural nets I created so far it's always the same old song, the evaluation values from the neural net are significant higher than of the HCE version. Result: the balance between eval and search are off balance and that cost elo.
3. To restore the balance between eval and search you can do it in 2 ways: a) re-tune all the search parameters, if you have good hardware weeks of test, else months; b) tune the value that comes from the NNUE evaluation, a couple of days testing. Instead of a dozen (or so) search tuning parameters you reduce the work to one parameter attempt.
First part is done, I have been testing 6 evaluation parameters 4000 games at 40/20 and there is reason to believe one of those values gives another 5-10 elo. I am now double checking if that is still true playing 4000 games at 40/60.
matejst, Damir Desevac and Dio like this post
Admin Admin
Posts : 2609 Join date : 2020-11-17 Location : Netherlands
So far, very good impressions. Much faster, while it seems that the evaluation is as good, perhaps even better than it was (did not do my usual tests yet).
Damir Desevac
Posts : 330 Join date : 2020-11-27 Age : 43 Location : Denmark
Ed, do you plan to release learn file (exp) in next version of Rebel ? So Rebel both contain learn file together with nnue, like Eman and Sugar do ? I think Rebel learn file will significantly boost Rebel strength even further.