Subject: Stockfish New Net 45 Mb? Tue May 18, 2021 9:39 pm
Someone posted on my channel that Stockfish will be coming out with a new 45 Mb net. The poster also claimed a nice rating increase with the new net. Does anyone have any information on the new Stockfish net?
Last edited by mwyoung on Tue May 18, 2021 11:13 pm; edited 1 time in total
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish New Net 45 Mb? Tue May 18, 2021 10:01 pm
I am being told it will be released in a few hours.
Damas Clásicas Hi Mark. In a few hours there will be in Abrok.
TheSelfImprover
Posts : 2876 Join date : 2020-11-18
Subject: Re: Stockfish New Net 45 Mb? Tue May 18, 2021 10:52 pm
mwyoung wrote:
Someone posted on my channel that Stockfish will be coming out with a new 45 Gb net. The poster also claimed a nice rating increase with the new net. Does anyone have any information on the new Stockfish net?
Who trained it?
How did they train it???
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish New Net 45 Mb? Tue May 18, 2021 10:56 pm
TheSelfImprover wrote:
mwyoung wrote:
Someone posted on my channel that Stockfish will be coming out with a new 45 Gb net. The poster also claimed a nice rating increase with the new net. Does anyone have any information on the new Stockfish net?
Who trained it?
How did they train it???
It sounds like it was made by the stockfish team, or if it was made by someone else. It will be used by Stockfish in the next DEV release. I am being told it will be released sometime today. In the next few hours.
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish New Net 45 Mb? Tue May 18, 2021 11:14 pm
Here is the more info posted on my channel.
Miro Pavlicko it will appear on abrok, when abrok updates his page
Miro Pavlicko and it was trained by SF team with the trainer SF team wrote for net training
TheSelfImprover
Posts : 2876 Join date : 2020-11-18
Subject: Re: Stockfish New Net 45 Mb? Wed May 19, 2021 9:43 am
mwyoung wrote:
...it was trained by SF team with the trainer SF team wrote for net training
I find it difficult to visualise training a 45 Gb net with current tools, but I could be wrong.
Also, I would have thought that running a 45 Gb net would be slow, but again I could be wrong (probably wouldn't want to run it from a disc drive).
Damir Desevac
Posts : 316 Join date : 2020-11-27 Age : 42 Location : Denmark
Subject: Re: Stockfish New Net 45 Mb? Wed May 19, 2021 5:32 pm
the net is 45 mb not 45 GB as that would be impossible to load and would eat half of your hard disc....
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish New Net 45 Mb? Wed May 19, 2021 9:01 pm
Info on the new net and rating.
Author: Tomasz Sobczyk Date: Tue May 18 18:06:23 2021 +0200 Timestamp: 1621353983
New NNUE architecture and net
Introduces a new NNUE network architecture and associated network parameters, as obtained by a new pytorch trainer.
The network is already very strong at short TC, without regression at longer TC, and has potential for further improvements.
Hello Chris, This is all the information I have at the moment.
Code:
This network also contains a few architectural changes with respect to the current master:
Size changed from 256x2-32-32-1 to 512x2-16-32-1 ~15-20% slower ~2x larger adds a special path for 16 valued ClippedReLU fixes affine transform code for 16 inputs/outputs, buy using InputDimensions instead of PaddedInputDimensions this is safe now because the inputs are processed in groups of 4 in the current affine transform code The feature set changed from HalfKP to HalfKAv2 Includes information about the kings like HalfKA Packs king features better, resulting in 8% size reduction compared to HalfKA The board is flipped for the black's perspective, instead of rotated like in the current master PSQT values for each feature the feature transformer now outputs a part that is fowarded directly to the output and allows learning piece values more directly than the previous network architecture. The effect is visible for high imbalance positions, where the current master network outputs evaluations skewed towards zero. 8 PSQT values per feature, chosen based on (popcount(pos.pieces()) - 1) / 4 initialized to classical material values on the start of the training 8 subnetworks (512x2->16->32->1), chosen based on (popcount(pos.pieces()) - 1) / 4 only one subnetwork is evaluated for any position, no or marginal speed loss
A diagram of the network is available: https://user-images.githubusercontent.com/8037982/118656988-553a1700-b7eb-11eb-82ef-56a11cbebbf2.png A more complete description: https://github.com/glinscott/nnue-pytorch/blob/master/docs/nnue.md
But the testing results are not great for the 45 Mb net. And Ed's results are showing the same at the moment. I do not think this net will overtake Stockfish 13 in real world testing. The +21 Elo is with 10s+0.1s, 1 thread Time Controls. The drop in performance with more time and threads is very sharp.
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
Subject: Re: Stockfish New Net 45 Mb? Sat May 22, 2021 3:26 pm
mwyoung wrote:
Chris Whittington wrote:
mwyoung wrote:
Info on the new net and rating.
Author: Tomasz Sobczyk Date: Tue May 18 18:06:23 2021 +0200 Timestamp: 1621353983
New NNUE architecture and net
Introduces a new NNUE network architecture and associated network parameters, as obtained by a new pytorch trainer.
The network is already very strong at short TC, without regression at longer TC, and has potential for further improvements.
Hello Chris, This is all the information I have at the moment.
Code:
This network also contains a few architectural changes with respect to the current master:
Size changed from 256x2-32-32-1 to 512x2-16-32-1 ~15-20% slower ~2x larger adds a special path for 16 valued ClippedReLU fixes affine transform code for 16 inputs/outputs, buy using InputDimensions instead of PaddedInputDimensions this is safe now because the inputs are processed in groups of 4 in the current affine transform code The feature set changed from HalfKP to HalfKAv2 Includes information about the kings like HalfKA Packs king features better, resulting in 8% size reduction compared to HalfKA The board is flipped for the black's perspective, instead of rotated like in the current master PSQT values for each feature the feature transformer now outputs a part that is fowarded directly to the output and allows learning piece values more directly than the previous network architecture. The effect is visible for high imbalance positions, where the current master network outputs evaluations skewed towards zero. 8 PSQT values per feature, chosen based on (popcount(pos.pieces()) - 1) / 4 initialized to classical material values on the start of the training 8 subnetworks (512x2->16->32->1), chosen based on (popcount(pos.pieces()) - 1) / 4 only one subnetwork is evaluated for any position, no or marginal speed loss
A diagram of the network is available: https://user-images.githubusercontent.com/8037982/118656988-553a1700-b7eb-11eb-82ef-56a11cbebbf2.png A more complete description: https://github.com/glinscott/nnue-pytorch/blob/master/docs/nnue.md
But the testing results are not great for the 45 Mb net. And Ed's results are showing the same at the moment. I do not think this net will overtake Stockfish 13 in real world testing. The +21 Elo is with 10s+0.1s, 1 thread Time Controls. The drop in performance with more time and threads is very sharp.
The problem with the old nets were that their architecture was by no means optimal, but so much work and time had gone into building and training that changing the design was inhibited. So now they took the step, using Gary’s GitHub code that was being worked on for last few months.
I always find reading Gary’s code difficult, but ....
The indexing of the old net using Ksq was always dumb, so they have a new and probably improved method. This changes the way they update the net accumulator during search but probably won’t make much difference to the rate of needing to do full on recomputes. I’ld guess this gives potential performance without nps penalty. They doubled the “width”, 256 to 512. This is relatively time expensive every time the accumulator needs incrementally updating (more or less each move in the search tree). They halved the next layer. This gains some time, but probably not as much as lost by doubling the accumulator width.
Doubling the width allows for more and interesting “things” to be extracted from raw position. How much? Is it worth it? Who knows, only testing tells. Halving the next layer reduces the non-linearity processing capability of the layer stack. By how much? Who knows. Only testing. Testing is not easy. It takes a relative long time to train up a net to find out how these changes affect a well trained network.
They’ve also added capability to add some pre-computed auxiliary knowledge direct into the upper layer(s). Currently SF piece values (I think).
Basically they got something a bit more amenable to design changes and introduction of new ideas. I’ld see it more as a playground than a finished design. Problem as ever is it takes time to make and build changes, but you can expect to see people making design alterations and releasing new nets. Probably you can expect incremental improvements but also coupled with various wild claims. There’s no rocket science in there, as far as I can see. Quite likely some silent people are maybe quietly doing something.