Rodent NNUE development

Subject: Rodent NNUE development Sun Mar 27, 2022 5:31 pm

Months ago I promised Pawel a Rodent net running under Rebel 14. A first try failed. But I have a good 450 million position base (created from Rodent 4.022) running now. On this page you can follow the development of the NNUE.

Keep in mind that Rodent 4.022 is ~3000 elo rated, the Rebel search based on Toga is estimated 2850-2900 max.

The learner is still in an early stage, it's currently at epoch-15 and should at least run till epoch-100.

First results

Code:: Rodent-NN vs Rodent 4.022

Epoch Games Time Result
Epoch-3 1000 40/20 67.3%
Epoch-8 1500 40/20 69.4%
Epoch-15 500 40/20 76.4%

Results are so good already it's better to leave this path and play other engines. I have chosen for a 3218 elo pool which no doubt is too high in the early stage of the learner but I have hope Rodent NN will reach this level as the learner makes progress.

http://rebel13.nl/a/grl.htm

Code:: Epoch-15 : 3168 elo
Epoch-30 : 3178 elo
Epoch-40 : 3187 elo
Epoch-50 : 3180 elo
Epoch-60 : 3174 elo
Epoch-75 : 3183 elo
Epoch-85 : 3162 elo
Epoch-95 : 3178 elo

Notes

1. The elo gain is in the range of 278 - 328 depending how strong you estimate the Toga search.

2. For a relative small dataset of 450 million positions this is a good result.

3. Not much is happening after epoch-30, not unlikely.

4. Note that the error bar for 1000 games is -15/+15

Preparing for PART-II
Increasing the dataset to ~750 million and try again, most of the time this gives more elo. Then restart the learner and repeat the process.

Posts : 1254 Join date : 2020-11-17 Location : France

Admin wrote:

Months ago I promised Pawel a Rodent net running under Rebel 14. A first try failed. But I have a good 450 million position base (created from Rodent 4.022) running now. On this page you can follow the development of the NNUE.

Keep in mind that Rodent 4.022 is ~3000 elo rated, the Rebel search based on Toga is estimated 2850-2900 max.

The learner is still in an early stage, it's currently at epoch-15 and should at least run till epoch-100.

First results

Code:: Rodent-NN vs Rodent 4.022

Epoch Games Time Result
Epoch-3 1000 40/20 67.3%
Epoch-8 1500 40/20 69.4%
Epoch-15 500 40/20 76.4%

Results are so good already it's better to leave this path and play other engines. I have chosen for a 3218 elo pool which no doubt is too high in the early stage of the learner but I have hope Rodent NN will reach this level as the learner makes progress.

http://rebel13.nl/a/grl.htm

Epoch-15 running.....

Three epochs is about one hour of training on Ed’s system. After three epochs, the engine has maybe 3100 Elo. World Chess Champion Elo is maybe 2850.
Who could have imagined this even ten years ago. Less than an hour to become a Super-GM.

Subject: Re: Rodent NNUE development Mon Mar 28, 2022 2:27 am

Chris Whittington wrote:

Admin wrote:

Months ago I promised Pawel a Rodent net running under Rebel 14. A first try failed. But I have a good 450 million position base (created from Rodent 4.022) running now. On this page you can follow the development of the NNUE.

Keep in mind that Rodent 4.022 is ~3000 elo rated, the Rebel search based on Toga is estimated 2850-2900 max.

The learner is still in an early stage, it's currently at epoch-15 and should at least run till epoch-100.

First results

Code:: Rodent-NN vs Rodent 4.022

Epoch Games Time Result
Epoch-3 1000 40/20 67.3%
Epoch-8 1500 40/20 69.4%
Epoch-15 500 40/20 76.4%

Results are so good already it's better to leave this path and play other engines. I have chosen for a 3218 elo pool which no doubt is too high in the early stage of the learner but I have hope Rodent NN will reach this level as the learner makes progress.

http://rebel13.nl/a/grl.htm

Epoch-15 running.....

Three epochs is about one hour of training on Ed’s system. After three epochs, the engine has maybe 3100 Elo. World Chess Champion Elo is maybe 2850.
Who could have imagined this even ten years ago. Less than an hour to become a Super-GM.

And a look in the kitchen how we bake our engines and when they are cooked become yours.

Subject: Re: Rodent NNUE development Tue Mar 29, 2022 7:43 pm

PART TWO

Using 800 million Rodent positions vs previous run of 450 million.

Code:: 450M 800M
Epoch-10 : ---- 3164
Epoch-15 : 3168 3182 +14
Epoch-25 : ---- 3179
Epoch-30 : 3178 ----
Epoch-40 : 3187 3147 -40 ??
Epoch-50 : 3180 3207 +27
Epoch-60 : 3174 3193 +19
Epoch-75 : 3183 3191 +8
Epoch-85 : 3162 3194 +32
Epoch-95 : 3178 3188 +10

This ends the Rodent NNUE development and testing.

Closing remarks:

1. The created net based on Rodent HCE will improve with ~300 elo over Rodent HCE evaluation. Pawel is free to import the GPL interfere NNUE source from Chris into Rodent.

2. The net needs fine tuning, will certainly give extra elo.

3. I like to release the Rodent-NN running under my source code, will be ~3200 elo CCRL and GRL but I am not sure how to call it.

4. At the start of this thread I mailed Pawel and made him aware however I haven't seen him reading the forum. I can imagine he has different priorities, 3 million refugees entering your country including the tensions of the war are not small things. Today I have mailed him again. We will see.

http://rebel13.nl/a/grl.htm

Subject: Re: Rodent NNUE development Thu Mar 31, 2022 9:21 pm

http://rebel13.nl/dl/rodent-nn.7z

Web site tomorrow.

Posts : 131 Join date : 2020-11-20

Thanks for providing this interesting engine.
I played a testgame on two pretty identical notebooks at a slow time control.
I really liked the game very much because of the early exchange sac, that came as a surprise and challenge to my own evaluation, though it has to be said that Crafty understood it just as well to my surprise – both engines arrived at similar conclusions.
But the time management of Rodent can’t be good here and would probably apply to Rebel and Tal just as well. I am not sure of the why ( unless you introduced a new bug into Toga) , maybe it is because Rodent produces higher swings in evaluation than Rebel or Tal in general ?!
Settings were 4 threads and 5GB Hash.
20 minutes for 2. … g6 and 20 minutes for 3. … d5 have to be unreasonable time management in general.
[Event "Lang 120min+10sek"]
[Site "Berlin"]
[Date "2022.04.08"]
[Round "?"]
[White "Crafty 25.3"]
[Black "Rodent NN 1.1"]
[Result "0-1"]
[ECO "D85"]
[PlyCount "136"]
[TimeControl "7200+10"]

{5120MB, LAPTOP-NCDN8BTK} 1. d4 {[%emt 0:00:00]} Nf6 {[%eval 8,24] [%emt 0:03:
28]} 2. c4 {[%emt 0:00:07]} g6 {[%eval 26,25] [%emt 0:20:19]} 3. Nc3 {[%emt 0:
00:12]} d5 {[%eval 14,27] [%emt 0:20:34]} 4. cxd5 {[%emt 0:00:08] (Lg5)} Nxd5 {
[%eval 15,21] [%emt 0:02:08]} 5. e4 {[%emt 0:00:07] (Sf3)} Nxc3 {[%eval 19,24]
[%emt 0:02:31]} 6. bxc3 {[%emt 0:00:06]} Bg7 {[%eval 18,25] [%emt 0:02:50]} 7.
Nf3 {[%emt 0:00:07]} c5 {[%eval 0,26] [%emt 0:02:04]} 8. Rb1 {[%emt 0:00:08]
(Lb5+)} O-O {[%eval 20,23] [%emt 0:02:53]} 9. Be2 {[%emt 0:00:06]} b6 {[%eval
28,23] [%emt 0:02:07]} 10. O-O {[%emt 0:00:06]} Qc7 {[%eval 24,23] [%emt 0:01:
40]} 11. Bg5 {[%emt 0:00:07] (a4)} Bb7 {[%eval 36,23] [%emt 0:02:26]} 12. Qd3 {
[%emt 0:00:07]} e6 {[%eval 35,24] [%emt 0:03:04]} 13. Qe3 {[%emt 0:00:07]} Nd7
{[%eval 43,24] [%emt 0:01:22]} 14. e5 {[%emt 0:00:08] (Ld3)} Bd5 {[%eval 14,23]
[%emt 0:01:36]} 15. Rb2 {[%emt 0:04:33] (a4)} f6 {[%eval 2,23] [%emt 0:01:35]}
16. exf6 {[%emt 0:03:11]} Nxf6 {[%eval 0,27] [%emt 0:00:01]} 17. Bf4 {[%emt 0:
02:50]} cxd4 {[%eval 0,28] [%emt 0:00:01]} 18. cxd4 {[%emt 0:02:48]} Qb7 {
[%eval 0,28] [%emt 0:00:01]} 19. Qa3 {[%emt 0:03:39] (Ld6)} Nh5 {[%eval 0,26]
[%emt 0:01:27]} 20. Bd6 {[%emt 0:02:38]} Nf4 {[%eval 0,29] [%emt 0:00:01]} 21.
Bxf8 {[%emt 0:02:33]} Bxf8 {[%eval 0,31] [%emt 0:00:01]} 22. Qa4 {[%emt 0:02:
21] (De3)} a6 {[%eval -2,23] [%emt 0:02:07]} 23. Re1 {[%emt 0:03:19] (Dd1)} b5
{[%eval -3,25] [%emt 0:01:30]} 24. Qa5 {[%emt 0:03:55] (Dd1)} Bd6 {[%eval -21,
23] [%emt 0:02:07]} 25. Qc3 {[%emt 0:03:38] (Dd2)} Rf8 {[%eval -50,26] [%emt 0:
01:40]} 26. a4 {[%emt 0:03:04] (De3)} b4 {[%eval -134,23] [%emt 0:01:24]} 27.
Qe3 {[%emt 0:02:29]} a5 {[%eval -111,24] [%emt 0:00:31]} 28. h3 {[%emt 0:06:03]
(Tc2)} Qe7 {[%eval -159,24] [%emt 0:01:55]} 29. Rbb1 {[%emt 0:05:39] (Ld1)} Kg7
{[%eval -179,23] [%emt 0:03:18]} 30. Rb2 {[%emt 0:02:05] (Tb3)} Qf6 {[%eval
-164,24] [%emt 0:01:17]} 31. Bf1 {[%emt 0:00:26] (Tc2)} Rc8 {[%eval -119,25]
[%emt 0:01:06]} 32. Ne5 {[%emt 0:00:24]} Rc3 {[%eval -111,26] [%emt 0:01:04]}
33. Qd2 {[%emt 0:00:21]} Qg5 {[%eval -166,26] [%emt 0:00:44]} 34. h4 {[%emt 0:
04:57]} Qxh4 {[%eval -162,30] [%emt 0:00:01]} 35. Re3 {[%emt 0:01:41]} Ra3 {
[%eval -169,25] [%emt 0:00:01]} 36. Rxa3 {[%emt 0:01:20]} bxa3 {[%eval -169,23]
[%emt 0:00:01]} 37. Rc2 {[%emt 0:02:14]} Qg5 {[%eval -169,28] [%emt 0:00:01]}
38. g3 {[%emt 0:01:39]} Bb3 {[%eval -168,27] [%emt 0:00:01]} 39. Bc4 {[%emt 0:
01:35]} Bxc2 {[%eval -167,26] [%emt 0:00:01]} 40. Qxc2 {[%emt 0:01:28]} Bxe5 {
[%eval -169,27] [%emt 0:00:19]} 41. dxe5 {[%emt 0:01:05]} Nh3+ {[%eval -169,26]
[%emt 0:00:01]} 42. Kg2 {[%emt 0:00:56]} Qg4 {[%eval -169,22] [%emt 0:00:07]}
43. Qb3 {[%emt 0:01:52]} Ng5 {[%eval -170,22] [%emt 0:00:01]} 44. Qb7+ {
[%emt 0:01:10]} Kh6 {[%eval -170,25] [%emt 0:00:01]} 45. f4 {[%emt 0:00:50]}
Kh5 {[%eval -170,24] [%emt 0:00:48]} 46. fxg5 {[%emt 0:00:29]} Qxc4 {[%eval
-183,10] [%emt 0:00:01]} 47. Qxh7+ {[%emt 0:00:49]} Kxg5 {[%eval -183,10]
[%emt 0:00:01]} 48. Qe7+ {[%emt 0:00:46]} Kf5 {[%eval -183,16] [%emt 0:00:51]}
49. Qxa3 {[%emt 0:00:24] (Df6+)} Qe2+ {[%eval -333,20] [%emt 0:01:36]} 50. Kg1
{[%emt 0:00:12]} Kg4 {[%eval -341,18] [%emt 0:01:02]} 51. Qc1 {[%emt 0:01:09]
(Db3)} Qxe5 {[%eval -342,16] [%emt 0:01:31]} 52. Qd2 {[%emt 0:00:46] (Dc4+)} g5
{[%eval -344,20] [%emt 0:01:20]} 53. Kh2 {[%emt 0:00:49]} Qxg3+ {[%eval -344,
14] [%emt 0:00:20]} 54. Kh1 {[%emt 0:00:07]} Qb3 {[%eval -369,18] [%emt 0:00:
48]} 55. Qd4+ {[%emt 0:01:46]} Kh5 {[%eval -373,20] [%emt 0:00:01]} 56. Kh2 {
[%emt 0:00:47] (Dh8+)} Qf3 {[%eval -380,15] [%emt 0:00:53]} 57. Qa7 {[%emt 0:
00:52]} Qf4+ {[%eval -373,10] [%emt 0:00:01]} 58. Kg2 {[%emt 0:01:31]} Qe4+ {
[%eval -380,13] [%emt 0:00:01]} 59. Kg3 {[%emt 0:01:02] (Kh2)} Qg4+ {[%eval
-397,15] [%emt 0:00:58]} 60. Kh2 {[%emt 0:00:09]} Qxa4 {[%eval -409,14] [%emt
0:00:57]} 61. Qh7+ {[%emt 0:00:08] (Df7+)} Kg4 {[%eval -374,4] [%emt 0:00:01]}
62. Qh3+ {[%emt 0:00:35]} Kf4 {[%eval -413,8] [%emt 0:00:01]} 63. Qg3+ {
[%emt 0:00:29]} Kf5 {[%eval -425,10] [%emt 0:00:01]} 64. Qh3+ {[%emt 0:01:18]}
Ke5 {[%eval -443,19] [%emt 0:00:01]} 65. Qc3+ {[%emt 0:03:41] (De3+)} Ke4 {
[%eval -460,19] [%emt 0:01:01]} 66. Kg2 {[%emt 0:00:10]} Qd4 {[%eval -461,18]
[%emt 0:00:56]} 67. Qf3+ {[%emt 0:00:52]} Ke5 {[%eval -461,12] [%emt 0:00:01]}
68. Qa3 {[%emt 0:00:32] (Dg3+)} a4 {[%eval -490,19] [%emt 0:01:17]} 0-1

Posts : 233 Join date : 2021-10-08

I had something that does not really belong to the Rodent thread, but I don't want to go back to Rebel 14.1 for this; I see Rebel sometimes going for opposite coloured Bishop endgames with very high scores but that are recognized by the opponent as draws. Probably there are more cases where the endgame knowledge is a bit missing.

For instance in the Stockfish - Dragon Superfinal game position that Ernest posted
38...Re4 is probably only a draw but 38...Qf6 is, we think, possibly winning. But if I analyze a bit with Rebel 14.1 against Stockfish, it ends up with these opposite color endgames that are rather totally drawn unfortunately.

[pgn][Event "?"]
[Site "?"]
[Date "2022.04.09"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[SetUp "1"]
[FEN "7k/4q1pp/3bP3/3p3P/Bp1P1r2/p2R4/2P3P1/K3Q3 b - -"]

1... Qf6 (1... Re4 2. Qf1 Qxe6 3. Ka2 h6 4. Qf3 Re2 5. Qh3
Qxh3 6. gxh3 g5 7. hxg6 Kg7 8. Bb3 Kxg6 9. Kb1 Kg5 10. Bxd5
Be7 11. Rg3+ Kh4 12. Rg6 Bg5 13. Rb6 Re1+ 14. Ka2 Bd2
15. Ra6 Kxh3 16. Bc4 Re4 17. Be6+ Kg3 18. d5 Bg5 19. Bf7
Rf4 20. Be8 Rf8 21. Re6 Rf6 22. Rxf6 Bxf6 23. d6 Kf4
24. Kb3 Ke5 25. Bf7 Kxd6 26. Kxb4 Bb2 27. Kb3 Ke7
28. Bg6 $11) 2. Ka2 Re4 3. Qd1 Qxe6 4. Bb3 Re1 5. Qf3 Qh6
6. Rd1 Rxd1 7. Qxd1 Qe3 8. c3 g6 9. cxb4 Bxb4 10. hxg6 hxg6
11. g4 Be7 12. Qh1+ Kg7 13. Qg2 Kf6 14. Bxd5 Qxd4 15. Bb3
Kg5 $11 *[/pgn]

8/4b3/5kp1/8/3q2P1/pB6/K5Q1/8 b - -

Engine: Rebel 14.1 MOD (512 MB)
gemaakt door Fabien Letouzey, Pawel Koziol, Chris Whittington en Ed Schroeder

29/159 21:48 -3.46 15...Kg5 16.De2 Lb4 17.Dc2 Dxg4
18.Dc6 De2+ 19.Lc2 Ld6 20.Dd5+ Kh6
21.Dxd6 Dxc2+ 22.Kxa3 Dc3+ 23.Ka2 Dc4+
24.Kb2 Kg5 25.Kb1 Df1+ 26.Kc2 Df5+
27.Kc3 Kh4 28.Dc7 (1.682.327.183) 1285

beste zet: Kf6-g5 tijd: 35:31.406 min n/s: 1.304.773 CPU 100.0% n/s(1CPU): 1.304.773 nodes: 2.781.000.000

8/4b3/5kp1/8/3q2P1/pB6/K5Q1/8 b - -

Engine: Stockfish 20220313 (512 MB)
gemaakt door the Stockfish developers (see AUTHORS f

50/39 0:00 0.00 15...Kg5 16.De2 Lb4 17.Lf7 Df4
18.Le6 Lc5 19.Dg2 Ld6 20.Dc2 Lf8
21.De2 (1.108.529) 1153

51/49 0:05 0.00 15...Kg5 16.De2 Lb4 17.Lf7 Df4
18.Le6 Lc5 19.Dg2 Ld6 20.Dc2 Lf8
21.De2 Dd4 22.Dc2 Kf6 23.De2 (6.304.203) 1125
.
.
.
73/25 35:31 0.00 15...Kg5 16.De2 Lb4 17.Lf7 Df4
18.Le6 Lc5 19.Dg2 Ld6 20.Dc2 Kf6
21.De2 Dd4 22.Lc4 Kg7 23.Ld3 Df4
24.Lc4 (2.339.331.482) 1097

beste zet: Kf6-g5 tijd: 35:31.406 min n/s: 1.097.555 nodes: 2.339.331.482

There must be many games like this for instance in the gauntlet from Graham Banks. How to teach Rebel the endgame knowledge? You could replace Rebel's score in high quality games with something that gradually goes towards the game result, especially for draws where opponent already knows it is a draw much earlier than Rebel. But it is a gargantuan task and not doable at all if you have billions of separate positions, not games. If you have the complete games it is a bit easier. Well I don't really know if it can be done that way. The alternative I think is to go back to total scratch like Alpha Zero, let Rebel learn without Benjamin scores at all. But that needs so much computing power and is not Rebel anymore but Alpha Zero, Rebel Zero what name do you want to give it.

It is not something done on a rainy afternoon...

Subject: Re: Rodent NNUE development Sat Apr 09, 2022 8:10 am

The answer is that the learner has too few positions to learn from it, the cure is too add 100+ million (or so) opposite colored Bishop endgames to the data. The most extreme example of too few data is the KBNK ending, try it, it doesn't know how to mate. Here the solution is to copy the PoDeo HCE code that handles it well and switch to HCE eval, also for those other special endings. Kind of finishing touch.

Posts : 612 Join date : 2020-11-26

Ed, I noticed that although trained on the same number of games, CSTal had a better understanding of endings with different colors bishops, knows also the endgame with a bishop a the a/h pawn of the wrong color, etc. I think it was probably more a problem in Benjamin's evaluation than anything else.

Posts : 233 Join date : 2021-10-08

Thanks for the explanation Ed. It is fascinating. You are right about the KBNK endgame, it is hopeless because the Lone Black King always manages to go to the corner where the Bishop can't give check and White does not manage to get him to the other corner, and then 50 moves are up Sad

It would sure be nice if Rebel had more endgame knowledge, if there is any way we could help with providing those (endgame) positions. For instance if you would want to make something for Pavel again, to stay with this thread?

Posts : 233 Join date : 2021-10-08

matejst wrote:: Ed, I noticed that although trained on the same number of games, CSTal had a better understanding of endings with different colors bishops, knows also the endgame with a bishop a the a/h pawn of the wrong color, etc. I think it was probably more a problem in Benjamin's evaluation than anything else.

It is probably different for different programs how much bonus they give for an extra pawn, even though technically (or heuristically, I'm not sure it is proven for eight men games always?) it is drawn, you can still win against a human opponent or weaker program, so programs can give it a small positive score. But almost 3½ pawns starts to work counterproductive. Ed I think says it is an artifact because the endgame positions are easily underrepresented in the learning process.

Subject: Re: Rodent NNUE development Mon Apr 11, 2022 10:42 am

Both of you are right to a certain extend. Here is something I learned through the years about HCE, it's valid for search as well.

The evaluation (of a position) is the heart of a chess program, in the end the final move is decided by the evaluation provided the search part is reasonable. The eval consists of many ingredients, the 3 most important (and dominant) are: 1) mobility, 2 passed pawns and 3) king safety and 25-50-100 or more others such as pawn evaluation, PST, bishop pair, center control, outposts, endgame stuff as the next important ones.

For all these ingredients you have to invent ideas (rules) and write the code for it, apply bonuses and penalty values. First problem, no matter how hard you try for complex eval ingredients such as king safety, mobility, passed pawn you never can write perfect code, but..... if in 90-95% (arguable!) of the cases your code is right you have accomplished something good.

Second problem, ingredients you program interact with each other and sometimes may clash, for example, you may have a good mobility and good king safety but in practice in many positions the one is more important than the other and may lead to bad moves after all. For a typical middle game position all of above mentioned eval ingredients interact with each other and will influence the final evaluation score (the sum of all ingredients) and as a result the reliability of the total evaluation will drop considerable. And to deal with that problem we invented the concept of tuning, playing thousands of games learning all the ingredients to get along with each other, so to say, and find the right balance.

It's what the NNUE learner is doing also, only in a different way and a lot better and much will depend how good your HCE eval is. In the case you have not so well functioning ingredients in your HCE eval, which in my case is certainly true for unequal bishop endings you still may expect a good improvement from the learner PROVIDED the learner is given enough positions to learn from.

The learner not only looks at the scores of moves but also at the game result and if too high scores (of say +3.xx) or to low scores (of say -3.xx) eventually end in a draw the learner will take notice.

Posts : 1254 Join date : 2020-11-17 Location : France

Admin wrote:: Both of you are right to a certain extend. Here is something I learned through the years about HCE, it's valid for search as well.

The evaluation (of a position) is the heart of a chess program, in the end the final move is decided by the evaluation provided the search part is reasonable. The eval consists of many ingredients, the 3 most important (and dominant) are: 1) mobility, 2 passed pawns and 3) king safety and 25-50-100 or more others such as pawn evaluation, PST, bishop pair, center control, outposts, endgame stuff as the next important ones.

For all these ingredients you have to invent ideas (rules) and write the code for it, apply bonuses and penalty values. First problem, no matter how hard you try for complex eval ingredients such as king safety, mobility, passed pawn you never can write perfect code, but..... if in 90-95% (arguable!) of the cases your code is right you have accomplished something good.

Second problem, ingredients you program interact with each other and sometimes may clash, for example, you may have a good mobility and good king safety but in practice in many positions the one is more important than the other and may lead to bad moves after all. For a typical middle game position all of above mentioned eval ingredients interact with each other and will influence the final evaluation score (the sum of all ingredients) and as a result the reliability of the total evaluation will drop considerable. And to deal with that problem we invented the concept of tuning, playing thousands of games learning all the ingredients to get along with each other, so to say, and find the right balance.

It's what the NNUE learner is doing also, only in a different way and a lot better and much will depend how good your HCE eval is. In the case you have not so well functioning ingredients in your HCE eval, which in my case is certainly true for unequal bishop endings you still may expect a good improvement from the learner PROVIDED the learner is given enough positions to learn from.

The learner not only looks at the scores of moves but also at the game result and if too high scores (of say +3.xx) or to low scores (of say -3.xx) eventually end in a draw the learner will take notice.

If you train the nnue on game results only (as many people are doing) then it will learn all these things no matter (well, almost no matter) what the original HCE says. This was the zero concept, proven by Alpha Zero, it learnt from scratch with no evaluation playing random games. Presumably (I’ve forgotten) Leela started off like that.

Posts : 612 Join date : 2020-11-26

D. Kappe wrote about experimenting with evaluation vs result learning, but I have the impression that evaluation learning helps keeping the characteristics, the style of play of an engine.

Then -- I really don't see a notable qualitative difference between Rebel and, let's say, SF in the opening at the same depths, but the difference start to be more visible in later phases of the game. The initial moves are, logically, more represented in the training data. I believe that there has to be an intelligent, targeted selection of the data for NN training/learning, and a good way to proceed would be to see what the weakness of the engine are, then add adequate positions that will help improve this part of his game.

Sorry, I am tired and my English is awful right now.

Subject: Re: Rodent NNUE development Tue Apr 12, 2022 10:45 am

matejst wrote:: D. Kappe wrote about experimenting with evaluation vs result learning, but I have the impression that evaluation learning helps keeping the characteristics, the style of play of an engine.

Then -- I really don't see a notable qualitative difference between Rebel and, let's say, SF in the opening at the same depths, but the difference start to be more visible in later phases of the game. The initial moves are, logically, more represented in the training data. I believe that there has to be an intelligent, targeted selection of the data for NN training/learning, and a good way to proceed would be to see what the weakness of the engine are, then add adequate positions that will help improve this part of his game.

Sorry, I am tired and my English is awful right now.

Fixing eval holes, such as unequal bishop endings, can be done in the following way, create a set of at least 5-10 million of unequal bishop endings, play games and you might end up with 200-300 million usable positions for the learner and volume is THE measuring rod to weed out most of the miss evaluations.

Probably (emphasis added) even better play those 5-10 million games with an engine that understands unequal bishops better and put those into the learner. I say probably because mixing engines is a dangerous concept but the whole neural net business is a matter of trial by error anyway.

Only problem, how to create 5-10 million unequal bishop ending start positions.

Posts : 612 Join date : 2020-11-26

In my previous post, I started to write that the filtering of data could be a mess, but I was really tired and in such cases, I simply forget English words.

Would it be a good idea to add, just like Connor McMonigle did, positions from 6 men TBH? [Connor aimed at something else though].

Jonathan Kreuzer also experimented with special endgame nets for SlowChess, and after John Stanback remark in a CCC thread, I ran a short match between Wasp and SC where, just like John said, SC destroyed Wasp in endgames. What interested me was how Wasp fared in the openings and middlegames -- and it was OK, but the longer and the greater was SC's advantage. Perhaps having several smaller nets could be more productive than having a huge net. Of course, I am not the one that knows how to do it.

» Rodent NNUE
» Which NNUE development is next?
» Rebel NNUE development diary
» Rodent games on Pi
» a book about Rodent (in French)