SPCC: Rebel 16.2 Evalcorrect-parameter test

Posts : 160 Join date : 2022-03-01 Location : Berlin

Experimental testrun of Rebel 16.2 with different values of the Evalcorrect UCI-parameter. This option can be used to change the playing style of the engine. The default value is 202. Increasing the value should increase the engine aggressiveness.

A 10000 games RoundRobin tournament was played. 60sec+600ms thinking-time, singlethread, no ponder, no bases, my UHO_2022_8mvs_+120_+129 openings were used.

https://www.sp-cc.de/experiments.htm

Code::    Program Elo + - Games Score Av.Op. Draws

   1 Rebel 16.2 default : 3617 6 6 4000 52.4% 3600 56.1%
   2 Rebel 16.2 Ec=256 : 3601 6 6 4000 49.5% 3604 56.2%
   3 Rebel 16.2 Ec=300 : 3600 6 6 4000 49.4% 3604 57.2%
   4 Rebel 16.2 Ec=500 : 3600 6 6 4000 49.3% 3604 55.9%
   5 Rebel 16.2 Ec=400 : 3599 6 6 4000 49.3% 3604 57.0%

Games : 10000 (finished)
White Wins : 4190 (41.9 %)
Black Wins : 162 (1.6 %)
Draws : 5648 (56.5 %)

Below the Engines Aggressiveness Scoring (EAS), calculated with my EAS-Tool (V5.21):

Code::    bad avg.win
Rank EAS-Score sacs shorts draws moves Engine/player
-------------------------------------------------------------------
   1 75453 10.06% 20.43% 20.36% 79 Rebel 16.2 default
   2 72790 09.25% 18.51% 20.45% 81 Rebel 16.2 Ec=400
   3 71327 09.36% 16.14% 19.71% 81 Rebel 16.2 Ec=500
   4 70839 10.50% 18.20% 20.60% 80 Rebel 16.2 Ec=256
   5 61260 09.23% 16.55% 21.16% 82 Rebel 16.2 Ec=300
-------------------------------------------------------------------
*** Average length of all won games: 80 moves

Conclusions: The Evalcorrect-parameter seems quite meaningless. As you can see, the strength and the aggressiveness of Rebel are nearly identical with all Evalcorrect-values and a higher Evalcorrect-value seems to lower the aggressiveness instead of increasing it...

Posts : 1254 Join date : 2020-11-17 Location : France

pohl4711 wrote:

Experimental testrun of Rebel 16.2 with different values of the Evalcorrect UCI-parameter. This option can be used to change the playing style of the engine. The default value is 202. Increasing the value should increase the engine aggressiveness.

A 10000 games RoundRobin tournament was played. 60sec+600ms thinking-time, singlethread, no ponder, no bases, my UHO_2022_8mvs_+120_+129 openings were used.

https://www.sp-cc.de/experiments.htm

Code::    Program Elo + - Games Score Av.Op. Draws

   1 Rebel 16.2 default : 3617 6 6 4000 52.4% 3600 56.1%
   2 Rebel 16.2 Ec=256 : 3601 6 6 4000 49.5% 3604 56.2%
   3 Rebel 16.2 Ec=300 : 3600 6 6 4000 49.4% 3604 57.2%
   4 Rebel 16.2 Ec=500 : 3600 6 6 4000 49.3% 3604 55.9%
   5 Rebel 16.2 Ec=400 : 3599 6 6 4000 49.3% 3604 57.0%

Games : 10000 (finished)
White Wins : 4190 (41.9 %)
Black Wins : 162 (1.6 %)
Draws : 5648 (56.5 %)

Below the Engines Aggressiveness Scoring (EAS), calculated with my EAS-Tool (V5.21):

Code::    bad avg.win
Rank EAS-Score sacs shorts draws moves Engine/player
-------------------------------------------------------------------
   1 75453 10.06% 20.43% 20.36% 79 Rebel 16.2 default
   2 72790 09.25% 18.51% 20.45% 81 Rebel 16.2 Ec=400
   3 71327 09.36% 16.14% 19.71% 81 Rebel 16.2 Ec=500
   4 70839 10.50% 18.20% 20.60% 80 Rebel 16.2 Ec=256
   5 61260 09.23% 16.55% 21.16% 82 Rebel 16.2 Ec=300
-------------------------------------------------------------------
*** Average length of all won games: 80 moves

Conclusions: The Evalcorrect-parameter seems quite meaningless. As you can see, the strength and the aggressiveness of Rebel are nearly identical with all Evalcorrect-values and a higher Evalcorrect-value seems to lower the aggressiveness instead of increasing it...

Purpose of eval correct was to try to bring the NNUE eval into accord with the hand coded search/pruning/extension parameters. HCS was basically originally “tuned” to the CSTal HCE-evaluation, whereas the range and scale of the NNUE eval is a little indeterminate, especially when part turned on game result. Whenever I do a new net, evalcorrect gets retuned. Current value is a multiplier of 202, I think. Whatever delivered most Elo.
If it is increased or decreased it basically is changing against what the HC-Search think a pawn is worth.

Curiously, I found total engine reaction to eval correct to be quite “dull”, big changes didn’t seem to make a lot of difference.
I guess if you increase it a lot, then the evals will begin to swamp the search parameter values (with unknown behaviour) and if decreased a lot, then the evals will themselves disappear in search prune/extension effect (more unknown behaviour).

Your finding of a little less aggression with increasing EC is useful and also intuitively in accord.

A more useful adjust would be to be able to change the relationship between the base material of a position and the NNUE idea of the positional component. CSTal dev already has this feature, although my (limited) testing didn’t find anything better than the vanilla setting. Obviously it’s neither simple, nor “accurate”, extracting a realistic psnl and a mtrl component out of a NNUE eval.

» SPCC: Test of Pedone 3 Strength-parameter finished
» SPCC: Testruns of Rebel EAS 2 and Rebel Extreme finished
» SPCC: Testrun of Rebel 15.1a finished
» SPCC: Testrun of Rebel 16.1 finished
» CEGT Test Rebel 15.1a 40/4