ProDeo
Would you like to react to this message? Create an account in a few clicks or log in to continue.
ProDeo

Computer Chess
 
HomeHome  CalendarCalendar  Latest imagesLatest images  FAQFAQ  SearchSearch  MemberlistMemberlist  UsergroupsUsergroups  RegisterRegister  Log in  

 

 Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS

Go down 
+3
Mclane
pohl4711
Chris Whittington
7 posters
AuthorMessage
Chris Whittington
Admin



Posts : 839
Join date : 2020-11-17
Location : France

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyTue Mar 07, 2023 11:50 pm

The EAS test is really very (very) useful to engine developers. I'm using it to keep a quantitative track on the dev versions of Chess System Tal, and also because I like (a lot) playing through the sacrifice games that it finds from PGN lists. Here I've one small request for a reversion to EAS saving PGN's with full comments, timings and evals, these are lost in the recent versions. Like so:

[Date "2023.03.07"]
[Round "105"]
[White "Chess-System-Tal-1.59"]
[Black "Seer-2.6"]
[Result "1-0"]
[ECO "C18"]
[Opening "French"]
[Variation "Winawer, Advance Variation"]
[TimeControl "20+0.05"]
[PlyCount "75"]
[GameDuration "00:00:30"]
[GameEndTime "2023-03-07T20:46:20.470 W. Europe Standard Time"]
[GameStartTime "2023-03-07T20:45:49.819 W. Europe Standard Time"]

1. e4 e6 2. d4 d5 3. Nc3 Bb4 4. e5 c5 5. a3 Bxc3+ 6. bxc3 Nc6 7. a4 Qc7 8.
Nf3 cxd4 9. cxd4 Nge7 10. Bd2 Na5 11. Bd3 Nc4 12. O-O h6 13. Qe2 Nxd2 14.
Qxd2 O-O 15. c3 Bd7 16. h4 Rab8 17. Rfc1 b5 18. axb5 Bxb5 19. Bb1 a5 20. h5
a4 21. Qc2 g6 22. Nh2 Kg7 23. hxg6 fxg6 24. Ng4 Qa5 25. Ra3 Rb7 26. Nf6
Rfb8 27. g4 Bc6 28. Kg2 Rb2 29. Qd1 Qc7 30. Qd3 Bb5 31. Qe3 Rf8 32. c4 dxc4
33. Qxh6+ Kxh6 34. Rh1+ Kg5 35. Nh7+ Kf4 36. Rf3+ Kxg4 37. Rg3+ Kf4 38.
Rh4# 1-0

[Event "?"]
[Site "?"]
[Date "2023.03.07"]
[Round "82"]
[White "Koivisto_9.0"]
[Black "Chess-System-Tal-1.59"]
[Result "0-1"]
[ECO "B60"]
[Opening "Sicilian"]
[Variation "Richter-Rauzer"]
[TimeControl "20+0.05"]
[PlyCount "68"]
[GameDuration "00:00:27"]
[GameEndTime "2023-03-07T20:43:10.371 W. Europe Standard Time"]
[GameStartTime "2023-03-07T20:42:42.914 W. Europe Standard Time"]

1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 Nc6 6. Bg5 Qa5 7. Bxf6
gxf6 8. Qd2 h5 9. Nb3 Qd8 10. Bd3 a6 11. f4 e6 12. Qf2 Be7 13. Na4 b5 14.
Nb6 Rb8 15. Nxc8 Rxc8 16. O-O-O Rb8 17. Qe2 Qb6 18. a3 a5 19. c3 b4 20.
cxb4 a4 21. Na1 Nxb4 22. axb4 Qxb4 23. f5 d5 24. Kb1 Bd6 25. Ka2 Be5 26.
Rb1 Ke7 27. fxe6 Rb6 28. exd5 Rhb8 29. Rhd1 Qb3+ 30. Nxb3 axb3+ 31. Ka3
Bd6+ 32. Ka4 Rb4+ 33. Ka5 Bc7+ 34. Ka6 Ra8# 0-1

it was very useful to know the evals, at a glance, when reviewing the games, but now not Sad


I have now 200,000 PGN test games, organised by maturing epochs (about 15 epochs apart) and several nets (three nets in the current series), so can probably make some comments about the scoring system. Aggressiveness scoring is obviously made very difficult by defining what "aggressive" means and how the various metrics used correlate to it and not least by the extreme sparsity of high material sacrifice games.

Mclane, matejst and Dio like this post

Back to top Go down
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyWed Mar 08, 2023 3:27 pm

Admin wrote:


BTW, speaking of Rebel-16.2 I noticed you did not test it, any particular reason?

Yes. You said, there is no change in engine or net compared to Rebel 16.1. And I made a small head_to_head of Rebel 16.1 vs 16.2 with some thousand games over night and the difference was below 3 Elo (using UHO-openings, so using balanced openings, the Elo-difference should be below 2 Elo...)


Last edited by pohl4711 on Wed Mar 08, 2023 3:53 pm; edited 1 time in total
Back to top Go down
https://www.sp-cc.de
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyWed Mar 08, 2023 3:38 pm

Chris Whittington wrote:
The EAS test is really very (very) useful to engine developers. I'm using it to keep a quantitative track on the dev versions of Chess System Tal, and also because I like (a lot) playing through the sacrifice games that it finds from PGN lists. Here I've one small request for a reversion to EAS saving PGN's with full comments, timings and evals, these are lost in the recent versions. Like so:

[Date "2023.03.07"]
[Round "105"]
[White "Chess-System-Tal-1.59"]
[Black "Seer-2.6"]
[Result "1-0"]
[ECO "C18"]
[Opening "French"]
[Variation "Winawer, Advance Variation"]
[TimeControl "20+0.05"]
[PlyCount "75"]
[GameDuration "00"]
[GameEndTime "2023-03-07T20.470 W. Europe Standard Time"]
[GameStartTime "2023-03-07T20.819 W. Europe Standard Time"]

1. e4 e6 2. d4 d5 3. Nc3 Bb4 4. e5 c5 5. a3 Bxc3+ 6. bxc3 Nc6 7. a4 Qc7 8.
Nf3 cxd4 9. cxd4 Nge7 10. Bd2 Na5 11. Bd3 Nc4 12. O-O h6 13. Qe2 Nxd2 14.
Qxd2 O-O 15. c3 Bd7 16. h4 Rab8 17. Rfc1 b5 18. axb5 Bxb5 19. Bb1 a5 20. h5
a4 21. Qc2 g6 22. Nh2 Kg7 23. hxg6 fxg6 24. Ng4 Qa5 25. Ra3 Rb7 26. Nf6
Rfb8 27. g4 Bc6 28. Kg2 Rb2 29. Qd1 Qc7 30. Qd3 Bb5 31. Qe3 Rf8 32. c4 dxc4
33. Qxh6+ Kxh6 34. Rh1+ Kg5 35. Nh7+ Kf4 36. Rf3+ Kxg4 37. Rg3+ Kf4 38.
Rh4# 1-0

[Event "?"]
[Site "?"]
[Date "2023.03.07"]
[Round "82"]
[White "Koivisto_9.0"]
[Black "Chess-System-Tal-1.59"]
[Result "0-1"]
[ECO "B60"]
[Opening "Sicilian"]
[Variation "Richter-Rauzer"]
[TimeControl "20+0.05"]
[PlyCount "68"]
[GameDuration "00"]
[GameEndTime "2023-03-07T20.371 W. Europe Standard Time"]
[GameStartTime "2023-03-07T20.914 W. Europe Standard Time"]

1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 Nc6 6. Bg5 Qa5 7. Bxf6
gxf6 8. Qd2 h5 9. Nb3 Qd8 10. Bd3 a6 11. f4 e6 12. Qf2 Be7 13. Na4 b5 14.
Nb6 Rb8 15. Nxc8 Rxc8 16. O-O-O Rb8 17. Qe2 Qb6 18. a3 a5 19. c3 b4 20.
cxb4 a4 21. Na1 Nxb4 22. axb4 Qxb4 23. f5 d5 24. Kb1 Bd6 25. Ka2 Be5 26.
Rb1 Ke7 27. fxe6 Rb6 28. exd5 Rhb8 29. Rhd1 Qb3+ 30. Nxb3 axb3+ 31. Ka3
Bd6+ 32. Ka4 Rb4+ 33. Ka5 Bc7+ 34. Ka6 Ra8# 0-1

it was very useful to know the evals, at a glance, when reviewing the games, but now not Sad


I have now 200,000 PGN test games, organised by maturing epochs (about 15 epochs apart) and several nets (three nets in the current series), so can probably make some comments about the scoring system. Aggressiveness scoring is obviously made very difficult by defining what "aggressive" means and how the various metrics used correlate to it and not least by the extreme sparsity of high material sacrifice games.

The scoring system of the EAS-Tool got really complex in the meantime. In the ReadMe-File in the EAS-download, it is fully explained. For engine developers: Perhaps it is more helpful to look at the full stats in the 2nd EAS-list, not only looking at the EAS-score: The EAS-tool makes 2 lists, the second with more percentual stats. Both in the statistics_EAS_ratinglist.txt File.

The EAS-Tool deletes all comments in the games, because it makes the files much smaller in size and the computing gets faster (around +50% faster!). But, if you want to keep the comments, just open the .bat-file (EAS_Tool_V5.21.bat or/and Gauntlet_EAS_Tool_V5.21.bat) with an editor and search for the string:
"-C -N -V"
This is the command for pgn-extract, to remove all comments.
If you search for this string, you will find it only one time in each of the both tools. The line looks like this:
pgn-extract --quiet --fixresulttags -C -N -V --plycount ../%gamebase% --output newsource.pgn > NUL

Just delete this sequence out and the games should have all comments (time, eval, depth, when played with cutechess). So, the line should look like this:
pgn-extract --quiet --fixresulttags --plycount ../%gamebase% --output newsource.pgn > NUL

Save the .bat file and use this instead of the original .bat-files...


Last edited by pohl4711 on Wed Mar 08, 2023 3:51 pm; edited 1 time in total

Mclane likes this post

Back to top Go down
https://www.sp-cc.de
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyWed Mar 08, 2023 3:48 pm

Here (taken from the Readme-file) the EAS-Score calculation:

EAS-Score is calculated with these rules:
1) Sacrifices: (percent*100) of the percent-values of the sacrifices (1-5+ pawnunits) calculated out
of the won games by the engine, only. So, a weak engine (with a small number of won games) can get
a high EAS-scoring, too, when the percent of sac-games in the won games is high (and the number of
short wins). Higher pawnunits-sacs give bonus-points:
1 pawnsac = 5x points *** 2 pawnsac = 15x points *** 3 pawnsac = 45x points
4 pawnsac = 90x points *** 5+ pawnsac = 180x points *** 5+ Queensac = 350x points

2) Very short won games (percent*100) of won games by the engine give these EAS-points:
60 moves= 8x points *** 55 moves= 12x points *** 50 moves= 18x points
45 moves= 27x points *** 40 moves= 45x points.
Since V5.2, the move-limit is no longer fixed to 40-60 moves, but the average length of all won
games in the source.pgn is calculated, rounded to 5 or 10 and -15. Reason is, that human games or
adjucated engine games are much shorter than non-adjucated engine-games for example and the EAS-tools
will now adjust the move-limits for short-win EAS-points to this "reality":
Example 1: Average won game length in the source.pgn is 78 moves: Rounded to 75 and -15 = 60 is the
upper limit, followed by 55, 50, 45, 40
Example 2: Average won game length is 58 moves: Rounded to 55 and -15 = 40 is the upper limit,
followed by 35, 30, 25, 20
Additionally, if the average win game length of the engine is shorter than the average win game
length of all games in the source.pgn, the engine gets 3000 EAS-points for each move, their won
games are shorter in average. If the average win game length of the engine is higher than the average
win game length of all games in the source.pgn, 1000 EAS-points are substracted for each move, their
won games are shorter in average. But these substraction of points is done only on the EAS-points,
the engine has received for their short wins (see above). The other EAS-points (for sacrifices and
bad draws (see 1) and 3)) stay always in the calculation!

3) Bad draws: Bad draws are games, which were drawn before endgame (material check is done, the
number of played moves does not matter) and draws after the engine had a material advantage of
at least 1 pawn during a game, because the engine should win a game, if material was won. All
these bad draws are finally checked for a material disadvantage of at least 1 pawn: Because draws
with material disadvantage prevented a possible loss and so, these games are no bad draws and are
not counted.
The formula for calculating the bad-draw EAS-points is a bit tricky:
a) The percent-value of all good draws (out of all draws, the engine played) by the engine
is calculated (all draws - bad draws) and rounded: Example: Engine had 23.7% bad draws, then
the value here is 76 (100% - 23.7% = 76.3% (good draws), then rounded).
b) This value is exp3. Means: 76*76*76 = 438976, then divided by 2500 = 175
c) This value is exp2. Means: 175*175 = 30625
So, the engines gets 30625 EAS-points

Mclane likes this post

Back to top Go down
https://www.sp-cc.de
Chris Whittington
Admin



Posts : 839
Join date : 2020-11-17
Location : France

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyWed Mar 08, 2023 4:18 pm

pohl4711 wrote:
Chris Whittington wrote:
The EAS test is really very (very) useful to engine developers. I'm using it to keep a quantitative track on the dev versions of Chess System Tal, and also because I like (a lot) playing through the sacrifice games that it finds from PGN lists. Here I've one small request for a reversion to EAS saving PGN's with full comments, timings and evals, these are lost in the recent versions. Like so:

[Date "2023.03.07"]
[Round "105"]
[White "Chess-System-Tal-1.59"]
[Black "Seer-2.6"]
[Result "1-0"]
[ECO "C18"]
[Opening "French"]
[Variation "Winawer, Advance Variation"]
[TimeControl "20+0.05"]
[PlyCount "75"]
[GameDuration "00"]
[GameEndTime "2023-03-07T20.470 W. Europe Standard Time"]
[GameStartTime "2023-03-07T20.819 W. Europe Standard Time"]

1. e4 e6 2. d4 d5 3. Nc3 Bb4 4. e5 c5 5. a3 Bxc3+ 6. bxc3 Nc6 7. a4 Qc7 8.
Nf3 cxd4 9. cxd4 Nge7 10. Bd2 Na5 11. Bd3 Nc4 12. O-O h6 13. Qe2 Nxd2 14.
Qxd2 O-O 15. c3 Bd7 16. h4 Rab8 17. Rfc1 b5 18. axb5 Bxb5 19. Bb1 a5 20. h5
a4 21. Qc2 g6 22. Nh2 Kg7 23. hxg6 fxg6 24. Ng4 Qa5 25. Ra3 Rb7 26. Nf6
Rfb8 27. g4 Bc6 28. Kg2 Rb2 29. Qd1 Qc7 30. Qd3 Bb5 31. Qe3 Rf8 32. c4 dxc4
33. Qxh6+ Kxh6 34. Rh1+ Kg5 35. Nh7+ Kf4 36. Rf3+ Kxg4 37. Rg3+ Kf4 38.
Rh4# 1-0

[Event "?"]
[Site "?"]
[Date "2023.03.07"]
[Round "82"]
[White "Koivisto_9.0"]
[Black "Chess-System-Tal-1.59"]
[Result "0-1"]
[ECO "B60"]
[Opening "Sicilian"]
[Variation "Richter-Rauzer"]
[TimeControl "20+0.05"]
[PlyCount "68"]
[GameDuration "00"]
[GameEndTime "2023-03-07T20.371 W. Europe Standard Time"]
[GameStartTime "2023-03-07T20.914 W. Europe Standard Time"]

1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 Nc6 6. Bg5 Qa5 7. Bxf6
gxf6 8. Qd2 h5 9. Nb3 Qd8 10. Bd3 a6 11. f4 e6 12. Qf2 Be7 13. Na4 b5 14.
Nb6 Rb8 15. Nxc8 Rxc8 16. O-O-O Rb8 17. Qe2 Qb6 18. a3 a5 19. c3 b4 20.
cxb4 a4 21. Na1 Nxb4 22. axb4 Qxb4 23. f5 d5 24. Kb1 Bd6 25. Ka2 Be5 26.
Rb1 Ke7 27. fxe6 Rb6 28. exd5 Rhb8 29. Rhd1 Qb3+ 30. Nxb3 axb3+ 31. Ka3
Bd6+ 32. Ka4 Rb4+ 33. Ka5 Bc7+ 34. Ka6 Ra8# 0-1

it was very useful to know the evals, at a glance, when reviewing the games, but now not Sad


I have now 200,000 PGN test games, organised by maturing epochs (about 15 epochs apart) and several nets (three nets in the current series), so can probably make some comments about the scoring system. Aggressiveness scoring is obviously made very difficult by defining what "aggressive" means and how the various metrics used correlate to it and not least by the extreme sparsity of high material sacrifice games.

The scoring system of the EAS-Tool got really complex in the meantime. In the ReadMe-File in the EAS-download, it is fully explained. For engine developers: Perhaps it is more helpful to look at the full stats in the 2nd EAS-list, not only looking at the EAS-score: The EAS-tool makes 2 lists, the second with more percentual stats. Both in the statistics_EAS_ratinglist.txt File.

The EAS-Tool deletes all comments in the games, because it makes the files much smaller in size and the computing gets faster (around +50% faster!). But, if you want to keep the comments, just open the .bat-file (EAS_Tool_V5.21.bat or/and Gauntlet_EAS_Tool_V5.21.bat) with an editor and search for the string:
"-C -N -V"
This is the command for pgn-extract, to remove all comments.
If you search for this string, you will find it only one time in each of the both tools. The line looks like this:
pgn-extract --quiet --fixresulttags -C -N -V --plycount ../%gamebase% --output newsource.pgn > NUL

Just delete this sequence out and the games should have all comments (time, eval, depth, when played with cutechess). So, the line should look like this:
pgn-extract --quiet --fixresulttags --plycount ../%gamebase% --output newsource.pgn > NUL

Save the .bat file and use this instead of the original .bat-files...

OMG, just looked at the batch files. It's all done in batch file language. Help!! I can see how it gets complicated, fast. Convert to Python?

There are two things I would change. Please treat this as being helpful, rather than critical.
First, I'ld weight the sacrifice results quite a bit more. Sacrifices are very probably directly correlatable to "chess aggressiveness". And the game length results less, possibly game length and aggressiveness have some kind of correlation, but so does gamelength and Elo difference. Some sac games can be "long", for example instead of liquidating into a short mate, they liquidate into a longer but winning endgame.
Second, I'ld introduce the concept of length of sacrifice(how long during the game does the engine "hold" the sacrificial situation before resolving it?) The longer the "hold", the more likely the sac is positional (eval), rather than short term tactical (search), and I think positional eval sacs are more linked to aggression style than tactical search ones.

I can see that, because sacs are sparse, increasing sac weighting is quite possibly going to result in larger final score ranges, which could also result in an engine getting "lucky" results, but this is always the danger of not enough data. However, introducing a "length of sac" factor will probably ameliorate this. Or better log(length of sac) or sqrt(length of sac).
I guess sac length is already computed? Because it's the trigger for counting as a sac?

it could be that just multiplying the sac factor by sqrt(sac length) could be enough. It will increase the sac weight at the same time as adjusting for sac length. Or maybe ln(sac length)?


Mclane likes this post

Back to top Go down
Chris Whittington
Admin



Posts : 839
Join date : 2020-11-17
Location : France

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: EAS - an example   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyWed Mar 08, 2023 4:33 pm

The below results are from a mass of games of the development versions of CSTal. Weak versions, strong versions, trained after a few epochs, and then every 15 epochs thereafter - against release versions of the other engines.

You can see that CSTal has way more %Q sacs than anything else. About 3x as many 5+ sacs as the next engine. 2x as many 4+ sacs. More 3+ sacs than anything else, more 2+ sacs than anything else, and in top 50% on the 1+ sacs.
On sacs alone, I'ld place it at "most aggressive", except that SF13 and SF14 come higher in the list (I assume from short game weightings). Thus the reason I think sacs are under-weighted.
Also looking at some of the games of other engine sacs, I found 'quick' sacs, ie, tactical ones that win back the material fast, or lead to quick mate - these tend to be search/elo related, so arguably not "aggressive" style.


Code:
                       avg.win                                                                                                                          bad  
Rank  EAS-Score   wins  moves   sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1    all shorts short40  short45  short50  short55  short60   draws    Engine/player
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   1    126207     8761   74   15.08% =[00.01% + 00.10% + 00.24% + 01.12% + 03.84% + 09.77%]    26.77% = [01.71% + 03.15% + 06.02% + 07.66% + 08.23%]  17.25%   Stockfish_13  
   2    117783    10626   73   12.58% =[00.00% + 00.08% + 00.18% + 00.91% + 03.29% + 08.12%]    22.51% = [00.91% + 01.96% + 04.01% + 06.35% + 09.28%]  16.52%   Stockfish_14  
   3    109037    45865   78   15.77% =[00.07% + 00.37% + 00.56% + 01.38% + 03.92% + 09.47%]    16.92% = [01.66% + 01.42% + 02.57% + 04.29% + 06.98%]  17.49%   Chess-System-Tal-1.59  
   4     89640     4217   75   12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]    16.62% = [00.50% + 01.38% + 02.68% + 04.65% + 07.42%]  20.12%   CST-1.35-V20-E520  
   5     78105     3025   79   12.76% =[00.00% + 00.10% + 00.26% + 01.22% + 02.84% + 08.33%]    15.27% = [01.55% + 01.32% + 02.41% + 03.67% + 06.31%]  21.52%   SlowChess_2.9  
   6     72370     6595   84   11.68% =[00.02% + 00.09% + 00.21% + 00.91% + 02.88% + 07.57%]    18.15% = [00.99% + 02.05% + 03.55% + 05.50% + 06.07%]  22.40%   Stockfish_12  
   7     60005     4719   76   06.29% =[00.00% + 00.04% + 00.11% + 00.38% + 01.40% + 04.37%]    16.21% = [00.55% + 01.06% + 02.37% + 04.28% + 07.95%]  27.67%   Seer-2.6  
   8     39704     7004   85   06.18% =[00.01% + 00.10% + 00.10% + 00.31% + 01.51% + 04.14%]    08.47% = [00.29% + 00.47% + 00.99% + 02.37% + 04.35%]  25.24%   Koivisto_9.0  
   9     34208     6153   86   07.10% =[00.00% + 00.10% + 00.11% + 00.31% + 01.37% + 05.22%]    09.48% = [00.21% + 00.62% + 01.25% + 02.80% + 04.60%]  28.42%   Berserk_9  
  10     32619     7552   86   08.53% =[00.04% + 00.03% + 00.07% + 00.41% + 01.97% + 06.01%]    08.32% = [00.20% + 00.41% + 01.28% + 02.57% + 03.85%]  29.48%   Berserk_10  
  11     28494     4779   87   07.09% =[00.00% + 00.00% + 00.08% + 00.46% + 01.30% + 05.25%]    09.25% = [00.33% + 00.63% + 01.23% + 02.32% + 04.73%]  30.11%   Koivisto_8.0  
****************************************************
*** EAS-tool (C) 2022 Stefan Pohl (www.sp-cc.de) ***
****************************************************
Back to top Go down
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyWed Mar 08, 2023 4:42 pm

Chris Whittington wrote:


OMG, just looked at the batch files. It's all done in batch file language. Help!! I can see how it gets complicated, fast. Convert to Python?

There are two things I would change. Please treat this as being helpful, rather than critical.
First, I'ld weight the sacrifice results quite a bit more. Sacrifices are very probably directly correlatable to "chess aggressiveness". And the game length results less, possibly game length and aggressiveness have some kind of correlation, but so does gamelength and Elo difference. Some sac games can be "long", for example instead of liquidating into a short mate, they liquidate into a longer but winning endgame.
Second, I'ld introduce the concept of length of sacrifice(how long during the game does the engine "hold" the sacrificial situation before resolving it?) The longer the "hold", the more likely the sac is positional (eval), rather than short term tactical (search), and I think positional eval sacs are more linked to aggression style than tactical search ones.

I can see that, because sacs are sparse, increasing sac weighting is quite possibly going to result in larger final score ranges, which could also result in an engine getting "lucky" results, but this is always the danger of not enough data. However, introducing a "length of sac" factor will probably ameliorate this. Or better log(length of sac) or sqrt(length of sac).
I guess sac length is already computed? Because it's the trigger for counting as a sac?

it could be that just multiplying the sac factor by sqrt(sac length) could be enough. It will increase the sac weight at the same time as adjusting for sac length. Or maybe ln(sac length)?



1) It must be written in batch-language, because I use pgn-extract and some other external tools and will definitly not write my own pgn-parser...

2) Length of sacs: Sac-detection only with pgn-extract... using it's peace-pattern-recognition... Strict limits there... A sac there is just material disadvantage in the game, for the winning color. Each new capture (even normal capture with re-capture) sets the "sac-counter" back to zero... No chance to change something here...

3) Weights: IMO playing very short wins is very aggressive, too. But in general: My advice for engine-developers, using my EAS-tool:
Do not look to much at the EAS-score, but more into the statistics.
Here an example out of my EAS-all-time Top10 EAS-list, which looks like this (from my website):

Code:

                                 bad  avg.win
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player         SPCC-Elo
----------------------------------------------------------------------------
   1    275864  35.22%  42.68%  15.64%   64   Velvet 4.1.0 avx2       3368
   2    261076  36.44%  35.46%  12.71%   68   Revenge 1.0 avx2        3476
   3    241188  24.57%  30.02%  22.45%   70   Pedone 3 avx2           3342
   4    235546  38.26%  32.04%  10.14%   70   Dragon 3 aggressive     3301
   5    228109  34.52%  33.42%  13.57%   68   Uralochka 3.37c avx2    3468
   6    223392  29.38%  36.61%  15.97%   66   Arasan 23.0.1 avx2      3340
   7    222050  31.46%  31.02%  14.74%   70   Danasah 9.0 avx2        3223
   8    218065  30.02%  34.50%  13.40%   69   Rebel 14.1 avx2         3235
   9    182309  30.06%  23.83%  15.64%   72   Slow Chess 2.5 avx2     3432
  10    171629  23.62%  30.56%  17.84%   69   Viridithas 7.0.0 avx2   3348

For a developer, it is much more important, to look at the stats. Example: Pedone 3. Why has Pedone such a high EAS-score, when the sacs is 24.57% only and the number of bad draws is so high (22.45%).?
Then you can investigate this further in the second EAS-list (looks like in my Full-ratinglist EAS):

Code:

                       avg.win                                                                                                                          bad  
Rank  EAS-Score   wins  moves   sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1    all shorts short40  short45  short50  short55  short60   draws    Engine/player
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   1    282189     4597   64   35.59% =[00.20% + 01.54% + 02.74% + 04.96% + 10.12% + 16.03%]    42.40% = [07.16% + 05.13% + 07.09% + 10.07% + 12.94%]  14.89%   Velvet 4.1.0 avx2  
   2    261076     3804   68   36.44% =[00.08% + 01.50% + 03.02% + 05.36% + 11.33% + 15.14%]    35.46% = [04.31% + 04.13% + 06.62% + 08.65% + 11.75%]  12.71%   Revenge 1.0 avx2  
   3    257404     2497   64   31.32% =[00.00% + 01.52% + 01.88% + 03.40% + 10.17% + 14.34%]    44.61% = [06.61% + 05.81% + 07.57% + 11.17% + 13.46%]  15.44%   Velvet 4.0.0 avx2  
   4    250257     5353   69   38.86% =[00.02% + 00.88% + 02.11% + 05.03% + 12.16% + 18.66%]    33.23% = [04.18% + 03.57% + 06.48% + 08.82% + 10.18%]  09.69%   Dragon 3 aggressive  
   5    241188      806   70   24.57% =[00.25% + 03.47% + 03.97% + 01.74% + 05.21% + 09.93%]    30.02% = [06.45% + 03.47% + 05.46% + 06.45% + 08.19%]  22.45%   Pedone 3 avx2  


bad to read, sorry. But the point are the sac-numbers of Pedone and below (for comparison) of Velvet 4.0.0:
Code:

sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1
24.57% =[00.25% + 03.47% + 03.97% + 01.74% + 05.21% + 09.93%]

31.32% =[00.00% + 01.52% + 01.88% + 03.40% + 10.17% + 14.34%]

So, we see, that the high EAS-score of Pedone 3 comes from the higher number of high-sac games, which give much more EAS-points, than low sac games...
So, we can learn here about Pedone 3 playings-style a lot:
1) Plays many high sacs (good!)
2) (Looking at the upper EAS-list in this post): plays not so many short wins and relatively high-number of "bad draws", so besides the high number of high-sacs, Pedone 3 can be improved here: Developer should try to make the engine playing more short & directly to the win and avoid some early draws...

Back to top Go down
https://www.sp-cc.de
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyWed Mar 08, 2023 4:49 pm

Chris Whittington wrote:
The below results are from a mass of games of the development versions of CSTal. Weak versions, strong versions, trained after a few epochs, and then every 15 epochs thereafter - against release versions of the other engines.

You can see that CSTal has way more %Q sacs than anything else.

Oh.... Look at my EAS alltime Top10 list, Pedone 3:

Code:

sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1
24.57% =[00.25% + 03.47% + 03.97% + 01.74% + 05.21% + 09.93%]

And here CSTal:
12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]


So, CSTal is lightyears away from Pedone 3 playing sacs... And with really good sac-stats (like Pedone 3), the engine gets a high EAS-score (241188), even though, other engines played much more overall-sacs. So, IMO, sacs are definitly not underrated. Sorry.

The points system is exponentially:
1 pawnsac = 5x points *** 2 pawnsac = 15x points *** 3 pawnsac = 45x points
4 pawnsac = 90x points *** 5+ pawnsac = 180x points *** 5+ Queensac = 350x points

So, a 5+sac gets 36x more points, than a 1pawn sac. For example.
Back to top Go down
https://www.sp-cc.de
Chris Whittington
Admin



Posts : 839
Join date : 2020-11-17
Location : France

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyWed Mar 08, 2023 5:40 pm

pohl4711 wrote:
Chris Whittington wrote:
The below results are from a mass of games of the development versions of CSTal. Weak versions, strong versions, trained after a few epochs, and then every 15 epochs thereafter - against release versions of the other engines.

You can see that CSTal has way more %Q sacs than anything else.

Oh.... Look at my EAS alltime Top10 list, Pedone 3:

Code:

sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1
24.57% =[00.25% + 03.47% + 03.97% + 01.74% + 05.21% + 09.93%]

And here CSTal:
12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]


So, CSTal is lightyears away from Pedone 3 playing sacs... And with really good sac-stats (like Pedone 3), the engine gets a high EAS-score (241188), even though, other engines played much more overall-sacs. So, IMO, sacs are definitly not underrated. Sorry.

The points system is exponentially:
1 pawnsac = 5x points  *** 2 pawnsac  = 15x points *** 3 pawnsac = 45x points
4 pawnsac = 90x points *** 5+ pawnsac = 180x points *** 5+ Queensac = 350x points

So, a 5+sac gets 36x more points, than a 1pawn sac. For example.

But you don’t have Chess System Tal or anything remotely resembling it. Probably you’ve got something one of the freaks called CSTal, only it isn’t.
It gets confusing when somebody calls/renames something by my brand name when they don’t have the right or permission to do that and then distributes it around and doesn’t even tell me. I didn’t know it was sent to you and you tested it, (obviously in good faith) btw.
The results I referred to were in the last post, about a hundred thousand games by the actual current Chess System Tal against top ten free available non-commercials.

Probably you have got this executable: "Rebel-14.1.02-ChrisW-NNUE-Tal-0-1MinuteBlitzer.exe" which has somehow morphed into "CSTal". It's a Rebel engine with one of my early nets, way back last year, which played "aggressively" against a pool of HCE opponents.
I would guess one of the external freaks, outside of our control, has "renamed" it.

Mclane and Brendan like this post

Back to top Go down
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyThu Mar 09, 2023 7:57 am

Chris Whittington wrote:
pohl4711 wrote:
Chris Whittington wrote:
The below results are from a mass of games of the development versions of CSTal. Weak versions, strong versions, trained after a few epochs, and then every 15 epochs thereafter - against release versions of the other engines.

You can see that CSTal has way more %Q sacs than anything else.

Oh.... Look at my EAS alltime Top10 list, Pedone 3:

Code:

sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1
24.57% =[00.25% + 03.47% + 03.97% + 01.74% + 05.21% + 09.93%]

And here CSTal:
12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]


So, CSTal is lightyears away from Pedone 3 playing sacs... And with really good sac-stats (like Pedone 3), the engine gets a high EAS-score (241188), even though, other engines played much more overall-sacs. So, IMO, sacs are definitly not underrated. Sorry.

The points system is exponentially:
1 pawnsac = 5x points  *** 2 pawnsac  = 15x points *** 3 pawnsac = 45x points
4 pawnsac = 90x points *** 5+ pawnsac = 180x points *** 5+ Queensac = 350x points

So, a 5+sac gets 36x more points, than a 1pawn sac. For example.

But you don’t have Chess System Tal or anything remotely resembling it. Probably you’ve got something one of the freaks called CSTal, only it isn’t.
It gets confusing when somebody calls/renames something by my brand name when they don’t have the right or permission to do that and then distributes it around and doesn’t even tell me. I didn’t know it was sent to you and you tested it, (obviously in good faith) btw.
The results I referred to were in the last post, about a hundred thousand games by the actual current Chess System Tal against top ten free available non-commercials.

Probably you have got this executable: "Rebel-14.1.02-ChrisW-NNUE-Tal-0-1MinuteBlitzer.exe" which has somehow morphed into "CSTal". It's a Rebel engine with one of my early nets, way back last year, which played "aggressively" against a pool of HCE opponents.
I would guess one of the external freaks, outside of our control, has "renamed" it.

I have no CSTal. I took the numbers of CSTal from your posting above (12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]) !!! The EAS-stats from Velvet 4.0.0 are mine. I just compared the 2 stats, in order to show, that right now, CSTal plays not very aggressive, compared to the most aggressive engines, I tested so far.

Here my full-Ratinglist (all engine-versions since 2020, except the Stockfish Dev-versions):
https://www.sp-cc.de/files/spcc_full_list.txt
(Below the ratinglist, the 2 EAS-lists are following)
As you can see there, a really aggressive playing engine should have 200000 EAS-points or more. For example: Rebel 14.1 has 218065 EAS-points here. Very nice value.

My Rebel 14.1 sac-stats:
30.02% =[00.04% + 01.03% + 01.34% + 03.63% + 07.93% + 16.04%]

Your CSTal-stats:
12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]

From Rebel 14.1, CSTal seems lightyears away, right now (12.4% sacs overall is a really not good value for an engine with the name TAL in it, when we see in my EAS-list, that aggressive engines have 30% overall sacs and more). Of course, it becomes more and more difficult, getting a high EAS-score, when the engine gets stronger: All really aggressive engine-versions in my EAS-list are below 3500 SPCC-Elo in my classical Ratinglist...So, a EAS-score of 200000 or more could be impossible to reach with a strong engine (3600+ SPCC-Elo), but at least 150000 EAS-points should be possible (Stockfishes have around 130000 points).
KomodoDragon 2.5 for example is a really strong engine (3732 SPCC-Elo) and it has 151484 EAS-points and the sac-stats look like this:
26.40% =[00.02% + 00.24% + 00.52% + 03.07% + 07.24% + 15.31%]
Back to top Go down
https://www.sp-cc.de
Chris Whittington
Admin



Posts : 839
Join date : 2020-11-17
Location : France

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyThu Mar 09, 2023 11:48 am

pohl4711 wrote:
Chris Whittington wrote:
pohl4711 wrote:
Chris Whittington wrote:
The below results are from a mass of games of the development versions of CSTal. Weak versions, strong versions, trained after a few epochs, and then every 15 epochs thereafter - against release versions of the other engines.

You can see that CSTal has way more %Q sacs than anything else.

Oh.... Look at my EAS alltime Top10 list, Pedone 3:

Code:

sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1
24.57% =[00.25% + 03.47% + 03.97% + 01.74% + 05.21% + 09.93%]

And here CSTal:
12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]

You're comparing APPLES with ORANGES here, with the result this thread disappears into nonsense, so let's try to get it back to normal ....

My EAS results come from playing a pool of very strong NNUE only engines. My EAS results are a composite of ALL test games, of all epoch versions of the current three training runs, so many bad, many good and many indifferent CSTals all lumped together.
Your EAS results come from Pedone3 playing a very large pool of NNUE and HCE engines.

You cant compare a result from one against the other.

I show you:
You give results for Pedone3 playing within your test setup (opening book, opponents, time control), quoting a sac rate of 24.5%

Well, I downloaded Pedone3 and put it through my test setup, result: sac rate 13.4%, very different, no?
I took one of my good Chess system Tal versions and ran it through my test setup, result: sac rate 21.4%, very different, no?

Even so, your EAS system still gives Pedone3 a very high aggressiveness level 195466 against CSTal (unfinished version) of 163283. Which was the point of my original post, where I showed a way better sac performance of CSTal against both SF14 and SF13, yet, SF14 and SF13 were given more points by your EAS tool. Leading me to the conclusion that your tool overrates short games and underrate sacs, also underrating (actually discounting) "length of sac".
Again, contrary to your assertion, Chess System Tal is without a doubt, imo, the most aggressive sacrificial non-commercial NNUE program to exist. I doubt it will ever get scores at the level of an aggressive-tuned HCE program simply because it is a lot more difficult to program aggression into a NNUE engine, but most aggressive free and non-commercial NNUE program, it is, or will be, when we release it.


Code:
*****************************************************************************
*** Engine Aggressiveness Tool V5.21 Score points Ratinglist
*****************************************************************************
*** Meanwhile, the scoring-system of the EAS-Tool got really complex, so
*** please check out the ReadMe-file, where you find the explanation...
*****************************************************************************
*** Evaluated file: all-204-130.pgn
*****************************************************************************
                                 bad  avg.win
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player
-------------------------------------------------------------------
   1    163283  21.43%  34.05%  15.63%   77   Chess-System-Tal-1.59  
-------------------------------------------------------------------
*** Average length of all won games:     80 moves
*** Engine gets bonuspoints, if its avg. won games length is shorter
*** Engine gets maluspoints, if its avg. won games length is longer
*****************************************************************************
*****************************************************************************
*****************************************************************************
*** 2nd Ratinglist with more stats in percent-values ************************
*****************************************************************************
*** Average length of all won games:  80 moves
*** Calculated limit for short wins giving EAS-points: 65 moves
*****************************************************************************
                       avg.win                                                                                                                          bad  
Rank  EAS-Score   wins  moves   sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1    all shorts short45  short50  short55  short60  short65   draws    Engine/player
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   1    163283      420   77   21.43% =[00.00% + 00.71% + 00.71% + 02.38% + 05.95% + 11.67%]    34.05% = [03.81% + 01.67% + 05.24% + 09.29% + 14.05%]  15.63%   Chess-System-Tal-1.59  
****************************************************
*** EAS-tool (C) 2022 Stefan Pohl (www.sp-cc.de) ***
**************


Code:
*****************************************************************************
*** Engine Aggressiveness Tool V5.21 Score points Ratinglist
*****************************************************************************
*** Meanwhile, the scoring-system of the EAS-Tool got really complex, so
*** please check out the ReadMe-file, where you find the explanation...
*****************************************************************************
*** Evaluated file: all-Pedone3.pgn
*****************************************************************************
                                 bad  avg.win
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player
-------------------------------------------------------------------
   1    195466  13.24%  41.18%  27.51%   68   Pedone3  
-------------------------------------------------------------------
*** Average length of all won games:     77 moves
*** Engine gets bonuspoints, if its avg. won games length is shorter
*** Engine gets maluspoints, if its avg. won games length is longer
*****************************************************************************
*****************************************************************************
*****************************************************************************
*** 2nd Ratinglist with more stats in percent-values ************************
*****************************************************************************
*** Average length of all won games:  77 moves
*** Calculated limit for short wins giving EAS-points: 60 moves
*****************************************************************************
                       avg.win                                                                                                                          bad  
Rank  EAS-Score   wins  moves   sacs    sacsQ    sacs5+   sacs4    sacs3    sacs2    sacs1    all shorts short40  short45  short50  short55  short60   draws    Engine/player
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   1    195466       68   68   13.24% =[00.00% + 01.47% + 01.47% + 00.00% + 05.88% + 04.41%]    41.18% = [11.76% + 02.94% + 08.82% + 11.76% + 05.88%]  27.51%   Pedone3  
****************************************************
*** EAS-tool (C) 2022 Stefan Pohl (www.sp-cc.de) ***
*****************
[/quote]

Mclane likes this post

Back to top Go down
Mclane

Mclane


Posts : 2599
Join date : 2020-11-17
Age : 56
Location : United States of Europe, Germany, Ruhr area

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyThu Mar 09, 2023 10:11 pm

I dont see that Pedone has anything special.
Back to top Go down
http://www.thorstenczub.de
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyFri Mar 10, 2023 7:23 am

Chris Whittington wrote:

Even so, your EAS system still gives Pedone3 a very high aggressiveness level 195466 against CSTal (unfinished version) of 163283. Which was the point of my original post, where I showed a way better sac performance of CSTal against both SF14 and SF13, yet, SF14 and SF13 were given more points by your EAS tool. Leading me to the conclusion that your tool overrates short games and underrate sacs, also underrating (actually discounting) "length of sac".
Again, contrary to your assertion, Chess System Tal is without a doubt, imo, the most aggressive sacrificial non-commercial NNUE program to exist. I doubt it will ever get scores at the level of an aggressive-tuned HCE program simply because it is a lot more difficult to program aggression into a NNUE engine, but most aggressive free and non-commercial NNUE program, it is, or will be, when we release it.


Of course, different testing-environments can lead to different EAS-scores. But aggressive playing engines will have a good EAS-score in each test-environment (like Pedone 3 in yours). And non-aggressive engines will have bad EAS-scores in each test-environment. When I look (for example) on the EAS-list of my SPCC-ratinglist and compare it with the EAS-list of my UHO-TOP10-ratinglist, the EAS-scores of the engines in both ratinglists are not very different...
When CSTal is released, I will test it and the we see it's EAS-score in my test-environment.

And IMO my tool does not overrate short games and underrate sacs, especially not the length of sacs, because of this:
The tool gives EAS-points for sacs (of course) and for short wins. So, if an engines plays a sac in a game and this game is very short, the engine gets EAS-points for the sac and for the short win - so the engine gets EAS-points for both (sac and shortness) out of the same played game! The EAS-scoring rewards sacs and shortness separately - if a game is short and there is a sac in it, the EAS-tool gives points for both and these points are added!!!
I did a lot of experiments with the EAS-scoring and right now, it is IMO really well balanced. There are already more points for sacs, than for shortness and the points for higher sacs are exponentially rising by factor 2. The points for shorter games are exponentially rising only by factor 1.5 (which leads to huge points-differences, comparing higher sacs and shorter wins). And, as I mentioned before, really spectacular games, which contain a sac and which are very short, get both EAS-points for sacs and for shortness. So, if an engine plays a lot of these spectacular games (sacs and very short length), it will get a lot of EAS-points! And IMO games containing a high sac will be short most of the time, too. So, most of the time, a won game containing a high sac, additionally gets EAS-points for shortness (If the engines sacs a rook or queen for example, I heavenly doubt, that the game will go on very long after the sac ist played.).
Back to top Go down
https://www.sp-cc.de
Chris Whittington
Admin



Posts : 839
Join date : 2020-11-17
Location : France

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyFri Mar 10, 2023 8:37 am

Queen-sac bug, not a Queen sac.
Suggested fix: if Q sac detected, test that the Q-owning side did NOT promote a pawn.


Chess-System-Tal-1.59 -- CST-1.35-V20-E520
? (855) 2023.03.10 1-0 D22

1.d4 d5 2.c4 dxc4 3.Nf3 a6 4.a4 Nf6 5.g3 e6 6.Bg2 c5 7.O-O Nc6 8.Ne5 Nxd4 9.e3 Nb3 10.Qxd8+ Kxd8 11.Ra3 Ke8 12.Nxc4 Nxc1 13.Rxc1 Rb8 14.Rd3 Nd5 15.Nc3 Nb4 16.Rd2 b6 17.Rcd1 Be7 18.Ne5 f6 19.Bc6+ Kf8 20.Rd8+ Bxd8 21.Rxd8+ Ke7 22.Rxh8 fxe5 23.Bf3 h6 24.Ne4 Nd3 25.Rg8 g5 26.Rh8 c4 27.h4 gxh4 28.gxh4 Nxb2 29.Rxh6 Nxa4 30.Rh7+ Kf8 31.h5 c3 32.h6 c2 33.Rh8+ Ke7 34.h7 c1=Q+ 35.Kg2 Kd7 36.Rg8 Bb7 37.Rxb8 Bxe4 38.Bxe4 Qc4 39.f3 Qe2+ 40.Kh3 Qf1+ 41.Kg4 Qg2+ 42.Kh5 Nc5 43.h8=Q Qh3+ 44.Kg6 Qxh8 45.Rxh8 Kd6 46.Kf6 Nd7+ 47.Kf7 Nc5 48.Rh1 b5 49.Rd1+ Kc7 50.Kf6 a5 51.Kxe5 a4 52.Rd6 Nb3 53.Rxe6 Nd2 54.Rc6+ Kb7 55.Bd5 Ka7 56.Rc8 Nf1 57.Ra8+ Kb6 58.e4 Kc5 59.Rc8+ Kb4 60.f4 Ne3 61.Ke6 Ng2 62.f5 Ka5 63.f6 Nh4 64.f7 Ng6 65.Rg8 Nf4+ 66.Kd6 Nd3 67.e5 a3 68.e6 Nb2 69.e7 a2 70.Ra8+ Kb4 71.e8=Q a1=Q 72.Rxa1 Nc4+ 73.Bxc4 Kxc4 74.Ra3 Kb4 75.Qe3 Kc4 1-0
Back to top Go down
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyFri Mar 10, 2023 9:19 am

Chris Whittington wrote:
Queen-sac bug, not a Queen sac.
Suggested fix: if Q sac detected, test that the Q-owning side did NOT promote a pawn.


Chess-System-Tal-1.59   --   CST-1.35-V20-E520
? (855)  2023.03.10  1-0  D22

1.d4 d5 2.c4 dxc4 3.Nf3 a6 4.a4 Nf6 5.g3 e6 6.Bg2 c5 7.O-O Nc6 8.Ne5 Nxd4 9.e3 Nb3 10.Qxd8+ Kxd8 11.Ra3 Ke8 12.Nxc4 Nxc1 13.Rxc1 Rb8 14.Rd3 Nd5 15.Nc3 Nb4 16.Rd2 b6 17.Rcd1 Be7 18.Ne5 f6 19.Bc6+ Kf8 20.Rd8+ Bxd8 21.Rxd8+ Ke7 22.Rxh8 fxe5 23.Bf3 h6 24.Ne4 Nd3 25.Rg8 g5 26.Rh8 c4 27.h4 gxh4 28.gxh4 Nxb2 29.Rxh6 Nxa4 30.Rh7+ Kf8 31.h5 c3 32.h6 c2 33.Rh8+ Ke7 34.h7 c1=Q+ 35.Kg2 Kd7 36.Rg8 Bb7 37.Rxb8 Bxe4 38.Bxe4 Qc4 39.f3 Qe2+ 40.Kh3 Qf1+ 41.Kg4 Qg2+ 42.Kh5 Nc5 43.h8=Q Qh3+ 44.Kg6 Qxh8 45.Rxh8 Kd6 46.Kf6 Nd7+ 47.Kf7 Nc5 48.Rh1 b5 49.Rd1+ Kc7 50.Kf6 a5 51.Kxe5 a4 52.Rd6 Nb3 53.Rxe6 Nd2 54.Rc6+ Kb7 55.Bd5 Ka7 56.Rc8 Nf1 57.Ra8+ Kb6 58.e4 Kc5 59.Rc8+ Kb4 60.f4 Ne3 61.Ke6 Ng2 62.f5 Ka5 63.f6 Nh4 64.f7 Ng6 65.Rg8 Nf4+ 66.Kd6 Nd3 67.e5 a3 68.e6 Nb2 69.e7 a2 70.Ra8+ Kb4 71.e8=Q a1=Q 72.Rxa1 Nc4+ 73.Bxc4 Kxc4 74.Ra3 Kb4 75.Qe3 Kc4 1-0

I know, in very rare cases, the Queen-Sac detection can be wrong. As I mentioned before, the only way to detect sacs with pgn-extract is the material imbalance. But pgn-extract does not have the feature for searching pawn-promotions, using it's peace/material-pattern search feature. So, I can not avoid this, sorry. I can only do, what I am already doing: Looking for a material imbalance of (at least) 5 pawn-units (loosing color must have 5 pawn-units more than winning color) and the loosing color must have one more queen on the board (at the same time) than the winning color. Both must be true for 8 consecutive plies.

The piece patterns for queen sacs look like this (for white, playing the queen sac):
08 r*l*p3+ q1>=r=l=p*
08 r*l*p3+ q1>=r=l>p*
08 r*l*p3+ q1>=r>l*p*
08 r*l*p3+ q1>=r=l1<=p>
08 r*l*p3+ q1>=r=l1<=p=
08 r*l*p3+ q1>=r=l1<=p1<=
08 r*l*p3+ q1>=r=l2<=p2>
08 r*l*p3+ q1>=r=l3<=p5>
08 r*l*p3+ q1>=r1<=l>p>
08 r*l*p3+ q1>=r1<=l>p=
08 r*l*p3+ q1>=r1<=l>p1<=
08 r*l*p3+ q1>=r1<=l>p2<=
08 r*l*p3+ q1>=r1<=l=p>
08 r*l*p3+ q1>=r1<=l1<=p4>

It is clear, that this search goes wrong, if the loosing color promotes a queen, which is not captured in the next 8 plies and the loosing color has 5 more pawn-units on the board (than the winning color) for these 8 plies. But this case is very rare, because most of the games, where one side promotes a queen and keeps it and has 5 pawn-units more on the board - this side will win the game most of time. A (wrong) queen-sac-detect can only happen, if this side looses the game instead. But this is very rare. But, of course, you are right: This can lead to some wrong queen-sac detections. But I can not avoid this. I can only do, what pgn-extract allows me to do.
Back to top Go down
https://www.sp-cc.de
Admin
Admin
Admin


Posts : 1851
Join date : 2020-11-17
Location : Netherlands

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptySat Mar 11, 2023 8:39 am

pohl4711 wrote:
Admin wrote:


BTW, speaking of Rebel-16.2 I noticed you did not test it, any particular reason?

Yes. You said, there is no change in engine or net compared to Rebel 16.1. And I made a small head_to_head of Rebel 16.1 vs 16.2 with some thousand games over night and the difference was below 3 Elo (using UHO-openings, so using balanced openings, the Elo-difference should be below 2 Elo...)

That's pretty odd, CEGT reported +23 over 16.1 on SD time control, CCRL +18 on (normal) 40/15 time control. 100% based on improvements in the time control code by Chris.
Back to top Go down
http://rebel13.nl/
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptySat Mar 11, 2023 12:24 pm

Admin wrote:
pohl4711 wrote:
Admin wrote:


BTW, speaking of Rebel-16.2 I noticed you did not test it, any particular reason?

Yes. You said, there is no change in engine or net compared to Rebel 16.1. And I made a small head_to_head of Rebel 16.1 vs 16.2 with some thousand games over night and the difference was below 3 Elo (using UHO-openings, so using balanced openings, the Elo-difference should be below 2 Elo...)

That's pretty odd, CEGT reported +23 over 16.1 on SD time control, CCRL +18 on (normal) 40/15 time control. 100% based on improvements in the time control code by Chris.

No, the latest version, CEGT has tested, was Rebel 16.0 not 16.1. So, +23 Elo from 16.0 to 16.2 seems legit.

From their test-results-forum of Rebel 16.2:
Performance = ca. ELO 3480 / 1800 games => +23 to v. 16.0NN (3457)

And, if we look at their list, with 3480 Elo, Rebel 16.2 is behind Rubichess and a bit ahead of Seer, near Igel and this is exactly what Rebel 16.1 got in the SPCC-ratinglist...
As I measured already: No measureable Elo-progress from 16.1 to 16.2. But from 16.0 to 16.2.
So, if 16.1 was already tested (SPCC), no need for a testrun of Rebel 16.2. But CEGT never tested Rebel 16.1.

Same in CCRL 40/15: Latest Rebel, they tested before 16.2 was Rebel 16.0...
Back to top Go down
https://www.sp-cc.de
fsanders




Posts : 14
Join date : 2023-01-10

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyTue Mar 21, 2023 12:49 pm

[quote="pohl4711"]

1) It must be written in batch-language, because I use pgn-extract and some other external tools and will definitly not write my own pgn-parser...

Just for my understanding, because pgn-extract is a command linie program, would it not be possible to start it from python and read the output (just like the MEA tool does with UCI engines) ?

Back to top Go down
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyTue Mar 21, 2023 12:58 pm

fsanders wrote:

Just for my understanding, because pgn-extract is a command linie program, would it not be possible to start it from python and read the output (just like the MEA tool does with UCI engines) ?


I do not say, that this is not possible. Perhaps it is. But mention, the batch-language is very powerful and simple, when doing reading, manipulating and writing text-files. And because of this, batch is perfect for using pgn-extract. Because pgn-extract operates only on pgn-files and the pgn-extract output is always a pgn-file (and all pgn-files are just text-files). So, IMO, batch-language is the best choice for using pgn-extract...
And there is no need for installing python or compiling or other things like that: The (batch)-source-code of my tools can be read and rewritten by everybody, who wants to do so. Because batch is an interpreter-language. Speed is no problem, because all complex operations (searching for piece-patterns etc.) are done by pgn-extract, which is very fast since it's last update.
Back to top Go down
https://www.sp-cc.de
fsanders




Posts : 14
Join date : 2023-01-10

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyTue Mar 21, 2023 1:58 pm

I see. Looks like batch is the right choice for you and this topic.
Back to top Go down
TheSelfImprover

TheSelfImprover


Posts : 2414
Join date : 2020-11-18

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyTue Mar 21, 2023 9:17 pm

In Windows, you've got the script host (WSH - link) which gives you good programming capabilities, and even simple popup windows. Among the scripting languages it offers are Rexx*, BASIC, Perl, Ruby, Tcl, PHP, JavaScript, Delphi, Python, XSLT. I use it every day because I find it "really useful". A WSH file can be used like a batch file.

*I used to love Rexx because it was my first scripting language, and I was amazed that it was on offer under WSH. However, I recognise that JavaScript is the nearest thing the world has to a script language standard, so I use that. It's only fair to warn you that there are a few differences between WSH JavaScript and standard JavaScript.
Back to top Go down
Chris Whittington
Admin



Posts : 839
Join date : 2020-11-17
Location : France

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyFri Mar 31, 2023 1:30 pm

For Stefan Pohl:
I’ve included a call to your EAS bat file as part of the SPSA tuner for CSTal. This means, at every few epochs, the tuner will output both an Elo performance and an EAS score.
May I have your permission to include the EAS batch files etc as part of the SPSA tuner when it (the tuner) gets released on GitHub? Thanks.

Brendan likes this post

Back to top Go down
pohl4711

pohl4711


Posts : 68
Join date : 2022-03-01
Location : Berlin

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptySat Apr 01, 2023 8:07 am

Chris Whittington wrote:
For Stefan Pohl:
I’ve included a call to your EAS bat file as part of the SPSA tuner for CSTal. This means, at every few epochs, the tuner will output both an Elo performance and an EAS score.
May I have your permission to include the EAS batch files etc as part of the  SPSA tuner when it (the tuner) gets released on GitHub? Thanks.

Of course! For me, it is an honor, that engine-developers use my little EAS-tool. Feel free to do whatever you want. And, by the way, I like the idea! Looking forward to your github-release!
Back to top Go down
https://www.sp-cc.de
Dann Corbit




Posts : 110
Join date : 2020-11-26

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyThu Apr 13, 2023 12:40 pm

Don't forget fun.
I remember a Komodo queen sac in TCEC, and the crowd went wild.
A pawn sac won't cause a stir like that.

Morphy moves and Tal moves make us swoon.  Mikhail Botvinnik not so much.

Mclane and Brendan like this post

Back to top Go down
Dann Corbit




Posts : 110
Join date : 2020-11-26

Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS EmptyThu Apr 13, 2023 12:42 pm

Speaking of Mikhail Botvinnik, sure he wins.  But who cares?
Back to top Go down
Sponsored content





Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty
PostSubject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS   Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Empty

Back to top Go down
 
Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS
Back to top 
Page 1 of 1
 Similar topics
-
» Release of my Engines Aggressiveness Statistic Tools
» For Stefan
» Rebel in my FACE -Tournaments
» How Chess Engines Should Try To Win
» It seems that chess is still not a draw between top engines

Permissions in this forum:You cannot reply to topics in this forum
ProDeo :: Computer Chess-
Jump to: