Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS
+3
Mclane
pohl4711
Chris Whittington
7 posters
Author
Message
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
Subject: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Tue Mar 07, 2023 11:50 pm
The EAS test is really very (very) useful to engine developers. I'm using it to keep a quantitative track on the dev versions of Chess System Tal, and also because I like (a lot) playing through the sacrifice games that it finds from PGN lists. Here I've one small request for a reversion to EAS saving PGN's with full comments, timings and evals, these are lost in the recent versions. Like so:
[Date "2023.03.07"] [Round "105"] [White "Chess-System-Tal-1.59"] [Black "Seer-2.6"] [Result "1-0"] [ECO "C18"] [Opening "French"] [Variation "Winawer, Advance Variation"] [TimeControl "20+0.05"] [PlyCount "75"] [GameDuration "00:00:30"] [GameEndTime "2023-03-07T20:46:20.470 W. Europe Standard Time"] [GameStartTime "2023-03-07T20:45:49.819 W. Europe Standard Time"]
it was very useful to know the evals, at a glance, when reviewing the games, but now not
I have now 200,000 PGN test games, organised by maturing epochs (about 15 epochs apart) and several nets (three nets in the current series), so can probably make some comments about the scoring system. Aggressiveness scoring is obviously made very difficult by defining what "aggressive" means and how the various metrics used correlate to it and not least by the extreme sparsity of high material sacrifice games.
Mclane, matejst and Dio like this post
pohl4711
Posts : 160 Join date : 2022-03-01 Location : Berlin
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Wed Mar 08, 2023 3:27 pm
Admin wrote:
BTW, speaking of Rebel-16.2 I noticed you did not test it, any particular reason?
Yes. You said, there is no change in engine or net compared to Rebel 16.1. And I made a small head_to_head of Rebel 16.1 vs 16.2 with some thousand games over night and the difference was below 3 Elo (using UHO-openings, so using balanced openings, the Elo-difference should be below 2 Elo...)
Last edited by pohl4711 on Wed Mar 08, 2023 3:53 pm; edited 1 time in total
pohl4711
Posts : 160 Join date : 2022-03-01 Location : Berlin
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Wed Mar 08, 2023 3:38 pm
Chris Whittington wrote:
The EAS test is really very (very) useful to engine developers. I'm using it to keep a quantitative track on the dev versions of Chess System Tal, and also because I like (a lot) playing through the sacrifice games that it finds from PGN lists. Here I've one small request for a reversion to EAS saving PGN's with full comments, timings and evals, these are lost in the recent versions. Like so:
[Date "2023.03.07"] [Round "105"] [White "Chess-System-Tal-1.59"] [Black "Seer-2.6"] [Result "1-0"] [ECO "C18"] [Opening "French"] [Variation "Winawer, Advance Variation"] [TimeControl "20+0.05"] [PlyCount "75"] [GameDuration "00"] [GameEndTime "2023-03-07T20.470 W. Europe Standard Time"] [GameStartTime "2023-03-07T20.819 W. Europe Standard Time"]
it was very useful to know the evals, at a glance, when reviewing the games, but now not
I have now 200,000 PGN test games, organised by maturing epochs (about 15 epochs apart) and several nets (three nets in the current series), so can probably make some comments about the scoring system. Aggressiveness scoring is obviously made very difficult by defining what "aggressive" means and how the various metrics used correlate to it and not least by the extreme sparsity of high material sacrifice games.
The scoring system of the EAS-Tool got really complex in the meantime. In the ReadMe-File in the EAS-download, it is fully explained. For engine developers: Perhaps it is more helpful to look at the full stats in the 2nd EAS-list, not only looking at the EAS-score: The EAS-tool makes 2 lists, the second with more percentual stats. Both in the statistics_EAS_ratinglist.txt File.
The EAS-Tool deletes all comments in the games, because it makes the files much smaller in size and the computing gets faster (around +50% faster!). But, if you want to keep the comments, just open the .bat-file (EAS_Tool_V5.21.bat or/and Gauntlet_EAS_Tool_V5.21.bat) with an editor and search for the string: "-C -N -V" This is the command for pgn-extract, to remove all comments. If you search for this string, you will find it only one time in each of the both tools. The line looks like this: pgn-extract --quiet --fixresulttags -C -N -V --plycount ../%gamebase% --output newsource.pgn > NUL
Just delete this sequence out and the games should have all comments (time, eval, depth, when played with cutechess). So, the line should look like this: pgn-extract --quiet --fixresulttags --plycount ../%gamebase% --output newsource.pgn > NUL
Save the .bat file and use this instead of the original .bat-files...
Last edited by pohl4711 on Wed Mar 08, 2023 3:51 pm; edited 1 time in total
Mclane likes this post
pohl4711
Posts : 160 Join date : 2022-03-01 Location : Berlin
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Wed Mar 08, 2023 3:48 pm
Here (taken from the Readme-file) the EAS-Score calculation:
EAS-Score is calculated with these rules: 1) Sacrifices: (percent*100) of the percent-values of the sacrifices (1-5+ pawnunits) calculated out of the won games by the engine, only. So, a weak engine (with a small number of won games) can get a high EAS-scoring, too, when the percent of sac-games in the won games is high (and the number of short wins). Higher pawnunits-sacs give bonus-points: 1 pawnsac = 5x points *** 2 pawnsac = 15x points *** 3 pawnsac = 45x points 4 pawnsac = 90x points *** 5+ pawnsac = 180x points *** 5+ Queensac = 350x points
2) Very short won games (percent*100) of won games by the engine give these EAS-points: 60 moves= 8x points *** 55 moves= 12x points *** 50 moves= 18x points 45 moves= 27x points *** 40 moves= 45x points. Since V5.2, the move-limit is no longer fixed to 40-60 moves, but the average length of all won games in the source.pgn is calculated, rounded to 5 or 10 and -15. Reason is, that human games or adjucated engine games are much shorter than non-adjucated engine-games for example and the EAS-tools will now adjust the move-limits for short-win EAS-points to this "reality": Example 1: Average won game length in the source.pgn is 78 moves: Rounded to 75 and -15 = 60 is the upper limit, followed by 55, 50, 45, 40 Example 2: Average won game length is 58 moves: Rounded to 55 and -15 = 40 is the upper limit, followed by 35, 30, 25, 20 Additionally, if the average win game length of the engine is shorter than the average win game length of all games in the source.pgn, the engine gets 3000 EAS-points for each move, their won games are shorter in average. If the average win game length of the engine is higher than the average win game length of all games in the source.pgn, 1000 EAS-points are substracted for each move, their won games are shorter in average. But these substraction of points is done only on the EAS-points, the engine has received for their short wins (see above). The other EAS-points (for sacrifices and bad draws (see 1) and 3)) stay always in the calculation!
3) Bad draws: Bad draws are games, which were drawn before endgame (material check is done, the number of played moves does not matter) and draws after the engine had a material advantage of at least 1 pawn during a game, because the engine should win a game, if material was won. All these bad draws are finally checked for a material disadvantage of at least 1 pawn: Because draws with material disadvantage prevented a possible loss and so, these games are no bad draws and are not counted. The formula for calculating the bad-draw EAS-points is a bit tricky: a) The percent-value of all good draws (out of all draws, the engine played) by the engine is calculated (all draws - bad draws) and rounded: Example: Engine had 23.7% bad draws, then the value here is 76 (100% - 23.7% = 76.3% (good draws), then rounded). b) This value is exp3. Means: 76*76*76 = 438976, then divided by 2500 = 175 c) This value is exp2. Means: 175*175 = 30625 So, the engines gets 30625 EAS-points
Mclane likes this post
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Wed Mar 08, 2023 4:18 pm
pohl4711 wrote:
Chris Whittington wrote:
The EAS test is really very (very) useful to engine developers. I'm using it to keep a quantitative track on the dev versions of Chess System Tal, and also because I like (a lot) playing through the sacrifice games that it finds from PGN lists. Here I've one small request for a reversion to EAS saving PGN's with full comments, timings and evals, these are lost in the recent versions. Like so:
[Date "2023.03.07"] [Round "105"] [White "Chess-System-Tal-1.59"] [Black "Seer-2.6"] [Result "1-0"] [ECO "C18"] [Opening "French"] [Variation "Winawer, Advance Variation"] [TimeControl "20+0.05"] [PlyCount "75"] [GameDuration "00"] [GameEndTime "2023-03-07T20.470 W. Europe Standard Time"] [GameStartTime "2023-03-07T20.819 W. Europe Standard Time"]
it was very useful to know the evals, at a glance, when reviewing the games, but now not
I have now 200,000 PGN test games, organised by maturing epochs (about 15 epochs apart) and several nets (three nets in the current series), so can probably make some comments about the scoring system. Aggressiveness scoring is obviously made very difficult by defining what "aggressive" means and how the various metrics used correlate to it and not least by the extreme sparsity of high material sacrifice games.
The scoring system of the EAS-Tool got really complex in the meantime. In the ReadMe-File in the EAS-download, it is fully explained. For engine developers: Perhaps it is more helpful to look at the full stats in the 2nd EAS-list, not only looking at the EAS-score: The EAS-tool makes 2 lists, the second with more percentual stats. Both in the statistics_EAS_ratinglist.txt File.
The EAS-Tool deletes all comments in the games, because it makes the files much smaller in size and the computing gets faster (around +50% faster!). But, if you want to keep the comments, just open the .bat-file (EAS_Tool_V5.21.bat or/and Gauntlet_EAS_Tool_V5.21.bat) with an editor and search for the string: "-C -N -V" This is the command for pgn-extract, to remove all comments. If you search for this string, you will find it only one time in each of the both tools. The line looks like this: pgn-extract --quiet --fixresulttags -C -N -V --plycount ../%gamebase% --output newsource.pgn > NUL
Just delete this sequence out and the games should have all comments (time, eval, depth, when played with cutechess). So, the line should look like this: pgn-extract --quiet --fixresulttags --plycount ../%gamebase% --output newsource.pgn > NUL
Save the .bat file and use this instead of the original .bat-files...
OMG, just looked at the batch files. It's all done in batch file language. Help!! I can see how it gets complicated, fast. Convert to Python?
There are two things I would change. Please treat this as being helpful, rather than critical. First, I'ld weight the sacrifice results quite a bit more. Sacrifices are very probably directly correlatable to "chess aggressiveness". And the game length results less, possibly game length and aggressiveness have some kind of correlation, but so does gamelength and Elo difference. Some sac games can be "long", for example instead of liquidating into a short mate, they liquidate into a longer but winning endgame. Second, I'ld introduce the concept of length of sacrifice(how long during the game does the engine "hold" the sacrificial situation before resolving it?) The longer the "hold", the more likely the sac is positional (eval), rather than short term tactical (search), and I think positional eval sacs are more linked to aggression style than tactical search ones.
I can see that, because sacs are sparse, increasing sac weighting is quite possibly going to result in larger final score ranges, which could also result in an engine getting "lucky" results, but this is always the danger of not enough data. However, introducing a "length of sac" factor will probably ameliorate this. Or better log(length of sac) or sqrt(length of sac). I guess sac length is already computed? Because it's the trigger for counting as a sac?
it could be that just multiplying the sac factor by sqrt(sac length) could be enough. It will increase the sac weight at the same time as adjusting for sac length. Or maybe ln(sac length)?
Mclane likes this post
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
Subject: EAS - an example Wed Mar 08, 2023 4:33 pm
The below results are from a mass of games of the development versions of CSTal. Weak versions, strong versions, trained after a few epochs, and then every 15 epochs thereafter - against release versions of the other engines.
You can see that CSTal has way more %Q sacs than anything else. About 3x as many 5+ sacs as the next engine. 2x as many 4+ sacs. More 3+ sacs than anything else, more 2+ sacs than anything else, and in top 50% on the 1+ sacs. On sacs alone, I'ld place it at "most aggressive", except that SF13 and SF14 come higher in the list (I assume from short game weightings). Thus the reason I think sacs are under-weighted. Also looking at some of the games of other engine sacs, I found 'quick' sacs, ie, tactical ones that win back the material fast, or lead to quick mate - these tend to be search/elo related, so arguably not "aggressive" style.
Posts : 160 Join date : 2022-03-01 Location : Berlin
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Wed Mar 08, 2023 4:42 pm
Chris Whittington wrote:
OMG, just looked at the batch files. It's all done in batch file language. Help!! I can see how it gets complicated, fast. Convert to Python?
There are two things I would change. Please treat this as being helpful, rather than critical. First, I'ld weight the sacrifice results quite a bit more. Sacrifices are very probably directly correlatable to "chess aggressiveness". And the game length results less, possibly game length and aggressiveness have some kind of correlation, but so does gamelength and Elo difference. Some sac games can be "long", for example instead of liquidating into a short mate, they liquidate into a longer but winning endgame. Second, I'ld introduce the concept of length of sacrifice(how long during the game does the engine "hold" the sacrificial situation before resolving it?) The longer the "hold", the more likely the sac is positional (eval), rather than short term tactical (search), and I think positional eval sacs are more linked to aggression style than tactical search ones.
I can see that, because sacs are sparse, increasing sac weighting is quite possibly going to result in larger final score ranges, which could also result in an engine getting "lucky" results, but this is always the danger of not enough data. However, introducing a "length of sac" factor will probably ameliorate this. Or better log(length of sac) or sqrt(length of sac). I guess sac length is already computed? Because it's the trigger for counting as a sac?
it could be that just multiplying the sac factor by sqrt(sac length) could be enough. It will increase the sac weight at the same time as adjusting for sac length. Or maybe ln(sac length)?
1) It must be written in batch-language, because I use pgn-extract and some other external tools and will definitly not write my own pgn-parser...
2) Length of sacs: Sac-detection only with pgn-extract... using it's peace-pattern-recognition... Strict limits there... A sac there is just material disadvantage in the game, for the winning color. Each new capture (even normal capture with re-capture) sets the "sac-counter" back to zero... No chance to change something here...
3) Weights: IMO playing very short wins is very aggressive, too. But in general: My advice for engine-developers, using my EAS-tool: Do not look to much at the EAS-score, but more into the statistics. Here an example out of my EAS-all-time Top10 EAS-list, which looks like this (from my website):
For a developer, it is much more important, to look at the stats. Example: Pedone 3. Why has Pedone such a high EAS-score, when the sacs is 24.57% only and the number of bad draws is so high (22.45%).? Then you can investigate this further in the second EAS-list (looks like in my Full-ratinglist EAS):
So, we see, that the high EAS-score of Pedone 3 comes from the higher number of high-sac games, which give much more EAS-points, than low sac games... So, we can learn here about Pedone 3 playings-style a lot: 1) Plays many high sacs (good!) 2) (Looking at the upper EAS-list in this post): plays not so many short wins and relatively high-number of "bad draws", so besides the high number of high-sacs, Pedone 3 can be improved here: Developer should try to make the engine playing more short & directly to the win and avoid some early draws...
pohl4711
Posts : 160 Join date : 2022-03-01 Location : Berlin
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Wed Mar 08, 2023 4:49 pm
Chris Whittington wrote:
The below results are from a mass of games of the development versions of CSTal. Weak versions, strong versions, trained after a few epochs, and then every 15 epochs thereafter - against release versions of the other engines.
You can see that CSTal has way more %Q sacs than anything else.
Oh.... Look at my EAS alltime Top10 list, Pedone 3:
And here CSTal: 12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]
So, CSTal is lightyears away from Pedone 3 playing sacs... And with really good sac-stats (like Pedone 3), the engine gets a high EAS-score (241188), even though, other engines played much more overall-sacs. So, IMO, sacs are definitly not underrated. Sorry.
So, a 5+sac gets 36x more points, than a 1pawn sac. For example.
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Wed Mar 08, 2023 5:40 pm
pohl4711 wrote:
Chris Whittington wrote:
The below results are from a mass of games of the development versions of CSTal. Weak versions, strong versions, trained after a few epochs, and then every 15 epochs thereafter - against release versions of the other engines.
You can see that CSTal has way more %Q sacs than anything else.
Oh.... Look at my EAS alltime Top10 list, Pedone 3:
And here CSTal: 12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]
So, CSTal is lightyears away from Pedone 3 playing sacs... And with really good sac-stats (like Pedone 3), the engine gets a high EAS-score (241188), even though, other engines played much more overall-sacs. So, IMO, sacs are definitly not underrated. Sorry.
So, a 5+sac gets 36x more points, than a 1pawn sac. For example.
But you don’t have Chess System Tal or anything remotely resembling it. Probably you’ve got something one of the freaks called CSTal, only it isn’t. It gets confusing when somebody calls/renames something by my brand name when they don’t have the right or permission to do that and then distributes it around and doesn’t even tell me. I didn’t know it was sent to you and you tested it, (obviously in good faith) btw. The results I referred to were in the last post, about a hundred thousand games by the actual current Chess System Tal against top ten free available non-commercials.
Probably you have got this executable: "Rebel-14.1.02-ChrisW-NNUE-Tal-0-1MinuteBlitzer.exe" which has somehow morphed into "CSTal". It's a Rebel engine with one of my early nets, way back last year, which played "aggressively" against a pool of HCE opponents. I would guess one of the external freaks, outside of our control, has "renamed" it.
Mclane and Brendan like this post
pohl4711
Posts : 160 Join date : 2022-03-01 Location : Berlin
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Thu Mar 09, 2023 7:57 am
Chris Whittington wrote:
pohl4711 wrote:
Chris Whittington wrote:
The below results are from a mass of games of the development versions of CSTal. Weak versions, strong versions, trained after a few epochs, and then every 15 epochs thereafter - against release versions of the other engines.
You can see that CSTal has way more %Q sacs than anything else.
Oh.... Look at my EAS alltime Top10 list, Pedone 3:
And here CSTal: 12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]
So, CSTal is lightyears away from Pedone 3 playing sacs... And with really good sac-stats (like Pedone 3), the engine gets a high EAS-score (241188), even though, other engines played much more overall-sacs. So, IMO, sacs are definitly not underrated. Sorry.
So, a 5+sac gets 36x more points, than a 1pawn sac. For example.
But you don’t have Chess System Tal or anything remotely resembling it. Probably you’ve got something one of the freaks called CSTal, only it isn’t. It gets confusing when somebody calls/renames something by my brand name when they don’t have the right or permission to do that and then distributes it around and doesn’t even tell me. I didn’t know it was sent to you and you tested it, (obviously in good faith) btw. The results I referred to were in the last post, about a hundred thousand games by the actual current Chess System Tal against top ten free available non-commercials.
Probably you have got this executable: "Rebel-14.1.02-ChrisW-NNUE-Tal-0-1MinuteBlitzer.exe" which has somehow morphed into "CSTal". It's a Rebel engine with one of my early nets, way back last year, which played "aggressively" against a pool of HCE opponents. I would guess one of the external freaks, outside of our control, has "renamed" it.
I have no CSTal. I took the numbers of CSTal from your posting above (12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]) !!! The EAS-stats from Velvet 4.0.0 are mine. I just compared the 2 stats, in order to show, that right now, CSTal plays not very aggressive, compared to the most aggressive engines, I tested so far.
Here my full-Ratinglist (all engine-versions since 2020, except the Stockfish Dev-versions): https://www.sp-cc.de/files/spcc_full_list.txt (Below the ratinglist, the 2 EAS-lists are following) As you can see there, a really aggressive playing engine should have 200000 EAS-points or more. For example: Rebel 14.1 has 218065 EAS-points here. Very nice value.
From Rebel 14.1, CSTal seems lightyears away, right now (12.4% sacs overall is a really not good value for an engine with the name TAL in it, when we see in my EAS-list, that aggressive engines have 30% overall sacs and more). Of course, it becomes more and more difficult, getting a high EAS-score, when the engine gets stronger: All really aggressive engine-versions in my EAS-list are below 3500 SPCC-Elo in my classical Ratinglist...So, a EAS-score of 200000 or more could be impossible to reach with a strong engine (3600+ SPCC-Elo), but at least 150000 EAS-points should be possible (Stockfishes have around 130000 points). KomodoDragon 2.5 for example is a really strong engine (3732 SPCC-Elo) and it has 151484 EAS-points and the sac-stats look like this: 26.40% =[00.02% + 00.24% + 00.52% + 03.07% + 07.24% + 15.31%]
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Thu Mar 09, 2023 11:48 am
pohl4711 wrote:
Chris Whittington wrote:
pohl4711 wrote:
Chris Whittington wrote:
The below results are from a mass of games of the development versions of CSTal. Weak versions, strong versions, trained after a few epochs, and then every 15 epochs thereafter - against release versions of the other engines.
You can see that CSTal has way more %Q sacs than anything else.
Oh.... Look at my EAS alltime Top10 list, Pedone 3:
And here CSTal: 12.40% =[00.00% + 00.14% + 00.05% + 00.97% + 03.56% + 07.68%]
You're comparing APPLES with ORANGES here, with the result this thread disappears into nonsense, so let's try to get it back to normal ....
My EAS results come from playing a pool of very strong NNUE only engines. My EAS results are a composite of ALL test games, of all epoch versions of the current three training runs, so many bad, many good and many indifferent CSTals all lumped together. Your EAS results come from Pedone3 playing a very large pool of NNUE and HCE engines.
You cant compare a result from one against the other.
I show you: You give results for Pedone3 playing within your test setup (opening book, opponents, time control), quoting a sac rate of 24.5%
Well, I downloaded Pedone3 and put it through my test setup, result: sac rate 13.4%, very different, no? I took one of my good Chess system Tal versions and ran it through my test setup, result: sac rate 21.4%, very different, no?
Even so, your EAS system still gives Pedone3 a very high aggressiveness level 195466 against CSTal (unfinished version) of 163283. Which was the point of my original post, where I showed a way better sac performance of CSTal against both SF14 and SF13, yet, SF14 and SF13 were given more points by your EAS tool. Leading me to the conclusion that your tool overrates short games and underrate sacs, also underrating (actually discounting) "length of sac". Again, contrary to your assertion, Chess System Tal is without a doubt, imo, the most aggressive sacrificial non-commercial NNUE program to exist. I doubt it will ever get scores at the level of an aggressive-tuned HCE program simply because it is a lot more difficult to program aggression into a NNUE engine, but most aggressive free and non-commercial NNUE program, it is, or will be, when we release it.
Code:
***************************************************************************** *** Engine Aggressiveness Tool V5.21 Score points Ratinglist ***************************************************************************** *** Meanwhile, the scoring-system of the EAS-Tool got really complex, so *** please check out the ReadMe-file, where you find the explanation... ***************************************************************************** *** Evaluated file: all-204-130.pgn ***************************************************************************** bad avg.win Rank EAS-Score sacs shorts draws moves Engine/player ------------------------------------------------------------------- 1 163283 21.43% 34.05% 15.63% 77 Chess-System-Tal-1.59 ------------------------------------------------------------------- *** Average length of all won games: 80 moves *** Engine gets bonuspoints, if its avg. won games length is shorter *** Engine gets maluspoints, if its avg. won games length is longer ***************************************************************************** ***************************************************************************** ***************************************************************************** *** 2nd Ratinglist with more stats in percent-values ************************ ***************************************************************************** *** Average length of all won games: 80 moves *** Calculated limit for short wins giving EAS-points: 65 moves ***************************************************************************** avg.win bad Rank EAS-Score wins moves sacs sacsQ sacs5+ sacs4 sacs3 sacs2 sacs1 all shorts short45 short50 short55 short60 short65 draws Engine/player ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 163283 420 77 21.43% =[00.00% + 00.71% + 00.71% + 02.38% + 05.95% + 11.67%] 34.05% = [03.81% + 01.67% + 05.24% + 09.29% + 14.05%] 15.63% Chess-System-Tal-1.59 **************************************************** *** EAS-tool (C) 2022 Stefan Pohl (www.sp-cc.de) *** **************
Code:
***************************************************************************** *** Engine Aggressiveness Tool V5.21 Score points Ratinglist ***************************************************************************** *** Meanwhile, the scoring-system of the EAS-Tool got really complex, so *** please check out the ReadMe-file, where you find the explanation... ***************************************************************************** *** Evaluated file: all-Pedone3.pgn ***************************************************************************** bad avg.win Rank EAS-Score sacs shorts draws moves Engine/player ------------------------------------------------------------------- 1 195466 13.24% 41.18% 27.51% 68 Pedone3 ------------------------------------------------------------------- *** Average length of all won games: 77 moves *** Engine gets bonuspoints, if its avg. won games length is shorter *** Engine gets maluspoints, if its avg. won games length is longer ***************************************************************************** ***************************************************************************** ***************************************************************************** *** 2nd Ratinglist with more stats in percent-values ************************ ***************************************************************************** *** Average length of all won games: 77 moves *** Calculated limit for short wins giving EAS-points: 60 moves ***************************************************************************** avg.win bad Rank EAS-Score wins moves sacs sacsQ sacs5+ sacs4 sacs3 sacs2 sacs1 all shorts short40 short45 short50 short55 short60 draws Engine/player ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 195466 68 68 13.24% =[00.00% + 01.47% + 01.47% + 00.00% + 05.88% + 04.41%] 41.18% = [11.76% + 02.94% + 08.82% + 11.76% + 05.88%] 27.51% Pedone3 **************************************************** *** EAS-tool (C) 2022 Stefan Pohl (www.sp-cc.de) *** *****************
[/quote]
Mclane likes this post
Mclane
Posts : 3022 Join date : 2020-11-17 Age : 57 Location : United States of Europe, Germany, Ruhr area
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Thu Mar 09, 2023 10:11 pm
I dont see that Pedone has anything special.
pohl4711
Posts : 160 Join date : 2022-03-01 Location : Berlin
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Fri Mar 10, 2023 7:23 am
Chris Whittington wrote:
Even so, your EAS system still gives Pedone3 a very high aggressiveness level 195466 against CSTal (unfinished version) of 163283. Which was the point of my original post, where I showed a way better sac performance of CSTal against both SF14 and SF13, yet, SF14 and SF13 were given more points by your EAS tool. Leading me to the conclusion that your tool overrates short games and underrate sacs, also underrating (actually discounting) "length of sac". Again, contrary to your assertion, Chess System Tal is without a doubt, imo, the most aggressive sacrificial non-commercial NNUE program to exist. I doubt it will ever get scores at the level of an aggressive-tuned HCE program simply because it is a lot more difficult to program aggression into a NNUE engine, but most aggressive free and non-commercial NNUE program, it is, or will be, when we release it.
Of course, different testing-environments can lead to different EAS-scores. But aggressive playing engines will have a good EAS-score in each test-environment (like Pedone 3 in yours). And non-aggressive engines will have bad EAS-scores in each test-environment. When I look (for example) on the EAS-list of my SPCC-ratinglist and compare it with the EAS-list of my UHO-TOP10-ratinglist, the EAS-scores of the engines in both ratinglists are not very different... When CSTal is released, I will test it and the we see it's EAS-score in my test-environment.
And IMO my tool does not overrate short games and underrate sacs, especially not the length of sacs, because of this: The tool gives EAS-points for sacs (of course) and for short wins. So, if an engines plays a sac in a game and this game is very short, the engine gets EAS-points for the sac and for the short win - so the engine gets EAS-points for both (sac and shortness) out of the same played game! The EAS-scoring rewards sacs and shortness separately - if a game is short and there is a sac in it, the EAS-tool gives points for both and these points are added!!! I did a lot of experiments with the EAS-scoring and right now, it is IMO really well balanced. There are already more points for sacs, than for shortness and the points for higher sacs are exponentially rising by factor 2. The points for shorter games are exponentially rising only by factor 1.5 (which leads to huge points-differences, comparing higher sacs and shorter wins). And, as I mentioned before, really spectacular games, which contain a sac and which are very short, get both EAS-points for sacs and for shortness. So, if an engine plays a lot of these spectacular games (sacs and very short length), it will get a lot of EAS-points! And IMO games containing a high sac will be short most of the time, too. So, most of the time, a won game containing a high sac, additionally gets EAS-points for shortness (If the engines sacs a rook or queen for example, I heavenly doubt, that the game will go on very long after the sac ist played.).
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Fri Mar 10, 2023 8:37 am
Queen-sac bug, not a Queen sac. Suggested fix: if Q sac detected, test that the Q-owning side did NOT promote a pawn.
I know, in very rare cases, the Queen-Sac detection can be wrong. As I mentioned before, the only way to detect sacs with pgn-extract is the material imbalance. But pgn-extract does not have the feature for searching pawn-promotions, using it's peace/material-pattern search feature. So, I can not avoid this, sorry. I can only do, what I am already doing: Looking for a material imbalance of (at least) 5 pawn-units (loosing color must have 5 pawn-units more than winning color) and the loosing color must have one more queen on the board (at the same time) than the winning color. Both must be true for 8 consecutive plies.
The piece patterns for queen sacs look like this (for white, playing the queen sac): 08 r*l*p3+ q1>=r=l=p* 08 r*l*p3+ q1>=r=l>p* 08 r*l*p3+ q1>=r>l*p* 08 r*l*p3+ q1>=r=l1<=p> 08 r*l*p3+ q1>=r=l1<=p= 08 r*l*p3+ q1>=r=l1<=p1<= 08 r*l*p3+ q1>=r=l2<=p2> 08 r*l*p3+ q1>=r=l3<=p5> 08 r*l*p3+ q1>=r1<=l>p> 08 r*l*p3+ q1>=r1<=l>p= 08 r*l*p3+ q1>=r1<=l>p1<= 08 r*l*p3+ q1>=r1<=l>p2<= 08 r*l*p3+ q1>=r1<=l=p> 08 r*l*p3+ q1>=r1<=l1<=p4>
It is clear, that this search goes wrong, if the loosing color promotes a queen, which is not captured in the next 8 plies and the loosing color has 5 more pawn-units on the board (than the winning color) for these 8 plies. But this case is very rare, because most of the games, where one side promotes a queen and keeps it and has 5 pawn-units more on the board - this side will win the game most of time. A (wrong) queen-sac-detect can only happen, if this side looses the game instead. But this is very rare. But, of course, you are right: This can lead to some wrong queen-sac detections. But I can not avoid this. I can only do, what pgn-extract allows me to do.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Sat Mar 11, 2023 8:39 am
pohl4711 wrote:
Admin wrote:
BTW, speaking of Rebel-16.2 I noticed you did not test it, any particular reason?
Yes. You said, there is no change in engine or net compared to Rebel 16.1. And I made a small head_to_head of Rebel 16.1 vs 16.2 with some thousand games over night and the difference was below 3 Elo (using UHO-openings, so using balanced openings, the Elo-difference should be below 2 Elo...)
That's pretty odd, CEGT reported +23 over 16.1 on SD time control, CCRL +18 on (normal) 40/15 time control. 100% based on improvements in the time control code by Chris.
pohl4711
Posts : 160 Join date : 2022-03-01 Location : Berlin
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Sat Mar 11, 2023 12:24 pm
Admin wrote:
pohl4711 wrote:
Admin wrote:
BTW, speaking of Rebel-16.2 I noticed you did not test it, any particular reason?
Yes. You said, there is no change in engine or net compared to Rebel 16.1. And I made a small head_to_head of Rebel 16.1 vs 16.2 with some thousand games over night and the difference was below 3 Elo (using UHO-openings, so using balanced openings, the Elo-difference should be below 2 Elo...)
That's pretty odd, CEGT reported +23 over 16.1 on SD time control, CCRL +18 on (normal) 40/15 time control. 100% based on improvements in the time control code by Chris.
No, the latest version, CEGT has tested, was Rebel 16.0 not 16.1. So, +23 Elo from 16.0 to 16.2 seems legit.
From their test-results-forum of Rebel 16.2: Performance = ca. ELO 3480 / 1800 games => +23 to v. 16.0NN (3457)
And, if we look at their list, with 3480 Elo, Rebel 16.2 is behind Rubichess and a bit ahead of Seer, near Igel and this is exactly what Rebel 16.1 got in the SPCC-ratinglist... As I measured already: No measureable Elo-progress from 16.1 to 16.2. But from 16.0 to 16.2. So, if 16.1 was already tested (SPCC), no need for a testrun of Rebel 16.2. But CEGT never tested Rebel 16.1.
Same in CCRL 40/15: Latest Rebel, they tested before 16.2 was Rebel 16.0...
fsanders
Posts : 15 Join date : 2023-01-10
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Tue Mar 21, 2023 12:49 pm
[quote="pohl4711"]
1) It must be written in batch-language, because I use pgn-extract and some other external tools and will definitly not write my own pgn-parser...
Just for my understanding, because pgn-extract is a command linie program, would it not be possible to start it from python and read the output (just like the MEA tool does with UCI engines) ?
pohl4711
Posts : 160 Join date : 2022-03-01 Location : Berlin
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Tue Mar 21, 2023 12:58 pm
fsanders wrote:
Just for my understanding, because pgn-extract is a command linie program, would it not be possible to start it from python and read the output (just like the MEA tool does with UCI engines) ?
I do not say, that this is not possible. Perhaps it is. But mention, the batch-language is very powerful and simple, when doing reading, manipulating and writing text-files. And because of this, batch is perfect for using pgn-extract. Because pgn-extract operates only on pgn-files and the pgn-extract output is always a pgn-file (and all pgn-files are just text-files). So, IMO, batch-language is the best choice for using pgn-extract... And there is no need for installing python or compiling or other things like that: The (batch)-source-code of my tools can be read and rewritten by everybody, who wants to do so. Because batch is an interpreter-language. Speed is no problem, because all complex operations (searching for piece-patterns etc.) are done by pgn-extract, which is very fast since it's last update.
fsanders
Posts : 15 Join date : 2023-01-10
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Tue Mar 21, 2023 1:58 pm
I see. Looks like batch is the right choice for you and this topic.
TheSelfImprover
Posts : 3112 Join date : 2020-11-18
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Tue Mar 21, 2023 9:17 pm
In Windows, you've got the script host (WSH - link) which gives you good programming capabilities, and even simple popup windows. Among the scripting languages it offers are Rexx*, BASIC, Perl, Ruby, Tcl, PHP, JavaScript, Delphi, Python, XSLT. I use it every day because I find it "really useful". A WSH file can be used like a batch file.
*I used to love Rexx because it was my first scripting language, and I was amazed that it was on offer under WSH. However, I recognise that JavaScript is the nearest thing the world has to a script language standard, so I use that. It's only fair to warn you that there are a few differences between WSH JavaScript and standard JavaScript.
Chris Whittington
Posts : 1254 Join date : 2020-11-17 Location : France
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Fri Mar 31, 2023 1:30 pm
For Stefan Pohl: I’ve included a call to your EAS bat file as part of the SPSA tuner for CSTal. This means, at every few epochs, the tuner will output both an Elo performance and an EAS score. May I have your permission to include the EAS batch files etc as part of the SPSA tuner when it (the tuner) gets released on GitHub? Thanks.
Brendan likes this post
pohl4711
Posts : 160 Join date : 2022-03-01 Location : Berlin
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Sat Apr 01, 2023 8:07 am
Chris Whittington wrote:
For Stefan Pohl: I’ve included a call to your EAS bat file as part of the SPSA tuner for CSTal. This means, at every few epochs, the tuner will output both an Elo performance and an EAS score. May I have your permission to include the EAS batch files etc as part of the SPSA tuner when it (the tuner) gets released on GitHub? Thanks.
Of course! For me, it is an honor, that engine-developers use my little EAS-tool. Feel free to do whatever you want. And, by the way, I like the idea! Looking forward to your github-release!
Dann Corbit
Posts : 189 Join date : 2020-11-26
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Thu Apr 13, 2023 12:40 pm
Don't forget fun. I remember a Komodo queen sac in TCEC, and the crowd went wild. A pawn sac won't cause a stir like that.
Morphy moves and Tal moves make us swoon. Mikhail Botvinnik not so much.
Mclane and Brendan like this post
Dann Corbit
Posts : 189 Join date : 2020-11-26
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS Thu Apr 13, 2023 12:42 pm
Speaking of Mikhail Botvinnik, sure he wins. But who cares?
Sponsored content
Subject: Re: Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS
Engines Aggressiveness Score Ratinglist - Stefan Pohl's EAS