ProDeo
Would you like to react to this message? Create an account in a few clicks or log in to continue.
ProDeo

Computer Chess
 
HomeHome  CalendarCalendar  Latest imagesLatest images  FAQFAQ  SearchSearch  MemberlistMemberlist  UsergroupsUsergroups  RegisterRegister  Log in  

 

 Enough is Enough

Go down 
+5
Peter Berger
Mclane
TheSelfImprover
Admin
Krisnatoonn
9 posters
Go to page : Previous  1, 2, 3, 4  Next
AuthorMessage
texium




Posts : 119
Join date : 2022-07-19

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySat Aug 24, 2024 10:57 am

the Patricia dev, noting some criticism by some, frankly suspicious, people who wanted to use Patricia for sparring, added a skill and ucielo feature and slightly randomized and made more aggressive how it played openings only when using reduced strength. their complaint was that 3.0 wouldn't gambit in openings like 2.0 did, 2.0 was 3150 and 3.0 is 3270 so it's not surprising. but to digress, it's already possible.

Ghppn likes this post

Back to top Go down
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySat Aug 24, 2024 12:15 pm

Patricia 3.1 has working skill levels down to nearly beginner level. And this works with limited nodes, not with artificial errors. So, everybody can play vs Patricia 3.1

Mclane and Ghppn like this post

Back to top Go down
https://www.sp-cc.de
Mclane

Mclane


Posts : 3011
Join date : 2020-11-17
Age : 57
Location : United States of Europe, Germany, Ruhr area

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySat Aug 24, 2024 12:19 pm

Then the question is: how do we get it into

DGT Pi
Pewatronic Grandmaster
Mephisto Phoenix

?
Back to top Go down
http://www.thorstenczub.de
texium




Posts : 119
Join date : 2022-07-19

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySat Aug 24, 2024 5:16 pm

pohl4711 wrote:
Patricia 3.1 has working skill levels down to nearly beginner level. And this works with limited nodes, not with artificial errors. So, everybody can play vs Patricia 3.1
from the release notes for Patricia 3.1: "Completely overhauled the strength limiting system and implemented more style when limited in strength. Patricia 3.1's skill levels are much better calibrated than Patricia 3's, meaning people can actually beat "1500 Elo" Patricia without too much difficulty unlike before. Additionally, Patricia 3.1 plays slightly randomized openings and sacrifices much more in the opening when her strength is limited for play against a human."

Mclane, TheSelfImprover and Ghppn like this post

Back to top Go down
Admin
Admin
Admin


Posts : 2571
Join date : 2020-11-17
Location : Netherlands

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySat Aug 24, 2024 6:30 pm

Patricia is brilliant.

Mclane likes this post

Back to top Go down
http://rebel13.nl/
texium




Posts : 119
Join date : 2022-07-19

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySun Aug 25, 2024 2:49 am

i think it's cool, I just prefer little Goliath, it's really sacrificial too

Ghppn likes this post

Back to top Go down
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySun Aug 25, 2024 6:26 am

Mclane wrote:
Then the question is: how do we get it into

DGT Pi
Pewatronic Grandmaster
Mephisto Phoenix

?

DGT Pi uses Linux or not? Patricia 3.1 offers Linux Binaries on it's GitHub site. Pewatronic Grandmaster uses Linux, too. Phoenix uses Raspberry Pi, so it should use Linux as OS, too?!?
So, shouldnt Patricia run on DGT Pi and Pewatronic Grandmaster and Phoenix?!?
I do not own these devices, so I can not try it by myself.

But Patricia definitly runs fine on Android. So on all electronic chessboards, using an Android-device, Patricia should definitly work.


Last edited by pohl4711 on Sun Aug 25, 2024 7:17 am; edited 1 time in total

Ghppn likes this post

Back to top Go down
https://www.sp-cc.de
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySun Aug 25, 2024 6:39 am

Admin wrote:
Patricia is brilliant.

And since some weeks, the author of Obsidian helps A.Kulju (Patricia), improving Patricia. On talkchess, A.Kulju mentioned a huge Elo-gain for the Patricia-dev:
"The dev version of Patricia is now 100 ELO stronger than Patricia 3.1. Collaborating with the Obsidian dev has been busy but very fruitful, he just threw himself at working on it lol. I really appreciate his help!"

Ghppn likes this post

Back to top Go down
https://www.sp-cc.de
Mclane

Mclane


Posts : 3011
Join date : 2020-11-17
Age : 57
Location : United States of Europe, Germany, Ruhr area

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySun Aug 25, 2024 9:08 am

I do not get the reduced strength working. I tried 2000 Elo,
That would be limit strength to 12 i think.
But it played to strong on my android devices.

I also do not understand why there are 2 methods. Either uci limit strength and then you can enter ELO directly.

And when i entered elo directly it looked like i do get multi variations instead of 1 mainline.

I tried it out under droidfish and now with aart biks chess.
But it computed way too many NPS and search depth was also too deep for 2000 elo.

Maybe the feature is broken on the android version ?

Ghppn likes this post

Back to top Go down
http://www.thorstenczub.de
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySun Aug 25, 2024 9:16 am

Mclane wrote:
I do not get the reduced strength working. I tried 2000 Elo,
That would be limit strength to 12 i think.
But it played to strong on my android devices.

I also do not understand why there are 2 methods. Either uci limit strength and then you can enter ELO directly.

And when i entered elo directly it looked like i do get multi variations instead of 1 mainline.

I tried it out under droidfish and now with aart biks chess.
But it computed way too many NPS and search depth was also too deep for 2000 elo.

Maybe the feature is broken on the android version ?

Use Skill Level parameter, not uci limit strength. But nobody said, the Elos, the author mentioned for the skill levels are really fitting "reality". You have to try, which skill-level fits...

The skill level option was added, because some GUIs (FritzGUI for example) does not show the UCI-parameters like UCI_Elo. And if UCI_LimitStrength is not set to true, any UCI_Elo setting is useless. All these problems do not appear, when using skill level instead.

Ghppn likes this post

Back to top Go down
https://www.sp-cc.de
Mclane

Mclane


Posts : 3011
Join date : 2020-11-17
Age : 57
Location : United States of Europe, Germany, Ruhr area

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySun Aug 25, 2024 9:39 am

Ok I will experiment with skill level

Ghppn likes this post

Back to top Go down
http://www.thorstenczub.de
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySun Aug 25, 2024 10:40 am

Mclane wrote:
Ok I will experiment with skill level

In Droidfish, it is really strange. Skill level seems not working, but, when I activate LimitStrength and set the Elo down to 500, Patricia plays definitly very weak. You see, that Elo limit is working, when Droidfish switches to MultiPV, when thinking. That is normal, the Author of Patricia used MultiPV to reduce strength.

In cutechess on Windows, Skill Level is working fine.

Really strange.

Mclane and Ghppn like this post

Back to top Go down
https://www.sp-cc.de
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyMon Aug 26, 2024 7:12 am

pohl4711 wrote:
Mclane wrote:
Ok I will experiment with skill level

In Droidfish, it is really strange. Skill level seems not working, but, when I activate LimitStrength and set the Elo down to 500, Patricia plays definitly very weak. You see, that Elo limit is working, when Droidfish switches to MultiPV, when thinking. That is normal, the Author of Patricia used MultiPV to reduce strength.

In cutechess on Windows, Skill Level is working fine.

Really strange.


I made some quick games of Patricia 3.1 in Droidfish (short thinking-time: 15sec+5sec = 5sec per move) singlethread vs. Mephisto Milano (15 sec/move). With Elo set to 1800, Patricia lost and with Elo 1950 Patricia won. So, the Elo-limit seems quite "realistic", when using short thinking-times for Patricia.

Mclane and Ghppn like this post

Back to top Go down
https://www.sp-cc.de
Admin
Admin
Admin


Posts : 2571
Join date : 2020-11-17
Location : Netherlands

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyTue Sep 10, 2024 7:43 pm

Admin wrote:
I have an EAS 220.xxx version, but it loses 50 elo.

Made some progress, EAS fluctuating between 180.xxx - 200.xxx at zero elo loss.

Stopped using Leela data since it has no EAS future, like the first Rebel NNUE started to generate own data. Currently have 850M positions, first result EAS 260.xxx. But need at least 10B to be a bit competitive, estimated 3-4 (boring) months.

C'est la vie.

There is no elo alternative left since the new trend in the pressure cooker of discord at the cost of originality.

Eelco likes this post

Back to top Go down
http://rebel13.nl/
Admin
Admin
Admin


Posts : 2571
Join date : 2020-11-17
Location : Netherlands

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyTue Sep 10, 2024 8:04 pm

https://drive.google.com/file/d/17xJLciI3fE6K_2fyCQU--45mi72M0lKE/view?usp=drive_link

68 sac games (at least 5 pawns) of the latest training session generated with Stefan EAS tool.
Back to top Go down
http://rebel13.nl/
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyWed Sep 11, 2024 6:04 am

Admin wrote:

Made some progress, EAS fluctuating between 180.xxx - 200.xxx at zero elo loss.

Sounds very promising. EAS-score around 200k is already a very nice playing-style (you can look at my Super 3 tournament, where Revenge 1 participates, Revenge 1 has around 200k EAS-score, like the new Velvet 8 with its Risky-net, by the way).
https://www.sp-cc.de/super3_tournament.htm[/code]

Here the EAS-scoring of the Super 3 tournament:
Code:

                                 bad  avg.win
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player
-------------------------------------------------------------------
   1    160509  26.27%  17.43%  12.94%   72   Revenge 1.0 avx2  
   2     76863  11.35%  16.18%  27.76%   71   Komodo 14.1 HCE  
   3     53654  08.29%  12.49%  22.97%   73   Lc0 791921 CPU  


I suggest, looking at the sacs-stats, too. As you can see in my full UHO/EAS ratinglist, Velvet 8 Risky and Revenge 1 are at the same EAS-score-level as Stockfish is, but the sac-stat is much better than Stockfish. Stockfish gains EAS-points with its higher number of short wins (compared to Velvet 8 Risky) and with its lower number of bad draws (compared to Revenge 1):
Code:

                                bad  avg.win
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player
-------------------------------------------------------------------
   1    429481  51.61%  38.03%  05.49%   66   Patricia 3.1 avx2  
   2    384942  52.55%  35.77%  05.19%   69   Patricia 3.0 avx2  
   3    201351  34.75%  22.22%  10.27%   73   Velvet 8.0.0 risky  
   4    197919  31.18%  29.46%  17.09%   71   Revenge 1.0 avx2  
   5    192881  21.25%  26.18%  09.40%   70   Stockfish 16 230630  
   6    191003  22.01%  24.79%  08.64%   70   Stockfish 16.1 240224  


Admin wrote:

Stopped using Leela data since it has no EAS future, like the first Rebel NNUE started to generate own data. Currently have 850M positions, first result EAS 260.xxx. But need at least 10B to be a bit competitive, estimated 3-4 (boring) months.

Lc0 plays not many sacs, which is not really surprising, considering, Lc0 is running very slow (compared to the other engines) and uses MCTS-search, IMHO. Everytime I am filtering the enginegames from chesscom, most of the found sac-games have Lc0 only as the loosing opponent... (or look at the EAS-score of Lc0 CPU in my Super 3 tournament, at the start of this posting: 8.29% sacs - really a bad value)

Admin wrote:

There is no elo alternative left since the new trend in the pressure cooker of discord at the cost of originality.

Isnt it not much more fascinating making an engine gaining EAS-points instead of gaining more and more Elo in these days of so many 3500+ Elo engines? IMHO it is.
My hope is, that my EAS-Tool can and will lead to a paradigm shift in engine-development, as I already mentioned on my Patricia subsite:

"Since the 1950s, the only goal of computerchess was gaining Elo. But in these days of superstrong engines beyond 3700+ Elo, IMHO it makes a lot of sense, to make engines playing more spectacular, aggressive and interesting, instead of just gaining more and more Elo... Since my EAS-Tool was made, it is possible for the first time, to measure the aggressiveness of engines. And so, using the EAS-Tool, to make an engine playing more aggressive, is the next logical step of development in computerchess."
Back to top Go down
https://www.sp-cc.de
Admin
Admin
Admin


Posts : 2571
Join date : 2020-11-17
Location : Netherlands

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyThu Sep 12, 2024 8:56 am

A few notes about EAS comparison

1. I play cutechess from PGN, I suppose I play from EPD games automatically are shorter and get a higher EAS score. Is that true?

2. I use tc=40/10 for training games, I am pretty sure with your TC my EAS numbers will be lower.

3. I use "-resign movecount=3 score=999", I believe (not sure) you play till mate, meaning my training games are shorter and get a higher EAS than yours.

4. Most important, recently someone said "why bad draws, it has nothing to do with aggressiveness", I think that someone was right.

5. There are alternatives for a third EAS component but these can't be done by pgn-extract. I can think of 1) a bonus for attacking (chasing) enemy pieces (not pawns) and 2) bonus for king attacks. Unfortunately that needs a C++ or Python application. Easy for me to write.
Back to top Go down
http://rebel13.nl/
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyFri Sep 13, 2024 7:43 am

Admin wrote:
A few notes about EAS comparison

1. I play cutechess from PGN, I suppose I play from EPD games automatically are shorter and get a higher EAS score. Is that true?

2. I use tc=40/10 for training games, I am pretty sure with your TC my EAS numbers will be lower.

3. I use "-resign movecount=3 score=999", I believe (not sure) you play till mate, meaning my training games are shorter and get a higher EAS than yours.

4. Most important, recently someone said "why bad draws, it has nothing to do with aggressiveness", I think that someone was right.

5. There are alternatives for a third EAS component but these can't be done by pgn-extract. I can think of 1) a bonus for attacking (chasing) enemy pieces (not pawns) and 2) bonus for king attacks. Unfortunately that needs a C++ or Python application. Easy for me to write.

1 and 3: The EAS-Tool gives points for short wins, depending on the average length of all won games in the pgn-database. So, if starting from epd, all won games should be shorter, so the shortwin-EAS-points should not be higher, compared to a gamebase, starting from move 1 instead of starting from a fen-code (epd). Same for using resign-rules. If they are used in all games in a pgn.

2: Yes, games with higher tc (and/or better hardware) should lower the EAS-scores a bit

4: I can not agree here. All bad draws are a missed opportunity to win a game. That is the opposite of aggressive play (aggressive play means playing for a win at the first place, not a draw). So bad draws (draws in middlegame or draws after gaining material) have a lot to do with aggressiveness, IMHO.

5: When it can not be done by pgn-extract, then I am to stupid to do this. And I am afraid, your ideas (which are great) could lead to a huge slowdown in the EAS-tool (the high speed of the EAS-tool is very important - calculating my UHO full ratinglist with my EAS-tool right now takes already nearly 3 hours (more than 1 million games and 85 engines!)). But, you can do whatever you want, using my EAS-tool, all fine for me. But mention, it took me around 6 months to find a good balance of the different EAS-points-categories for the final EAS-score (it was a nightmare!). If you add new components to the EAS-tool, this nightmare would start all over again...
Back to top Go down
https://www.sp-cc.de
Admin
Admin
Admin


Posts : 2571
Join date : 2020-11-17
Location : Netherlands

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySat Sep 14, 2024 5:14 pm

I understand.

Having said that, looking with amazement at what you have established with batch files I can assure you you would be a great programmer.

Having said that (2), I started to set up a simple and quick frame that could serve as EAS in C++, to begin with the Engine Shorties List function.

Code:
   Engine Shorties List                            
                                                  
   PGN database      : pgn\uho_ratinglist_games.pgn
   Get won games     : 61.800                    
                                                  
   Engine                  Won Loss   Perc  Score  
   Stockfish 231107 av    7439  614  92.4%    224  
   Stockfish 16 230630    7199  701  91.1%    217  
   Uralochka 3.40a avx    2122 5770  26.9%    186  
   Torch 1 popavx2        6527 1368  82.7%    182  
   KomodoDragon 3.3 av    5846 1924  75.2%    172  
   Revenge 3.0 avx2       1941 5767  25.2%    158  
   Rebel EAS avx2         2263 5371  29.6%    156  
   Caissa 1.14 avx2       3153 4442  41.5%    143  
   Seer 2.7.0 avx2        2245 5616  28.6%    142  
   CSTal 2.0 avx2         2939 4564  39.2%    141  
   Clover 6 avx2          2239 5336  29.6%    137  
   Koivisto 9.2 avx2      2599 5006  34.2%    129  
   RubiChess 230918 av    3514 4009  46.7%    123  
   Ethereal 14.25 nnue    4112 3367  55.0%    118  
   RofChade 3.1 avx2      2291 5583  29.1%    110  
   Berserk 12 avx2        5377 2368  69.4%    109

Currently the points are a brew of my own, maybe you can help and show me the 3 places in your code how to eliminate each function (sacs, shorties, bad draws) so I can easily check if my c++ code for a 100% reproduction of your results.

Speed is not an issue, the pgn I used is an old rating list of yours of 120.000 games, it took about 30 seconds to produce the above.

Mclane likes this post

Back to top Go down
http://rebel13.nl/
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptySun Sep 15, 2024 6:04 am

Admin wrote:
I understand.

Having said that, looking with amazement at what you have established with batch files I can assure you you would be a great programmer.

Having said that (2), I started to set up a simple and quick frame that could serve as EAS in C++, to begin with the Engine Shorties List function.

Code:
   Engine Shorties List                            
                                                  
   PGN database      : pgn\uho_ratinglist_games.pgn
   Get won games     : 61.800                    
                                                  
   Engine                  Won Loss   Perc  Score  
   Stockfish 231107 av    7439  614  92.4%    224  
   Stockfish 16 230630    7199  701  91.1%    217  
   Uralochka 3.40a avx    2122 5770  26.9%    186  
   Torch 1 popavx2        6527 1368  82.7%    182  
   KomodoDragon 3.3 av    5846 1924  75.2%    172  
   Revenge 3.0 avx2       1941 5767  25.2%    158  
   Rebel EAS avx2         2263 5371  29.6%    156  
   Caissa 1.14 avx2       3153 4442  41.5%    143  
   Seer 2.7.0 avx2        2245 5616  28.6%    142  
   CSTal 2.0 avx2         2939 4564  39.2%    141  
   Clover 6 avx2          2239 5336  29.6%    137  
   Koivisto 9.2 avx2      2599 5006  34.2%    129  
   RubiChess 230918 av    3514 4009  46.7%    123  
   Ethereal 14.25 nnue    4112 3367  55.0%    118  
   RofChade 3.1 avx2      2291 5583  29.1%    110  
   Berserk 12 avx2        5377 2368  69.4%    109

Currently the points are a brew of my own, maybe you can help and show me the 3 places in your code how to eliminate each function (sacs, shorties, bad draws) so I can easily check if my c++ code for a 100% reproduction of your results.

Speed is not an issue, the pgn I used is an old rating list of yours of 120.000 games, it took about 30 seconds to produce the above.

My pleasure.
Lines in the EAS-Tool (latest version 5.7):

177-218: bad draws (find them and calculate the EAS points)

220-300: short wins (Mention: In the code, some of the short wins-variables are named with 40 up to 60 moves, but the short wins- categories are not hardcoded in the later EAS-tool versions. Now, there are 5 variables sh_level1 up to sh_level5. Where sh_level1 was 60 moves, when these limit was hardcoded, now it depends on the overall lengths of all won games. These variables are set early in the code on line 71:
REM *** Calculate movelimit for short wins, depending on the average won game length of
REM *** all games of the source.pgn file
set /A shortwin_movelimit=(%avg_length_all_wins%/5)*5
set /A shortwin_movelimit-=15
if %shortwin_movelimit% LSS 30 set /A shortwin_movelimit=30
if %shortwin_movelimit% GTR 95 set /A shortwin_movelimit=95
set /A sh_level1=%shortwin_movelimit%
set /A sh_level2=%shortwin_movelimit%-5
set /A sh_level3=%shortwin_movelimit%-10
set /A sh_level4=%shortwin_movelimit%-15
set /A sh_level5=%shortwin_movelimit%-20

303-428: finding the sacs and calculating the EAS-points

Mention, in the bad draws-search and the sac-search pgn-extract uses the piece-pattern files in the bin-folder (1_pawnsac_white up to 5_pawnsac_black) - this is where the "magic happens". All other code in my tool is very simple.
But, doing this in a real language like C++, would make this much easier - you will not need so many piece-patterns. You just have to count material in pawn-units after each ply and look, if the winning color has less pawn-units on the board for 8 consecutive plies. This would work even better than my solution, because in pgn-extract, there are some different piece-patterns with the same amount of pawn-units less (or more) for one color, and switching between them, resets the counter of consecutive plies, which can lead to overlooking some sacs... no way to avoid this for me, because of pgn-extract. But a pure "pawn-unit-counting"-solution in C++ would avoid these problem. Except the search for queen-sacs of course, here you have to look especially for a missing queen (possible bug-problem here: promotion of a pawn to a new queen... or 2 queens of one color (rare, but possible)).

Mention: Important for a high-speed working tool here is: Filtering higher sacs only in the already found lower sacs-games... Not filtering all games over and over again to find higher and higher sacs. That would make the tool much slower...

In a C++ program, you could try to check, if a game ended with a crash, disconnect, timeloss, illegal move (problem here: different GUIs write different strings in the pgn for these not-normal game-endings). If a game with a timeloss or so happens, the sac-search in this game will fail (because my base-idea of finding sac without any eval is, that engines do not blunder and do not loose on time etc. Then, a sac is just less material (for 8 consecutive plies) for the winning color of a game). I can not do this (fast) with pgn-extract. But, if you can do so, these games just should be ignored for EAS-scoring.

You see, an EAS-tool rewritten in C++ has a lot of room for improvements, compared to my stupid batch-file/pgn-extract solution... Because I am not able to write a brutally fast pgn-parser by myself, I had to use pgn-extract.

Speed (for comparison) of my EAS-Tool is around 10 minutes for calculating my 120000 games UHO-Top15 ratinglist gamebase (with 16 engines (the more engines in the pgn-gamebase, the longer the EAS-tool needs for evaluating)) on a fast CPU-core, running with around 4GHz. Including: Building the statistic output-files and building the pgn-files, containing the sacs and the short-wins.

Mclane likes this post

Back to top Go down
https://www.sp-cc.de
Peter Berger




Posts : 130
Join date : 2020-11-20

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyMon Sep 16, 2024 1:49 pm

pohl4711 wrote:
Admin wrote:
A few notes about EAS comparison
4: I can not agree here. All bad draws are a missed opportunity to win a game. That is the opposite of aggressive play (aggressive play means playing for a win at the first place, not a draw). So bad draws (draws in middlegame or draws after gaining material) have a lot to do with aggressiveness, IMHO.
I disagree heavily. Bad draws is just nonsense.
Also all these features are not independent from each other. I basically agree with "very short wins" and "most short wins overall", maybe even "average length of all won games".
"Most high-value sacrifices" is basically pure tactically ability, so useless but not too harmful.
"Smallest number of bad draws" helps the engines that are strong but not aggressive at all. Think about how you even reach these positions. It is obvious, that there are several positions with someone being a pawn up, that are just boring draws - I'd expect an aggressive player to avoid them.
Back to top Go down
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyTue Sep 17, 2024 7:32 am

Peter Berger wrote:

"Most high-value sacrifices" is basically pure tactically ability, so useless but not too harmful.
"Smallest number of bad draws" helps the engines that are strong but not aggressive at all... Bad draws is just nonsense.

Very easy to prove both points as wrong.

1) "Most high-value sacrifices is basically pure tactically ability, so useless"

First we take a look at Komodo 14.1 (a HCE engine without any net) and Komodo 14.1 aggressive (just set the uci-option playing style to aggresssive, all other uci-parameters stay default) in the EAS-Ratinglist of my abandoned SPCC-Ratinglist (https://www.sp-cc.de/files/spcc_full_list.txt):

Code:

                                 bad  avg.win
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player
-------------------------------------------------------------------
2        262203  37.03%  37.57%  08.50%   68   Komodo 14.1 aggress. (high sacs = 6.78%)
181       65015  15.80%  10.10%  21.30%   79   Komodo 14.1 x64  (high sacs = 1.50%)

Mention, Komodo 14.1 aggressive has a rating of 3334 and Komodo 14.1 has 3472 (138 Celo difference!)

So, if the high-value sacs are pure tactically ability, why has the 138 Celo weaker engine-version 4.5x more high sacs? Besides the fact, that both are the same engine, just with one changed parameter?

Secondly, we look at Patricia and Revenge 1 in my full EAS-ratinglist (https://www.sp-cc.de/files/uho_full_list.txt) and a super strong engine (Stockfish 16.1)

Here the Celo-ratings:
Code:

   3 Stockfish 16.1 240224     : 3811    2    2 56000    72.2%   3636   47.2%
  81 Stockfish final HCE       : 3476    4    4 16000    40.3%   3547   37.6%
  83 Revenge 1.0 avx2          : 3362    5    5 15000    18.3%   3634   30.3%
  84 Patricia 3.1 avx2         : 3215    6    6 14000    14.7%   3526   21.1%
  85 Patricia 3.0 avx2         : 3212    6    6 14000    14.5%   3526   21.2%

and we compare this with the EAS-ratinglist:

Code:

                                 bad  avg.win
Rank  EAS-Score  sacs   shorts  draws  moves  Engine/player
-------------------------------------------------------------------
   1    429481  51.61%  38.03%  05.49%   66   Patricia 3.1 avx2  (high sacs = 18.00%)
   2    384942  52.55%  35.77%  05.19%   69   Patricia 3.0 avx2  (high sacs = 19.16%)
   4    197919  31.18%  29.46%  17.09%   71   Revenge 1.0 avx2 (high sacs = 5.38%) 
   6    191003  22.01%  24.79%  08.64%   70   Stockfish 16.1 240224  (high sacs = 3.74%)
  15    138735  11.40%  40.08%  26.11%   67   Stockfish final HCE (high sacs = 1.40%)

So, if you would be correct and the high sacs are just tactically ability, why have these 3 weak engines, which are 450 Celo and more weaker than Stockfish 16.1 (Patricia is nearly 600 Celo weaker !!!), a much better high sac-value? And a higher value of all sacs, too? And why has the tactical super strong Stockfish final HCE (no net) a high sac-value of just 1.4% and overall sacs of just 11.4%??? (Nobody can be that stupid, to have any doubt about the tactical strength of this nodecruncher engine Stockfish final HCE)

So, point 1 is obvious false. QED


2) "Smallest number of bad draws helps the engines that are strong but not aggressive at all."..."Bad draws is just nonsense"

To prove this as wrong, we just look up and see, that Patricia has 5.49% and 5.19% bad draws, but the 596 Celo stronger Stockfish 16.1 has 8.64% bad draws.

Another interesting point here is Stockfish final HCE, which is the only non-neural net engine in my UHO-full ratinglist: Stockfish final HCE has an enourmous high bad draw value (26.11%), which is 3x higher than Stockfish 16.1 and around 5x higher than the (much weaker!) Patricia engine. Why is that? That is exactly the point, where it gets interesting (and what even Ed sadly does not understand): A non-neural net engine has much less positional understanding. The neural-net engines understand, that just one pawn more in many endgames is not enough for a win. So, they try to avoid these endgames (a possible way is, to avoid captures (= avoid going towards endgame) until a second pawn was won). Stockfish final HCE doesnt have this understanding, so it believes in winning, when having one pawn more, no matter if it is an endgame or not). So, it has much higher number of bad draws, even though, Stockfish final HCE is a tactical monster and +114 Celo stronger than Revenge 1 and +261 Celo stronger than Patricia...

So, point 2 is obvious false. QED


And finally, if my EAS-Tool uses "useless" and "nonsense" parameters to calculate the EAS-scores, why is Patricia such an incredible aggressive playing engine? On my website, you can replay some nice sac-wins of Patricia:
https://www.sp-cc.de/patricia_eas_engine.htm

The Patricia-author used my EAS-tool to make Patricia playing this way:
"The metric that Patricia's aggression claims are based off of is Stefan Pohl's EAS-Tool, which is the most well known and well regarded tool for determining the aggressiveness of chess engines. It looks at a combination of factors, such as sacrifice rate, short win rate, and unnecessary draw rate, and outputs a score that captures how "exciting" an engine tends to play. A huge shoutout to Stefan Pohl. His EAS Tool works wonderfully, makes properly and objectively testing for increase aggression possible, and is the measure by which Patricia development progressed."

So, if my EAS-Tool uses "useless" and "nonsense" parameters, how could Patricia play such beautiful and aggressive chess??? (UHO-opening ended with move 6, from move 7, the engines started thinking)

[pgn]
[Event "UHO Ratinglist"]
[White "Patricia 3.1 avx2"]
[Black "Seer 2.6.0 avx2"]
[Site ""]
[Round "310"]
[Result "1-0"]
[Date "2024.08.09"]
[PlyCount "63"]
[TimeControl "180+1"]

1. d4 Nf6 2. c4 c5 3. d5 e6 4. Nf3 exd5 5. cxd5 Bd6 6. Bg5 h6 7. Bh4 O-O 8. Nc3 Re8 9. e3 Bf8 10. d6 Re6 11. Bg3 Nh5 12. Bc4 Nxg3 13. Bxe6 Nxh1 14. Bxf7+ Kxf7 15. Qd5+ Kg6 16. Qg8 Nxf2 17. Ne5+ Kf6 18. Nf7 Qe8 19. Nd5+ Kg6 20. Nh8+ Kg5 21. Kxf2 Qe6 22. Nf7+ Kf5 23. Qh7+ g6 24. Ke2 Qe5 25. Nxe5 Nc6 26. Nxg6 Ke6 27. Nc7+ Kxd6 28. Ne8+ Ke6 29. Qg8+ Kf5 30. Qf7+ Kg5 31. h4+ Kg4 32. Qf3# 1-0
[/pgn]

Not so bad for an engine, developed by using a tool which evaluates useless, nonsense parameters, isnt it?


And the good news is: My EAS-Tool is completely free and open source. Because it is a batch-tool, not even a compiler or so is needed. Just use a text-editor, open the code and make a better EAS-tool. Cant wait to see it!
Calling the work of others useless and nonsense is way easier, than doing it better, i am afraid...
Back to top Go down
https://www.sp-cc.de
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyTue Sep 17, 2024 8:48 am

pohl4711 wrote:

And the good news is: My EAS-Tool is completely free and open source. Because it is a batch-tool, not even a compiler or so is needed. Just use a text-editor, open the code and make a better EAS-tool.

Most of the code of the EAS-tool is very simple. But the percent-calculations are not, because the batch-language only allows integer-variables, no floating-point. So, building percent-values with 2 digits (12.34% for example) is not easy. And (second problem) it is not possible to use these for any calculations. So the percent-subroutine gives 2 values (global variables) back:
percent which is only for printing the percent values in the output statistics - it is a string (!) ("12.34%")
percentx100 which is the percent-value x100 (1234 in this example) as an integer. This is used for the EAS-calculations.

Here the subroutine:

:percent
if %~1 LEQ 0 (
set percent=00.00%%
set /A percentx100=0
EXIT /B 0
)
set /A l_base=((1000000000/%~1)*%~2)
set /A l_percent=%l_base%/1000000
set /A l_rest1=%l_percent%%%10
set /A l_percent=%l_percent%/10
set /A l_secdig_pc=%l_base%/100000
set /A l_rest2=%l_secdig_pc%%%10
set /A l_thirddig_pc=%l_base%/10000
set /A l_rest3=%l_thirddig_pc%%%10

REM *** Round 2nd digit up, if 3rd digit is GEQ 5 and round 1st digit and percent-value up, if necessary
if %l_rest3% GEQ 5 set /A l_rest2+=1
if %l_rest2% GEQ 10 (
 set /A l_rest2-=10
 set /A l_rest1+=1
)
if %l_rest1% GEQ 10 (
 set /A l_rest1-=10
 set /A l_percent+=1
)

if %l_percent% LSS 10 (
 set percent=0%l_percent%.%l_rest1%%l_rest2%%%
) else (
 set percent=%l_percent%.%l_rest1%%l_rest2%%%
)
set /A percentx100=(%l_percent%*100)+(%l_rest1%*10)+%l_rest2%

if %l_percent% GTR 99 (
set percent=100.0%%
set /A percentx100=10000
)
EXIT /B 0

Ghppn likes this post

Back to top Go down
https://www.sp-cc.de
Peter Berger




Posts : 130
Join date : 2020-11-20

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyWed Sep 18, 2024 10:17 am

pohl4711 wrote:


Not so bad for an engine, developed by using a tool which evaluates useless, nonsense parameters, isnt it?

I worded my message too carelessly, it was not my intention to offend you at all, so please accept my apology.
It is obvious that the EAS tool works and produces results that all in all go very well with what people feel is aggressive play in general. You mentioned yourself that you spent a lot of time with tuning parameters to get good results, so it is well possible that some parameters are more relevant than others, while some may even be left out or replaced by others.
To your points:
1. High-value sacrifices
When it is about Komodo 14.1 aggressive I suspect it gets some bonus for high-value sacrifices. And with Patricia it does a lot of these sacrifices by design (as it is supposed to score well with the EAS tool).
How many of these sacrifices are relevant? Relevant would mean that they change the outcome of the game compaired to other moves? Anyway - I don't want to argue this right now as I don't know. ( I noticed some games where Patricia "sacs" in the endgame just to reach another simpler endgame).
2. Bad draws
I feel you make some good points here in general, especially:
"The neural-net engines understand, that just one pawn more in many endgames is not enough for a win. So, they try to avoid these endgames (a possible way is, to avoid captures (= avoid going towards endgame) until a second pawn was won). Stockfish final HCE doesnt have this understanding, so it believes in winning, when having one pawn more, no matter if it is an endgame or not)."
When it is about Patricia, it is clear that it avoids being a pawn up in general Very Happy . Is this aggressiveness?
Maybe - this is complex. Let's think about a position where you can reach an endgame a pawn up that is tough or maybe impossible to win. Is it really more "aggressive" to not take the pawn?
I'll have to think about this some more.
Peter
Back to top Go down
pohl4711

pohl4711


Posts : 132
Join date : 2022-03-01
Location : Berlin

Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 EmptyThu Sep 19, 2024 6:37 am

Peter Berger wrote:
pohl4711 wrote:


Not so bad for an engine, developed by using a tool which evaluates useless, nonsense parameters, isnt it?

I worded my message too carelessly, it was not my intention to offend you at all, so please accept my apology.
It is obvious that the EAS tool works and produces results that all in all go very well with what people feel is aggressive play in general. You mentioned yourself that you spent a lot of time with tuning parameters to get good results, so it is well possible that some parameters are more relevant than others, while some may even be left out or replaced by others.
To your points:
1. High-value sacrifices
When it is about Komodo 14.1 aggressive I suspect it gets some bonus for high-value sacrifices. And with Patricia it does a lot of these sacrifices by design (as it is supposed to score well with the EAS tool).
How many of these sacrifices are relevant? Relevant would mean that they change the outcome of the game compaired to other moves? Anyway - I don't want to argue this right now as I don't know. ( I noticed some games where Patricia "sacs" in the endgame just to reach another simpler endgame).
2. Bad draws
I feel you make some good points here in general, especially:
"The neural-net engines understand, that just one pawn more in many endgames is not enough for a win. So, they try to avoid these endgames (a possible way is, to avoid captures (= avoid going towards endgame) until a second pawn was won). Stockfish final HCE doesnt have this understanding, so it believes in winning, when having one pawn more, no matter if it is an endgame or not)."
When it is about Patricia, it is clear that it avoids being a pawn up in general Very Happy . Is this aggressiveness?
Maybe - this is complex. Let's think about a position where you can reach an endgame a pawn up that is tough or maybe impossible to win. Is it really more "aggressive" to not take the pawn?
I'll have to think about this some more.
Peter

Thanks man, very appreciated!
I agree, the EAS-Tool is definitly not perfect and a lot of things in this direction (aggressiveness) are still unclear. No surprise, of course, because my EAS-Tool is the first tool ever, doing measurements of "aggressiveness". It is even hard, to be precise here: What is aggressiveness?!? But, as far as I can see now, the EAS-tool works not bad and recognizes aggressive playing engines (like Patricia and Revenge 1) correctly, even when these engines played versus much stronger engines and (because of this) scored badly. Thats good enough for me, right now. Additionally, I am limited by pgn-extract - I can only do things in the EAS-tool, which png-extract can do...
But, if Ed writes a new EAS-Tool in C or C++, the tool could do more. Lets wait & see.
Back to top Go down
https://www.sp-cc.de
Sponsored content





Enough is Enough - Page 2 Empty
PostSubject: Re: Enough is Enough   Enough is Enough - Page 2 Empty

Back to top Go down
 
Enough is Enough
Back to top 
Page 2 of 4Go to page : Previous  1, 2, 3, 4  Next

Permissions in this forum:You cannot reply to topics in this forum
ProDeo :: Computer Chess-
Jump to: