Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Wed Aug 23, 2023 5:39 pm
I have my moments
Ozymandias likes this post
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Wed Aug 23, 2023 7:59 pm
I have a 22K human game collection between 2700+ players. If I have a PC free I will give it a try and make a polyglot book from it using the somu tool. Uploaded elo2700.pgn on the somu page, maybe it's a good testbed for the real 15M.
Ozymandias
Posts : 622 Join date : 2020-11-23
Subject: Re: EAS - for Stefan Wed Aug 23, 2023 9:21 pm
Over 30% of those games are either rapid or blitz, just so you know. You may want to check that.
Admin likes this post
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Thu Aug 24, 2023 8:25 am
Ozymandias wrote:
Over 30% of those games are either rapid or blitz, just so you know. You may want to check that.
And 1305 TCEC games ......
Do you have a tool that strip those games?
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Thu Aug 24, 2023 9:39 am
Anyway, SOMU using the [F8] key has such a function, executable SOMU-F8.EXE on github.
Pressing [F8] you can :
1. Strip draws
2. Make an elo selection.
3. Strip words
On [3], when you type "blitz" all games with "blitz" in the Event-tag or Site-tag will be stripped.
OUTPUT.PGN is the new created stripped file.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Thu Aug 24, 2023 9:45 am
BTW, what is the rapid time control as used in human games, maybe they should be stripped as well.
Ozymandias
Posts : 622 Join date : 2020-11-23
Subject: Re: EAS - for Stefan Thu Aug 24, 2023 1:25 pm
Admin wrote:
Ozymandias wrote:
Over 30% of those games are either rapid or blitz, just so you know. You may want to check that.
And 1305 TCEC games ......
Do you have a tool that strip those games?
Engine games too? I didn't check, I was talking from experience. Once you've deleted all those games, you'll probably have less than 15K (which is what I have).
For those purposes, I use tagExtract and keep excludeT.pgn.
Admin wrote:
BTW, what is the rapid time control as used in human games, maybe they should be stripped as well.
15m to 25m, with increment of 3 to 5 seconds. And yes, I throw them out.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Thu Aug 24, 2023 4:12 pm
Years ago TWIC started to include TCEC games.
I found games of chess.com as well, 3350 (15%) to be precisely, yikes....
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Thu Aug 24, 2023 4:57 pm
Code:
PGN pruning PGN database : pgn\elo2700.pgn Read PGN game : 22.315 Skip Online games y To OUTPUT.PGN : 12.456 Done...
This is what I get after stripping a pgn with the words :
"blitz" , "rapid" , "tcec" , "chess.com" , "comp"
In one run and case insensitive of course.
Ozymandias
Posts : 622 Join date : 2020-11-23
Subject: Re: EAS - for Stefan Thu Aug 24, 2023 7:52 pm
Other terms that might bring down the count: blind, simul, internet, active, knockout, tiebreak, K.O.
I've noticed that, quite often, play-offs between players are faster rounds for qualification purposes, even though there's usually no other indication other than several games being played on the same day between two players, which is quite telling.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Thu Aug 24, 2023 11:51 pm
Ozymandias wrote:
Other terms that might bring down the count: blind, simul, internet, active, knockout, tiebreak, K.O.
Can "tagExtract" handle all these terms in one run?
I can do it for SOMU by creating a simple text file with all the terms you want to exclude in one run
blind simul internet active knockout tiebreak K.O. blitz rapid tcec chess.com comp
-------
or whatever.
Ozymandias
Posts : 622 Join date : 2020-11-23
Subject: Re: EAS - for Stefan Fri Aug 25, 2023 7:14 am
Nope, tagExtract works one by one. if your tool doesn't face problems with big DBs, it holds a huge advantage.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Fri Aug 25, 2023 7:29 am
Ozymandias wrote:
Nope, tagExtract works one by one. if your tool doesn't face problems with big DBs, it holds a huge advantage.
SOMU can handle any PGN size and is a lot faster than Norm's (great) utils.
I will do the text file idea, small change.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Fri Aug 25, 2023 8:48 am
Code:
PGN pruning PGN database : pgn\elo2700.pgn Read PGN game : 22.315 Skip words (y/n) y To OUTPUT.PGN : 11.051 Done...
Using "words.txt" another 1000 less.
SUMO.EXE and words.txt on github.
Run took 4 seconds.
----------
Added "lichess" and "chess24.com" to words.txt
Code:
PGN pruning PGN database : pgn\elo2700.pgn Read PGN game : 22.315 Skip words (y/n) y To OUTPUT.PGN : 10.849 Done...
Ozymandias likes this post
Ozymandias
Posts : 622 Join date : 2020-11-23
Subject: Re: EAS - for Stefan Fri Aug 25, 2023 10:44 am
Well, that looks too good to overlook. I wasn't planning on another update, but since LSS games for the first half of the year are already online, cubail posted some PC and IC games for the past year, and an update of OTB games looks so streamlined with this tool, I guess I'll bite.
Only book testing games (via Sedat) won't be updated, since he's on vacation.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Fri Aug 25, 2023 10:58 am
Wait till you get the invoice....
with 100% discount of course
Ozymandias
Posts : 622 Join date : 2020-11-23
Subject: Re: EAS - for Stefan Fri Aug 25, 2023 11:30 am
You joke, but his weekend I was trying to explain to my nephew how some programmer was doing something for free, and he just couldn't understand it.
Admin likes this post
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Sat Aug 26, 2023 9:45 am
WARNING
In case you are making books with "polyglot make-book" get the fix for SUMO-F5 on github.
I tried to make a polyglot opening book from the remaining 10,000 games from elo2700.pgn and polyglot (with its extreme pgn checks) refused the pgn coming from SOMU. What gives? Well, when a PGN is truncated it misses the game result and Polyglot goes berserk.
I thought it was nonsense to do, Fabien decided otherwise.
Ozymandias
Posts : 622 Join date : 2020-11-23
Subject: Re: EAS - for Stefan Sat Aug 26, 2023 1:57 pm
AS a PGN integrity check, polyglot has no equal.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Sat Aug 26, 2023 3:45 pm
Indeed.....
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Sat Aug 26, 2023 4:48 pm
Book made, system seems to work, on github.
Ozymandias
Posts : 622 Join date : 2020-11-23
Subject: Re: EAS - for Stefan Sat Aug 26, 2023 6:21 pm
I forgot to ask. The last move saved is the one before the blunder, but when you establish an overall limit, is the position, where that value has been reached, saved to output? Or is the game truncated on the previous one? The later would be the consistent behavior.
PS: the latest exe doesn't work. "This app can't run on your PC".
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Sun Aug 27, 2023 11:04 am
Ozymandias wrote:
I forgot to ask. The last move saved is the one before the blunder, but when you establish an overall limit, is the position, where that value has been reached, saved to output? Or is the game truncated on the previous one? The later would be the consistent behavior.
Yes, consistent. The moves that fail the blunder margin or the max margin are stripped.
Quote :
PS: the latest exe doesn't work. "This app can't run on your PC".
Before I do another recompile that generates different code can you rename somu.exe to something strange like umos.exe? I think Windows remembers the false warning on somu.exe
Meanwhile we should discuss the book, there are issues, more in a next post.
Ozymandias likes this post
Ozymandias
Posts : 622 Join date : 2020-11-23
Subject: Re: EAS - for Stefan Sun Aug 27, 2023 11:30 am
I renamed it, changed it to another location: same message.
Sure, what issues?
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: EAS - for Stefan Sun Aug 27, 2023 12:59 pm
Ozymandias wrote:
I renamed it, changed it to another location: same message.
Uploaded somu-new.exe
Ozymandias wrote:
Sure, what issues?
1. The use of analyzed books. Should be an advantage, but....
. Book made with blunder margin 50. Playing 1.e4 c5 2.Nf3 d6 and the only playable book move is Bb5, ridiculous. . Book made with blunder margin 75. Playing 1.e4 c5 2.Nf3 d6 and now 3.d4 (97%) 3.Bb5 (3%), good. Disadvantage, the analysis takes ~3 x more time.
-------
2. Using normal polyglot books without analyzed data.
1. Using the small book "varied.bin". 2. Since book score is always 0 all moves in the book are accepted without analysis. 3. First move out of book (for instance using blunder margin 50) that gives a score of -51 is stripped which can be meaningless certainly with NNUE evaluation.
Both are undesirable, but that's just me.
ozzy-50.bin ozzy-75.bin ozzy-varied on github
--------
Suggestions
1. Skip the book option, analyze every position. Not so nice but accurate.
2. Take a small book (for instance varied.bin) and analyze it with SF16 NNUE eval, depth=20 or so.