GUI Banksia Book Perfect Book 2021 to 6 moves. Tablebases All up to 7 man of the top 10. Threads = 16 CPU. Hash = 1024 Time Control = 3m+2s Robin Round Tournament 40 Rounds.
I first detected the problem with Stockfish Dev 20230314. But I did not want to report the issue until we had another development version of Stockfish to test. The new version of Stockfish was released today and is still broken.
I am not sure how this error has slipped by Fishtest?
Last edited by mwyoung on Sun Mar 19, 2023 6:09 pm; edited 1 time in total
Ghppn likes this post
Mclane
Posts : 3022 Join date : 2020-11-17 Age : 57 Location : United States of Europe, Germany, Ruhr area
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 3:39 pm
Shit happens
Ghppn likes this post
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 3:42 pm
Mclane wrote:
Shit happens
You would think with the testing they do before releasing new versions of Stockfish. This would be detected.
Mclane and Ghppn like this post
Mclane
Posts : 3022 Join date : 2020-11-17 Age : 57 Location : United States of Europe, Germany, Ruhr area
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 4:33 pm
If they ever look into the games, maybe. If they only look into statistics about score and elo, NO.
Brendan and Ghppn like this post
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 6:20 pm
Looking at what is happening during the games with Stockfish development and the cause of the weak Stockfish development's chess play.
It seems Stockfish Development is in some moves or all moves. Stockfish Development is ignoring the search part of the evaluation, and is just playing the base move generated by the neural net.
This seems to be what the bug is doing to Stockfish development's chess play.
My best guess as to what is happening to Stockfish development.
Regardless of the cause. Stockfish development is playing chess at about a 1350 to 1400 Elo level of play.
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 7:32 pm
Three days ago I watched a somehow strange Stockfish game snippet that I intended to post about, but didn't find the time for yet. Dunno if this is somehow related to the bug this thread is about, but I will leave it here.
I stopped the game here out of pure boredom ( it is most obviously a draw) but several move choices of Stockfish looked extremely strange to me. Of course - a draw is a draw - but I couldn't get any idea why it went for this setup where you have to fight for it a pawn down.
Ghppn likes this post
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 7:44 pm
Peter Berger wrote:
Three days ago I watched a somehow strange Stockfish game snippet that I intended to post about, but didn't find the time for yet. Dunno if this is somehow related to the bug this thread is about, but I will leave it here.
I stopped the game here out of pure boredom ( it is most obviously a draw) but several move choices of Stockfish looked extremely strange to me. Of course - a draw is a draw - but I couldn't get any idea why it went for this setup where you have to fight for it a pawn down.
It is hard to say what is going on. Is the cause related to my computer, the compile, the time controls. The number of threads being used.
All I know for sure. I can demonstrate a huge regression in the last 2 development versions of Stockfish vs Stockfish 15.1 Playing Lc0 with Stockfish using 16 threads, 1 GB hash, at a time control of 3m+2s.
We need other testers confirm the results. I can only demonstrate what my computer is doing running Stockfish.
Ghppn likes this post
Dio
Posts : 222 Join date : 2021-08-28
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 7:59 pm
It is, of course, easy to predict a "regression" based on individual games.
Stefan Pohl has tested the version 14.03.2023 with 7000 games. I personally think that a significant progress in Stockfish is not to be expected in the future, the NNUE training seems to me exhausted, if there are not new ideas. The search should be practically optimal at the moment. However, I may be wrong.
Posts : 3022 Join date : 2020-11-17 Age : 57 Location : United States of Europe, Germany, Ruhr area
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 8:07 pm
Ideal situation for others to attack.
Dio likes this post
Dio
Posts : 222 Join date : 2021-08-28
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 8:15 pm
You are right Thorsten, but I stand by my opinion. The Komodo team has the same problem as can be seen from Larry Kaufmann's statements. At some point they will exhaust the strength improvements with NNUE, much faster than with a NN like Lc0, where much more information is stored. I admit, however, that I don't see any real increase in playing strength with Lc0 either, the training effort is enormous. More powerful graphics cards will show whether this will be the new "old" trend.
Mclane and Ghppn like this post
Mclane
Posts : 3022 Join date : 2020-11-17 Age : 57 Location : United States of Europe, Germany, Ruhr area
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 8:21 pm
Maybe we need more software support for the net.
Dio
Posts : 222 Join date : 2021-08-28
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 8:29 pm
for SF or Lc0 ? both have a massive support, for the openbench programs (Ethereal, Berserk..) this is present and also provides very useful results that will allow these programs to compete with Stockfish or Dragon in the future. Ed and Chris are for me persons who really master the NNUE development and will bring it further. The development of Rebel is unique, what is possible with CSTal can only be judged when it actually appears.
Ghppn likes this post
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Mar 19, 2023 9:11 pm
Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes: # name nodes/m NPS depth/m time/m moves time 1. Stockfish 15.1 65076K 13906748 51.7 4.7 60.8 284.7 2. Lc0 v0.29.0 135K 26654 9.8 5.1 34.2 173.5 3. Stockfish dev-20230319 149214K 14796918 48.8 10.1 21.2 213.4 4. Stockfish dev-20230314 157713K 14975458 50.4 10.5 19.9 210.1 all --- 49259K 8357740 30.1 6.0 33.9 204.3
Ghppn likes this post
Uri Blass
Posts : 207 Join date : 2020-11-28
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Tue Mar 21, 2023 11:09 am
Mclane wrote:
If they ever look into the games, maybe. If they only look into statistics about score and elo, NO.
Statistics is enough to detect a big bug. I see that stockfish simply lose every game in mark young's games.
I guess it does not happen in their testing. I guess that the bug is related to testing conditions that is different so watching games cannot help to detect the problem.
Considering the way the stockfish team test I can understand no improvement in playing strength or even getting weaker in the ability to beat weaker opponents but not losing every game against lc0 and playing obviously bad moves unless the problem is simply testing conditions that they did not test(for example if they test only ponder off and somebody play ponder on).
Ghppn likes this post
Peter Berger
Posts : 131 Join date : 2020-11-20
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Fri Mar 31, 2023 8:58 pm
There is sth seriously broken in current Stockfish when it is about games against Crafty:
Again a draw out of a position of maximal weaknes (this time two pawns down). I have done these kind of games with ancient versions of Stockfish against this exact same version of Crafty, and Crafty got a draw once about every 100 games, and it usually didn't even come close. Against the current versions it is 1.0-2.0 now .. - OK - happy to not be a Stockfish developper or tester - but I am competent enough in this business to be sure, there is some problem here.
Ghppn likes this post
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Fri Mar 31, 2023 9:03 pm
Peter Berger wrote:
There is sth seriously broken in current Stockfish when it is about games against Crafty:
Again a draw out of a position of maximal weaknes (this time two pawns down). I have done these kind of games with ancient versions of Stockfish against this exact same version of Crafty, and Crafty got a draw once about every 100 games, and it usually didn't even come close. Against the current versions it is 1.0-2.0 now .. - OK - happy to not be a Stockfish developper or tester - but I am competent enough in this business to be sure, there is some problem here.
The issue I had as shown in my games. Is now gone with the latest Stockfish DEV versions.
Nezhman likes this post
Peter Berger
Posts : 131 Join date : 2020-11-20
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Fri Mar 31, 2023 9:07 pm
The issue I had as shown in my games. Is now gone with the latest Stockfish DEV versions.
I wanted to post this but didn't . no sign of the bug you pointed out, but the draw problem seems to be pretty relevant ( it is not as if Stockfish could afford that many draws against the likes of Crafty )
matejst, Nezhman and Ghppn like this post
Theresa May
Posts : 12 Join date : 2020-11-27
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Apr 30, 2023 5:25 pm
Dragon development is also broken. No improvement in elo from Dragon 3 to Dragon 3.2 if we use Mark Young's methodology to test between Dragon 3 and Dragon 3.2. No wonder why Larry Kaufman and Mark Lefter gave up on Dragon.
Damir Desevac
Posts : 330 Join date : 2020-11-27 Age : 43 Location : Denmark
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sun Apr 30, 2023 6:11 pm
GUI Banksia Book Perfect Book 2021 to 6 moves. Tablebases All up to 7 man of the top 10. Threads = 16 CPU. Hash = 1024 Time Control = 3m+2s Robin Round Tournament 40 Rounds.
I first detected the problem with Stockfish Dev 20230314. But I did not want to report the issue until we had another development version of Stockfish to test. The new version of Stockfish was released today and is still broken.
I am not sure how this error has slipped by Fishtest?
Fake News. SF just won TCEC, where it beat Lc0 in the finale... You have to keep on dreaming, if you wish Lc0 to beat SF, lolll
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Fri Jun 16, 2023 10:34 am
GUI Banksia Book Perfect Book 2021 to 6 moves. Tablebases All up to 7 man of the top 10. Threads = 16 CPU. Hash = 1024 Time Control = 3m+2s Robin Round Tournament 40 Rounds.
I first detected the problem with Stockfish Dev 20230314. But I did not want to report the issue until we had another development version of Stockfish to test. The new version of Stockfish was released today and is still broken.
I am not sure how this error has slipped by Fishtest?
Fake News. SF just won TCEC, where it beat Lc0 in the finale... You have to keep on dreaming, if you wish Lc0 to beat SF, lolll
mwyoung
Posts : 880 Join date : 2020-11-25 Location : USA
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Fri Jun 16, 2023 8:08 pm
So much for the theory that the reason Stockfish is not improving is because Stockfish already plays perfect chess.
Stockfish -40 Elo Below Lc0 on SSDF.
Mclane likes this post
Nezhman
Posts : 74 Join date : 2020-11-27
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sat Jun 17, 2023 3:59 am
Peter Berger wrote:
The issue I had as shown in my games. Is now gone with the latest Stockfish DEV versions.
I wanted to post this but didn't . no sign of the bug you pointed out, but the draw problem seems to be pretty relevant ( it is not as if Stockfish could afford that many draws against the likes of Crafty )
This is why proper testing should include more lopsided matchups. If Stockfish can't beat Crafty and other much weaker engines enough times, it should pay a price in Elo.
The biggest problem with SF-NNUE is that it has a certain unhealthy preference for a safety-first approach. As if it were playing not to lose, instead of going for the win and taking reasonable risks to do so.
Sponsored content
Subject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1
Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1