ProDeo
Would you like to react to this message? Create an account in a few clicks or log in to continue.
ProDeo

Computer Chess
 
HomeHome  CalendarCalendar  Latest imagesLatest images  FAQFAQ  SearchSearch  MemberlistMemberlist  UsergroupsUsergroups  RegisterRegister  Log in  

 

 Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1

Go down 
+4
Dio
Peter Berger
Mclane
mwyoung
8 posters
AuthorMessage
mwyoung

mwyoung


Posts : 880
Join date : 2020-11-25
Location : USA

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 3:23 pm



Hardware:  Threadripper 2950x,  RTX 2080TI,  64 GBs Ram.

GUI Banksia
Book Perfect Book 2021 to 6 moves.
Tablebases All up to 7 man of the top 10.
Threads  = 16 CPU.
Hash = 1024
Time Control = 3m+2s
Robin Round Tournament 40 Rounds.

I first detected the problem with Stockfish Dev 20230314. But I did not want to report the issue until we had another development version of Stockfish to test. The new version of Stockfish was released today and is still broken.

I am not sure how this error has slipped by Fishtest?


Last edited by mwyoung on Sun Mar 19, 2023 6:09 pm; edited 1 time in total

Ghppn likes this post

Back to top Go down
Mclane

Mclane


Posts : 2921
Join date : 2020-11-17
Age : 57
Location : United States of Europe, Germany, Ruhr area

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 3:39 pm

Shit happens

Ghppn likes this post

Back to top Go down
http://www.thorstenczub.de
mwyoung

mwyoung


Posts : 880
Join date : 2020-11-25
Location : USA

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 3:42 pm

Mclane wrote:
Shit happens

You would think with the testing they do before releasing new versions of Stockfish. This would be detected. lol!

Mclane and Ghppn like this post

Back to top Go down
Mclane

Mclane


Posts : 2921
Join date : 2020-11-17
Age : 57
Location : United States of Europe, Germany, Ruhr area

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 4:33 pm

If they ever look into the games, maybe.
If they only look into statistics about score and elo, NO.

Brendan and Ghppn like this post

Back to top Go down
http://www.thorstenczub.de
mwyoung

mwyoung


Posts : 880
Join date : 2020-11-25
Location : USA

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 6:20 pm

Looking at what is happening during the games with Stockfish development and the cause of the weak Stockfish development's chess play.

It seems Stockfish Development is in some moves or all moves. Stockfish Development is ignoring the search part of the evaluation, and is just playing the base move generated by the neural net.

This seems to be what the bug is doing to Stockfish development's chess play.

My best guess as to what is happening to Stockfish development.

Regardless of the cause. Stockfish development is playing chess at about a 1350 to 1400 Elo level of play.


[pgn][Event "*"]
[Site "*"]
[Date "2023.03.19"]
[Time "12:10:29"]
[Round "2"]
[Board "16"]
[White "Lc0 v0.29.0"]
[Black "Stockfish dev-20230314"]
[Result "1-0"]
[Termination "mate"]
[ECO "A50"]
[Opening "Queen's pawn game"]
[TimeControl "180+2"]
[PlyCount "51"]

1.c4 Nf6 2.d4 c6 {A50: Queen's pawn game}
3.Nf3 e6 4.g3 d5
5.Bg2 Be7 6.O-O O-O {End of opening}
7.Qc2 {+0.2/11 21090 295484 275/573/152} h6 {+0.3/32 40699 596648554 1/977/22} 8.Rd1 {+0.2/12 5313 62616 309/552/139} a5 {+0.3/32 29372 431034128 1/967/32}
9.Nc3 {+0.3/16 5892 56357 293/588/119} Nbd7 {+0.4/26 7610 116452217 0/958/42} 10.b3 {+0.3/19 2389 56713 294/598/108} Re8 {+0.4/29 10164 149666369 0/962/38}
11.Bf4 {+0.4/13 6615 53008 393/495/112} Bf8 {+0.6/29 22662 325023175 0/908/92} 12.e4 {+0.4/14 10218 91907 396/503/101} Nxe4 {+0.5/25 6670 99768532 0/920/80}
13.Nxe4 {+0.4/19 285 48020 388/514/98} e5 {+0.7/27 11931 173508738 0/845/155} 14.Nxe5 {+1.0/15 5709 43265 592/352/56} g5 {+1.1/26 11683 186862875 0/294/706}
15.cxd5 {+12.0/11 6631 49459 967/23/10} cxd5 {+4.5/24 11241 181479620 0/0/1000} 16.Nxf7 {+17.9/11 4764 47125 982/9/9} Kxf7 {+5.5/21 10286 251538020 0/0/1000}
17.Bc7 {+21.8/11 3273 64367 986/6/8} Rxe4 {+5.5/25 30587 586065147 0/0/1000} 18.Bxd8 {+91.3/6 6540 56676 999/0/1} Kg8 {+8.5/19 8853 239130143 0/0/1000}
19.Bxe4 {+118.7/5 10074 94318 1000/0/0} Kg7 {+11.5/16 2370 68150728 0/0/1000} 20.Bxd5 {+111.5/5 7490 70005 999/1/0} Nb6 {+30.3/29 3073 88785796 0/0/1000}
21.Bxb6 {+125.1/6 5992 51001 1000/0/0} Bd6 {M-10/30 470 12367583 0/0/1000} 22.Re1 {+125.4/5 6117 56692 1000/0/0} Be5 {M-8/55 697 17464011 0/0/1000}
23.Rxe5 {+120.5/5 4707 40534 1000/0/0} Bf5 {M-4/122 702 34346651 0/0/1000} 24.Qxf5 {M+4/1 10 249 1000/0/0} h5 {M-3/245 144 9818314 0/0/1000}
25.Re7+ {M+2/1 10 6 1000/0/0} Kh6 {M-1/245 20 19974 0/0/1000} 26.Qh7# {M+1/1 10 2 1000/0/0} 1-0

[/pgn]

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Sfa10

Ghppn likes this post

Back to top Go down
Peter Berger




Posts : 120
Join date : 2020-11-20

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 7:32 pm

Three days ago I watched a somehow strange Stockfish game snippet that I intended to post about, but didn't find the time for yet. Dunno if this is somehow related to the bug this thread is about, but I will leave it here.

Intended title was "Not every 0.00 is the same".

[Event "Lang 120min+10sek"]
[Site "Berlin"]
[Date "2023.03.16"]
[Round "?"]
[White "Crafty 25.6"]
[Black "Stockfish dev-20230314-f0556dc"]
[Result "*"]
[ECO "C67"]
[PlyCount "75"]
[TimeControl "7200+10"]

{4096MB, LAPTOP-NCDN8BTK} 1. e4 {[%emt 0:00:00]} e5 {[%eval 17,44] [%emt 0:02:
54]} 2. Nf3 {[%emt 0:00:06]} Nc6 {[%eval 21,43] [%emt 0:01:42]} 3. Bb5 {
[%emt 0:00:08]} Nf6 {[%eval 18,44] [%emt 0:01:35]} 4. O-O {[%emt 0:00:07]} Nxe4
{[%eval 12,45] [%emt 0:02:01]} 5. d4 {[%emt 0:00:06] (Te1)} Nd6 {[%eval 13,44]
[%emt 0:02:07]} 6. Bxc6 {[%emt 0:00:06]} dxc6 {[%eval 9,41] [%emt 0:01:46]} 7.
dxe5 {[%emt 0:00:07]} Nf5 {[%eval 15,42] [%emt 0:02:04]} 8. Qxd8+ {[%emt 0:00:
07]} Kxd8 {[%eval 7,29] [%emt 0:00:01]} 9. Nc3 {[%emt 0:00:07]} Be7 {[%eval 12,
43] [%emt 0:02:40]} 10. Rd1+ {[%emt 0:00:07] (Te1)} Ke8 {[%eval 7,40] [%emt 0:
02:00]} 11. h3 {[%emt 0:00:09]} Nh4 {[%eval 7,46] [%emt 0:04:33]} 12. Nxh4 {
[%emt 0:00:07]} Bxh4 {[%eval 11,45] [%emt 0:02:03]} 13. g4 {[%emt 0:00:08]
(Se2)} h5 {[%eval 6,46] [%emt 0:01:58]} 14. f3 {[%emt 0:03:07] (g5)} f5 {
[%eval 3,47] [%emt 0:09:47]} 15. exf6 {[%emt 0:03:01]} gxf6 {[%eval 0,46]
[%emt 0:00:01]} 16. Kg2 {[%emt 0:11:55] (Le3)} Kf7 {[%eval 0,47] [%emt 0:02:55]
} 17. Bf4 {[%emt 0:03:13] (Ld2)} f5 {[%eval 0,57] [%emt 0:01:51]} 18. Bxc7 {
[%emt 0:02:36]} hxg4 {[%eval 0,53] [%emt 0:00:01]} 19. hxg4 {[%emt 0:03:07]}
fxg4 {[%eval 0,53] [%emt 0:00:14]} 20. Rd4 {[%emt 0:02:08]} gxf3+ {[%eval 0,55]
[%emt 0:00:36]} 21. Kxf3 {[%emt 0:01:32]} Be6 {[%eval 0,51] [%emt 0:00:16]} 22.
Rf4+ {[%emt 0:02:16]} Ke7 {[%eval 0,56] [%emt 0:00:01]} 23. Ne4 {[%emt 0:02:12]
} Bd5 {[%eval 0,59] [%emt 0:00:01]} 24. Rg1 {[%emt 0:01:59] (c4)} Kd7 {[%eval
0,64] [%emt 0:01:57]} 25. Rg7+ {[%emt 0:02:06]} Be7 {[%eval 0,59] [%emt 0:00:
13]} 26. c4 {[%emt 0:01:39]} Bxe4+ {[%eval 0,64] [%emt 0:00:35]} 27. Rxe4 {
[%emt 0:01:14]} Kxc7 {[%eval 0,70] [%emt 0:00:53]} 28. Rexe7+ {[%emt 0:01:56]}
Kb6 {[%eval 0,69] [%emt 0:00:07]} 29. Rxb7+ {[%emt 0:01:38]} Kc5 {[%eval 0,75]
[%emt 0:00:44]} 30. b3 {[%emt 0:00:58] (Txa7)} a5 {[%eval 0,80] [%emt 0:02:18]}
31. Rg5+ {[%emt 0:03:48] (Ke3)} Kd4 {[%eval 0,70] [%emt 0:02:01]} 32. Rg4+ {
[%emt 0:00:08] (Td7+)} Kc3 {[%eval 0,69] [%emt 0:01:53]} 33. Kf4 {[%emt 0:03:
31] (Tg3)} Rae8 {[%eval 0,75] [%emt 0:02:00]} 34. Rg3+ {[%emt 0:02:12]} Kb2 {
[%eval 0,69] [%emt 0:00:01]} 35. Rg2+ {[%emt 0:01:53]} Kc3 {[%eval 0,66] [%emt
0:00:06]} 36. Rf7 {[%emt 0:01:29] (Tg6)} Rh1 {[%eval 0,48] [%emt 0:02:05]} 37.
Rf2 {[%emt 0:01:42] (Kf3)} Kb4 {[%eval 0,59] [%emt 0:02:23]} 38. Kg3 {[%emt 0:
01:59] (Tf6)} *

I stopped the game here out of pure boredom ( it is most obviously a draw) but several move choices of Stockfish looked extremely strange to me. Of course - a draw is a draw - but I couldn't get any idea why it went for this setup where you have to fight for it a pawn down.


Ghppn likes this post

Back to top Go down
mwyoung

mwyoung


Posts : 880
Join date : 2020-11-25
Location : USA

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 7:44 pm

Peter Berger wrote:
Three days ago I watched a somehow strange Stockfish game snippet that I intended to post about, but didn't find the time for yet. Dunno if this is somehow related to the bug this thread is about, but I will leave it here.

Intended title was "Not every 0.00 is the same".

[Event "Lang 120min+10sek"]
[Site "Berlin"]
[Date "2023.03.16"]
[Round "?"]
[White "Crafty 25.6"]
[Black "Stockfish dev-20230314-f0556dc"]
[Result "*"]
[ECO "C67"]
[PlyCount "75"]
[TimeControl "7200+10"]

{4096MB, LAPTOP-NCDN8BTK} 1. e4 {[%emt 0:00:00]} e5 {[%eval 17,44] [%emt 0:02:
54]} 2. Nf3 {[%emt 0:00:06]} Nc6 {[%eval 21,43] [%emt 0:01:42]} 3. Bb5 {
[%emt 0:00:08]} Nf6 {[%eval 18,44] [%emt 0:01:35]} 4. O-O {[%emt 0:00:07]} Nxe4
{[%eval 12,45] [%emt 0:02:01]} 5. d4 {[%emt 0:00:06] (Te1)} Nd6 {[%eval 13,44]
[%emt 0:02:07]} 6. Bxc6 {[%emt 0:00:06]} dxc6 {[%eval 9,41] [%emt 0:01:46]} 7.
dxe5 {[%emt 0:00:07]} Nf5 {[%eval 15,42] [%emt 0:02:04]} 8. Qxd8+ {[%emt 0:00:
07]} Kxd8 {[%eval 7,29] [%emt 0:00:01]} 9. Nc3 {[%emt 0:00:07]} Be7 {[%eval 12,
43] [%emt 0:02:40]} 10. Rd1+ {[%emt 0:00:07] (Te1)} Ke8 {[%eval 7,40] [%emt 0:
02:00]} 11. h3 {[%emt 0:00:09]} Nh4 {[%eval 7,46] [%emt 0:04:33]} 12. Nxh4 {
[%emt 0:00:07]} Bxh4 {[%eval 11,45] [%emt 0:02:03]} 13. g4 {[%emt 0:00:08]
(Se2)} h5 {[%eval 6,46] [%emt 0:01:58]} 14. f3 {[%emt 0:03:07] (g5)} f5 {
[%eval 3,47] [%emt 0:09:47]} 15. exf6 {[%emt 0:03:01]} gxf6 {[%eval 0,46]
[%emt 0:00:01]} 16. Kg2 {[%emt 0:11:55] (Le3)} Kf7 {[%eval 0,47] [%emt 0:02:55]
} 17. Bf4 {[%emt 0:03:13] (Ld2)} f5 {[%eval 0,57] [%emt 0:01:51]} 18. Bxc7 {
[%emt 0:02:36]} hxg4 {[%eval 0,53] [%emt 0:00:01]} 19. hxg4 {[%emt 0:03:07]}
fxg4 {[%eval 0,53] [%emt 0:00:14]} 20. Rd4 {[%emt 0:02:08]} gxf3+ {[%eval 0,55]
[%emt 0:00:36]} 21. Kxf3 {[%emt 0:01:32]} Be6 {[%eval 0,51] [%emt 0:00:16]} 22.
Rf4+ {[%emt 0:02:16]} Ke7 {[%eval 0,56] [%emt 0:00:01]} 23. Ne4 {[%emt 0:02:12]
} Bd5 {[%eval 0,59] [%emt 0:00:01]} 24. Rg1 {[%emt 0:01:59] (c4)} Kd7 {[%eval
0,64] [%emt 0:01:57]} 25. Rg7+ {[%emt 0:02:06]} Be7 {[%eval 0,59] [%emt 0:00:
13]} 26. c4 {[%emt 0:01:39]} Bxe4+ {[%eval 0,64] [%emt 0:00:35]} 27. Rxe4 {
[%emt 0:01:14]} Kxc7 {[%eval 0,70] [%emt 0:00:53]} 28. Rexe7+ {[%emt 0:01:56]}
Kb6 {[%eval 0,69] [%emt 0:00:07]} 29. Rxb7+ {[%emt 0:01:38]} Kc5 {[%eval 0,75]
[%emt 0:00:44]} 30. b3 {[%emt 0:00:58] (Txa7)} a5 {[%eval 0,80] [%emt 0:02:18]}
31. Rg5+ {[%emt 0:03:48] (Ke3)} Kd4 {[%eval 0,70] [%emt 0:02:01]} 32. Rg4+ {
[%emt 0:00:08] (Td7+)} Kc3 {[%eval 0,69] [%emt 0:01:53]} 33. Kf4 {[%emt 0:03:
31] (Tg3)} Rae8 {[%eval 0,75] [%emt 0:02:00]} 34. Rg3+ {[%emt 0:02:12]} Kb2 {
[%eval 0,69] [%emt 0:00:01]} 35. Rg2+ {[%emt 0:01:53]} Kc3 {[%eval 0,66] [%emt
0:00:06]} 36. Rf7 {[%emt 0:01:29] (Tg6)} Rh1 {[%eval 0,48] [%emt 0:02:05]} 37.
Rf2 {[%emt 0:01:42] (Kf3)} Kb4 {[%eval 0,59] [%emt 0:02:23]} 38. Kg3 {[%emt 0:
01:59] (Tf6)} *

I stopped the game here out of pure boredom ( it is most obviously a draw) but several move choices of Stockfish looked extremely strange to me. Of course - a draw is a draw - but I couldn't get any idea why it went for this setup where you have to fight for it a pawn down.



It is hard to say what is going on. Is the cause related to my computer, the compile, the time controls. The number of threads being used.

All I know for sure. I can demonstrate a huge regression in the last 2 development versions of Stockfish vs Stockfish 15.1 Playing Lc0 with Stockfish using 16 threads, 1 GB hash, at a time control of 3m+2s.

We need other testers confirm the results. I can only demonstrate what my computer is doing running Stockfish.

Ghppn likes this post

Back to top Go down
Dio




Posts : 214
Join date : 2021-08-28

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 7:59 pm

It is, of course, easy to predict a "regression" based on individual games.

Stefan Pohl has tested the version 14.03.2023 with 7000 games. I personally think that a significant progress in Stockfish is not to be expected in the future, the
NNUE training seems to me exhausted, if there are not new ideas. The search should be practically optimal at the moment. However, I may be wrong.

https://www.sp-cc.de/

Ghppn likes this post

Back to top Go down
Mclane

Mclane


Posts : 2921
Join date : 2020-11-17
Age : 57
Location : United States of Europe, Germany, Ruhr area

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 8:07 pm

Ideal situation for others to attack.

Dio likes this post

Back to top Go down
http://www.thorstenczub.de
Dio




Posts : 214
Join date : 2021-08-28

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 8:15 pm

You are right Thorsten, but I stand by my opinion. The Komodo team has the same problem as can be seen from Larry Kaufmann's statements. At some point they will exhaust the strength improvements with NNUE, much faster than with a NN like Lc0, where much more information is stored. I admit, however, that I don't see any real increase in playing strength with Lc0 either, the training effort is enormous. More powerful graphics cards will show whether this will be the new "old" trend.

Mclane and Ghppn like this post

Back to top Go down
Mclane

Mclane


Posts : 2921
Join date : 2020-11-17
Age : 57
Location : United States of Europe, Germany, Ruhr area

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 8:21 pm

Maybe we need more software support for the net.
Back to top Go down
http://www.thorstenczub.de
Dio




Posts : 214
Join date : 2021-08-28

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 8:29 pm

for SF or Lc0 ? both have a massive support, for the openbench programs (Ethereal, Berserk..) this is present and also provides very useful results that will allow these programs to compete with Stockfish or Dragon in the future. Ed and Chris are for me persons who really master the NNUE development and will bring it further. The development of Rebel is unique, what is possible with CSTal can only be judged when it actually appears.

Ghppn likes this post

Back to top Go down
mwyoung

mwyoung


Posts : 880
Join date : 2020-11-25
Location : USA

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Mar 19, 2023 9:11 pm

End of Demonstration.

Code:
Result:
------------------------------------------------------------------------------------
  #  name                    games            wins           draws          losses           score    elo    +    -
  1. Stockfish 15.1             18          0 0.0%       18 100.0%          0 0.0%         9 50.0%    268   80   80
  2. Lc0 v0.29.0                56       38 100.0%       18 100.0%          0 0.0%        47 83.9%    239   72   68
  3. Stockfish dev-20230319     19          0 0.0%          0 0.0%       19 100.0%          0 0.0%   -254  155  348
  4. Stockfish dev-20230314     19          0 0.0%          0 0.0%       19 100.0%          0 0.0%   -254  155  348

Cross table:
------------------------------------------------------------------------------------
  #  name                               score   games         1         2         3         4
  1. Stockfish 15.1                   9 50.0%      18         x       9.0                    
  2. Lc0 v0.29.0                     47 83.9%      56       9.0         x      19.0      19.0
  3. Stockfish dev-20230319            0 0.0%      19                 0.0         x          
  4. Stockfish dev-20230314            0 0.0%      19                 0.0                   x

Tech:
------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                      nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 15.1             65076K    13906748     51.7      4.7     60.8    284.7
  2. Lc0 v0.29.0                  135K       26654      9.8      5.1     34.2    173.5
  3. Stockfish dev-20230319    149214K    14796918     48.8     10.1     21.2    213.4
  4. Stockfish dev-20230314    157713K    14975458     50.4     10.5     19.9    210.1
     all ---                    49259K     8357740     30.1      6.0     33.9    204.3
 

Ghppn likes this post

Back to top Go down
Uri Blass




Posts : 207
Join date : 2020-11-28

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptyTue Mar 21, 2023 11:09 am

Mclane wrote:
If they ever look into the games, maybe.
If they only look into statistics about score and elo, NO.

Statistics is enough to detect a big bug.
I see that stockfish simply lose every game in mark young's games.

I guess it does not happen in their testing.
I guess that the bug is related to testing conditions that is different so watching games cannot help to detect the problem.

Considering the way the stockfish team test
I can understand no improvement in playing strength or even getting weaker in the ability to beat weaker opponents but not losing every game against lc0 and playing obviously bad moves unless the problem is simply testing conditions that they did not test(for example if they test only ponder off and somebody play ponder on).

Ghppn likes this post

Back to top Go down
Peter Berger




Posts : 120
Join date : 2020-11-20

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptyFri Mar 31, 2023 8:58 pm

There is sth seriously broken in current Stockfish when it is about games against Crafty:

[Event "Lang 120min+10sek"]
[Site "Berlin"]
[Date "2023.03.31"]
[Round "?"]
[White "Crafty 25.6"]
[Black "Stockfish dev-20230329-3f01e3f"]
[Result "1/2-1/2"]
[ECO "C67"]
[PlyCount "55"]
[TimeControl "7200+10"]

{4096MB, LAPTOP-NCDN8BTK} 1. e4 {[%emt 0:00:00]} e5 {[%eval 20,44] [%emt 0:02:
54]} 2. Nf3 {[%emt 0:00:08]} Nc6 {[%eval 19,41] [%emt 0:01:36]} 3. Bb5 {
[%emt 0:00:07] (Lc4)} Nf6 {[%eval 17,45] [%emt 0:02:03]} 4. O-O {[%emt 0:00:08]
} Nxe4 {[%eval 15,44] [%emt 0:01:44]} 5. d4 {[%emt 0:00:08] (Te1)} Nd6 {
[%eval 12,45] [%emt 0:02:02]} 6. Bxc6 {[%emt 0:00:08]} dxc6 {[%eval 15,41]
[%emt 0:02:38]} 7. dxe5 {[%emt 0:00:07]} Nf5 {[%eval 9,43] [%emt 0:02:42]} 8.
Qxd8+ {[%emt 0:00:10]} Kxd8 {[%eval 11,33] [%emt 0:00:01]} 9. h3 {[%emt 0:00:
11]} Be7 {[%eval 5,45] [%emt 0:01:57]} 10. Nc3 {[%emt 0:05:34]} Nh4 {[%eval 12,
46] [%emt 0:00:01]} 11. Rd1+ {[%emt 0:06:37] (Sxh4)} Ke8 {[%eval 3,46] [%emt 0:
02:04]} 12. Nxh4 {[%emt 0:00:10]} Bxh4 {[%eval 2,48] [%emt 0:02:19]} 13. g4 {
[%emt 0:00:10]} h5 {[%eval 2,47] [%emt 0:02:24]} 14. f3 {[%emt 0:02:49]} f5 {
[%eval 3,50] [%emt 0:00:51]} 15. exf6 {[%emt 0:05:33]} gxf6 {[%eval 2,53]
[%emt 0:00:00]} 16. Bf4 {[%emt 0:06:07]} hxg4 {[%eval 2,53] [%emt 0:00:01]} 17.
hxg4 {[%emt 0:05:01]} Be6 {[%eval 1,54] [%emt 0:00:01]} 18. Ne2 {[%emt 0:08:31]
} Kf7 {[%eval 1,58] [%emt 0:00:01]} 19. Nd4 {[%emt 0:02:43]} Bd7 {[%eval 1,57]
[%emt 0:00:57]} 20. Bxc7 {[%emt 0:00:58]} Rag8 {[%eval 1,58] [%emt 0:01:29]}
21. Kf1 {[%emt 0:01:13]} Re8 {[%eval 1,58] [%emt 0:01:00]} 22. c3 {[%emt 0:04:
06] (Se2)} f5 {[%eval 0,53] [%emt 0:02:18]} 23. Nxf5 {[%emt 0:00:09]} Bxf5 {
[%eval 0,66] [%emt 0:03:06]} 24. gxf5 {[%emt 0:00:10]} Kf6 {[%eval 0,71] [%emt
0:02:09]} 25. Bh2 {[%emt 0:00:55]} Bf2 {[%eval 0,65] [%emt 0:01:06]} 26. Kxf2 {
[%emt 0:00:27]} Rxh2+ {[%eval 0,62] [%emt 0:01:37]} 27. Kg3 {[%emt 0:00:13]}
Reh8 {[%eval 0,65] [%emt 0:01:54]} 28. Rd7 {[%emt 0:00:00]} 1/2-1/2

Again a draw out of a position of maximal weaknes (this time two pawns down). I have done these kind of games with ancient versions of Stockfish against this exact same version of Crafty, and Crafty got a draw once about every 100 games, and it usually didn't even come close. Against the current versions it is 1.0-2.0 now .. - OK - happy to not be a Stockfish developper or tester - but I am competent enough in this business to be sure, there is some problem here.

Ghppn likes this post

Back to top Go down
mwyoung

mwyoung


Posts : 880
Join date : 2020-11-25
Location : USA

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptyFri Mar 31, 2023 9:03 pm

Peter Berger wrote:
There is sth seriously broken in current Stockfish when it is about games against Crafty:

[Event "Lang 120min+10sek"]
[Site "Berlin"]
[Date "2023.03.31"]
[Round "?"]
[White "Crafty 25.6"]
[Black "Stockfish dev-20230329-3f01e3f"]
[Result "1/2-1/2"]
[ECO "C67"]
[PlyCount "55"]
[TimeControl "7200+10"]

{4096MB, LAPTOP-NCDN8BTK} 1. e4 {[%emt 0:00:00]} e5 {[%eval 20,44] [%emt 0:02:
54]} 2. Nf3 {[%emt 0:00:08]} Nc6 {[%eval 19,41] [%emt 0:01:36]} 3. Bb5 {
[%emt 0:00:07] (Lc4)} Nf6 {[%eval 17,45] [%emt 0:02:03]} 4. O-O {[%emt 0:00:08]
} Nxe4 {[%eval 15,44] [%emt 0:01:44]} 5. d4 {[%emt 0:00:08] (Te1)} Nd6 {
[%eval 12,45] [%emt 0:02:02]} 6. Bxc6 {[%emt 0:00:08]} dxc6 {[%eval 15,41]
[%emt 0:02:38]} 7. dxe5 {[%emt 0:00:07]} Nf5 {[%eval 9,43] [%emt 0:02:42]} 8.
Qxd8+ {[%emt 0:00:10]} Kxd8 {[%eval 11,33] [%emt 0:00:01]} 9. h3 {[%emt 0:00:
11]} Be7 {[%eval 5,45] [%emt 0:01:57]} 10. Nc3 {[%emt 0:05:34]} Nh4 {[%eval 12,
46] [%emt 0:00:01]} 11. Rd1+ {[%emt 0:06:37] (Sxh4)} Ke8 {[%eval 3,46] [%emt 0:
02:04]} 12. Nxh4 {[%emt 0:00:10]} Bxh4 {[%eval 2,48] [%emt 0:02:19]} 13. g4 {
[%emt 0:00:10]} h5 {[%eval 2,47] [%emt 0:02:24]} 14. f3 {[%emt 0:02:49]} f5 {
[%eval 3,50] [%emt 0:00:51]} 15. exf6 {[%emt 0:05:33]} gxf6 {[%eval 2,53]
[%emt 0:00:00]} 16. Bf4 {[%emt 0:06:07]} hxg4 {[%eval 2,53] [%emt 0:00:01]} 17.
hxg4 {[%emt 0:05:01]} Be6 {[%eval 1,54] [%emt 0:00:01]} 18. Ne2 {[%emt 0:08:31]
} Kf7 {[%eval 1,58] [%emt 0:00:01]} 19. Nd4 {[%emt 0:02:43]} Bd7 {[%eval 1,57]
[%emt 0:00:57]} 20. Bxc7 {[%emt 0:00:58]} Rag8 {[%eval 1,58] [%emt 0:01:29]}
21. Kf1 {[%emt 0:01:13]} Re8 {[%eval 1,58] [%emt 0:01:00]} 22. c3 {[%emt 0:04:
06] (Se2)} f5 {[%eval 0,53] [%emt 0:02:18]} 23. Nxf5 {[%emt 0:00:09]} Bxf5 {
[%eval 0,66] [%emt 0:03:06]} 24. gxf5 {[%emt 0:00:10]} Kf6 {[%eval 0,71] [%emt
0:02:09]} 25. Bh2 {[%emt 0:00:55]} Bf2 {[%eval 0,65] [%emt 0:01:06]} 26. Kxf2 {
[%emt 0:00:27]} Rxh2+ {[%eval 0,62] [%emt 0:01:37]} 27. Kg3 {[%emt 0:00:13]}
Reh8 {[%eval 0,65] [%emt 0:01:54]} 28. Rd7 {[%emt 0:00:00]} 1/2-1/2

Again a draw out of a position of maximal weaknes (this time two pawns down). I have done these kind of games with ancient versions of Stockfish against this exact same version of Crafty, and Crafty got a draw once about every 100 games, and it usually didn't even come close. Against the current versions it is 1.0-2.0 now .. - OK - happy to not be a Stockfish developper or tester - but I am competent enough in this business to be sure, there is some problem here.

The issue I had as shown in my games. Is now gone with the latest Stockfish DEV versions.

Nezhman likes this post

Back to top Go down
Peter Berger




Posts : 120
Join date : 2020-11-20

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptyFri Mar 31, 2023 9:07 pm

The issue I had as shown in my games. Is now gone with the latest Stockfish DEV versions.

I wanted to post this but didn't . no sign of the bug you pointed out, but the draw problem seems to be pretty relevant ( it is not as if Stockfish could afford that many draws against the likes of Crafty Very Happy )

matejst, Nezhman and Ghppn like this post

Back to top Go down
Theresa May




Posts : 12
Join date : 2020-11-27

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Apr 30, 2023 5:25 pm

Dragon development is also broken. No improvement in elo from Dragon 3 to Dragon 3.2 if we use Mark Young's methodology to test between Dragon 3 and Dragon 3.2. No wonder why Larry Kaufman and Mark Lefter gave up on Dragon.
Back to top Go down
Damir Desevac

Damir Desevac


Posts : 316
Join date : 2020-11-27
Age : 43
Location : Denmark

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySun Apr 30, 2023 6:11 pm

mwyoung wrote:


Hardware:  Threadripper 2950x,  RTX 2080TI,  64 GBs Ram.

GUI Banksia
Book Perfect Book 2021 to 6 moves.
Tablebases All up to 7 man of the top 10.
Threads  = 16 CPU.
Hash = 1024
Time Control = 3m+2s
Robin Round Tournament 40 Rounds.

I first detected the problem with Stockfish Dev 20230314. But I did not want to report the issue until we had another development version of Stockfish to test. The new version of Stockfish was released today and is still broken.

I am not sure how this error has slipped by Fishtest?

Fake News. SF just won TCEC, where it beat Lc0 in the finale... You have to keep on dreaming, if you wish Lc0 to beat SF, lolll
Back to top Go down
mwyoung

mwyoung


Posts : 880
Join date : 2020-11-25
Location : USA

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptyFri Jun 16, 2023 10:34 am

Damir Desevac wrote:
mwyoung wrote:


Hardware:  Threadripper 2950x,  RTX 2080TI,  64 GBs Ram.

GUI Banksia
Book Perfect Book 2021 to 6 moves.
Tablebases All up to 7 man of the top 10.
Threads  = 16 CPU.
Hash = 1024
Time Control = 3m+2s
Robin Round Tournament 40 Rounds.

I first detected the problem with Stockfish Dev 20230314. But I did not want to report the issue until we had another development version of Stockfish to test. The new version of Stockfish was released today and is still broken.

I am not sure how this error has slipped by Fishtest?

Fake News. SF just won TCEC, where it beat Lc0 in the finale... You have to keep on dreaming, if you wish Lc0 to beat SF, lolll

lol!
Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Ssdf10
Back to top Go down
mwyoung

mwyoung


Posts : 880
Join date : 2020-11-25
Location : USA

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptyFri Jun 16, 2023 8:08 pm

So much for the theory that the reason Stockfish is not improving is because Stockfish already plays perfect chess.

Stockfish -40 Elo Below Lc0 on SSDF.

Shocked

Mclane likes this post

Back to top Go down
Nezhman




Posts : 74
Join date : 2020-11-27

Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 EmptySat Jun 17, 2023 3:59 am

Peter Berger wrote:
The issue I had as shown in my games. Is now gone with the latest Stockfish DEV versions.

I wanted to post this but didn't . no sign of the bug you pointed out, but the draw problem seems to be pretty relevant ( it is not as if Stockfish could afford that many draws against the likes of Crafty Very Happy )

This is why proper testing should include more lopsided matchups. If Stockfish can't beat Crafty and other much weaker engines enough times, it should pay a price in Elo.

The biggest problem with SF-NNUE is that it has a certain unhealthy preference for a safety-first approach. As if it were playing not to lose, instead of going for the win and taking reasonable risks to do so.
Back to top Go down
Sponsored content





Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty
PostSubject: Re: Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1   Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1 Empty

Back to top Go down
 
Stockfish Development has been broken . Gauntlet Demonstration Lc0 vs Stockfish Dev, and SF 15.1
Back to top 
Page 1 of 1
 Similar topics
-
» Stockfish is Broken in Windows 11 on some CPU's
» Stockfish 16 Runs the Gauntlet.
» Stockfish 13 Gauntlet Testing
» Stockfish 22/11/21 Gauntlet Tournament
» Stockfish 23/11/21 vs Dragon 2.5.1 MCTS Gauntlet Tournament

Permissions in this forum:You cannot reply to topics in this forum
ProDeo :: Computer Chess-
Jump to: