Subject: Re: Rating List Experiment Thu Jan 04, 2024 8:31 pm
Personally, I don't think very much of this new statistic, which in my opinion only serves to measure any existing Elo progress in current Stockfish versions. This is just my personal opinion.
Ghppn likes this post
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: Rating List Experiment Thu Jan 04, 2024 11:33 pm
(classical Armageddon-Chess): Win for white = 1 point for white Draw = 1 point for black Win for black = 1 point for black
(advanced Armageddon-Chess): Win for white = 1 point for white Draw = 1 point for black Win for black = 2 points for black (!!!) (especially useful/interesting, of course, when using white-biased unbalanced openings (like my UHO-openings)
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: Rating List Experiment Mon Jan 08, 2024 2:10 pm
Interesting.....
Ghppn likes this post
Uri Blass
Posts : 207 Join date : 2020-11-28
Subject: Re: Rating List Experiment Tue Jan 23, 2024 11:08 pm
It can't be that a difference of 6 or 12 more elo only that SF wins TCEC the last 7 times in a row. The difference must be much higher.
I know playing from balanced positions plays a role in this matter.
rating list with balanced book is different than rating list with unbalanced book.
It is even possible that not the same engine is better in both options so comparing CEGT with TCEC is comparing apples with oranges.
Stockfish team does not care about being best with a balanced book so hopefully we will see some engine with higher rating in CEGT or CCRL because of being better at beating weaker chess engines with 100% draws against stockfish.
pohl4711
Posts : 159 Join date : 2022-03-01 Location : Berlin
Subject: Re: Rating List Experiment Wed Jan 24, 2024 6:27 am
Uri Blass wrote:
Stockfish team does not care about being best with a balanced book
Of course not. Why should they? All engine tournaments with top-engines are played with my UHO-openings (chesscom engine-torunament site) or similar biased openings (TCEC). Balanced openings are useless for highend-computerchess these days. And in the future, balanced openings will be useless for computerchess at all. Just a question of time.
Thankfully, in contrast to you, the Stockfish-team looks forward into the future, not back into the past.
Uri Blass
Posts : 207 Join date : 2020-11-28
Subject: Re: Rating List Experiment Wed Jan 24, 2024 11:00 am
pohl4711 wrote:
Uri Blass wrote:
Stockfish team does not care about being best with a balanced book
Of course not. Why should they? All engine tournaments with top-engines are played with my UHO-openings (chesscom engine-torunament site) or similar biased openings (TCEC). Balanced openings are useless for highend-computerchess these days. And in the future, balanced openings will be useless for computerchess at all. Just a question of time.
Thankfully, in contrast to you, the Stockfish-team looks forward into the future, not back into the past.
When I use an engine to analyze my chess games I want the engine to help me to understand the best move also in equal positions that happen in my games that means the move that can give me the better chance to win against weaker opponents(otherwise I can have also games with no tactical mistakes and learn almost nothing from them).
You can say that the best move to increase chances to win against humans is not the best move to win against weaker engines but at least the best move to win against weaker engines is better than ignoring beating weaker players.
Note that my last OTB tournament game was with no tactical blunders when stockfish's evaluation never left the draw zone. There are some inaccuracies but the main reason for the draw is not that we are very strong chess players but because both players were more afraid to lose than wanted to win.
I started 1.e3 to surprise my opponent It seems that my opponent also was afraid from opening preperation so the game started with 1.e3 Nf6 2.Nf3 e6(in any case 1.e3 is not a bad move and it is not the reason for failing to win this game)
During the game I regretted not playing 11.Ne4 and I understood 36.Nc4 was bad after 36...Nxc4 37.Rxc4 a5 but these moves do not get the score out of the draw zone.
I guess that analysis by engines is going to claim that the sides played at GM level based on average blunder but I think average blunder is a bad strategy to evaluate level of players.
Peter Berger likes this post
Peter Berger
Posts : 131 Join date : 2020-11-20
Subject: Re: Rating List Experiment Thu Jan 25, 2024 8:24 pm
We probably have to look at the game of chess a bit differently IMHO. I am not a programmer, so please read with a friendly eye. We are used to think of chess as a zero sum gam with perfect information. But in reality, a game like NLH Poker (where I certainly reached a +way+ higher playing level than in chess) might be a more interesting general model.And even there, computers rule - so, this is clearly doeable for a computer program. I don't want to bore anyone with poker theory, but translated to chess and greatly simplified, you may be interested to know the expected value of a chessmove against a given opponent, including the concept of bluffs. It is my impression that this is the way top level human chess is currently heading. A player like Alireza Firouzja is taking huge risks ( currently in bad form, but still feeling special). We don't want to play a move risking something like losing a piece or being mated - but we are ready to take a risk. A program that worked somehow like this would be useful for you, Uri. As long as the move were good enough not to be punished more often than bearable at a given level.