Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: My first NNUE Sun Dec 19, 2021 12:57 am
matejst wrote:
Admin wrote:
That's a 20Mb net (better knowledge) doesn't work in Marvin.
I believe it would not be a problem. Several engines use 20Mb nets. Of course, if you think that it is not ready for testing yet, feel free to decline.
Well, there is *.nnue format and *.nnue format, the Benjamin 20Mb nnue edition currently doesn't work on any public engine. This will be fixed eventually.
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Sun Dec 19, 2021 2:14 am
No problems, of course. Just take your time. I am glad you found energy to revive the Rebel project and that it goes so well.
Then, this net is already much better than I expected and I enjoy using it a lot. BTW, I tested it today in several fixed depths matches against Marvin. While the results were inconclusive from depth 12 (the matches were too short), at depth 10 the difference was huge in Benjamin's favour, and I was thinking about running a much longer match during the night to confirm it (it does not last long anyway). For me, if the results stand (7/3 ratio so far in Benjamin's favour) it would confirm that its eval is basically sound -- but could it be that its evaluation is better suited to shallower depths? Obviously, I don't know much about the relationship between evaluation and depth.
Edit: And... no. At ply 10 the results are just like poker. Nothing relevant there. From depth 12 the results make sense, usually small margin wins by Benjamin in matches.
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Wed Dec 29, 2021 10:28 pm
After several days of daily use of the new Benjamin net for all my needs (often in concurrence with another engine), I can confirm my first impressions. Its evaluation in the middlegame is excellent -- both in dynamic and simple positions. Sometimes, it shows a preference for sacrifices, but for analysis it is not bad, quite the opposite. I guess overall that its comparatively weaker results can be easily improved by a well thought supervised training -- more simple positions, endings are necessary. It simply is often too optimistic in endings, and probably lacks knowledge.
The evaluation should be scaled down. It is sometimes way unrealistic. The proposed moves were good -- SF chose the same -- but way too often the eval was +4, even +5 when Benjamin had only a positional advantage and the initiative.
It seems that the positional knowledge in ProDeo was even better than I thought. In retrospect, I believe that it was simply not adapted to the improvements in search, which seems to be a problem encountered by many (John Stanback had this problem with Wasp.) and make the engine look silly at times. Had to reinstall ProDeo 1.0 to remind myself how coherent its evaluation was.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: My first NNUE Thu Dec 30, 2021 2:04 am
I am currently almost ready to create a new net, now at depth=7, first net was based on depth=6 evaluation. Then the endgame. It's a weak point but probably with nnue can be given a booster injection.
matejst likes this post
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Thu Dec 30, 2021 9:01 am
I remember when you tested Pro Deo against SF at lower depths -- it achieved very good results until depth 8. It well could be that older engines give a better evaluation at lower depths than modern ones, and that it is not a handicap for net training.
In general, Benjamin overestimates initiative and underestimates positional advantages (bishop pair, bishop vs knight in open positions, etc.), and sometimes even material advantages. In endings, I saw Benjamin make unsound sacrifice of pawns for waning initiative, and spoil clearly better, even winning positions achieved in the opening and middlegame. Wasp is also weak in endings, so I plan to compare Benjamin -- now that I am analyzing whole games -- with Slow Chess, which is much better in simple positions. [In the opening, otoh, Benjamin's eval is outstanding. At the same depths, probably as good as SF's.]
I don't know how you can train the net. In general, I think that several small nets are better than a huge one, like Jonathan Kreuzer did in SCB, and that a kind of supervised learning is better. The problem is how to select positions (number of pieces?), how to balance learning. I have no suggestions, no ideas. When I see that in Frank's FCP tournament the difference between SF and SlowChessBlitz is only100 Elos while SF outsearches SCB easily by 10 plies in any position and at any time control, one can only wonder if huge nets are needed. I would not be surprised if Dragon's evaluation was in general better than SF's (I don't have Dragon, so I can't say), and if Dietrich Kappe had alone done a better job than the whole SF team.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: My first NNUE Thu Dec 30, 2021 12:56 pm
It's true, as already stated earlier, Benjamin is a bit too wild, perhaps I should have used the more balanced ProDeo 3.1
Or even better to a much earlier version, for instance ProDeo 1.2 (2004/5) which doesn't do all the search pruning (good for elo) but bad for the quality of the evaluation. Problem with that is the run time to create a net would increase with a factor 3-5, meaning from 3 weeks to ~3 months.
Probably I will do something in between, throw out some search stuff of ProDeo 3.1 to keep the current evaluation instead of using the evaluation of ProDeo 1.2
Time will tell, there is a whole new area to explore.
matejst likes this post
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Mon Jan 03, 2022 6:10 pm
Ed,
Since you have much more computing power than I have, please do test Benjamin at a bit longer TCs. There seems to be a wide difference in evaluation somewhere from depth 24/5, when Benjamin's game become markedly less speculative. It still values initiative, its centipawn evaluation is still wild, but it is much safer.
I compared its evaluation with Wasp's 5.20, with a hybrid -- Ethereal 12.75 NNUE with a good SF net -- and with Komodo's. While Benjamin is overoptimistic at depths 14-20, it switches to "active positional" type of play over these depths. At depth 25, e.g., I find its eval as good as Komodo's.
I can just run short tests on my laptop, nonetheless, I think that Benjamin's risky game costs it a lot of Elos at lowers depths/shorter time controls. Anyway, if you develop an engine for this/and similar nets, be sure to have an adequate search, since I feel that Marvin's is not fitted for Benjamin.
Admin likes this post
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: My first NNUE Mon Jan 03, 2022 6:34 pm
Boban, you noticed well. When I test a net I play 5000 games at 40/10 (0.25s average) and 5000 games at 40/40 (one second average) and it's a fixed pattern the 40/40 results are considerable better.
matejst likes this post
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Mon Jan 03, 2022 9:19 pm
I also had a look at Rebel Century, Pro Deo 1.0 and Pro Deo 3.1. After a period when Pro Deo's evaluation was all over the place, you did a good job with this last version, and imho using Pro Deo for developing the net could be a good option. [In general, in positions which I have already analyzed and where the two Pro Deo find the best move (or an equally good move), Pro Deo 1.0 does it at depth 10-11 and Pro Deo 3.1 at depth 14-15 -- but it's about at the same time; then, in others, the greater depth achieved by Pro Deo 3.1 is a huge advantage. Generally speaking, the continuity is obvious, to the point that I sometimes think that the evaluation function is absolutely the same.]
It could help scale down the evaluation, which is overoptimistic and overemphasizes initiative and activity. I saw that SF developers made a selection of the Leela data they got by choosing some positions with more pieces and others with less pieces to improve play in simple positions -- your proposal to mix judiciously Benjamin's and Pro Deo's data could be a clever choice. [TBH, personally I would use a mixture of Benjamin data for complex, and Leela data for simple positions.]
While not a programmer, I am also aware that engines used a lot of search extensions with HCE [I don't know if it is possible with NNs, though, and I believe that many engines simply prune too much]. I feel it would be imperative for Benjamin/Pro Deo NN to check deeper its tactical choices. I don't know how you plan to develop further Benjamin (it would a pity if you stopped now), but what is certain is that Pro Deo's search is too slow. You mentioned adapting the search of another open source engine, and I was thinking that OliThink could be a good choice -- fast, simple, WB, and Oliver Brausch could gladly give his blessing. It could be a good starting point.
But let's not forget that chess programming is now just your hobby -- and while it's true that I impatiently await for a new version of Pro Deo (or Benjamin) -- I don't want you to feel any pressure from my side. Anyway, this version of the Benjamin net is very usable and is already my go-for engine.
TheSelfImprover likes this post
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: My first NNUE Mon Jan 03, 2022 10:05 pm
matejst wrote:
I also had a look at Rebel Century, Pro Deo 1.0 and Pro Deo 3.1. After a period when Pro Deo's evaluation was all over the place, you did a good job with this last version, and imho using Pro Deo for developing the net could be a good option. [In general, in positions which I have already analyzed and where the two Pro Deo find the best move (or an equally good move), Pro Deo 1.0 does it at depth 10-11 and Pro Deo 3.1 at depth 14-15 -- but it's about at the same time; then, in others, the greater depth achieved by Pro Deo 3.1 is a huge advantage. Generally speaking, the continuity is obvious, to the point that I sometimes think that the evaluation function is absolutely the same.]
Funny you come to that conclusion, I don't like the playing style of ProDeo 3.1 at all (in comparison with the last DOS Rebel or ProDeo 1.2) although the PESTO tables helped somewhat.
matejst wrote:
It could help scale down the evaluation, which is overoptimistic and overemphasizes initiative and activity. I saw that SF developers made a selection of the Leela data they got by choosing some positions with more pieces and others with less pieces to improve play in simple positions -- your proposal to mix judiciously Benjamin's and Pro Deo's data could be a clever choice. [TBH, personally I would use a mixture of Benjamin data for complex, and Leela data for simple positions.]
My idea is to use the evaluation ProDeo 3.1 (with less pruning) because it is more balanced than Benjamin and so I expect a lesser wild net. It's my -- perhaps premature -- conclusion created nnue's magnify the evaluation else I can not explain the score differences between Benjamin 1.1 and the current net you have.
matejst wrote:
While not a programmer, I am also aware that engines used a lot of search extensions with HCE [I don't know if it is possible with NNs, though, and I believe that many engines simply prune too much]. I feel it would be imperative for Benjamin/Pro Deo NN to check deeper its tactical choices. I don't know how you plan to develop further Benjamin (it would a pity if you stopped now), but what is certain is that Pro Deo's search is too slow. You mentioned adapting the search of another open source engine, and I was thinking that OliThink could be a good choice -- fast, simple, WB, and Oliver Brausch could gladly give his blessing. It could be a good starting point.
I will have a look at the source code of Olithink but in principle I want to use an engine that is no longer developed. One of them I notable had in mind was Rodent since Pavel said Rodent IV was the last one. Until about 10 days ago
matejst wrote:
But let's not forget that chess programming is now just your hobby -- and while it's true that I impatiently await for a new version of Pro Deo (or Benjamin) -- I don't want you to feel any pressure from my side. Anyway, this version of the Benjamin net is very usable and is already my go-for engine.
Thanks for all compliments.
Another compliment is in place, I could not have done this without friend Chris, he provided me with the right tools, answered all my questions, he deserves a huge credit.
Mclane, TheSelfImprover and matejst like this post
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Mon Jan 03, 2022 11:14 pm
Admin wrote:
Funny you come to that conclusion, I don't like the playing style of ProDeo 3.1 at all (in comparison with the last DOS Rebel or ProDeo 1.2) although the PESTO tables helped somewhat.
Sorry I could not answer right away, Ed.
I compared Pro Deo 1.0 and Pro Deo 3.1 side-by side, analyzing the same positions. I too did not like Pro Deo's evaluation since it started pruning more, and I was surprised to find that after, let's say, 1 or 2 mn, the moves proposed by Pro Deo 1.0 and Pro Deo 3.1 were identical (there was only one different move, in a more tactical position). The centipawn evaluation is different though. I will do a more thorough comparison after a short match between Benjamin and Wasp at longer TC.
TheSelfImprover likes this post
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Mon Jan 03, 2022 11:39 pm
Admin wrote:
I will have a look at the source code of Olithink but in principle I want to use an engine that is no longer developed. One of them I notable had in mind was Rodent since Pavel said Rodent IV was the last one. Until about 10 days ago
I thought about OliThink because it has practically no evaluation at all, and it is a WB engine.
There are several that are "unofficially" no longer developed: Xiphos, Fizbo, Hakkapeliitta are the first that come to mind.Recently, Amanj stated that he will not develop Zahak any more.
Anyway, there are plenty engines are open source, under the GPL. The code can be reused under that same license freely.
Fizbo and GreKo are unlicensed.
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Mon Jan 03, 2022 11:51 pm
Eventually... We wrote a lot about the search, and Benjamin's sacrificial moves. But...
Benjamin had a completely winning position (SF gives +17 after 39...Qa7) and simply missed a mate in 7.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: My first NNUE Tue Jan 04, 2022 8:51 pm
matejst wrote:
Admin wrote:
I will have a look at the source code of Olithink but in principle I want to use an engine that is no longer developed. One of them I notable had in mind was Rodent since Pavel said Rodent IV was the last one. Until about 10 days ago
I thought about OliThink because it has practically no evaluation at all, and it is a WB engine.
There are several that are "unofficially" no longer developed: Xiphos, Fizbo, Hakkapeliitta are the first that come to mind.Recently, Amanj stated that he will not develop Zahak any more.
Anyway, there are plenty engines are open source, under the GPL. The code can be reused under that same license freely.
Fizbo and GreKo are unlicensed.
Fizbo is indeed interesting but too strong, for the moment I have chosen for "growing fruit" which is the classic Fruit 2.1 from 2004 with search improvements by Pavel. Growing_Fruit is estimated ~2800 elo about the same strength as ProDeo. First test results show an 350 elo improvement with a net generated from SF data. Next steps, switch to Benjamin net, switch to larger (20Mb) net, the latter will give another elo boost.
matejst likes this post
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Wed Jan 05, 2022 3:04 am
After comparing the first Benjamin net with smaller nets in other engines, I am sure that it is stronger than most of them in the middle game. There seem to be a minimal NN size for a good evaluation, and with Orion 0.8 in mind, I think it is about 10MB. I am not sure a bigger net will give a big Elo boost: it could be more judicious to create a bigger net on Benjamin NN's data, but I guess that's what you meant.
But more important: I compared Benjamin today with Slowchess: the search _seems_ to be about the same speed, but I thought that Benjamin's choice of moves in the middlegame was generally better -- it was certainly more in sync with opening theory and Fischer choices, when I analyzed Fisher games, or SF choices when I rechecked with SF. Nonetheless, Slowchess is 200+ Elos stronger than Marvin and probably than Benjamin too, and from what I have seen this is in relation with the level of its play in late middlegame and endgame. Imho a form of supervised training might be imperative, to achieve high quality of evaluation in all phases of the game.
Anyway, if there is a way I can help in the limits of my possibilities, it will be my pleasure.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: My first NNUE Thu Jan 06, 2022 1:31 pm
What's a good name for the thing?
So far I have -
Fruity JBF (Juicy Benjamin Fruit) Fruit-TNG (The Next Generation) Frebel
None of them I really like.
Suggestions?
TheSelfImprover
Posts : 3112 Join date : 2020-11-18
Subject: Re: My first NNUE Thu Jan 06, 2022 2:58 pm
Admin wrote:
What's a good name for the thing?
I'm the best ideas man here! Here goes...
Fearless Square Zero Bossy Thoughtless All Seeing Eye Water Quick Thought Chess Calculator Panda Cat Well Stocked Kitchen Gang Master (play on "grandmaster") Deus Ex Machina Solstice Float Down The Stream Chocolate Fountain Secret Weapon Secret Ingredient Soup (from "Kung Fu Panda") The Spanker Anti Gravity Moon Moon Child Upper Chess Chess Up! Fifth Dimension (name of an old pop group) Ignition Plan Force Terror Bird KDread Scarlet Smirk Slicer FireStorm Forest Fire Final Chess (the word "ultimate" is overused) Insight Motor Chess Zygalski Sheets (link) Overlord (the code name for D-Day in WWII) Magician Sorcerer Witchcraft Luxury Coin ("gold" overused) Kingy McKingface (after "Parsey McParseface") Capture Cool Riot Urbane
You are welcome!
Admin likes this post
TheSelfImprover
Posts : 3112 Join date : 2020-11-18
Subject: Re: My first NNUE Thu Jan 06, 2022 3:16 pm
I had forgotten - K Dread is the name of a Reggae artist. Easily done!
I'll offer "Dreadnought" (a type of battleship), or even just "Battleship" in its place.
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Thu Jan 06, 2022 4:17 pm
Rebel NG. It is more Rebel than Fruit. And I like the Rebel brand.
Then, Corsair perhaps.
Admin likes this post
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: My first NNUE Thu Jan 06, 2022 5:28 pm
Or just Rebel 14.
Not out yet.
matejst and Damir Desevac like this post
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: My first NNUE Fri Jan 07, 2022 2:23 pm
Subject: Re: My first NNUE Fri Jan 07, 2022 2:50 pm
Solid results so far. I presume that the search is a bit slower than Marvin's, since, to be honest, I expected about ~30 Elos more, based on the experimental version I have.
Admin Admin
Posts : 2608 Join date : 2020-11-17 Location : Netherlands
Subject: Re: My first NNUE Fri Jan 07, 2022 3:13 pm
matejst wrote:
Solid results so far. I presume that the search is a bit slower than Marvin's, since, to be honest, I expected about ~30 Elos more, based on the experimental version I have.
Marvin (of course) has a better search than Fruit.
And from the result (so far) you can see that a Benjamin net is almost as strong as a net generated from SF data. SF net got 58.6% after 200 games.
matejst likes this post
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Fri Jan 07, 2022 3:56 pm
The net I have is very strong, despite an optimistic evaluation. I had no doubt that the new one would be even stronger.
matejst
Posts : 612 Join date : 2020-11-26
Subject: Re: My first NNUE Fri Jan 07, 2022 5:09 pm
To have a reference: Dietrich Kappe experimented with Toga and published Dark Toga -- an improved Toga bundled with the White Rose NNUE. It is rated 3152 on the CCRL pure single CPU 40/15 rating list. I think that the Benjamin net could well be stronger than the WR. Also: on that same list, Winter is 3130, Counter 3085, Clover 3130, Combusken 3120, Marvin 3105, which gives an average of 3115. From my experience, the Benjamin NN scales well and is a bit stronger at longer TC. I presume it should already be somewhere between 3150-3200 Elos at CCRL TCs.
Ed, I don't know your plans, but a bit of work on the search could give another 100 Elos in a short time. I hope you will not stop here.