ProDeo
Would you like to react to this message? Create an account in a few clicks or log in to continue.
ProDeo

Computer Chess
 
HomeHome  CalendarCalendar  Latest imagesLatest images  FAQFAQ  SearchSearch  MemberlistMemberlist  UsergroupsUsergroups  RegisterRegister  Log in  

 

 need a tool to generate the most common openings out of lichess or pgn

Go down 
3 posters
AuthorMessage
Uri Blass




Posts : 207
Join date : 2020-11-28

need a tool to generate the most common openings out of lichess or pgn Empty
PostSubject: need a tool to generate the most common openings out of lichess or pgn   need a tool to generate the most common openings out of lichess or pgn EmptyMon Oct 30, 2023 9:42 am

I give an example of excel file of the most common opening positions in lichess that I generated manually from the data in lichess of the positions that happened at least 49,000,000 times.

I would like some tool to do it automatically for every pgn and also for lichess that based on my knowledge does not give an option to get pgn of all games to download or part of them.

For lichess I did it manually by going to the following link

https://lichess.org/analysis

I marked all time controls and all average rating and games until september 2023(so hopefullly the data does not change if I look at it later and looked at the number of every move to fill the data.

common move out means lead to most common position that is not in the list and it may be in some cases not the most common move.
It is possible that I have few mistakes but generally the idea is to find the most common positions based on the data.

I do not think that lichess has a pgn that I can download of more than 10^9 games so I guess that in order to generate the data some program need to go to the lichess site and look at the numbers in it like I did manually.

I would like to do the same for games at some rating range and some time control in order to prepare better for games in lichess by guessing the expected move of opponents that I need to learn.

For pgn I would like the option to treat different tranpositions as different games that mean that I have a different line for how many times 1.e4 e6 2.d4 appeared and how many times 1.d4 e6 2.e4 appeared.

Note that probably the most common losing line in lichess is 1.d4 e5 that is in the file that I give(if you look at higher level and in master games then 1.d4 e5 is probably not one of the most 10000 popular lines and I wonder what are the most popular losing lines in master games in lichess).

1.d4 e5 appeared in master games only in 109 games when there are more than 1000 positions that appeared more than 4000 times(I made a list of most of them in another file that I did not attach and I wrote the moves in that file in hebrew).

Edit:I tried to attach a file but I see max size per file 0 Mb so it seems that I cannot attach xlsx file so I copy and paste the important columns(first 5 that I have in the file)
a_root means no moves
every line include:1)moves(call it a_root when no moves) 2)plies of the line 3)move of parent(call it a for the opening position) 4)the frequency last move appeared(missing in the opening position) 5)the frequency the position appeared(always more than 49M games)

The reason for different columns is that I wanted to order the file based on different options  and one of them is based on the number of moves but also based on lexisographic order of lines in order not to jump from one line to another unconnected line when I look at the table.

moves ply parant last move frequency position frequency

a_root 0 a 4881333462
e4 1 a_root 2860395163 2854845723
d4 1 a_root 1221713123 1219559988
e4 e5 2 e4 1169514886 1167606262
e4 e5 Nf3 3 e4 e5 712523011 712808347
e4 c5 2 e4 521312864 520607556
d4 d5 2 d4 506764315 506043882
e4 e5 Nf3 Nc6 4 e4 e5 Nf3 439064694 447085898
e4 d5 2 e4 296742446 296255996
e4 e6 2 e4 290625721 290191451
e4 c5 Nf3 3 e4 c5 277868022 279670782
d4 Nf6 2 d4 237290777 237041496
d4 d5 c4 3 d4 d5 198411098 198925770
e4 e5 Nf3 Nc6 Bc4 5 e4 e5 Nf3 Nc6 185828227 197862865
e4 d5 exd5 3 e4 d5 189778469 189569631
e4 c6 2 e4 188072561 187817369
Nf3 1 a_root 172734683 172355562
c4 1 a_root 161188410 160881278
e4 e6 d4 3 e4 e6 121462468 136248574
e4 d5 exd5 Qxd5 4 e4 d5 exd5 136165341 136106078
d4 e6 2 d4 122339813 122184591
e4 e5 Bc4 3 e4 e5 121120428 120968287
e4 c5 Nf3 Nc6 4 e4 c5 Nf3 115602526 115549860
e3 1 a_root 108902793 108646791
e4 e5 Nf3 d6 4 e4 e5 Nf3 104178167 108311935
d4 d5 Nf3 3 d4 d5 86434224 104602213
Nf3 d5 d4 3 Nf3 d5 18274595 104602213
e4 e5 Nf3 Nc6 Bb5 5 e4 e5 Nf3 Nc6 103344661 103404849
d4 Nf6 c4 3 d4 Nf6 100607852 103046757
e4 d6 2 e4 102970331 102794395
e4 d5 exd5 Qxd5 Nc3 5 e4 d5 exd5 Qxd5 100513609 100550282
e4 g6 2 e4 95783613 95658645
e4 e6 d4 d5 4 e4 e6 d4 91665119 93518271
e4 c5 Nf3 d6 4 e4 c5 Nf3 92171535 92981814
e4 e5 Nf3 Nf6 4 e4 e5 Nf3 90468924 90931228
d4 d5 Bf4 3 d4 d5 89329958 89225689
e4 e6 Nf3 3 e4 e6 86107176 87973463
g3 1 a_root 84231344 84090357
e4 c6 d4 3 e4 c6 76991543 81675635
e4 e5 Nf3 Nc6 d4 5 e4 e5 Nf3 Nc6 76003468 78406153
e4 e5 Nc3 3 e4 e5 74240170 76675636
e4 e5 d4 3 e4 e5 75000654 76489405
e4 c6 d4 d5 4 e4 c6 d4 73681139 74996133
b3 1 a_root 74382230 74274865
d4 e5 2 d4 73638193 73502304
d4 d5 c4 e6 4 d4 d5 c4 52765112 73327238
e4 e5 Nf3 Nc6 Bc4 Nf6 6 e4 e5 Nf3 Nc6 Bc4 64178359 72539035
e4 e5 f4 3 e4 e5 70737680 70900815
e4 e5 Nf3 Nc6 Bc4 Bc5 6 e4 e5 Nf3 Nc6 Bc4 63731137 66175260
e4 e5 Nf3 Nc6 Nc3 5 e4 e5 Nf3 Nc6 50442804 64115475
e4 e5 Nf3 Nc6 d4 exd4 6 e4 e5 Nf3 Nc6 d4 57316705 60898409
d4 g6 2 d4 59405176 59340654
d4 d5 e3 3 d4 d5 52407330 58369767
e4 Nf6 2 e4 58219166 58135933
e4 b6 2 e4 57825258 57731528
d4 c5 2 d4 57328732 57259997
e4 c6 Nf3 3 e4 c6 55733091 56413634
Nf3 d5 2 Nf3 54439752 54374742
e4 c5 Nf3 Nc6 d4 5 e4 c5 Nf3 Nc6 53801999 54345313
e4 c5 Nf3 e6 4 e4 c5 Nf3 46109562 52963852
e4 e6 Nf3 c5 4 e4 e6 Nf3 6878492 52963852
d4 Nf6 Nf3 3 d4 Nf6 45487617 52566079
d4 d5 c4 c6 4 d4 d5 c4 39135396 52553973
e4 e6 Nf3 d5 4 e4 e6 Nf3 51748850 52408493
e4 c5 Nf3 Nc6 d4 cxd4 6 e4 c5 Nf3 Nc6 d4 48981865 51567514
e4 c5 Nc3 3 e4 c5 50555953 51273358
e4 c6 Nf3 d5 4 e4 c6 Nf3 50534447 51180061
e4 e5 Nf3 Nc6 Nc3 Nf6 6 e4 e5 Nf3 Nc6 Nc3 31581331 51062223
e4 d6 d4 3 e4 d6 45258921 50867279
e4 c5 Nf3 d6 d4 5 e4 c5 Nf3 d6 49695505 50262143
e4 c5 Bc4 3 e4 c5 50128902 50087239
e4 e5 Nf3 d6 Bc4 5 e4 e5 Nf3 d6 45755533 49049595

Ghppn likes this post

Back to top Go down
Chris Whittington




Posts : 1254
Join date : 2020-11-17
Location : France

need a tool to generate the most common openings out of lichess or pgn Empty
PostSubject: Re: need a tool to generate the most common openings out of lichess or pgn   need a tool to generate the most common openings out of lichess or pgn EmptyMon Oct 30, 2023 6:02 pm

Use Python chess

Ghppn likes this post

Back to top Go down
Dann Corbit




Posts : 188
Join date : 2020-11-26

need a tool to generate the most common openings out of lichess or pgn Empty
PostSubject: Re: need a tool to generate the most common openings out of lichess or pgn   need a tool to generate the most common openings out of lichess or pgn EmptyThu Nov 02, 2023 1:07 am

The Lichess pgn is a wonderful source of data. I download all of the pgn, and filter for both players above 2500. There is so much data you still end up with millions of games. I download one month at a time so it's not hard to process. Because of the filtering I do, my statistics will be very different. You can get excellent statistics with scid's opening report for any position.
Back to top Go down
Uri Blass




Posts : 207
Join date : 2020-11-28

need a tool to generate the most common openings out of lichess or pgn Empty
PostSubject: Re: need a tool to generate the most common openings out of lichess or pgn   need a tool to generate the most common openings out of lichess or pgn EmptyFri Nov 10, 2023 4:24 pm

Dann Corbit wrote:
The Lichess pgn is a wonderful source of data. I download all of the pgn, and filter for both players above 2500. There is so much data you still end up with millions of games. I download one month at a time so it's not hard to process. Because of the filtering I do, my statistics will be very different. You can get excellent statistics with scid's opening report for any position.

I can see that in the following link they claim to have 4,996,127,933 standard rated games.

some questions:
1)I have statistics in the following link
https://lichess.org/analysis

For some reason I see only 4,959,734,423 games for the opening position after I clicked on all possible time controls and all possible average rating and until November 2023
and clicked on "ALL SET!" What is the reason for the difference?

2)What is the size of the data and can I practically download it?
I see more than 30GB for less than 100M games so I guess I need about 1500GB free memory to save all the games in pgn format.

I am afraid that I have not enough memory in my computer because looking at the free memory of my computer I have only 446 GB in C when I have 275GB free memory
and I have only 931GB in D when I can use 806 GB free memory.

3)I do not know much about using scid.
What information can I get by scid about positions from pgn?
Back to top Go down
Dann Corbit




Posts : 188
Join date : 2020-11-26

need a tool to generate the most common openings out of lichess or pgn Empty
PostSubject: Re: need a tool to generate the most common openings out of lichess or pgn   need a tool to generate the most common openings out of lichess or pgn EmptyFri Nov 10, 2023 6:25 pm

Uri Blass wrote:
Dann Corbit wrote:
The Lichess pgn is a wonderful source of data. I download all of the pgn, and filter for both players above 2500. There is so much data you still end up with millions of games. I download one month at a time so it's not hard to process. Because of the filtering I do, my statistics will be very different. You can get excellent statistics with scid's opening report for any position.

I can see that in the following link they claim to have 4,996,127,933 standard rated games.

some questions:
1)I have statistics in the following link
https://lichess.org/analysis

For some reason I see only 4,959,734,423 games for the opening position after I clicked on all possible time controls and all possible average rating and until November 2023
and clicked on "ALL SET!" What is the reason for the difference?

2)What is the size of the data and can I practically download it?
I see more than 30GB for less than 100M games so I guess I need about 1500GB free memory to save all the games in pgn format.

I am afraid that I have not enough memory in my computer because looking at the free memory of my computer I have only 446 GB in C when I have 275GB free memory
and I have only 931GB in D when I can use 806 GB free memory.

3)I do not know much about using scid.
What information can I get by scid about positions from pgn?

1. I don't know why the number is different. To download the games, I go here and get the ZST files: https://database.lichess.org/
I collect the files one at a time. Then I filter for games less than 2500 Elo. I already have all the games up until April 2023 on my google drive.
This is the link:
https://drive.google.com/file/d/1Hwd_7pPgD6XNPuL3O06J6_OUAeFoBm-E/view?usp=drive_link
Those games are already filtered. The other games you can download and filter one at a time.

2. The size of the filtered PGN in the archive is 2GB in compressed form, and the PGN file when decompressed is 7.7GB.

3. I use Scid-Vs-Pc most of the time. This is the project: https://scidvspc.sourceforge.net/
The download page for Scid is on that same page. Just click the downloads link
I like Scid a lot and it is free. But any tool that manipulates PGN can be used.

Admin likes this post

Back to top Go down
Sponsored content





need a tool to generate the most common openings out of lichess or pgn Empty
PostSubject: Re: need a tool to generate the most common openings out of lichess or pgn   need a tool to generate the most common openings out of lichess or pgn Empty

Back to top Go down
 
need a tool to generate the most common openings out of lichess or pgn
Back to top 
Page 1 of 1
 Similar topics
-
» Openings used in CCRL
» Gambit Openings
» Made some human-like, Maia-style NNUE nets (available on Lichess!)
» Bugfix EAS-tool
» SPCC: 750000 clicks and a new tool released

Permissions in this forum:You cannot reply to topics in this forum
ProDeo :: Computer Chess-
Jump to: