ProDeo
Would you like to react to this message? Create an account in a few clicks or log in to continue.
ProDeo

Computer Chess
 
HomeHome  CalendarCalendar  Latest imagesLatest images  FAQFAQ  SearchSearch  MemberlistMemberlist  UsergroupsUsergroups  RegisterRegister  Log in  

 

 how to add floats?

Go down 
2 posters
AuthorMessage
nescitus




Posts : 46
Join date : 2020-12-01

how to add floats? Empty
PostSubject: how to add floats?   how to add floats? EmptyFri May 17, 2024 9:49 pm

I got a problematic behaviour in my small neural network engine. After hitting depth 21 or so scores get badly out of sync: the engine (https://github.com/nescitus/LizardBrain) shows like -100, then -300, all the while proposing reasonable moves and principal variation. Resetting accumulator every million nodes or so stops it. I suspect some rounding error in adding floats. Is there some known technique to avoid this problem?

Ghppn likes this post

Back to top Go down
Admin
Admin
Admin


Posts : 2541
Join date : 2020-11-17
Location : Netherlands

how to add floats? Empty
PostSubject: Re: how to add floats?   how to add floats? EmptyFri May 17, 2024 11:32 pm

It's best to use SIMD instead of float.

https://en.algorithmica.org/hpc/simd/intrinsics/

Ghppn likes this post

Back to top Go down
http://rebel13.nl/
nescitus




Posts : 46
Join date : 2020-12-01

how to add floats? Empty
PostSubject: Re: how to add floats?   how to add floats? EmptyThu May 23, 2024 1:16 pm

OK, I got something like this. Speed gain is negligible, something like half a percent, and I think the way of extracting quantized_vals is at fault. Any hints how to improve it?


Code:
const int hiddenLayerSize = 32;
alignas(32) int quantized[hiddenLayerSize][768]; // in the Network object that holds all the weights
alignas(32)int hidden[hiddenLayerSize]; // in the cAccumulator class

void cAccumulator::Add(int cl, int pc, int sq) {
   
    // get piece-on-square index
    int idx = Idx(cl, pc, sq);

    // process 8 elements at a time using AVX2 (256-bit)
    for (int i = 0; i < hiddenLayerSize; i += 8) {
       
        // load 8 integers from the hidden and quantized arrays
        __m256i hidden_vals = _mm256_load_si256((__m256i*) & hidden[i]);
        __m256i quantized_vals = _mm256_set_epi32(
            Network.quantized[i + 7][idx], Network.quantized[i + 6][idx], Network.quantized[i + 5][idx], Network.quantized[i + 4][idx],
            Network.quantized[i + 3][idx], Network.quantized[i + 2][idx], Network.quantized[i + 1][idx], Network.quantized[i][idx]
        );

        // add the hidden and quantized values together
        __m256i result = _mm256_add_epi32(hidden_vals, quantized_vals);

        // store the result back into the hidden array
        _mm256_store_si256((__m256i*) & hidden[i], result);
    }
}

Ghppn likes this post

Back to top Go down
nescitus




Posts : 46
Join date : 2020-12-01

how to add floats? Empty
PostSubject: Re: how to add floats?   how to add floats? EmptyThu May 23, 2024 1:28 pm

Come to think of it, I could flatten Network.quantized, so that the indices are continous

Ghppn likes this post

Back to top Go down
nescitus




Posts : 46
Join date : 2020-12-01

how to add floats? Empty
PostSubject: Re: how to add floats?   how to add floats? EmptyFri May 24, 2024 2:59 am

flattening works indeed: I went down from 12,5 seconds on benchmark (naive implementation) to 10,1 (Add, Del and Move functions using flattened array). Any other ideas for optimization?

Code:
void cAccumulator::Add(int cl, int pc, int sq) {

    // get piece-on-square index
    int idx = Idx(cl, pc, sq);

    // process 8 elements at a time using AVX2 (256-bit)
    for (int i = 0; i < hiddenLayerSize; i += 8) {

        // load 8 integers from the hidden and quantized arrays
        __m256i hidden_vals = _mm256_load_si256((__m256i*) & hidden[i]);
        __m256i quantized_vals = _mm256_load_si256((__m256i*) & Network.flat_quantized[idx * 32 + i]);

        // add the hidden and quantized values together
        __m256i result = _mm256_add_epi32(hidden_vals, quantized_vals);

        // store the result back into the hidden array
        _mm256_store_si256((__m256i*) & hidden[i], result);
    }
}

Ghppn likes this post

Back to top Go down
Sponsored content





how to add floats? Empty
PostSubject: Re: how to add floats?   how to add floats? Empty

Back to top Go down
 
how to add floats?
Back to top 
Page 1 of 1

Permissions in this forum:You cannot reply to topics in this forum
ProDeo :: Computer Chess-
Jump to: