parallel algorithm course 01

Posted on March 9, 2022 by Shiguang Wu
Tags: PA

gcc -fopenmp

omp_num_thread(int): request

multi-data

omp_get_thread_num, get id


SMP: equal-time access cost, in theory

NUMA: different .., practically


False Sharing

cache line

two processors may have access to the same region, repeat many useless write back


Synchronization, to avoid data racing, false sharing (avoid global array) d barrier

#pragma omp barrier

critical

only one thread can enter (often cost cheap), mutual exclusion, avoid data racing

(software support)

#pragma omp critical

atomic

only support (hardware support)

#pragma omp atomic