Difference between revisions of "NUMA"

From Chessprogramming wiki
Jump to: navigation, search
Line 35: Line 35:
 
* [https://www.rrze.fau.de/wir-ueber-uns/organigramm/mitarbeiter/index.shtml/georg-hager.shtml Georg Hager] <ref>[https://blogs.fau.de/hager/ Georg Hager's Blog | Random thoughts on High Performance Computing]</ref>, [http://dblp.uni-trier.de/pers/hd/t/Treibig:Jan Jan Treibig], [http://dblp.uni-trier.de/pers/hd/w/Wellein:Gerhard Gerhard Wellein] ('''2013'''). ''The Practitioner's Cookbook for Good Parallel Performance on Multi- and Many-Core Systems''. [https://de.wikipedia.org/wiki/Regionales_Rechenzentrum_Erlangen RRZE], [http://sc13.supercomputing.org/ SC13], [https://blogs.fau.de/hager/files/2013/11/sc13_tutorial_134.pdf slides as pdf]
 
* [https://www.rrze.fau.de/wir-ueber-uns/organigramm/mitarbeiter/index.shtml/georg-hager.shtml Georg Hager] <ref>[https://blogs.fau.de/hager/ Georg Hager's Blog | Random thoughts on High Performance Computing]</ref>, [http://dblp.uni-trier.de/pers/hd/t/Treibig:Jan Jan Treibig], [http://dblp.uni-trier.de/pers/hd/w/Wellein:Gerhard Gerhard Wellein] ('''2013'''). ''The Practitioner's Cookbook for Good Parallel Performance on Multi- and Many-Core Systems''. [https://de.wikipedia.org/wiki/Regionales_Rechenzentrum_Erlangen RRZE], [http://sc13.supercomputing.org/ SC13], [https://blogs.fau.de/hager/files/2013/11/sc13_tutorial_134.pdf slides as pdf]
 
* [https://www.linkedin.com/in/rikvanriel Rik van Riel], [https://www.linkedin.com/in/chegu Vinod Chegu] ('''2014'''). ''Automatic NUMA Balancing''. [https://www.redhat.com/en/about/press-releases/red-hat-announces-dates-for-red-hat-summit-2014-in-san-francisco Red Hat Summit 2014], [http://events.linuxfoundation.org/sites/events/files/slides/summit2014_riel_chegu_w_0340_automatic_numa_balancing_0.pdf slides as pdf], [https://youtu.be/mjVw_oe1hEA video lecture] by Rik van Riel
 
* [https://www.linkedin.com/in/rikvanriel Rik van Riel], [https://www.linkedin.com/in/chegu Vinod Chegu] ('''2014'''). ''Automatic NUMA Balancing''. [https://www.redhat.com/en/about/press-releases/red-hat-announces-dates-for-red-hat-summit-2014-in-san-francisco Red Hat Summit 2014], [http://events.linuxfoundation.org/sites/events/files/slides/summit2014_riel_chegu_w_0340_automatic_numa_balancing_0.pdf slides as pdf], [https://youtu.be/mjVw_oe1hEA video lecture] by Rik van Riel
 +
* [https://scholar.google.com/citations?user=5aWoGywAAAAJ&hl=en Irina Calciu], [[Siddhartha Sen]], [[Mathematician#MBalakrishnan|Mahesh Balakrishnan]], [[Mathematician#MKAguilera|Marcos K. Aguilera:]] ('''2017'''). ''[https://dl.acm.org/doi/10.1145/3093336.3037721 Black-box Concurrent Data Structures for NUMA Architectures]''. [[ACM#SIGPLAN|ACM SIGPLAN Notices]], Vol. 52, No. 4, [https://cs.brown.edu/people/irina/papers/asplos2017-final.pdf pdf]
 +
* [https://scholar.google.com/citations?user=5aWoGywAAAAJ&hl=en Irina Calciu], [[Siddhartha Sen]], [[Mathematician#MBalakrishnan|Mahesh Balakrishnan]], [[Mathematician#MKAguilera|Marcos K. Aguilera:]] ('''2017'''). ''[https://dl.acm.org/doi/10.1145/3139645.3139650 How to implement any concurrent data structure for modern servers]''. [[ACM#SIGOPS|ACM SIGOPS]], Vol. 51, No. 1
  
 
=Forum Posts=
 
=Forum Posts=

Revision as of 23:52, 8 December 2020

Home * Hardware * Memory * NUMA

Possible NUMA system [1]

NUMA, (Non-uniform memory access)
a multiprocessing memory design where the main memory is partitioned between processors. Opposed to SMP, where all processors compete for access to the centralized shared memory bus, making it difficult to scale well bejoind 8 to 12 CPUs [2], NUMA splits the main memory into so called nodes with separate memory busses for subsets of processors, and high speed interconnection between nodes, either directly in so called 1-hop distance, or indirectly in 2-hop distance. Despite the high speed interconnection, NUMA memory access time varies considerably between faster local memory and remote memory of other nodes. Maintaining cache coherence of processor caches adds significant overhead to NUMA Systems, addressed by ccNUMA which is mostly used synonymous for current NUMA implementations [3].

x86

AMD implemented NUMA with its Opteron processor in 2003, using HyperTransport. Intel announced NUMA compatibility for their x86 servers in late 2007 with Nehalem CPUs using QuickPath Interconnect [4].

Considerations

Scheduling of threads across nodes and cores of a system is a complicated topic due to access of independent or shared data. There are several considerations in ccNUMA aware operating systems and software, such as keeping data local by virtue of first touch [5] [6]. NUMA and processor affinity APIs help application programmers to bind threads or processes to NUMA nodes or to allocate memory from a certain node.

See also

Selected Publications

1998 ...

2000 ...

Memory part 1
Memory part 2: CPU caches
Memory part 3: Virtual Memory
Memory part 5: What programmers can do

2010 ...

Forum Posts

2000 ...

2010 ...

2015 ...

Re: thread affinity by Robert Hyatt, CCC, July 03, 2015

External Links

Linux

Windows

x86

Misc

References

Up one Level