/** * Returns number of CPUs on system that are useful for math. */int32_tcpu_get_num_math(){#if defined(__x86_64__) && defined(__linux__) && !defined(__ANDROID__)//_SC_NPROCESSORS_ONLN 返回系统中实际可用的核心数intn_cpu=sysconf(_SC_NPROCESSORS_ONLN);if(n_cpu<1){returncpu_get_num_physical_cores();}if(is_hybrid_cpu()){cpu_set_taffinity;if(!pthread_getaffinity_np(pthread_self(),sizeof(affinity),&affinity)){intresult=cpu_count_math_cpus(n_cpu);pthread_setaffinity_np(pthread_self(),sizeof(affinity),&affinity);if(result>0){returnresult;}}}#endifreturncpu_get_num_physical_cores();}staticintcpu_count_math_cpus(intn_cpu){intresult=0;for(intcpu=0;cpu<n_cpu;++cpu){if(pin_cpu(cpu)){return-1;}if(is_running_on_efficiency_core()){continue;// efficiency cores harm lockstep threading}++cpu;// hyperthreading isn't useful for linear algebra // 超线程对线性代数没有用,算出就是physic cores / 2++result;}returnresult;}
int32_tcpu_get_num_physical_cores(){std::cout<<"call cpu_get_num_physical_cores"<<std::endl;#ifdef __linux__// enumerate the set of thread siblings, num entries is num coresstd::unordered_set<std::string>siblings;for(uint32_tcpu=0;cpu<UINT32_MAX;++cpu){std::ifstreamthread_siblings("/sys/devices/system/cpu/cpu"+std::to_string(cpu)+"/topology/thread_siblings");if(!thread_siblings.is_open()){break;// no more cpus}std::stringline;if(std::getline(thread_siblings,line)){std::cout<<"line :"<<line<<std::endl;siblings.insert(line);}}if(!siblings.empty()){returnstatic_cast<int32_t>(siblings.size());}#endifunsignedintn_threads=std::thread::hardware_concurrency();returnn_threads>0?(n_threads<=4?n_threads:n_threads/2):4;}
A hybrid CPU refers to a processor that combines two different types of processing units in a single Chip:
a conventional CPU (Central Processing Unit) and a specialized accelerator, such as a GPU(Graphics Processing Unit) or an FPGA (Field Programmable Gate Array)
Examples Intel’s Lakefield and Alder Lake Processor.
Another Example of Hybrid CPU AMD’s Ryzen APUs.
On Linux I think you can read /proc/cpuinfo, but after that you have to do a bit of thinking to see whether we have multicore cpu, or HT enabled cpu etc.
First, flags will give you supported features, and ht there will indicate hyperthreading support.
Then you have to check whether sibling count matches core count on each CPU, so look for cpu id, and deduct from there. (So if sibling count matches core count -> no HT)
Because BLAS (and LINPACK, Linear Algebra Package, for other linear algebra routines) is so optimized, people say you should always make sure that it knows exactly how many “real” processors it has to work with. So in my case, with a Core i7 with 4 physical cores and 4 from hyperthreading, forget the hyperthreading and thus there are 4. With the FX8350, there are only 4 processors for doing math, so 4 threads. Check to make sure this is best.