v6.14.0

Merge branch 'dev'
v6.14.0-dev
2025-12-07 07:55:04 -05:00 · 2021-08-09 16:09:15 +07:00 · 2021-08-09 16:08:20 +07:00 · 2021-08-08 19:36:54 +07:00 · 2021-08-08 00:52:06 +07:00 · 2021-08-07 19:38:31 +02:00
526 changed files with 69714 additions and 21115 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -1,4 +1,6 @@
 /build
+scripts/build
+scripts/deps
 /CMakeLists.txt.user
 /.idea
 /src/backend/opencl/cl/cn/cryptonight_gen.cl
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,4 +1,229 @@
+# v6.14.0
+- [#2484](https://github.com/xmrig/xmrig/pull/2484) Added ZeroMQ support for solo mining.
+- [#2476](https://github.com/xmrig/xmrig/issues/2476) Fixed crash in DMI memory reader.
+- [#2492](https://github.com/xmrig/xmrig/issues/2492) Added missing `--huge-pages-jit` command line option.
+- [#2512](https://github.com/xmrig/xmrig/pull/2512) Added show the number of transactions in pool job.
+
+# v6.13.1
+- [#2468](https://github.com/xmrig/xmrig/pull/2468) Fixed regression in previous version: don't send miner signature during regular mining.
+
+# v6.13.0
+- [#2445](https://github.com/xmrig/xmrig/pull/2445) Added support for solo mining with miner signatures for the upcoming Wownero fork.
+
+# v6.12.2
+- [#2280](https://github.com/xmrig/xmrig/issues/2280) GPU backends are now disabled in benchmark mode.
+- [#2322](https://github.com/xmrig/xmrig/pull/2322) Improved MSR compatibility with recent Linux kernels and updated `randomx_boost.sh`.
+- [#2340](https://github.com/xmrig/xmrig/pull/2340) Fixed AES detection on FreeBSD on ARM.
+- [#2341](https://github.com/xmrig/xmrig/pull/2341) `sse2neon` updated to the latest version.
+- [#2351](https://github.com/xmrig/xmrig/issues/2351) Fixed help output for `--cpu-priority` and `--cpu-affinity` option.
+- [#2375](https://github.com/xmrig/xmrig/pull/2375) Fixed macOS CUDA backend default loader name.
+- [#2378](https://github.com/xmrig/xmrig/pull/2378) Fixed broken light mode mining on x86.
+- [#2379](https://github.com/xmrig/xmrig/pull/2379) Fixed CL code for KawPow where it assumes everything is AMD.
+- [#2386](https://github.com/xmrig/xmrig/pull/2386) RandomX: enabled `IMUL_RCP` optimization for light mode mining.
+- [#2393](https://github.com/xmrig/xmrig/pull/2393) RandomX: added BMI2 version for scratchpad prefetch.
+- [#2395](https://github.com/xmrig/xmrig/pull/2395) RandomX: rewrote dataset read code.
+- [#2398](https://github.com/xmrig/xmrig/pull/2398) RandomX: optimized ARMv8 dataset read.
+- Added `argon2/ninja` alias for `argon2/wrkz` algorithm.
+
+# v6.12.1
+- [#2296](https://github.com/xmrig/xmrig/pull/2296) Fixed Zen3 assembly code for `cn/upx2` algorithm.
+
+# v6.12.0
+- [#2276](https://github.com/xmrig/xmrig/pull/2276) Added support for Uplexa (`cn/upx2` algorithm).
+- [#2261](https://github.com/xmrig/xmrig/pull/2261) Show total hashrate if compiled without OpenCL.
+- [#2289](https://github.com/xmrig/xmrig/pull/2289) RandomX: optimized `IMUL_RCP` instruction.
+- Added support for `--user` command line option for online benchmark.
+
+# v6.11.2
+- [#2207](https://github.com/xmrig/xmrig/issues/2207) Fixed regression in HTTP parser and llhttp updated to v5.1.0.
+
+# v6.11.1
+- [#2239](https://github.com/xmrig/xmrig/pull/2239) Fixed broken `coin` setting functionality.
+
+# v6.11.0
+- [#2196](https://github.com/xmrig/xmrig/pull/2196) Improved DNS subsystem and added new DNS specific options.
+- [#2172](https://github.com/xmrig/xmrig/pull/2172) Fixed build on Alpine 3.13.
+- [#2177](https://github.com/xmrig/xmrig/pull/2177) Fixed ARM specific compilation error with GCC 10.2.
+- [#2214](https://github.com/xmrig/xmrig/pull/2214) [#2216](https://github.com/xmrig/xmrig/pull/2216) [#2235](https://github.com/xmrig/xmrig/pull/2235) Optimized `cn-heavy` algorithm.
+- [#2217](https://github.com/xmrig/xmrig/pull/2217) Fixed mining job creation sequence.
+- [#2225](https://github.com/xmrig/xmrig/pull/2225) Fixed build without OpenCL support on some systems.
+- [#2229](https://github.com/xmrig/xmrig/pull/2229) Don't use RandomX JIT if `WITH_ASM=OFF`.
+- [#2228](https://github.com/xmrig/xmrig/pull/2228) Removed useless code for cryptonight algorithms.
+- [#2234](https://github.com/xmrig/xmrig/pull/2234) Fixed build error on gcc 4.8.
+
+# v6.10.0
+- [#2122](https://github.com/xmrig/xmrig/pull/2122) Fixed pause logic when both pause on battery and user activity are enabled.
+- [#2123](https://github.com/xmrig/xmrig/issues/2123) Fixed compatibility with gcc 4.8.
+- [#2147](https://github.com/xmrig/xmrig/pull/2147) Fixed many `new job` messages when solo mining.
+- [#2150](https://github.com/xmrig/xmrig/pull/2150) Updated `sse2neon.h` to the latest master, fixes build on ARMv7.
+- [#2157](https://github.com/xmrig/xmrig/pull/2157) Fixed crash in `cn-heavy` on Zen3 with manual thread count.
+- Fixed possible out of order write to log file.
+- [http-parser](https://github.com/nodejs/http-parser) replaced to [llhttp](https://github.com/nodejs/llhttp).
+- For official builds: libuv, hwloc and OpenSSL updated to latest versions.
+
+# v6.9.0
+- [#2104](https://github.com/xmrig/xmrig/pull/2104) Added [pause-on-active](https://xmrig.com/docs/miner/config/misc#pause-on-active) config option and `--pause-on-active=N` command line option.
+- [#2112](https://github.com/xmrig/xmrig/pull/2112) Added support for [Tari merge mining](https://github.com/tari-project/tari/blob/development/README.md#tari-merge-mining).
+- [#2117](https://github.com/xmrig/xmrig/pull/2117) Fixed crash when GPU mining `cn-heavy` on Zen3 system.
+
+# v6.8.2
+- [#2080](https://github.com/xmrig/xmrig/pull/2080) Fixed compile error in Termux.
+- [#2089](https://github.com/xmrig/xmrig/pull/2089) Optimized CryptoNight-Heavy for Zen3, 7-8% speedup.
+
+# v6.8.1
+- [#2064](https://github.com/xmrig/xmrig/pull/2064) Added documentation for config.json CPU options.
+- [#2066](https://github.com/xmrig/xmrig/issues/2066) Fixed AMD GPUs health data readings on Linux.
+- [#2067](https://github.com/xmrig/xmrig/pull/2067) Fixed compilation error when RandomX and Argon2 are disabled.
+- [#2076](https://github.com/xmrig/xmrig/pull/2076) Added support for flexible huge page sizes on Linux.
+- [#2077](https://github.com/xmrig/xmrig/pull/2077) Fixed `illegal instruction` crash on ARM.
+
+# v6.8.0
+- [#2052](https://github.com/xmrig/xmrig/pull/2052) Added DMI/SMBIOS reader.
+  - Added information about memory modules on the miner startup and for online benchmark.
+  - Added new HTTP API endpoint: `GET /2/dmi`.
+  - Added new command line option `--no-dmi` or config option `"dmi"`.
+  - Added new CMake option `-DWITH_DMI=OFF`.
+- [#2057](https://github.com/xmrig/xmrig/pull/2057) Improved MSR subsystem code quality.
+- [#2058](https://github.com/xmrig/xmrig/pull/2058) RandomX JIT x86: removed unnecessary instructions.
+
+# v6.7.2
+- [#2039](https://github.com/xmrig/xmrig/pull/2039) Fixed solo mining.
+
+# v6.7.1
+- [#1995](https://github.com/xmrig/xmrig/issues/1995) Fixed log initialization.
+- [#1998](https://github.com/xmrig/xmrig/pull/1998) Added hashrate in the benchmark finished message.
+- [#2009](https://github.com/xmrig/xmrig/pull/2009) AstroBWT OpenCL fixes.
+- [#2028](https://github.com/xmrig/xmrig/pull/2028) RandomX x86 JIT: removed redundant `CFROUND`.
+
+# v6.7.0
+- **[#1991](https://github.com/xmrig/xmrig/issues/1991) Added Apple M1 processor support.**
+- **[#1986](https://github.com/xmrig/xmrig/pull/1986) Up to 20-30% faster RandomX dataset initialization with AVX2 on some CPUs.**
+- [#1964](https://github.com/xmrig/xmrig/pull/1964) Cleanup and refactoring.
+- [#1966](https://github.com/xmrig/xmrig/pull/1966) Removed libcpuid support.
+- [#1968](https://github.com/xmrig/xmrig/pull/1968) Added virtual machine detection.
+- [#1969](https://github.com/xmrig/xmrig/pull/1969) [#1970](https://github.com/xmrig/xmrig/pull/1970) Fixed errors found by static analysis.
+- [#1977](https://github.com/xmrig/xmrig/pull/1977) Fixed: secure JIT and huge pages are incompatible on Windows.
+- [#1979](https://github.com/xmrig/xmrig/pull/1979) Term `x64` replaced to `64-bit`.
+- [#1980](https://github.com/xmrig/xmrig/pull/1980) Fixed build on gcc 11.
+- [#1989](https://github.com/xmrig/xmrig/pull/1989) Fixed broken Dero solo mining.
+
+# v6.6.2
+- [#1958](https://github.com/xmrig/xmrig/pull/1958) Added example mining scripts to help new miners.
+- [#1959](https://github.com/xmrig/xmrig/pull/1959) Optimized JIT compiler.
+- [#1960](https://github.com/xmrig/xmrig/pull/1960) Fixed RandomX init when switching to other algo and back.
+
+# v6.6.1
+- Fixed, benchmark validation on NUMA hardware produced incorrect results in some conditions.
+
+# v6.6.0
+- Online benchmark protocol upgraded to v2, validation not compatible with previous versions.
+  - Single thread benchmark now is cheat-resistant, not possible speedup it with multiple threads.
+  - RandomX dataset is now always initialized with static seed, to prevent time cheat by report slow dataset initialization.
+  - Zero delay online submission, to make time validation much more precise and strict.
+  - DNS cache for online benchmark to prevent unexpected delays.
+
+# v6.5.3
+- [#1946](https://github.com/xmrig/xmrig/pull/1946) Fixed MSR mod names in JSON API (v6.5.2 affected).
+
+# v6.5.2
+- [#1935](https://github.com/xmrig/xmrig/pull/1935) Separate MSR mod for Zen/Zen2 and Zen3.
+- [#1937](https://github.com/xmrig/xmrig/issues/1937) Print path to existing WinRing0 service without verbose option.
+- [#1939](https://github.com/xmrig/xmrig/pull/1939) Fixed build with gcc 4.8.
+- [#1941](https://github.com/xmrig/xmrig/pull/1941) Added CPUID info to JSON report.
+- [#1941](https://github.com/xmrig/xmrig/pull/1942) Fixed alignment modification in memory pool.
+- [#1944](https://github.com/xmrig/xmrig/pull/1944) Updated `randomx_boost.sh` with new MSR mod.
+- Added `250K` and `500K` offline benchmarks.
+
+# v6.5.1
+- [#1932](https://github.com/xmrig/xmrig/pull/1932) New MSR mod for Ryzen, up to +3.5% on Zen2 and +1-2% on Zen3.
+- [#1918](https://github.com/xmrig/xmrig/issues/1918) Fixed 1GB huge pages support on ARMv8.
+- [#1926](https://github.com/xmrig/xmrig/pull/1926) Fixed compilation on ARMv8 with GCC 9.3.0.
+- [#1929](https://github.com/xmrig/xmrig/issues/1929) Fixed build without HTTP.
+
+# v6.5.0
+- **Added [online benchmark](https://xmrig.com/benchmark) mode for sharing results.**
+  - Added new command line options: `--submit`, `	--verify=ID`, `	--seed=SEED`, `--hash=HASH`.
+- [#1912](https://github.com/xmrig/xmrig/pull/1912) Fixed MSR kernel module warning with new Linux kernels.
+- [#1925](https://github.com/xmrig/xmrig/pull/1925) Add checking for config files in user home directory.
+- Added vendor to ARM CPUs name and added `"arch"` field to API.
+- Removed legacy CUDA plugin API.
+
+# v6.4.0
+- [#1862](https://github.com/xmrig/xmrig/pull/1862) **RandomX: removed `rx/loki` algorithm.**
+- [#1890](https://github.com/xmrig/xmrig/pull/1890) **Added `argon2/chukwav2` algorithm.**
+- [#1895](https://github.com/xmrig/xmrig/pull/1895) [#1897](https://github.com/xmrig/xmrig/pull/1897) **Added [benchmark and stress test](https://github.com/xmrig/xmrig/blob/dev/doc/BENCHMARK.md).**
+- [#1864](https://github.com/xmrig/xmrig/pull/1864) RandomX: improved software AES performance.
+- [#1870](https://github.com/xmrig/xmrig/pull/1870) RandomX: fixed unexpected resume due to disconnect during dataset init.
+- [#1872](https://github.com/xmrig/xmrig/pull/1872) RandomX: fixed `randomx_create_vm` call.
+- [#1875](https://github.com/xmrig/xmrig/pull/1875) RandomX: fixed crash on x86.
+- [#1876](https://github.com/xmrig/xmrig/pull/1876) RandomX: added `huge-pages-jit` config parameter.
+- [#1881](https://github.com/xmrig/xmrig/pull/1881) Fixed possible race condition in hashrate counting code.
+- [#1882](https://github.com/xmrig/xmrig/pull/1882) [#1886](https://github.com/xmrig/xmrig/pull/1886) [#1887](https://github.com/xmrig/xmrig/pull/1887) [#1893](https://github.com/xmrig/xmrig/pull/1893) General code improvements.
+- [#1885](https://github.com/xmrig/xmrig/pull/1885) Added more precise hashrate calculation.
+- [#1889](https://github.com/xmrig/xmrig/pull/1889) Fixed libuv performance issue on Linux.
+
+# v6.3.5
+- [#1845](https://github.com/xmrig/xmrig/pull/1845) [#1861](https://github.com/xmrig/xmrig/pull/1861) Fixed ARM build and added CMake option `WITH_SSE4_1`.
+- [#1846](https://github.com/xmrig/xmrig/pull/1846) KawPow: fixed OpenCL memory leak.
+- [#1849](https://github.com/xmrig/xmrig/pull/1849) [#1859](https://github.com/xmrig/xmrig/pull/1859) RandomX: optimized soft AES code.
+- [#1850](https://github.com/xmrig/xmrig/pull/1850) [#1852](https://github.com/xmrig/xmrig/pull/1852) General code improvements.
+- [#1853](https://github.com/xmrig/xmrig/issues/1853) [#1856](https://github.com/xmrig/xmrig/pull/1856) [#1857](https://github.com/xmrig/xmrig/pull/1857) Fixed crash on old CPUs.
+
+# v6.3.4
+- [#1823](https://github.com/xmrig/xmrig/pull/1823) RandomX: added new option `scratchpad_prefetch_mode`.
+- [#1827](https://github.com/xmrig/xmrig/pull/1827) [#1831](https://github.com/xmrig/xmrig/pull/1831) Improved nonce iteration performance.
+- [#1828](https://github.com/xmrig/xmrig/pull/1828) RandomX: added SSE4.1-optimized Blake2b.
+- [#1830](https://github.com/xmrig/xmrig/pull/1830) RandomX: added performance profiler (for developers).
+- [#1835](https://github.com/xmrig/xmrig/pull/1835) RandomX: returned old soft AES implementation and added auto-select between the two.
+- [#1840](https://github.com/xmrig/xmrig/pull/1840) RandomX: moved more stuff to compile time, small x86 JIT compiler speedup.
+- [#1841](https://github.com/xmrig/xmrig/pull/1841) Fixed Cryptonight OpenCL for AMD 20.7.2 drivers.
+- [#1842](https://github.com/xmrig/xmrig/pull/1842) RandomX: AES improvements, a bit faster hardware AES code when compiled with MSVC.
+- [#1843](https://github.com/xmrig/xmrig/pull/1843) RandomX: improved performance of GCC compiled binaries.
+
+# v6.3.3
+- [#1817](https://github.com/xmrig/xmrig/pull/1817) Fixed self-select login sequence.
+- Added brand new [build from source](https://xmrig.com/docs/miner/build) documentation.
+- New binary downloads for macOS (`macos-x64`), FreeBSD (`freebsd-static-x64`), Linux (`linux-static-x64`), Ubuntu 18.04 (`bionic-x64`), Ubuntu 20.04 (`focal-x64`).
+- Generic Linux download `xenial-x64` renamed to `linux-x64`.
+- Builds without SSL/TLS support are no longer provided.
+- Improved CUDA loader error reporting and fixed plugin load on Linux.
+- Fixed build warnings with Clang compiler.
+- Fixed colors on macOS.
+
+# v6.3.2
+- [#1794](https://github.com/xmrig/xmrig/pull/1794) More robust 1 GB pages handling.
+  - Don't allocate 1 GB per thread if 1 GB is the default huge page size.
+  - Try to allocate scratchpad from dataset's 1 GB huge pages, if normal huge pages are not available.
+  - Correctly initialize RandomX cache if 1 GB pages fail to allocate on a first NUMA node.
+- [#1806](https://github.com/xmrig/xmrig/pull/1806) Fixed macOS battery detection.
+- [#1809](https://github.com/xmrig/xmrig/issues/1809) Improved auto configuration on ARM CPUs.
+  - Added retrieving ARM CPU names, based on lscpu code and database.
+
+# v6.3.1
+- [#1786](https://github.com/xmrig/xmrig/pull/1786) Added `pause-on-battery` option, supported on Windows and Linux.
+- Added command line options `--randomx-cache-qos` and `--argon2-impl`.
+
+# v6.3.0
+- [#1771](https://github.com/xmrig/xmrig/pull/1771) Adopted new SSE2NEON and reduced ARM-specific changes.
+- [#1774](https://github.com/xmrig/xmrig/pull/1774) RandomX: Added new option `cache_qos` in `randomx` object for cache QoS support.
+- [#1777](https://github.com/xmrig/xmrig/pull/1777) Added support for upcoming Haven offshore fork.
+  - [#1780](https://github.com/xmrig/xmrig/pull/1780) CryptoNight OpenCL: fix for long input data.
+
+# v6.2.3
+- [#1745](https://github.com/xmrig/xmrig/pull/1745) AstroBWT: fixed OpenCL compilation on some systems.
+- [#1749](https://github.com/xmrig/xmrig/pull/1749) KawPow: optimized CPU share verification.
+- [#1752](https://github.com/xmrig/xmrig/pull/1752) RandomX: added error message when MSR mod fails.
+- [#1754](https://github.com/xmrig/xmrig/issues/1754) Fixed GPU health readings for pre Vega GPUs on Linux.
+- [#1756](https://github.com/xmrig/xmrig/issues/1756) Added results and connection reports.
+- [#1759](https://github.com/xmrig/xmrig/pull/1759) KawPow: fixed DAG initialization on slower AMD GPUs.
+- [#1763](https://github.com/xmrig/xmrig/pull/1763) KawPow: fixed rare duplicate share errors.
+- [#1766](https://github.com/xmrig/xmrig/pull/1766) RandomX: small speedup on Ryzen CPUs.
+
+# v6.2.2
+- [#1742](https://github.com/xmrig/xmrig/issues/1742) Fixed crash when use HTTP API.
+
 # v6.2.1
+- [#1726](https://github.com/xmrig/xmrig/issues/1726) Fixed detection of AVX2/AVX512.
 - [#1728](https://github.com/xmrig/xmrig/issues/1728) Fixed, 32 bit Windows builds was crash on start.
 - [#1729](https://github.com/xmrig/xmrig/pull/1729) Fixed KawPow crash on old CPUs.
 - [#1730](https://github.com/xmrig/xmrig/pull/1730) Improved displaying information for compute errors on GPUs.
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -1,11 +1,11 @@
-cmake_minimum_required(VERSION 2.8)
+cmake_minimum_required(VERSION 2.8.12)
 project(xmrig)

-option(WITH_LIBCPUID        "Enable libcpuid support" ON)
 option(WITH_HWLOC           "Enable hwloc support" ON)
 option(WITH_CN_LITE         "Enable CryptoNight-Lite algorithms family" ON)
 option(WITH_CN_HEAVY        "Enable CryptoNight-Heavy algorithms family" ON)
 option(WITH_CN_PICO         "Enable CryptoNight-Pico algorithm" ON)
+option(WITH_CN_FEMTO        "Enable CryptoNight-UPX2 algorithm" ON)
 option(WITH_RANDOMX         "Enable RandomX algorithms family" ON)
 option(WITH_ARGON2          "Enable Argon2 algorithms family" ON)
 option(WITH_ASTROBWT        "Enable AstroBWT algorithms family" ON)
@@ -23,6 +23,11 @@ option(WITH_NVML            "Enable NVML (NVIDIA Management Library) support (on
 option(WITH_ADL             "Enable ADL (AMD Display Library) or sysfs support (only if OpenCL backend enabled)" ON)
 option(WITH_STRICT_CACHE    "Enable strict checks for OpenCL cache" ON)
 option(WITH_INTERLEAVE_DEBUG_LOG "Enable debug log for threads interleave" OFF)
+option(WITH_PROFILING       "Enable profiling for developers" OFF)
+option(WITH_SSE4_1          "Enable SSE 4.1 for Blake2" ON)
+option(WITH_BENCHMARK       "Enable builtin RandomX benchmark and stress test" ON)
+option(WITH_SECURE_JIT      "Enable secure access to JIT memory" OFF)
+option(WITH_DMI             "Enable DMI/SMBIOS reader" ON)

 option(BUILD_STATIC         "Build static binary" OFF)
 option(ARM_TARGET           "Force use specific ARM target 8 or 7" 0)
@@ -143,6 +148,10 @@ elseif (XMRIG_OS_APPLE)
        src/App_unix.cpp
        src/crypto/common/VirtualMemory_unix.cpp
        )
+
+    find_library(IOKIT_LIBRARY IOKit)
+    find_library(CORESERVICES_LIBRARY CoreServices)
+    set(EXTRA_LIBS ${IOKIT_LIBRARY} ${CORESERVICES_LIBRARY})
 else()
    list(APPEND SOURCES_OS
        src/App_unix.cpp
@@ -163,8 +172,8 @@ else()
    endif()
 endif()

-add_definitions(-DXMRIG_MINER_PROJECT)
-add_definitions(-D__STDC_FORMAT_MACROS -DUNICODE)
+add_definitions(-DXMRIG_MINER_PROJECT -DXMRIG_JSON_SINGLE_LINE_ARRAY)
+add_definitions(-D__STDC_FORMAT_MACROS -DUNICODE -D_FILE_OFFSET_BITS=64)

 find_package(UV REQUIRED)

@@ -188,26 +197,36 @@ if (WITH_CN_PICO)
    add_definitions(/DXMRIG_ALGO_CN_PICO)
 endif()

+if (WITH_CN_FEMTO)
+    add_definitions(/DXMRIG_ALGO_CN_FEMTO)
+endif()
+
 if (WITH_EMBEDDED_CONFIG)
    add_definitions(/DXMRIG_FEATURE_EMBEDDED_CONFIG)
 endif()

+include(src/hw/api/api.cmake)
+include(src/hw/dmi/dmi.cmake)
+
 include_directories(src)
 include_directories(src/3rdparty)
 include_directories(${UV_INCLUDE_DIR})

-if (BUILD_STATIC)
-    set(CMAKE_EXE_LINKER_FLAGS " -static")
-endif()
-
 if (WITH_DEBUG_LOG)
    add_definitions(/DAPP_DEBUG)
 endif()

-add_executable(${CMAKE_PROJECT_NAME} ${HEADERS} ${SOURCES} ${SOURCES_OS} ${SOURCES_CPUID} ${HEADERS_CRYPTO} ${SOURCES_CRYPTO} ${SOURCES_SYSLOG} ${TLS_SOURCES} ${XMRIG_ASM_SOURCES})
+add_executable(${CMAKE_PROJECT_NAME} ${HEADERS} ${SOURCES} ${SOURCES_OS} ${HEADERS_CRYPTO} ${SOURCES_CRYPTO} ${SOURCES_SYSLOG} ${TLS_SOURCES} ${XMRIG_ASM_SOURCES})
 target_link_libraries(${CMAKE_PROJECT_NAME} ${XMRIG_ASM_LIBRARY} ${OPENSSL_LIBRARIES} ${UV_LIBRARIES} ${EXTRA_LIBS} ${CPUID_LIB} ${ARGON2_LIBRARY} ${ETHASH_LIBRARY})

 if (WIN32)
-    add_custom_command(TARGET ${CMAKE_PROJECT_NAME} POST_BUILD
-        COMMAND ${CMAKE_COMMAND} -E copy_if_different "${CMAKE_SOURCE_DIR}/bin/WinRing0/WinRing0x64.sys" $<TARGET_FILE_DIR:${CMAKE_PROJECT_NAME}>)
+    add_custom_command(TARGET ${CMAKE_PROJECT_NAME} POST_BUILD COMMAND ${CMAKE_COMMAND} -E copy_if_different "${CMAKE_SOURCE_DIR}/bin/WinRing0/WinRing0x64.sys" $<TARGET_FILE_DIR:${CMAKE_PROJECT_NAME}>)
+    add_custom_command(TARGET ${CMAKE_PROJECT_NAME} POST_BUILD COMMAND ${CMAKE_COMMAND} -E copy_if_different "${CMAKE_SOURCE_DIR}/scripts/benchmark_1M.cmd" $<TARGET_FILE_DIR:${CMAKE_PROJECT_NAME}>)
+    add_custom_command(TARGET ${CMAKE_PROJECT_NAME} POST_BUILD COMMAND ${CMAKE_COMMAND} -E copy_if_different "${CMAKE_SOURCE_DIR}/scripts/benchmark_10M.cmd" $<TARGET_FILE_DIR:${CMAKE_PROJECT_NAME}>)
+    add_custom_command(TARGET ${CMAKE_PROJECT_NAME} POST_BUILD COMMAND ${CMAKE_COMMAND} -E copy_if_different "${CMAKE_SOURCE_DIR}/scripts/pool_mine_example.cmd" $<TARGET_FILE_DIR:${CMAKE_PROJECT_NAME}>)
+    add_custom_command(TARGET ${CMAKE_PROJECT_NAME} POST_BUILD COMMAND ${CMAKE_COMMAND} -E copy_if_different "${CMAKE_SOURCE_DIR}/scripts/solo_mine_example.cmd" $<TARGET_FILE_DIR:${CMAKE_PROJECT_NAME}>)
+endif()
+
+if (CMAKE_CXX_COMPILER_ID MATCHES Clang AND CMAKE_BUILD_TYPE STREQUAL Release AND NOT CMAKE_GENERATOR STREQUAL Xcode)
+    add_custom_command(TARGET ${PROJECT_NAME} POST_BUILD COMMAND ${CMAKE_STRIP} ${CMAKE_PROJECT_NAME})
 endif()
--- a/README.md
+++ b/README.md
@@ -2,130 +2,35 @@

 [![Github All Releases](https://img.shields.io/github/downloads/xmrig/xmrig/total.svg)](https://github.com/xmrig/xmrig/releases)
 [![GitHub release](https://img.shields.io/github/release/xmrig/xmrig/all.svg)](https://github.com/xmrig/xmrig/releases)
-[![GitHub Release Date](https://img.shields.io/github/release-date-pre/xmrig/xmrig.svg)](https://github.com/xmrig/xmrig/releases)
+[![GitHub Release Date](https://img.shields.io/github/release-date/xmrig/xmrig.svg)](https://github.com/xmrig/xmrig/releases)
 [![GitHub license](https://img.shields.io/github/license/xmrig/xmrig.svg)](https://github.com/xmrig/xmrig/blob/master/LICENSE)
 [![GitHub stars](https://img.shields.io/github/stars/xmrig/xmrig.svg)](https://github.com/xmrig/xmrig/stargazers)
 [![GitHub forks](https://img.shields.io/github/forks/xmrig/xmrig.svg)](https://github.com/xmrig/xmrig/network)

-XMRig High performance, open source, cross platform RandomX, KawPow, CryptoNight, AstroBWT and Argon2 CPU/GPU miner, with official support for Windows.
+XMRig is a high performance, open source, cross platform RandomX, KawPow, CryptoNight and AstroBWT unified CPU/GPU miner and [RandomX benchmark](https://xmrig.com/benchmark). Official binaries are available for Windows, Linux, macOS and FreeBSD.

 ## Mining backends
- **CPU** (x64/x86/ARM)
+- **CPU** (x64/ARMv8)
 - **OpenCL** for AMD GPUs.
 - **CUDA** for NVIDIA GPUs via external [CUDA plugin](https://github.com/xmrig/xmrig-cuda).

-<img src="doc/screenshot_v5_2_0.png" width="833" >
-
 ## Download
-* Binary releases: https://github.com/xmrig/xmrig/releases
-* Git tree: https://github.com/xmrig/xmrig.git
-  * Clone with `git clone https://github.com/xmrig/xmrig.git` :hammer: [Build instructions](https://github.com/xmrig/xmrig/wiki/Build).
+* **[Binary releases](https://github.com/xmrig/xmrig/releases)**
+* **[Build from source](https://xmrig.com/docs/miner/build)**

 ## Usage
-The preferred way to configure the miner is the [JSON config file](src/config.json) as it is more flexible and human friendly. The command line interface does not cover all features, such as mining profiles for different algorithms. Important options can be changed during runtime without miner restart by editing the config file or executing API calls.
+The preferred way to configure the miner is the [JSON config file](https://xmrig.com/docs/miner/config) as it is more flexible and human friendly. The [command line interface](https://xmrig.com/docs/miner/command-line-options) does not cover all features, such as mining profiles for different algorithms. Important options can be changed during runtime without miner restart by editing the config file or executing [API](https://xmrig.com/docs/miner/api) calls.

-* **[xmrig.com/wizard](https://xmrig.com/wizard)** helps you create initial configuration for the miner.
-* **[workers.xmrig.info](http://workers.xmrig.info)** helps manage your miners via HTTP API.
-
-### Command line options
-```
-Network:
-  -o, --url=URL                 URL of mining server
-  -a, --algo=ALGO               mining algorithm https://xmrig.com/docs/algorithms
-      --coin=COIN               specify coin instead of algorithm
-  -u, --user=USERNAME           username for mining server
-  -p, --pass=PASSWORD           password for mining server
-  -O, --userpass=U:P            username:password pair for mining server
-  -x, --proxy=HOST:PORT         connect through a SOCKS5 proxy
-  -k, --keepalive               send keepalive packet for prevent timeout (needs pool support)
-      --nicehash                enable nicehash.com support
-      --rig-id=ID               rig identifier for pool-side statistics (needs pool support)
-      --tls                     enable SSL/TLS support (needs pool support)
-      --tls-fingerprint=HEX     pool TLS certificate fingerprint for strict certificate pinning
-      --daemon                  use daemon RPC instead of pool for solo mining
-      --daemon-poll-interval=N  daemon poll interval in milliseconds (default: 1000)
-  -r, --retries=N               number of times to retry before switch to backup server (default: 5)
-  -R, --retry-pause=N           time to pause between retries (default: 5)
-      --user-agent              set custom user-agent string for pool
-      --donate-level=N          donate level, default 5%% (5 minutes in 100 minutes)
-      --donate-over-proxy=N     control donate over xmrig-proxy feature
-
-CPU backend:
-      --no-cpu                  disable CPU mining backend
-  -t, --threads=N               number of CPU threads
-  -v, --av=N                    algorithm variation, 0 auto select
-      --cpu-affinity            set process affinity to CPU core(s), mask 0x3 for cores 0 and 1
-      --cpu-priority            set process priority (0 idle, 2 normal to 5 highest)
-      --cpu-max-threads-hint=N  maximum CPU threads count (in percentage) hint for autoconfig
-      --cpu-memory-pool=N       number of 2 MB pages for persistent memory pool, -1 (auto), 0 (disable)
-      --cpu-no-yield            prefer maximum hashrate rather than system response/stability
-      --no-huge-pages           disable huge pages support
-      --asm=ASM                 ASM optimizations, possible values: auto, none, intel, ryzen, bulldozer
-      --randomx-init=N          thread count to initialize RandomX dataset
-      --randomx-no-numa         disable NUMA support for RandomX
-      --randomx-mode=MODE       RandomX mode: auto, fast, light
-      --randomx-1gb-pages       use 1GB hugepages for dataset (Linux only)
-      --randomx-wrmsr=N         write custom value (0-15) to Intel MSR register 0x1a4 or disable MSR mod (-1)
-      --randomx-no-rdmsr        disable reverting initial MSR values on exit
-      --astrobwt-max-size=N     skip hashes with large stage 2 size, default: 550, min: 400, max: 1200
-      --astrobwt-avx2           enable AVX2 optimizations for AstroBWT algorithm
-
-API:
-      --api-worker-id=ID        custom worker-id for API
-      --api-id=ID               custom instance ID for API
-      --http-host=HOST          bind host for HTTP API (default: 127.0.0.1)
-      --http-port=N             bind port for HTTP API
-      --http-access-token=T     access token for HTTP API
-      --http-no-restricted      enable full remote access to HTTP API (only if access token set)
-
-OpenCL backend:
-      --opencl                  enable OpenCL mining backend
-      --opencl-devices=N        comma separated list of OpenCL devices to use
-      --opencl-platform=N       OpenCL platform index or name
-      --opencl-loader=PATH      path to OpenCL-ICD-Loader (OpenCL.dll or libOpenCL.so)
-      --opencl-no-cache         disable OpenCL cache
-      --print-platforms         print available OpenCL platforms and exit
-
-CUDA backend:
-      --cuda                    enable CUDA mining backend
-      --cuda-loader=PATH        path to CUDA plugin (xmrig-cuda.dll or libxmrig-cuda.so)
-      --cuda-devices=N          comma separated list of CUDA devices to use
-      --cuda-bfactor-hint=N     bfactor hint for autoconfig (0-12)
-      --cuda-bsleep-hint=N      bsleep hint for autoconfig
-      --no-nvml                 disable NVML (NVIDIA Management Library) support
-
-TLS:
-      --tls-gen=HOSTNAME        generate TLS certificate for specific hostname
-      --tls-cert=FILE           load TLS certificate chain from a file in the PEM format
-      --tls-cert-key=FILE       load TLS certificate private key from a file in the PEM format
-      --tls-dhparam=FILE        load DH parameters for DHE ciphers from a file in the PEM format
-      --tls-protocols=N         enable specified TLS protocols, example: "TLSv1 TLSv1.1 TLSv1.2 TLSv1.3"
-      --tls-ciphers=S           set list of available ciphers (TLSv1.2 and below)
-      --tls-ciphersuites=S      set list of available TLSv1.3 ciphersuites
-
-Logging:
-  -S, --syslog                  use system log for output messages
-  -l, --log-file=FILE           log all output to a file
-      --print-time=N            print hashrate report every N seconds
-      --health-print-time=N     print health report every N seconds
-      --no-color                disable colored output
-      --verbose                 verbose output
-
-Misc:
-  -c, --config=FILE             load a JSON-format configuration file
-  -B, --background              run the miner in the background
-  -V, --version                 output version information and exit
-  -h, --help                    display this help and exit
-      --dry-run                 test configuration and exit
-      --export-topology         export hwloc topology to a XML file and exit
-      --title                   set custom console window title
-      --no-title                disable setting console window title      
-```
+* **[Wizard](https://xmrig.com/wizard)** helps you create initial configuration for the miner.
+* **[Workers](http://workers.xmrig.info)** helps manage your miners via HTTP API.

 ## Donations
-* Default donation 5% (5 minutes in 100 minutes) can be reduced to 1% via option `donate-level` or disabled in source code.
+* Default donation 1% (1 minute in 100 minutes) can be increased via option `donate-level` or disabled in source code.
 * XMR: `48edfHu7V9Z84YzzMa6fUueoELZ9ZRXq9VetWzYGzKt52XU5xvqgzYnDK9URnRoJMk1j8nLwEVsaSWJ4fhdUyZijBGUicoD`
-* BTC: `1P7ujsXeX7GxQwHNnJsRMgAdNkFZmNVqJT`
+
+## Developers
+* **[xmrig](https://github.com/xmrig)**
+* **[sech1](https://github.com/SChernykh)**

 ## Contacts
 * support@xmrig.com
--- a/cmake/OpenSSL.cmake
+++ b/cmake/OpenSSL.cmake
@@ -10,6 +10,11 @@ if (WITH_TLS)
        set(OPENSSL_USE_STATIC_LIBS TRUE)
    endif()

+    if (BUILD_STATIC)
+        set(OPENSSL_USE_STATIC_LIBS TRUE)
+    endif()
+
+
    find_package(OpenSSL)

    if (OPENSSL_FOUND)
--- a/cmake/cpu.cmake
+++ b/cmake/cpu.cmake
@@ -2,9 +2,10 @@ if (NOT CMAKE_SYSTEM_PROCESSOR)
    message(WARNING "CMAKE_SYSTEM_PROCESSOR not defined")
 endif()

-
-if (CMAKE_SYSTEM_PROCESSOR MATCHES "^(x86_64|AMD64)$")
+if (CMAKE_SYSTEM_PROCESSOR MATCHES "^(x86_64|AMD64)$" AND CMAKE_SIZEOF_VOID_P EQUAL 8)
    add_definitions(/DRAPIDJSON_SSE2)
+else()
+    set(WITH_SSE4_1 OFF)
 endif()

 if (NOT ARM_TARGET)
@@ -17,7 +18,6 @@ endif()

 if (ARM_TARGET AND ARM_TARGET GREATER 6)
    set(XMRIG_ARM     ON)
-    set(WITH_LIBCPUID OFF)
    add_definitions(/DXMRIG_ARM)

    message(STATUS "Use ARM_TARGET=${ARM_TARGET} (${CMAKE_SYSTEM_PROCESSOR})")
@@ -41,3 +41,7 @@ if (ARM_TARGET AND ARM_TARGET GREATER 6)
        add_definitions(/DXMRIG_ARMv7)
    endif()
 endif()
+
+if (WITH_SSE4_1)
+    add_definitions(/DXMRIG_FEATURE_SSE4_1)
+endif()
--- a/cmake/flags.cmake
+++ b/cmake/flags.cmake
@@ -45,6 +45,10 @@ if (CMAKE_CXX_COMPILER_ID MATCHES GNU)
        set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static-libgcc -static-libstdc++")
    endif()

+    if (BUILD_STATIC)
+        set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static")
+    endif()
+
    add_definitions(/D_GNU_SOURCE)

    if (${CMAKE_VERSION} VERSION_LESS "3.1.0")
@@ -60,6 +64,9 @@ elseif (CMAKE_CXX_COMPILER_ID MATCHES MSVC)
    set(CMAKE_C_FLAGS_RELEASE "/MT /O2 /Oi /DNDEBUG /GL")
    set(CMAKE_CXX_FLAGS_RELEASE "/MT /O2 /Oi /DNDEBUG /GL")

+    set(CMAKE_C_FLAGS_RELWITHDEBINFO "/Ob1 /Zi /DRELWITHDEBINFO")
+    set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "/Ob1 /Zi /DRELWITHDEBINFO")
+
    add_definitions(/D_CRT_SECURE_NO_WARNINGS)
    add_definitions(/D_CRT_NONSTDC_NO_WARNINGS)
    add_definitions(/DNOMINMAX)
@@ -89,6 +96,10 @@ elseif (CMAKE_CXX_COMPILER_ID MATCHES Clang)
        endif()
    endif()

+    if (BUILD_STATIC)
+        set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static")
+    endif()
+
 endif()

 if (NOT WIN32)
--- a/cmake/os.cmake
+++ b/cmake/os.cmake
@@ -1,3 +1,7 @@
+if (CMAKE_SIZEOF_VOID_P EQUAL 8)
+    add_definitions(/DXMRIG_64_BIT)
+endif()
+
 if (WIN32)
    set(XMRIG_OS_WIN ON)
 elseif (APPLE)
@@ -32,6 +36,10 @@ elseif(XMRIG_OS_APPLE)
    else()
        add_definitions(/DXMRIG_OS_MACOS)
    endif()
+
+    if (XMRIG_ARM)
+        set(WITH_SECURE_JIT ON)
+    endif()
 elseif(XMRIG_OS_UNIX)
    add_definitions(/DXMRIG_OS_UNIX)

@@ -43,3 +51,7 @@ elseif(XMRIG_OS_UNIX)
        add_definitions(/DXMRIG_OS_FREEBSD)
    endif()
 endif()
+
+if (WITH_SECURE_JIT)
+    add_definitions(/DXMRIG_SECURE_JIT)
+endif()
--- a/cmake/randomx.cmake
+++ b/cmake/randomx.cmake
@@ -42,13 +42,13 @@ if (WITH_RANDOMX)
        src/crypto/rx/RxVm.cpp
    )

-    if (CMAKE_C_COMPILER_ID MATCHES MSVC)
+    if (WITH_ASM AND CMAKE_C_COMPILER_ID MATCHES MSVC)
        enable_language(ASM_MASM)
        list(APPEND SOURCES_CRYPTO
             src/crypto/randomx/jit_compiler_x86_static.asm
             src/crypto/randomx/jit_compiler_x86.cpp
            )
-    elseif (NOT XMRIG_ARM AND CMAKE_SIZEOF_VOID_P EQUAL 8)
+    elseif (WITH_ASM AND NOT XMRIG_ARM AND CMAKE_SIZEOF_VOID_P EQUAL 8)
        list(APPEND SOURCES_CRYPTO
             src/crypto/randomx/jit_compiler_x86_static.S
             src/crypto/randomx/jit_compiler_x86.cpp
@@ -61,8 +61,24 @@ if (WITH_RANDOMX)
             src/crypto/randomx/jit_compiler_a64.cpp
            )
        # cheat because cmake and ccache hate each other
+        if (CMAKE_GENERATOR STREQUAL Xcode)
+            set_property(SOURCE src/crypto/randomx/jit_compiler_a64_static.S PROPERTY LANGUAGE ASM)
+        else()
            set_property(SOURCE src/crypto/randomx/jit_compiler_a64_static.S PROPERTY LANGUAGE C)
        endif()
+    else()
+        list(APPEND SOURCES_CRYPTO
+             src/crypto/randomx/jit_compiler_fallback.cpp
+            )
+    endif()
+
+    if (WITH_SSE4_1)
+        list(APPEND SOURCES_CRYPTO src/crypto/randomx/blake2/blake2b_sse41.c)
+
+        if (CMAKE_C_COMPILER_ID MATCHES GNU OR CMAKE_C_COMPILER_ID MATCHES Clang)
+            set_source_files_properties(src/crypto/randomx/blake2/blake2b_sse41.c PROPERTIES COMPILE_FLAGS -msse4.1)
+        endif()
+    endif()

    if (CMAKE_CXX_COMPILER_ID MATCHES Clang)
        set_source_files_properties(src/crypto/randomx/jit_compiler_x86.cpp PROPERTIES COMPILE_FLAGS -Wno-unused-const-variable)
@@ -84,18 +100,41 @@ if (WITH_RANDOMX)
        message("-- WITH_MSR=ON")

        if (XMRIG_OS_WIN)
-            list(APPEND SOURCES_CRYPTO src/crypto/rx/Rx_win.cpp)
+            list(APPEND SOURCES_CRYPTO
+                src/crypto/rx/RxFix_win.cpp
+                src/hw/msr/Msr_win.cpp
+                )
        elseif (XMRIG_OS_LINUX)
-            list(APPEND SOURCES_CRYPTO src/crypto/rx/Rx_linux.cpp)
+            list(APPEND SOURCES_CRYPTO
+                src/crypto/rx/RxFix_linux.cpp
+                src/hw/msr/Msr_linux.cpp
+                )
        endif()

-        list(APPEND HEADERS_CRYPTO src/crypto/rx/msr/MsrItem.h)
-        list(APPEND SOURCES_CRYPTO src/crypto/rx/msr/MsrItem.cpp)
+        list(APPEND HEADERS_CRYPTO
+            src/crypto/rx/RxFix.h
+            src/crypto/rx/RxMsr.h
+            src/hw/msr/Msr.h
+            src/hw/msr/MsrItem.h
+            )
+
+        list(APPEND SOURCES_CRYPTO
+            src/crypto/rx/RxMsr.cpp
+            src/hw/msr/Msr.cpp
+            src/hw/msr/MsrItem.cpp
+            )
    else()
        remove_definitions(/DXMRIG_FEATURE_MSR)
        remove_definitions(/DXMRIG_FIX_RYZEN)
        message("-- WITH_MSR=OFF")
    endif()
+
+    if (WITH_PROFILING)
+        add_definitions(/DXMRIG_FEATURE_PROFILING)
+
+        list(APPEND HEADERS_CRYPTO src/crypto/rx/Profiler.h)
+        list(APPEND SOURCES_CRYPTO src/crypto/rx/Profiler.cpp)
+    endif()
 else()
    remove_definitions(/DXMRIG_ALGO_RANDOMX)
 endif()
--- a/doc/BENCHMARK.md
+++ b/doc/BENCHMARK.md
@@ -0,0 +1,29 @@
+# Embedded benchmark
+
+You can run with XMRig with the following commands:
+```
+xmrig --bench=1M
+xmrig --bench=10M
+xmrig --bench=1M -a rx/wow
+xmrig --bench=10M -a rx/wow
+```
+This will run between 1 and 10 million RandomX hashes, depending on `bench` parameter, and print the time it took. First two commands use Monero variant (2 MB per thread, best for Zen2/Zen3 CPUs), second two commands use Wownero variant (1 MB per thread, useful for Intel and 1st gen Zen/Zen+ CPUs).
+
+Checksum of all the hashes will be also printed to check stability of your hardware: if it's green then it's correct, if it's red then there was hardware error during computation. No Internet connection is required for the benchmark.
+
+Double check that you see `Huge pages 100%` both for dataset and for all threads, and also check for `msr register values ... has been set successfully` - without this result will be far from the best. Running as administrator is required for MSR and huge pages to be set up properly.
+
+![Benchmark example](https://i.imgur.com/PST3BYc.png)
+
+### Benchmark with custom config
+
+You can run benchmark with any configuration you want. Just start without command line parameteres, use regular config.json and add `"benchmark":"1M",` on the next line after pool url. 
+
+# Stress test
+
+You can also run continuous stress-test that is as close to the real RandomX mining as possible and doesn't require any configuration:
+```
+xmrig --stress
+xmrig --stress -a rx/wow
+```
+This will require Internet connection and will run indefinitely.
--- a/doc/CPU.md
+++ b/doc/CPU.md
@@ -1,3 +1,5 @@
+**:warning: Recent version of this page https://xmrig.com/docs/miner/config/cpu.**
+
 # CPU backend

 All CPU related settings contains in one `cpu` object in config file, CPU backend allow specify multiple profiles and allow switch between them without restrictions by pool request or config change. Default auto-configuration create reasonable minimum of profiles which cover all supported algorithms.
@@ -75,6 +77,35 @@ Each number represent one thread and means CPU affinity, this is default format
 ```
 Internal format, but can be user defined.

+## RandomX options
+
+#### `init`
+Thread count to initialize RandomX dataset. Auto-detect (`-1`) or any number greater than 0 to use that many threads.
+
+#### `init-avx2`
+Use AVX2 for dataset initialization. Faster on some CPUs. Auto-detect (`-1`), disabled (`0`), always enabled on CPUs that support AVX2 (`1`).
+
+#### `mode`
+RandomX mining mode: `auto`, `fast` (2 GB memory), `light` (256 MB memory).
+
+#### `1gb-pages`
+Use 1GB hugepages for RandomX dataset (Linux only). Enabled (`true`) or disabled (`false`). It gives 1-3% speedup.
+
+#### `wrmsr`
+[MSR mod](https://xmrig.com/docs/miner/randomx-optimization-guide/msr). Enabled (`true`) or disabled (`false`). It gives up to 15% speedup depending on your system. _(**Note**: Userspace MSR writes are no longer enabled by default; the flag `msr.allow_writes=on` must be set for Linux Kernels 5.9 and after.)_
+
+#### `rdmsr`
+Restore MSR register values to their original values on exit. Used together with `wrmsr`. Enabled (`true`) or disabled (`false`).
+
+#### `cache_qos`
+[Cache QoS](https://xmrig.com/docs/miner/randomx-optimization-guide/qos). Enabled (`true`) or disabled (`false`). It's useful when you can't or don't want to mine on all CPU cores to make mining hashrate more stable.
+
+#### `numa`
+NUMA support (better hashrate on multi-CPU servers and Ryzen Threadripper 1xxx/2xxx). Enabled (`true`) or disabled (`false`).
+
+#### `scratchpad_prefetch_mode`
+Which instruction to use in RandomX loop to prefetch data from scratchpad. `1` is default and fastest in most cases. Can be off (`0`), `prefetcht0` instruction (`1`), `prefetchnta` instruction (`2`, a bit faster on Coffee Lake and a few other CPUs), `mov` instruction (`3`).
+
 ## Shared options

 #### `enabled`
@@ -83,23 +114,32 @@ Enable (`true`) or disable (`false`) CPU backend, by default `true`.
 #### `huge-pages`
 Enable (`true`) or disable (`false`) huge pages support, by default `true`.

+#### `huge-pages-jit`
+Enable (`true`) or disable (`false`) huge pages support for RandomX JIT code, by default `false`. It gives a very small boost on Ryzen CPUs, but hashrate is unstable between launches. Use with caution.
+
 #### `hw-aes`
 Force enable (`true`) or disable (`false`) hardware AES support. Default value `null` means miner autodetect this feature. Usually don't need change this option, this option useful for some rare cases when miner can't detect hardware AES, but it available. If you force enable this option, but your hardware not support it, miner will crash.

 #### `priority`
-Mining threads priority, value from `1` (lowest priority) to `5` (highest possible priority). Default value `null` means miner don't change threads priority at all.
+Mining threads priority, value from `1` (lowest priority) to `5` (highest possible priority). Default value `null` means miner don't change threads priority at all. Setting priority higher than 2 can make your PC unresponsive.
+
+#### `memory-pool` (since v4.3.0)
+Use continuous, persistent memory block for mining threads, useful for preserve huge pages allocation while algorithm switching. Possible values `false` (feature disabled, by default) or `true` or specific count of 2 MB huge pages. It helps to avoid loosing huge pages for scratchpads when RandomX dataset is updated and mining threads restart after a 2-3 days of mining.
+
+#### `yield` (since v5.1.1)
+Prefer system better system response/stability `true` (default value) or maximum hashrate `false`.

 #### `asm`
 Enable/configure or disable ASM optimizations. Possible values: `true`, `false`, `"intel"`, `"ryzen"`, `"bulldozer"`.

 #### `argon2-impl` (since v3.1.0)
-Allow override automatically detected Argon2 implementation, this option added mostly for debug purposes, default value `null` means autodetect. Other possible values: `"x86_64"`, `"SSE2"`, `"SSSE3"`, `"XOP"`, `"AVX2"`, `"AVX-512F"`. Manual selection has no safe guards, if you CPU not support required instuctions, miner will crash.
+Allow override automatically detected Argon2 implementation, this option added mostly for debug purposes, default value `null` means autodetect. This is used in RandomX dataset initialization and also in some other mining algorithms. Other possible values: `"x86_64"`, `"SSE2"`, `"SSSE3"`, `"XOP"`, `"AVX2"`, `"AVX-512F"`. Manual selection has no safe guards - if your CPU doesn't support required instuctions, miner will crash.
+
+#### `astrobwt-max-size`
+AstroBWT algorithm: skip hashes with large stage 2 size, default: `550`, min: `400`, max: `1200`. Optimal value depends on your CPU/GPU
+
+#### `astrobwt-avx2`
+AstroBWT algorithm: use AVX2 code. It's faster on some CPUs and slower on other

 #### `max-threads-hint` (since v4.2.0)
 Maximum CPU threads count (in percentage) hint for autoconfig. [CPU_MAX_USAGE.md](CPU_MAX_USAGE.md)
-
-#### `memory-pool` (since v4.3.0)
-Use continuous, persistent memory block for mining threads, useful for preserve huge pages allocation while algorithm swithing. Possible values `false` (feature disabled, by default) or `true` or specific count of 2 MB huge pages.
-
-#### `yield` (since v5.1.1)
-Prefer system better system response/stability `true` (default value) or maximum hashrate `false`.
--- a/doc/build/CMAKE_OPTIONS.md
+++ b/doc/build/CMAKE_OPTIONS.md
@@ -22,6 +22,7 @@ This feature add external dependency to libhwloc (1.10.0+) (except MSVC builds).
 * **`-DWITH_EMBEDDED_CONFIG=ON`** Enable [embedded](https://github.com/xmrig/xmrig/issues/957) config support.
 * **`-DWITH_OPENCL=OFF`** Disable OpenCL backend.
 * **`-DWITH_CUDA=OFF`** Disable CUDA backend.
+* **`-DWITH_SSE4_1=OFF`** Disable SSE 4.1 for Blake2 (useful for arm builds).

 ## Debug options

--- a/doc/releases/5_0_1/SHA256SUMS
+++ b/doc/releases/5_0_1/SHA256SUMS
@@ -1,5 +0,0 @@
-6bb1a2e3a0fbca5195be6022f2a9fbff8a353c37c7542e7ab89420cb45b64505  xmrig-5.0.1-gcc-win32.zip
-24dba9ec281acfb2ea2c401ebd0e4e2d1f1ee5fd557da5ff3c7049020c1f78b6  xmrig-5.0.1-gcc-win64.zip
-86d65c6693ec9e35cd7547329580638b85c9eb0cf8383892a1c15199de5b556f  xmrig-5.0.1-msvc-cuda10_1-win64.zip
-0fbfe518b1c4b6993b0f66ff01302626375b15620ccf8f64d6fb97845068ffca  xmrig-5.0.1-msvc-win64.zip
-aa34890738a3494de2fa0e44db346937fea7339852f5f10b5d4655f95e2d8f1f  xmrig-5.0.1-xenial-x64.tar.gz
--- a/doc/releases/5_0_1/SHA256SUMS.sig
+++ b/doc/releases/5_0_1/SHA256SUMS.sig
@@ -1,11 +0,0 @@
-----BEGIN PGP SIGNATURE-----
-
-iQEzBAABCgAdFiEEmsTOqOZuNaXHzdwbRGpTY4vpRAkFAl3VcsoACgkQRGpTY4vp
-RAm9vQgA1MyTUU2jley2TCYLUzQy2Fffc8fbXYv64r44jbWOjC/6qo2iIlRgPhIc
-oVyPKr5TYS3QjDzCEm8IvozS0YudS6soESbPzqDonboK8pd0K4bsML9TQY2feV7A
-NL5vln0rfVHp1wxLLrQpfBqAgvJUXEyaHece6gFQN79JOGhEo2bHL2NyrOl+FViS
-b2BaMtXq410Fh+XT6ShnOaG/2EuO8ZqSGdCO6A/2LHQw1UY+mZiCvue6P6B06HmB
-WD/urOv38V389v+V+Sp4UlEW6VpBOOjvtChoVWtLt+tKzydrnt2EmoWWWg475pka
-4G6whHuMWS8CTt5/PDhJpvVXNQTIOw==
-=C764
-----END PGP SIGNATURE-----
--- a/scripts/benchmark_10M.cmd
+++ b/scripts/benchmark_10M.cmd
@@ -0,0 +1,4 @@
+@echo off
+cd %~dp0
+xmrig.exe --bench=10M --submit
+pause
--- a/scripts/benchmark_1M.cmd
+++ b/scripts/benchmark_1M.cmd
@@ -0,0 +1,4 @@
+@echo off
+cd %~dp0
+xmrig.exe --bench=1M --submit
+pause
--- a/scripts/build.hwloc.sh
+++ b/scripts/build.hwloc.sh
@@ -1,6 +1,6 @@
 #!/bin/bash -e

-HWLOC_VERSION="2.2.0"
+HWLOC_VERSION="2.4.1"

 mkdir -p deps
 mkdir -p deps/include
@@ -8,12 +8,12 @@ mkdir -p deps/lib

 mkdir -p build && cd build

-wget https://download.open-mpi.org/release/hwloc/v2.2/hwloc-${HWLOC_VERSION}.tar.bz2 -O hwloc-${HWLOC_VERSION}.tar.bz2
-tar -xjf hwloc-${HWLOC_VERSION}.tar.bz2
+wget https://download.open-mpi.org/release/hwloc/v2.4/hwloc-${HWLOC_VERSION}.tar.gz -O hwloc-${HWLOC_VERSION}.tar.gz
+tar -xzf hwloc-${HWLOC_VERSION}.tar.gz

 cd hwloc-${HWLOC_VERSION}
 ./configure --disable-shared --enable-static --disable-io --disable-libudev --disable-libxml2
-make -j$(nproc)
-cp -fr include/ ../../deps
+make -j$(nproc || sysctl -n hw.ncpu || sysctl -n hw.logicalcpu)
+cp -fr include ../../deps
 cp hwloc/.libs/libhwloc.a ../../deps/lib
 cd ..
--- a/scripts/build.hwloc1.sh
+++ b/scripts/build.hwloc1.sh
@@ -0,0 +1,19 @@
+#!/bin/bash -e
+
+HWLOC_VERSION="1.11.13"
+
+mkdir -p deps
+mkdir -p deps/include
+mkdir -p deps/lib
+
+mkdir -p build && cd build
+
+wget https://download.open-mpi.org/release/hwloc/v1.11/hwloc-${HWLOC_VERSION}.tar.gz -O hwloc-${HWLOC_VERSION}.tar.gz
+tar -xzf hwloc-${HWLOC_VERSION}.tar.gz
+
+cd hwloc-${HWLOC_VERSION}
+./configure --disable-shared --enable-static --disable-io --disable-libudev --disable-libxml2
+make -j$(nproc || sysctl -n hw.ncpu || sysctl -n hw.logicalcpu)
+cp -fr include ../../deps
+cp src/.libs/libhwloc.a ../../deps/lib
+cd ..
--- a/scripts/build.libressl.sh
+++ b/scripts/build.libressl.sh
@@ -13,8 +13,8 @@ tar -xzf libressl-${LIBRESSL_VERSION}.tar.gz

 cd libressl-${LIBRESSL_VERSION}
 ./configure --disable-shared
-make -j$(nproc)
-cp -fr include/ ../../deps
+make -j$(nproc || sysctl -n hw.ncpu || sysctl -n hw.logicalcpu)
+cp -fr include ../../deps
 cp crypto/.libs/libcrypto.a ../../deps/lib
 cp ssl/.libs/libssl.a ../../deps/lib
 cd ..
--- a/scripts/build.openssl.sh
+++ b/scripts/build.openssl.sh
@@ -1,6 +1,6 @@
 #!/bin/bash -e

-OPENSSL_VERSION="1.1.1g"
+OPENSSL_VERSION="1.1.1k"

 mkdir -p deps
 mkdir -p deps/include
@@ -13,8 +13,8 @@ tar -xzf openssl-${OPENSSL_VERSION}.tar.gz

 cd openssl-${OPENSSL_VERSION}
 ./config -no-shared -no-asm -no-zlib -no-comp -no-dgram -no-filenames -no-cms
-make -j$(nproc)
-cp -fr include/ ../../deps
+make -j$(nproc || sysctl -n hw.ncpu || sysctl -n hw.logicalcpu)
+cp -fr include ../../deps
 cp libcrypto.a ../../deps/lib
 cp libssl.a ../../deps/lib
 cd ..
--- a/scripts/build.uv.sh
+++ b/scripts/build.uv.sh
@@ -1,6 +1,6 @@
 #!/bin/bash -e

-UV_VERSION="1.38.0"
+UV_VERSION="1.41.0"

 mkdir -p deps
 mkdir -p deps/include
@@ -14,7 +14,7 @@ tar -xzf v${UV_VERSION}.tar.gz
 cd libuv-${UV_VERSION}
 sh autogen.sh
 ./configure --disable-shared
-make -j$(nproc)
-cp -fr include/ ../../deps
+make -j$(nproc || sysctl -n hw.ncpu || sysctl -n hw.logicalcpu)
+cp -fr include ../../deps
 cp .libs/libuv.a ../../deps/lib
 cd ..
--- a/scripts/generate_cl.js
+++ b/scripts/generate_cl.js
@@ -49,7 +49,6 @@ function rx()
        '../cn/algorithm.cl',
        'randomx_constants_monero.h',
        'randomx_constants_wow.h',
-        'randomx_constants_loki.h',
        'randomx_constants_arqma.h',
        'randomx_constants_keva.h',
        'aes.cl',
--- a/scripts/pool_mine_example.cmd
+++ b/scripts/pool_mine_example.cmd
@@ -0,0 +1,20 @@
+:: Example batch file for mining Monero at a pool
+::
+:: Format:
+::	xmrig.exe -o <pool address>:<pool port> -u <pool username/wallet> -p <pool password>
+::
+:: Fields:
+::	pool address		The host name of the pool stratum or its IP address, for example pool.hashvault.pro
+::	pool port 		The port of the pool's stratum to connect to, for example 3333. Check your pool's getting started page.
+::	pool username/wallet 	For most pools, this is the wallet address you want to mine to. Some pools require a username
+::	pool password 		For most pools this can be just 'x'. For pools using usernames, you may need to provide a password as configured on the pool.
+::
+:: List of Monero mining pools:
+::	https://miningpoolstats.stream/monero
+::
+:: Choose pools outside of top 5 to help Monero network be more decentralized!
+:: Smaller pools also often have smaller fees/payout limits.
+
+cd %~dp0
+xmrig.exe -o pool.hashvault.pro:3333 -u 48edfHu7V9Z84YzzMa6fUueoELZ9ZRXq9VetWzYGzKt52XU5xvqgzYnDK9URnRoJMk1j8nLwEVsaSWJ4fhdUyZijBGUicoD -p x
+pause
--- a/scripts/randomx_boost.sh
+++ b/scripts/randomx_boost.sh
@@ -1,18 +1,34 @@
-#!/bin/bash
+#!/bin/sh -e

-modprobe msr
+MSR_FILE=/sys/module/msr/parameters/allow_writes

-if cat /proc/cpuinfo | grep "AMD Ryzen" > /dev/null;
+if test -e "$MSR_FILE"; then
+	echo on > $MSR_FILE
+else
+	modprobe msr allow_writes=on
+fi
+
+if grep -E 'AMD Ryzen|AMD EPYC' /proc/cpuinfo > /dev/null;
 	then
-		echo "Detected Ryzen"
-		wrmsr -a 0xc0011022 0x510000
-		wrmsr -a 0xc001102b 0x1808cc16
+	if grep "cpu family[[:space:]]:[[:space:]]25" /proc/cpuinfo > /dev/null;
+		then
+			echo "Detected Zen3 CPU"
+			wrmsr -a 0xc0011020 0x4480000000000
+			wrmsr -a 0xc0011021 0x1c000200000040
+			wrmsr -a 0xc0011022 0xc000000401500000
+			wrmsr -a 0xc001102b 0x2000cc14
+			echo "MSR register values for Zen3 applied"
+		else
+			echo "Detected Zen1/Zen2 CPU"
 			wrmsr -a 0xc0011020 0
 			wrmsr -a 0xc0011021 0x40
-		echo "MSR register values for Ryzen applied"
-elif cat /proc/cpuinfo | grep "Intel" > /dev/null;
+			wrmsr -a 0xc0011022 0x1510000
+			wrmsr -a 0xc001102b 0x2000cc16
+			echo "MSR register values for Zen1/Zen2 applied"
+		fi
+elif grep "Intel" /proc/cpuinfo > /dev/null;
 	then
-		echo "Detected Intel"
+		echo "Detected Intel CPU"
 		wrmsr -a 0x1a4 0xf
 		echo "MSR register values for Intel applied"
 else
--- a/scripts/solo_mine_example.cmd
+++ b/scripts/solo_mine_example.cmd
@@ -0,0 +1,16 @@
+:: Example batch file for mining Monero solo
+::
+:: Format:
+::	xmrig.exe -o <node address>:<node port> -a rx/0 -u <wallet address> --daemon
+::
+:: Fields:
+::	node address		The host name of your monerod node or its IP address. It can also be a public node with RPC enabled, for example node.xmr.to
+::	node port 		The RPC port of your monerod node to connect to, usually 18081.
+::	wallet address		Check your Monero CLI or GUI wallet to see your wallet's address.
+::
+:: Mining solo is the best way to help Monero network be more decentralized!
+:: But you will only get a payout when you find a block which can take more than a year for a single low-end PC.
+
+cd %~dp0
+xmrig.exe -o node.xmr.to:18081 -a rx/0 -u 48edfHu7V9Z84YzzMa6fUueoELZ9ZRXq9VetWzYGzKt52XU5xvqgzYnDK9URnRoJMk1j8nLwEVsaSWJ4fhdUyZijBGUicoD --daemon
+pause
--- a/src/3rdparty/argon2/CMakeLists.txt
+++ b/src/3rdparty/argon2/CMakeLists.txt
@@ -1,4 +1,4 @@
-cmake_minimum_required(VERSION 2.8)
+cmake_minimum_required(VERSION 2.8.12)

 project(argon2 C)
 set(CMAKE_C_STANDARD 99)
--- a/src/3rdparty/fmt/LICENSE.rst
+++ b/src/3rdparty/fmt/LICENSE.rst
@@ -0,0 +1,27 @@
+Copyright (c) 2012 - present, Victor Zverovich
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
+--- Optional exception to the license ---
+
+As an exception, if, as a result of your compiling your source code, portions
+of this Software are embedded into a machine-executable object form of such
+source code, you may redistribute such embedded portions in such object form
+without including the above copyright and permission notices.
--- a/src/3rdparty/fmt/README.rst
+++ b/src/3rdparty/fmt/README.rst
@@ -0,0 +1,505 @@
+{fmt}
+=====
+
+.. image:: https://travis-ci.org/fmtlib/fmt.png?branch=master
+   :target: https://travis-ci.org/fmtlib/fmt
+
+.. image:: https://ci.appveyor.com/api/projects/status/ehjkiefde6gucy1v
+   :target: https://ci.appveyor.com/project/vitaut/fmt
+
+.. image:: https://oss-fuzz-build-logs.storage.googleapis.com/badges/libfmt.svg
+   :alt: fmt is continuously fuzzed at oss-fuzz
+   :target: https://bugs.chromium.org/p/oss-fuzz/issues/list?\
+            colspec=ID%20Type%20Component%20Status%20Proj%20Reported%20Owner%20\
+            Summary&q=proj%3Dlibfmt&can=1
+
+.. image:: https://img.shields.io/badge/stackoverflow-fmt-blue.svg
+   :alt: Ask questions at StackOverflow with the tag fmt
+   :target: https://stackoverflow.com/questions/tagged/fmt
+
+**{fmt}** is an open-source formatting library providing a fast and safe
+alternative to C stdio and C++ iostreams.
+
+If you like this project, please consider donating to BYSOL,
+an initiative to help victims of political repressions in Belarus:
+https://www.facebook.com/donate/759400044849707/108388587646909/.
+
+`Documentation <https://fmt.dev>`__
+
+Q&A: ask questions on `StackOverflow with the tag fmt
+<https://stackoverflow.com/questions/tagged/fmt>`_.
+
+Try {fmt} in `Compiler Explorer <https://godbolt.org/z/Eq5763>`_.
+
+Features
+--------
+
+* Simple `format API <https://fmt.dev/latest/api.html>`_ with positional arguments
+  for localization
+* Implementation of `C++20 std::format
+  <https://en.cppreference.com/w/cpp/utility/format>`__
+* `Format string syntax <https://fmt.dev/latest/syntax.html>`_ similar to Python's
+  `format <https://docs.python.org/3/library/stdtypes.html#str.format>`_
+* Fast IEEE 754 floating-point formatter with correct rounding, shortness and
+  round-trip guarantees.
+* Safe `printf implementation
+  <https://fmt.dev/latest/api.html#printf-formatting>`_ including the POSIX
+  extension for positional arguments
+* Extensibility: `support for user-defined types
+  <https://fmt.dev/latest/api.html#formatting-user-defined-types>`_
+* High performance: faster than common standard library implementations of
+  ``(s)printf``, iostreams, ``to_string`` and ``to_chars``, see `Speed tests`_
+  and `Converting a hundred million integers to strings per second
+  <http://www.zverovich.net/2020/06/13/fast-int-to-string-revisited.html>`_
+* Small code size both in terms of source code with the minimum configuration
+  consisting of just three files, ``core.h``, ``format.h`` and ``format-inl.h``,
+  and compiled code; see `Compile time and code bloat`_
+* Reliability: the library has an extensive set of `tests
+  <https://github.com/fmtlib/fmt/tree/master/test>`_ and is `continuously fuzzed
+  <https://bugs.chromium.org/p/oss-fuzz/issues/list?colspec=ID%20Type%20
+  Component%20Status%20Proj%20Reported%20Owner%20Summary&q=proj%3Dlibfmt&can=1>`_
+* Safety: the library is fully type safe, errors in format strings can be
+  reported at compile time, automatic memory management prevents buffer overflow
+  errors
+* Ease of use: small self-contained code base, no external dependencies,
+  permissive MIT `license
+  <https://github.com/fmtlib/fmt/blob/master/LICENSE.rst>`_
+* `Portability <https://fmt.dev/latest/index.html#portability>`_ with
+  consistent output across platforms and support for older compilers
+* Clean warning-free codebase even on high warning levels such as
+  ``-Wall -Wextra -pedantic``
+* Locale-independence by default
+* Optional header-only configuration enabled with the ``FMT_HEADER_ONLY`` macro
+
+See the `documentation <https://fmt.dev>`_ for more details.
+
+Examples
+--------
+
+**Print to stdout** (`run <https://godbolt.org/z/Tevcjh>`_)
+
+.. code:: c++
+
+    #include <fmt/core.h>
+    
+    int main() {
+      fmt::print("Hello, world!\n");
+    }
+
+**Format a string** (`run <https://godbolt.org/z/oK8h33>`_)
+
+.. code:: c++
+
+    std::string s = fmt::format("The answer is {}.", 42);
+    // s == "The answer is 42."
+
+**Format a string using positional arguments** (`run <https://godbolt.org/z/Yn7Txe>`_)
+
+.. code:: c++
+
+    std::string s = fmt::format("I'd rather be {1} than {0}.", "right", "happy");
+    // s == "I'd rather be happy than right."
+
+**Print chrono durations** (`run <https://godbolt.org/z/K8s4Mc>`_)
+
+.. code:: c++
+
+    #include <fmt/chrono.h>
+
+    int main() {
+      using namespace std::literals::chrono_literals;
+      fmt::print("Default format: {} {}\n", 42s, 100ms);
+      fmt::print("strftime-like format: {:%H:%M:%S}\n", 3h + 15min + 30s);
+    }
+
+Output::
+
+    Default format: 42s 100ms
+    strftime-like format: 03:15:30
+
+**Print a container** (`run <https://godbolt.org/z/MjsY7c>`_)
+
+.. code:: c++
+
+    #include <vector>
+    #include <fmt/ranges.h>
+
+    int main() {
+      std::vector<int> v = {1, 2, 3};
+      fmt::print("{}\n", v);
+    }
+
+Output::
+
+    {1, 2, 3}
+
+**Check a format string at compile time**
+
+.. code:: c++
+
+    std::string s = fmt::format(FMT_STRING("{:d}"), "don't panic");
+
+This gives a compile-time error because ``d`` is an invalid format specifier for
+a string.
+
+**Write a file from a single thread**
+
+.. code:: c++
+
+    #include <fmt/os.h>
+
+    int main() {
+      auto out = fmt::output_file("guide.txt");
+      out.print("Don't {}", "Panic");
+    }
+
+This can be `5 to 9 times faster than fprintf
+<http://www.zverovich.net/2020/08/04/optimal-file-buffer-size.html>`_.
+
+**Print with colors and text styles**
+
+.. code:: c++
+
+    #include <fmt/color.h>
+
+    int main() {
+      fmt::print(fg(fmt::color::crimson) | fmt::emphasis::bold,
+                 "Hello, {}!\n", "world");
+      fmt::print(fg(fmt::color::floral_white) | bg(fmt::color::slate_gray) |
+                 fmt::emphasis::underline, "Hello, {}!\n", "мир");
+      fmt::print(fg(fmt::color::steel_blue) | fmt::emphasis::italic,
+                 "Hello, {}!\n", "世界");
+    }
+
+Output on a modern terminal:
+
+.. image:: https://user-images.githubusercontent.com/
+           576385/88485597-d312f600-cf2b-11ea-9cbe-61f535a86e28.png
+
+Benchmarks
+----------
+
+Speed tests
+~~~~~~~~~~~
+
+================= ============= ===========
+Library           Method        Run Time, s
+================= ============= ===========
+libc              printf          1.04
+libc++            std::ostream    3.05
+{fmt} 6.1.1       fmt::print      0.75
+Boost Format 1.67 boost::format   7.24
+Folly Format      folly::format   2.23
+================= ============= ===========
+
+{fmt} is the fastest of the benchmarked methods, ~35% faster than ``printf``.
+
+The above results were generated by building ``tinyformat_test.cpp`` on macOS
+10.14.6 with ``clang++ -O3 -DNDEBUG -DSPEED_TEST -DHAVE_FORMAT``, and taking the
+best of three runs. In the test, the format string ``"%0.10f:%04d:%+g:%s:%p:%c:%%\n"``
+or equivalent is filled 2,000,000 times with output sent to ``/dev/null``; for
+further details refer to the `source
+<https://github.com/fmtlib/format-benchmark/blob/master/tinyformat_test.cpp>`_.
+
+{fmt} is up to 10x faster than ``std::ostringstream`` and ``sprintf`` on
+floating-point formatting (`dtoa-benchmark <https://github.com/fmtlib/dtoa-benchmark>`_)
+and faster than `double-conversion <https://github.com/google/double-conversion>`_:
+
+.. image:: https://user-images.githubusercontent.com/576385/
+           69767160-cdaca400-112f-11ea-9fc5-347c9f83caad.png
+   :target: https://fmt.dev/unknown_mac64_clang10.0.html
+
+Compile time and code bloat
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The script `bloat-test.py
+<https://github.com/fmtlib/format-benchmark/blob/master/bloat-test.py>`_
+from `format-benchmark <https://github.com/fmtlib/format-benchmark>`_
+tests compile time and code bloat for nontrivial projects.
+It generates 100 translation units and uses ``printf()`` or its alternative
+five times in each to simulate a medium sized project.  The resulting
+executable size and compile time (Apple LLVM version 8.1.0 (clang-802.0.42),
+macOS Sierra, best of three) is shown in the following tables.
+
+**Optimized build (-O3)**
+
+============= =============== ==================== ==================
+Method        Compile Time, s Executable size, KiB Stripped size, KiB
+============= =============== ==================== ==================
+printf                    2.6                   29                 26
+printf+string            16.4                   29                 26
+iostreams                31.1                   59                 55
+{fmt}                    19.0                   37                 34
+Boost Format             91.9                  226                203
+Folly Format            115.7                  101                 88
+============= =============== ==================== ==================
+
+As you can see, {fmt} has 60% less overhead in terms of resulting binary code
+size compared to iostreams and comes pretty close to ``printf``. Boost Format
+and Folly Format have the largest overheads.
+
+``printf+string`` is the same as ``printf`` but with extra ``<string>``
+include to measure the overhead of the latter.
+
+**Non-optimized build**
+
+============= =============== ==================== ==================
+Method        Compile Time, s Executable size, KiB Stripped size, KiB
+============= =============== ==================== ==================
+printf                    2.2                   33                 30
+printf+string            16.0                   33                 30
+iostreams                28.3                   56                 52
+{fmt}                    18.2                   59                 50
+Boost Format             54.1                  365                303
+Folly Format             79.9                  445                430
+============= =============== ==================== ==================
+
+``libc``, ``lib(std)c++`` and ``libfmt`` are all linked as shared libraries to
+compare formatting function overhead only. Boost Format is a
+header-only library so it doesn't provide any linkage options.
+
+Running the tests
+~~~~~~~~~~~~~~~~~
+
+Please refer to `Building the library`__ for the instructions on how to build
+the library and run the unit tests.
+
+__ https://fmt.dev/latest/usage.html#building-the-library
+
+Benchmarks reside in a separate repository,
+`format-benchmarks <https://github.com/fmtlib/format-benchmark>`_,
+so to run the benchmarks you first need to clone this repository and
+generate Makefiles with CMake::
+
+    $ git clone --recursive https://github.com/fmtlib/format-benchmark.git
+    $ cd format-benchmark
+    $ cmake .
+
+Then you can run the speed test::
+
+    $ make speed-test
+
+or the bloat test::
+
+    $ make bloat-test
+
+Projects using this library
+---------------------------
+
+* `0 A.D. <https://play0ad.com/>`_: A free, open-source, cross-platform
+  real-time strategy game
+
+* `AMPL/MP <https://github.com/ampl/mp>`_:
+  An open-source library for mathematical programming
+
+* `Aseprite <https://github.com/aseprite/aseprite>`_:
+  Animated sprite editor & pixel art tool 
+
+* `AvioBook <https://www.aviobook.aero/en>`_: A comprehensive aircraft
+  operations suite
+  
+* `Celestia <https://celestia.space/>`_: Real-time 3D visualization of space
+
+* `Ceph <https://ceph.com/>`_: A scalable distributed storage system
+
+* `ccache <https://ccache.dev/>`_: A compiler cache
+
+* `ClickHouse <https://github.com/ClickHouse/ClickHouse>`_: analytical database
+  management system
+
+* `CUAUV <http://cuauv.org/>`_: Cornell University's autonomous underwater
+  vehicle
+
+* `Drake <https://drake.mit.edu/>`_: A planning, control, and analysis toolbox
+  for nonlinear dynamical systems (MIT)
+
+* `Envoy <https://lyft.github.io/envoy/>`_: C++ L7 proxy and communication bus
+  (Lyft)
+
+* `FiveM <https://fivem.net/>`_: a modification framework for GTA V
+
+* `Folly <https://github.com/facebook/folly>`_: Facebook open-source library
+
+* `HarpyWar/pvpgn <https://github.com/pvpgn/pvpgn-server>`_:
+  Player vs Player Gaming Network with tweaks
+
+* `KBEngine <https://github.com/kbengine/kbengine>`_: An open-source MMOG server
+  engine
+
+* `Keypirinha <https://keypirinha.com/>`_: A semantic launcher for Windows
+
+* `Kodi <https://kodi.tv/>`_ (formerly xbmc): Home theater software
+
+* `Knuth <https://kth.cash/>`_: High-performance Bitcoin full-node
+
+* `Microsoft Verona <https://github.com/microsoft/verona>`_:
+  Research programming language for concurrent ownership
+
+* `MongoDB <https://mongodb.com/>`_: Distributed document database
+
+* `MongoDB Smasher <https://github.com/duckie/mongo_smasher>`_: A small tool to
+  generate randomized datasets
+
+* `OpenSpace <https://openspaceproject.com/>`_: An open-source
+  astrovisualization framework
+
+* `PenUltima Online (POL) <https://www.polserver.com/>`_:
+  An MMO server, compatible with most Ultima Online clients
+
+* `PyTorch <https://github.com/pytorch/pytorch>`_: An open-source machine
+  learning library
+
+* `quasardb <https://www.quasardb.net/>`_: A distributed, high-performance,
+  associative database
+  
+* `Quill <https://github.com/odygrd/quill>`_: Asynchronous low-latency logging library
+
+* `QKW <https://github.com/ravijanjam/qkw>`_: Generalizing aliasing to simplify
+  navigation, and executing complex multi-line terminal command sequences
+
+* `readpe <https://bitbucket.org/sys_dev/readpe>`_: Read Portable Executable
+
+* `redis-cerberus <https://github.com/HunanTV/redis-cerberus>`_: A Redis cluster
+  proxy
+
+* `redpanda <https://vectorized.io/redpanda>`_: A 10x faster Kafka® replacement
+  for mission critical systems written in C++
+
+* `rpclib <http://rpclib.net/>`_: A modern C++ msgpack-RPC server and client
+  library
+
+* `Salesforce Analytics Cloud
+  <https://www.salesforce.com/analytics-cloud/overview/>`_:
+  Business intelligence software
+
+* `Scylla <https://www.scylladb.com/>`_: A Cassandra-compatible NoSQL data store
+  that can handle 1 million transactions per second on a single server
+
+* `Seastar <http://www.seastar-project.org/>`_: An advanced, open-source C++
+  framework for high-performance server applications on modern hardware
+
+* `spdlog <https://github.com/gabime/spdlog>`_: Super fast C++ logging library
+
+* `Stellar <https://www.stellar.org/>`_: Financial platform
+
+* `Touch Surgery <https://www.touchsurgery.com/>`_: Surgery simulator
+
+* `TrinityCore <https://github.com/TrinityCore/TrinityCore>`_: Open-source
+  MMORPG framework
+
+* `Windows Terminal <https://github.com/microsoft/terminal>`_: The new Windows
+  Terminal
+
+`More... <https://github.com/search?q=fmtlib&type=Code>`_
+
+If you are aware of other projects using this library, please let me know
+by `email <mailto:victor.zverovich@gmail.com>`_ or by submitting an
+`issue <https://github.com/fmtlib/fmt/issues>`_.
+
+Motivation
+----------
+
+So why yet another formatting library?
+
+There are plenty of methods for doing this task, from standard ones like
+the printf family of function and iostreams to Boost Format and FastFormat
+libraries. The reason for creating a new library is that every existing
+solution that I found either had serious issues or didn't provide
+all the features I needed.
+
+printf
+~~~~~~
+
+The good thing about ``printf`` is that it is pretty fast and readily available
+being a part of the C standard library. The main drawback is that it
+doesn't support user-defined types. ``printf`` also has safety issues although
+they are somewhat mitigated with `__attribute__ ((format (printf, ...))
+<https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html>`_ in GCC.
+There is a POSIX extension that adds positional arguments required for
+`i18n <https://en.wikipedia.org/wiki/Internationalization_and_localization>`_
+to ``printf`` but it is not a part of C99 and may not be available on some
+platforms.
+
+iostreams
+~~~~~~~~~
+
+The main issue with iostreams is best illustrated with an example:
+
+.. code:: c++
+
+    std::cout << std::setprecision(2) << std::fixed << 1.23456 << "\n";
+
+which is a lot of typing compared to printf:
+
+.. code:: c++
+
+    printf("%.2f\n", 1.23456);
+
+Matthew Wilson, the author of FastFormat, called this "chevron hell". iostreams
+don't support positional arguments by design.
+
+The good part is that iostreams support user-defined types and are safe although
+error handling is awkward.
+
+Boost Format
+~~~~~~~~~~~~
+
+This is a very powerful library which supports both ``printf``-like format
+strings and positional arguments. Its main drawback is performance. According to
+various, benchmarks it is much slower than other methods considered here. Boost
+Format also has excessive build times and severe code bloat issues (see
+`Benchmarks`_).
+
+FastFormat
+~~~~~~~~~~
+
+This is an interesting library which is fast, safe and has positional arguments.
+However, it has significant limitations, citing its author:
+
+    Three features that have no hope of being accommodated within the
+    current design are:
+
+    * Leading zeros (or any other non-space padding)
+    * Octal/hexadecimal encoding
+    * Runtime width/alignment specification
+
+It is also quite big and has a heavy dependency, STLSoft, which might be too
+restrictive for using it in some projects.
+
+Boost Spirit.Karma
+~~~~~~~~~~~~~~~~~~
+
+This is not really a formatting library but I decided to include it here for
+completeness. As iostreams, it suffers from the problem of mixing verbatim text
+with arguments. The library is pretty fast, but slower on integer formatting
+than ``fmt::format_to`` with format string compilation on Karma's own benchmark,
+see `Converting a hundred million integers to strings per second
+<http://www.zverovich.net/2020/06/13/fast-int-to-string-revisited.html>`_.
+
+License
+-------
+
+{fmt} is distributed under the MIT `license
+<https://github.com/fmtlib/fmt/blob/master/LICENSE.rst>`_.
+
+Documentation License
+---------------------
+
+The `Format String Syntax <https://fmt.dev/latest/syntax.html>`_
+section in the documentation is based on the one from Python `string module
+documentation <https://docs.python.org/3/library/string.html#module-string>`_.
+For this reason the documentation is distributed under the Python Software
+Foundation license available in `doc/python-license.txt
+<https://raw.github.com/fmtlib/fmt/master/doc/python-license.txt>`_.
+It only applies if you distribute the documentation of {fmt}.
+
+Maintainers
+-----------
+
+The {fmt} library is maintained by Victor Zverovich (`vitaut
+<https://github.com/vitaut>`_) and Jonathan Müller (`foonathan
+<https://github.com/foonathan>`_) with contributions from many other people.
+See `Contributors <https://github.com/fmtlib/fmt/graphs/contributors>`_ and
+`Releases <https://github.com/fmtlib/fmt/releases>`_ for some of the names.
+Let us know if your contribution is not listed or mentioned incorrectly and
+we'll make it right.
--- a/src/3rdparty/fmt/chrono.h
+++ b/src/3rdparty/fmt/chrono.h
--- a/src/3rdparty/fmt/color.h
+++ b/src/3rdparty/fmt/color.h
@@ -0,0 +1,602 @@
+// Formatting library for C++ - color support
+//
+// Copyright (c) 2018 - present, Victor Zverovich and fmt contributors
+// All rights reserved.
+//
+// For the license information refer to format.h.
+
+#ifndef FMT_COLOR_H_
+#define FMT_COLOR_H_
+
+#include "format.h"
+
+FMT_BEGIN_NAMESPACE
+
+enum class color : uint32_t {
+  alice_blue = 0xF0F8FF,               // rgb(240,248,255)
+  antique_white = 0xFAEBD7,            // rgb(250,235,215)
+  aqua = 0x00FFFF,                     // rgb(0,255,255)
+  aquamarine = 0x7FFFD4,               // rgb(127,255,212)
+  azure = 0xF0FFFF,                    // rgb(240,255,255)
+  beige = 0xF5F5DC,                    // rgb(245,245,220)
+  bisque = 0xFFE4C4,                   // rgb(255,228,196)
+  black = 0x000000,                    // rgb(0,0,0)
+  blanched_almond = 0xFFEBCD,          // rgb(255,235,205)
+  blue = 0x0000FF,                     // rgb(0,0,255)
+  blue_violet = 0x8A2BE2,              // rgb(138,43,226)
+  brown = 0xA52A2A,                    // rgb(165,42,42)
+  burly_wood = 0xDEB887,               // rgb(222,184,135)
+  cadet_blue = 0x5F9EA0,               // rgb(95,158,160)
+  chartreuse = 0x7FFF00,               // rgb(127,255,0)
+  chocolate = 0xD2691E,                // rgb(210,105,30)
+  coral = 0xFF7F50,                    // rgb(255,127,80)
+  cornflower_blue = 0x6495ED,          // rgb(100,149,237)
+  cornsilk = 0xFFF8DC,                 // rgb(255,248,220)
+  crimson = 0xDC143C,                  // rgb(220,20,60)
+  cyan = 0x00FFFF,                     // rgb(0,255,255)
+  dark_blue = 0x00008B,                // rgb(0,0,139)
+  dark_cyan = 0x008B8B,                // rgb(0,139,139)
+  dark_golden_rod = 0xB8860B,          // rgb(184,134,11)
+  dark_gray = 0xA9A9A9,                // rgb(169,169,169)
+  dark_green = 0x006400,               // rgb(0,100,0)
+  dark_khaki = 0xBDB76B,               // rgb(189,183,107)
+  dark_magenta = 0x8B008B,             // rgb(139,0,139)
+  dark_olive_green = 0x556B2F,         // rgb(85,107,47)
+  dark_orange = 0xFF8C00,              // rgb(255,140,0)
+  dark_orchid = 0x9932CC,              // rgb(153,50,204)
+  dark_red = 0x8B0000,                 // rgb(139,0,0)
+  dark_salmon = 0xE9967A,              // rgb(233,150,122)
+  dark_sea_green = 0x8FBC8F,           // rgb(143,188,143)
+  dark_slate_blue = 0x483D8B,          // rgb(72,61,139)
+  dark_slate_gray = 0x2F4F4F,          // rgb(47,79,79)
+  dark_turquoise = 0x00CED1,           // rgb(0,206,209)
+  dark_violet = 0x9400D3,              // rgb(148,0,211)
+  deep_pink = 0xFF1493,                // rgb(255,20,147)
+  deep_sky_blue = 0x00BFFF,            // rgb(0,191,255)
+  dim_gray = 0x696969,                 // rgb(105,105,105)
+  dodger_blue = 0x1E90FF,              // rgb(30,144,255)
+  fire_brick = 0xB22222,               // rgb(178,34,34)
+  floral_white = 0xFFFAF0,             // rgb(255,250,240)
+  forest_green = 0x228B22,             // rgb(34,139,34)
+  fuchsia = 0xFF00FF,                  // rgb(255,0,255)
+  gainsboro = 0xDCDCDC,                // rgb(220,220,220)
+  ghost_white = 0xF8F8FF,              // rgb(248,248,255)
+  gold = 0xFFD700,                     // rgb(255,215,0)
+  golden_rod = 0xDAA520,               // rgb(218,165,32)
+  gray = 0x808080,                     // rgb(128,128,128)
+  green = 0x008000,                    // rgb(0,128,0)
+  green_yellow = 0xADFF2F,             // rgb(173,255,47)
+  honey_dew = 0xF0FFF0,                // rgb(240,255,240)
+  hot_pink = 0xFF69B4,                 // rgb(255,105,180)
+  indian_red = 0xCD5C5C,               // rgb(205,92,92)
+  indigo = 0x4B0082,                   // rgb(75,0,130)
+  ivory = 0xFFFFF0,                    // rgb(255,255,240)
+  khaki = 0xF0E68C,                    // rgb(240,230,140)
+  lavender = 0xE6E6FA,                 // rgb(230,230,250)
+  lavender_blush = 0xFFF0F5,           // rgb(255,240,245)
+  lawn_green = 0x7CFC00,               // rgb(124,252,0)
+  lemon_chiffon = 0xFFFACD,            // rgb(255,250,205)
+  light_blue = 0xADD8E6,               // rgb(173,216,230)
+  light_coral = 0xF08080,              // rgb(240,128,128)
+  light_cyan = 0xE0FFFF,               // rgb(224,255,255)
+  light_golden_rod_yellow = 0xFAFAD2,  // rgb(250,250,210)
+  light_gray = 0xD3D3D3,               // rgb(211,211,211)
+  light_green = 0x90EE90,              // rgb(144,238,144)
+  light_pink = 0xFFB6C1,               // rgb(255,182,193)
+  light_salmon = 0xFFA07A,             // rgb(255,160,122)
+  light_sea_green = 0x20B2AA,          // rgb(32,178,170)
+  light_sky_blue = 0x87CEFA,           // rgb(135,206,250)
+  light_slate_gray = 0x778899,         // rgb(119,136,153)
+  light_steel_blue = 0xB0C4DE,         // rgb(176,196,222)
+  light_yellow = 0xFFFFE0,             // rgb(255,255,224)
+  lime = 0x00FF00,                     // rgb(0,255,0)
+  lime_green = 0x32CD32,               // rgb(50,205,50)
+  linen = 0xFAF0E6,                    // rgb(250,240,230)
+  magenta = 0xFF00FF,                  // rgb(255,0,255)
+  maroon = 0x800000,                   // rgb(128,0,0)
+  medium_aquamarine = 0x66CDAA,        // rgb(102,205,170)
+  medium_blue = 0x0000CD,              // rgb(0,0,205)
+  medium_orchid = 0xBA55D3,            // rgb(186,85,211)
+  medium_purple = 0x9370DB,            // rgb(147,112,219)
+  medium_sea_green = 0x3CB371,         // rgb(60,179,113)
+  medium_slate_blue = 0x7B68EE,        // rgb(123,104,238)
+  medium_spring_green = 0x00FA9A,      // rgb(0,250,154)
+  medium_turquoise = 0x48D1CC,         // rgb(72,209,204)
+  medium_violet_red = 0xC71585,        // rgb(199,21,133)
+  midnight_blue = 0x191970,            // rgb(25,25,112)
+  mint_cream = 0xF5FFFA,               // rgb(245,255,250)
+  misty_rose = 0xFFE4E1,               // rgb(255,228,225)
+  moccasin = 0xFFE4B5,                 // rgb(255,228,181)
+  navajo_white = 0xFFDEAD,             // rgb(255,222,173)
+  navy = 0x000080,                     // rgb(0,0,128)
+  old_lace = 0xFDF5E6,                 // rgb(253,245,230)
+  olive = 0x808000,                    // rgb(128,128,0)
+  olive_drab = 0x6B8E23,               // rgb(107,142,35)
+  orange = 0xFFA500,                   // rgb(255,165,0)
+  orange_red = 0xFF4500,               // rgb(255,69,0)
+  orchid = 0xDA70D6,                   // rgb(218,112,214)
+  pale_golden_rod = 0xEEE8AA,          // rgb(238,232,170)
+  pale_green = 0x98FB98,               // rgb(152,251,152)
+  pale_turquoise = 0xAFEEEE,           // rgb(175,238,238)
+  pale_violet_red = 0xDB7093,          // rgb(219,112,147)
+  papaya_whip = 0xFFEFD5,              // rgb(255,239,213)
+  peach_puff = 0xFFDAB9,               // rgb(255,218,185)
+  peru = 0xCD853F,                     // rgb(205,133,63)
+  pink = 0xFFC0CB,                     // rgb(255,192,203)
+  plum = 0xDDA0DD,                     // rgb(221,160,221)
+  powder_blue = 0xB0E0E6,              // rgb(176,224,230)
+  purple = 0x800080,                   // rgb(128,0,128)
+  rebecca_purple = 0x663399,           // rgb(102,51,153)
+  red = 0xFF0000,                      // rgb(255,0,0)
+  rosy_brown = 0xBC8F8F,               // rgb(188,143,143)
+  royal_blue = 0x4169E1,               // rgb(65,105,225)
+  saddle_brown = 0x8B4513,             // rgb(139,69,19)
+  salmon = 0xFA8072,                   // rgb(250,128,114)
+  sandy_brown = 0xF4A460,              // rgb(244,164,96)
+  sea_green = 0x2E8B57,                // rgb(46,139,87)
+  sea_shell = 0xFFF5EE,                // rgb(255,245,238)
+  sienna = 0xA0522D,                   // rgb(160,82,45)
+  silver = 0xC0C0C0,                   // rgb(192,192,192)
+  sky_blue = 0x87CEEB,                 // rgb(135,206,235)
+  slate_blue = 0x6A5ACD,               // rgb(106,90,205)
+  slate_gray = 0x708090,               // rgb(112,128,144)
+  snow = 0xFFFAFA,                     // rgb(255,250,250)
+  spring_green = 0x00FF7F,             // rgb(0,255,127)
+  steel_blue = 0x4682B4,               // rgb(70,130,180)
+  tan = 0xD2B48C,                      // rgb(210,180,140)
+  teal = 0x008080,                     // rgb(0,128,128)
+  thistle = 0xD8BFD8,                  // rgb(216,191,216)
+  tomato = 0xFF6347,                   // rgb(255,99,71)
+  turquoise = 0x40E0D0,                // rgb(64,224,208)
+  violet = 0xEE82EE,                   // rgb(238,130,238)
+  wheat = 0xF5DEB3,                    // rgb(245,222,179)
+  white = 0xFFFFFF,                    // rgb(255,255,255)
+  white_smoke = 0xF5F5F5,              // rgb(245,245,245)
+  yellow = 0xFFFF00,                   // rgb(255,255,0)
+  yellow_green = 0x9ACD32              // rgb(154,205,50)
+};                                     // enum class color
+
+enum class terminal_color : uint8_t {
+  black = 30,
+  red,
+  green,
+  yellow,
+  blue,
+  magenta,
+  cyan,
+  white,
+  bright_black = 90,
+  bright_red,
+  bright_green,
+  bright_yellow,
+  bright_blue,
+  bright_magenta,
+  bright_cyan,
+  bright_white
+};
+
+enum class emphasis : uint8_t {
+  bold = 1,
+  italic = 1 << 1,
+  underline = 1 << 2,
+  strikethrough = 1 << 3
+};
+
+// rgb is a struct for red, green and blue colors.
+// Using the name "rgb" makes some editors show the color in a tooltip.
+struct rgb {
+  FMT_CONSTEXPR rgb() : r(0), g(0), b(0) {}
+  FMT_CONSTEXPR rgb(uint8_t r_, uint8_t g_, uint8_t b_) : r(r_), g(g_), b(b_) {}
+  FMT_CONSTEXPR rgb(uint32_t hex)
+      : r((hex >> 16) & 0xFF), g((hex >> 8) & 0xFF), b(hex & 0xFF) {}
+  FMT_CONSTEXPR rgb(color hex)
+      : r((uint32_t(hex) >> 16) & 0xFF),
+        g((uint32_t(hex) >> 8) & 0xFF),
+        b(uint32_t(hex) & 0xFF) {}
+  uint8_t r;
+  uint8_t g;
+  uint8_t b;
+};
+
+namespace detail {
+
+// color is a struct of either a rgb color or a terminal color.
+struct color_type {
+  FMT_CONSTEXPR color_type() FMT_NOEXCEPT : is_rgb(), value{} {}
+  FMT_CONSTEXPR color_type(color rgb_color) FMT_NOEXCEPT : is_rgb(true),
+                                                           value{} {
+    value.rgb_color = static_cast<uint32_t>(rgb_color);
+  }
+  FMT_CONSTEXPR color_type(rgb rgb_color) FMT_NOEXCEPT : is_rgb(true), value{} {
+    value.rgb_color = (static_cast<uint32_t>(rgb_color.r) << 16) |
+                      (static_cast<uint32_t>(rgb_color.g) << 8) | rgb_color.b;
+  }
+  FMT_CONSTEXPR color_type(terminal_color term_color) FMT_NOEXCEPT : is_rgb(),
+                                                                     value{} {
+    value.term_color = static_cast<uint8_t>(term_color);
+  }
+  bool is_rgb;
+  union color_union {
+    uint8_t term_color;
+    uint32_t rgb_color;
+  } value;
+};
+}  // namespace detail
+
+// Experimental text formatting support.
+class text_style {
+ public:
+  FMT_CONSTEXPR text_style(emphasis em = emphasis()) FMT_NOEXCEPT
+      : set_foreground_color(),
+        set_background_color(),
+        ems(em) {}
+
+  FMT_CONSTEXPR text_style& operator|=(const text_style& rhs) {
+    if (!set_foreground_color) {
+      set_foreground_color = rhs.set_foreground_color;
+      foreground_color = rhs.foreground_color;
+    } else if (rhs.set_foreground_color) {
+      if (!foreground_color.is_rgb || !rhs.foreground_color.is_rgb)
+        FMT_THROW(format_error("can't OR a terminal color"));
+      foreground_color.value.rgb_color |= rhs.foreground_color.value.rgb_color;
+    }
+
+    if (!set_background_color) {
+      set_background_color = rhs.set_background_color;
+      background_color = rhs.background_color;
+    } else if (rhs.set_background_color) {
+      if (!background_color.is_rgb || !rhs.background_color.is_rgb)
+        FMT_THROW(format_error("can't OR a terminal color"));
+      background_color.value.rgb_color |= rhs.background_color.value.rgb_color;
+    }
+
+    ems = static_cast<emphasis>(static_cast<uint8_t>(ems) |
+                                static_cast<uint8_t>(rhs.ems));
+    return *this;
+  }
+
+  friend FMT_CONSTEXPR text_style operator|(text_style lhs,
+                                            const text_style& rhs) {
+    return lhs |= rhs;
+  }
+
+  FMT_CONSTEXPR text_style& operator&=(const text_style& rhs) {
+    if (!set_foreground_color) {
+      set_foreground_color = rhs.set_foreground_color;
+      foreground_color = rhs.foreground_color;
+    } else if (rhs.set_foreground_color) {
+      if (!foreground_color.is_rgb || !rhs.foreground_color.is_rgb)
+        FMT_THROW(format_error("can't AND a terminal color"));
+      foreground_color.value.rgb_color &= rhs.foreground_color.value.rgb_color;
+    }
+
+    if (!set_background_color) {
+      set_background_color = rhs.set_background_color;
+      background_color = rhs.background_color;
+    } else if (rhs.set_background_color) {
+      if (!background_color.is_rgb || !rhs.background_color.is_rgb)
+        FMT_THROW(format_error("can't AND a terminal color"));
+      background_color.value.rgb_color &= rhs.background_color.value.rgb_color;
+    }
+
+    ems = static_cast<emphasis>(static_cast<uint8_t>(ems) &
+                                static_cast<uint8_t>(rhs.ems));
+    return *this;
+  }
+
+  friend FMT_CONSTEXPR text_style operator&(text_style lhs,
+                                            const text_style& rhs) {
+    return lhs &= rhs;
+  }
+
+  FMT_CONSTEXPR bool has_foreground() const FMT_NOEXCEPT {
+    return set_foreground_color;
+  }
+  FMT_CONSTEXPR bool has_background() const FMT_NOEXCEPT {
+    return set_background_color;
+  }
+  FMT_CONSTEXPR bool has_emphasis() const FMT_NOEXCEPT {
+    return static_cast<uint8_t>(ems) != 0;
+  }
+  FMT_CONSTEXPR detail::color_type get_foreground() const FMT_NOEXCEPT {
+    FMT_ASSERT(has_foreground(), "no foreground specified for this style");
+    return foreground_color;
+  }
+  FMT_CONSTEXPR detail::color_type get_background() const FMT_NOEXCEPT {
+    FMT_ASSERT(has_background(), "no background specified for this style");
+    return background_color;
+  }
+  FMT_CONSTEXPR emphasis get_emphasis() const FMT_NOEXCEPT {
+    FMT_ASSERT(has_emphasis(), "no emphasis specified for this style");
+    return ems;
+  }
+
+ private:
+  FMT_CONSTEXPR text_style(bool is_foreground,
+                           detail::color_type text_color) FMT_NOEXCEPT
+      : set_foreground_color(),
+        set_background_color(),
+        ems() {
+    if (is_foreground) {
+      foreground_color = text_color;
+      set_foreground_color = true;
+    } else {
+      background_color = text_color;
+      set_background_color = true;
+    }
+  }
+
+  friend FMT_CONSTEXPR_DECL text_style fg(detail::color_type foreground)
+      FMT_NOEXCEPT;
+  friend FMT_CONSTEXPR_DECL text_style bg(detail::color_type background)
+      FMT_NOEXCEPT;
+
+  detail::color_type foreground_color;
+  detail::color_type background_color;
+  bool set_foreground_color;
+  bool set_background_color;
+  emphasis ems;
+};
+
+FMT_CONSTEXPR text_style fg(detail::color_type foreground) FMT_NOEXCEPT {
+  return text_style(/*is_foreground=*/true, foreground);
+}
+
+FMT_CONSTEXPR text_style bg(detail::color_type background) FMT_NOEXCEPT {
+  return text_style(/*is_foreground=*/false, background);
+}
+
+FMT_CONSTEXPR text_style operator|(emphasis lhs, emphasis rhs) FMT_NOEXCEPT {
+  return text_style(lhs) | rhs;
+}
+
+namespace detail {
+
+template <typename Char> struct ansi_color_escape {
+  FMT_CONSTEXPR ansi_color_escape(detail::color_type text_color,
+                                  const char* esc) FMT_NOEXCEPT {
+    // If we have a terminal color, we need to output another escape code
+    // sequence.
+    if (!text_color.is_rgb) {
+      bool is_background = esc == detail::data::background_color;
+      uint32_t value = text_color.value.term_color;
+      // Background ASCII codes are the same as the foreground ones but with
+      // 10 more.
+      if (is_background) value += 10u;
+
+      size_t index = 0;
+      buffer[index++] = static_cast<Char>('\x1b');
+      buffer[index++] = static_cast<Char>('[');
+
+      if (value >= 100u) {
+        buffer[index++] = static_cast<Char>('1');
+        value %= 100u;
+      }
+      buffer[index++] = static_cast<Char>('0' + value / 10u);
+      buffer[index++] = static_cast<Char>('0' + value % 10u);
+
+      buffer[index++] = static_cast<Char>('m');
+      buffer[index++] = static_cast<Char>('\0');
+      return;
+    }
+
+    for (int i = 0; i < 7; i++) {
+      buffer[i] = static_cast<Char>(esc[i]);
+    }
+    rgb color(text_color.value.rgb_color);
+    to_esc(color.r, buffer + 7, ';');
+    to_esc(color.g, buffer + 11, ';');
+    to_esc(color.b, buffer + 15, 'm');
+    buffer[19] = static_cast<Char>(0);
+  }
+  FMT_CONSTEXPR ansi_color_escape(emphasis em) FMT_NOEXCEPT {
+    uint8_t em_codes[4] = {};
+    uint8_t em_bits = static_cast<uint8_t>(em);
+    if (em_bits & static_cast<uint8_t>(emphasis::bold)) em_codes[0] = 1;
+    if (em_bits & static_cast<uint8_t>(emphasis::italic)) em_codes[1] = 3;
+    if (em_bits & static_cast<uint8_t>(emphasis::underline)) em_codes[2] = 4;
+    if (em_bits & static_cast<uint8_t>(emphasis::strikethrough))
+      em_codes[3] = 9;
+
+    size_t index = 0;
+    for (int i = 0; i < 4; ++i) {
+      if (!em_codes[i]) continue;
+      buffer[index++] = static_cast<Char>('\x1b');
+      buffer[index++] = static_cast<Char>('[');
+      buffer[index++] = static_cast<Char>('0' + em_codes[i]);
+      buffer[index++] = static_cast<Char>('m');
+    }
+    buffer[index++] = static_cast<Char>(0);
+  }
+  FMT_CONSTEXPR operator const Char*() const FMT_NOEXCEPT { return buffer; }
+
+  FMT_CONSTEXPR const Char* begin() const FMT_NOEXCEPT { return buffer; }
+  FMT_CONSTEXPR const Char* end() const FMT_NOEXCEPT {
+    return buffer + std::char_traits<Char>::length(buffer);
+  }
+
+ private:
+  Char buffer[7u + 3u * 4u + 1u];
+
+  static FMT_CONSTEXPR void to_esc(uint8_t c, Char* out,
+                                   char delimiter) FMT_NOEXCEPT {
+    out[0] = static_cast<Char>('0' + c / 100);
+    out[1] = static_cast<Char>('0' + c / 10 % 10);
+    out[2] = static_cast<Char>('0' + c % 10);
+    out[3] = static_cast<Char>(delimiter);
+  }
+};
+
+template <typename Char>
+FMT_CONSTEXPR ansi_color_escape<Char> make_foreground_color(
+    detail::color_type foreground) FMT_NOEXCEPT {
+  return ansi_color_escape<Char>(foreground, detail::data::foreground_color);
+}
+
+template <typename Char>
+FMT_CONSTEXPR ansi_color_escape<Char> make_background_color(
+    detail::color_type background) FMT_NOEXCEPT {
+  return ansi_color_escape<Char>(background, detail::data::background_color);
+}
+
+template <typename Char>
+FMT_CONSTEXPR ansi_color_escape<Char> make_emphasis(emphasis em) FMT_NOEXCEPT {
+  return ansi_color_escape<Char>(em);
+}
+
+template <typename Char>
+inline void fputs(const Char* chars, FILE* stream) FMT_NOEXCEPT {
+  std::fputs(chars, stream);
+}
+
+template <>
+inline void fputs<wchar_t>(const wchar_t* chars, FILE* stream) FMT_NOEXCEPT {
+  std::fputws(chars, stream);
+}
+
+template <typename Char> inline void reset_color(FILE* stream) FMT_NOEXCEPT {
+  fputs(detail::data::reset_color, stream);
+}
+
+template <> inline void reset_color<wchar_t>(FILE* stream) FMT_NOEXCEPT {
+  fputs(detail::data::wreset_color, stream);
+}
+
+template <typename Char>
+inline void reset_color(buffer<Char>& buffer) FMT_NOEXCEPT {
+  const char* begin = data::reset_color;
+  const char* end = begin + sizeof(data::reset_color) - 1;
+  buffer.append(begin, end);
+}
+
+template <typename Char>
+void vformat_to(buffer<Char>& buf, const text_style& ts,
+                basic_string_view<Char> format_str,
+                basic_format_args<buffer_context<type_identity_t<Char>>> args) {
+  bool has_style = false;
+  if (ts.has_emphasis()) {
+    has_style = true;
+    auto emphasis = detail::make_emphasis<Char>(ts.get_emphasis());
+    buf.append(emphasis.begin(), emphasis.end());
+  }
+  if (ts.has_foreground()) {
+    has_style = true;
+    auto foreground = detail::make_foreground_color<Char>(ts.get_foreground());
+    buf.append(foreground.begin(), foreground.end());
+  }
+  if (ts.has_background()) {
+    has_style = true;
+    auto background = detail::make_background_color<Char>(ts.get_background());
+    buf.append(background.begin(), background.end());
+  }
+  detail::vformat_to(buf, format_str, args);
+  if (has_style) detail::reset_color<Char>(buf);
+}
+}  // namespace detail
+
+template <typename S, typename Char = char_t<S>>
+void vprint(std::FILE* f, const text_style& ts, const S& format,
+            basic_format_args<buffer_context<type_identity_t<Char>>> args) {
+  basic_memory_buffer<Char> buf;
+  detail::vformat_to(buf, ts, to_string_view(format), args);
+  buf.push_back(Char(0));
+  detail::fputs(buf.data(), f);
+}
+
+/**
+  \rst
+  Formats a string and prints it to the specified file stream using ANSI
+  escape sequences to specify text formatting.
+
+  **Example**::
+
+    fmt::print(fmt::emphasis::bold | fg(fmt::color::red),
+               "Elapsed time: {0:.2f} seconds", 1.23);
+  \endrst
+ */
+template <typename S, typename... Args,
+          FMT_ENABLE_IF(detail::is_string<S>::value)>
+void print(std::FILE* f, const text_style& ts, const S& format_str,
+           const Args&... args) {
+  vprint(f, ts, format_str,
+         fmt::make_args_checked<Args...>(format_str, args...));
+}
+
+/**
+  Formats a string and prints it to stdout using ANSI escape sequences to
+  specify text formatting.
+  Example:
+    fmt::print(fmt::emphasis::bold | fg(fmt::color::red),
+               "Elapsed time: {0:.2f} seconds", 1.23);
+ */
+template <typename S, typename... Args,
+          FMT_ENABLE_IF(detail::is_string<S>::value)>
+void print(const text_style& ts, const S& format_str, const Args&... args) {
+  return print(stdout, ts, format_str, args...);
+}
+
+template <typename S, typename Char = char_t<S>>
+inline std::basic_string<Char> vformat(
+    const text_style& ts, const S& format_str,
+    basic_format_args<buffer_context<type_identity_t<Char>>> args) {
+  basic_memory_buffer<Char> buf;
+  detail::vformat_to(buf, ts, to_string_view(format_str), args);
+  return fmt::to_string(buf);
+}
+
+/**
+  \rst
+  Formats arguments and returns the result as a string using ANSI
+  escape sequences to specify text formatting.
+
+  **Example**::
+
+    #include <fmt/color.h>
+    std::string message = fmt::format(fmt::emphasis::bold | fg(fmt::color::red),
+                                      "The answer is {}", 42);
+  \endrst
+*/
+template <typename S, typename... Args, typename Char = char_t<S>>
+inline std::basic_string<Char> format(const text_style& ts, const S& format_str,
+                                      const Args&... args) {
+  return vformat(ts, to_string_view(format_str),
+                 fmt::make_args_checked<Args...>(format_str, args...));
+}
+
+/**
+  Formats a string with the given text_style and writes the output to ``out``.
+ */
+template <typename OutputIt, typename Char,
+          FMT_ENABLE_IF(detail::is_output_iterator<OutputIt>::value)>
+OutputIt vformat_to(
+    OutputIt out, const text_style& ts, basic_string_view<Char> format_str,
+    basic_format_args<buffer_context<type_identity_t<Char>>> args) {
+  decltype(detail::get_buffer<Char>(out)) buf(detail::get_buffer_init(out));
+  detail::vformat_to(buf, ts, format_str, args);
+  return detail::get_iterator(buf);
+}
+
+/**
+  \rst
+  Formats arguments with the given text_style, writes the result to the output
+  iterator ``out`` and returns the iterator past the end of the output range.
+
+  **Example**::
+
+    std::vector<char> out;
+    fmt::format_to(std::back_inserter(out),
+                   fmt::emphasis::bold | fg(fmt::color::red), "{}", 42);
+  \endrst
+*/
+template <typename OutputIt, typename S, typename... Args,
+          FMT_ENABLE_IF(detail::is_output_iterator<OutputIt>::value&&
+                            detail::is_string<S>::value)>
+inline OutputIt format_to(OutputIt out, const text_style& ts,
+                          const S& format_str, Args&&... args) {
+  return vformat_to(out, ts, to_string_view(format_str),
+                    fmt::make_args_checked<Args...>(format_str, args...));
+}
+
+FMT_END_NAMESPACE
+
+#endif  // FMT_COLOR_H_
--- a/src/3rdparty/fmt/compile.h
+++ b/src/3rdparty/fmt/compile.h
@@ -0,0 +1,699 @@
+// Formatting library for C++ - experimental format string compilation
+//
+// Copyright (c) 2012 - present, Victor Zverovich and fmt contributors
+// All rights reserved.
+//
+// For the license information refer to format.h.
+
+#ifndef FMT_COMPILE_H_
+#define FMT_COMPILE_H_
+
+#include <vector>
+
+#include "format.h"
+
+FMT_BEGIN_NAMESPACE
+namespace detail {
+
+// A compile-time string which is compiled into fast formatting code.
+class compiled_string {};
+
+template <typename S>
+struct is_compiled_string : std::is_base_of<compiled_string, S> {};
+
+/**
+  \rst
+  Converts a string literal *s* into a format string that will be parsed at
+  compile time and converted into efficient formatting code. Requires C++17
+  ``constexpr if`` compiler support.
+
+  **Example**::
+
+    // Converts 42 into std::string using the most efficient method and no
+    // runtime format string processing.
+    std::string s = fmt::format(FMT_COMPILE("{}"), 42);
+  \endrst
+ */
+#define FMT_COMPILE(s) FMT_STRING_IMPL(s, fmt::detail::compiled_string)
+
+template <typename T, typename... Tail>
+const T& first(const T& value, const Tail&...) {
+  return value;
+}
+
+// Part of a compiled format string. It can be either literal text or a
+// replacement field.
+template <typename Char> struct format_part {
+  enum class kind { arg_index, arg_name, text, replacement };
+
+  struct replacement {
+    arg_ref<Char> arg_id;
+    dynamic_format_specs<Char> specs;
+  };
+
+  kind part_kind;
+  union value {
+    int arg_index;
+    basic_string_view<Char> str;
+    replacement repl;
+
+    FMT_CONSTEXPR value(int index = 0) : arg_index(index) {}
+    FMT_CONSTEXPR value(basic_string_view<Char> s) : str(s) {}
+    FMT_CONSTEXPR value(replacement r) : repl(r) {}
+  } val;
+  // Position past the end of the argument id.
+  const Char* arg_id_end = nullptr;
+
+  FMT_CONSTEXPR format_part(kind k = kind::arg_index, value v = {})
+      : part_kind(k), val(v) {}
+
+  static FMT_CONSTEXPR format_part make_arg_index(int index) {
+    return format_part(kind::arg_index, index);
+  }
+  static FMT_CONSTEXPR format_part make_arg_name(basic_string_view<Char> name) {
+    return format_part(kind::arg_name, name);
+  }
+  static FMT_CONSTEXPR format_part make_text(basic_string_view<Char> text) {
+    return format_part(kind::text, text);
+  }
+  static FMT_CONSTEXPR format_part make_replacement(replacement repl) {
+    return format_part(kind::replacement, repl);
+  }
+};
+
+template <typename Char> struct part_counter {
+  unsigned num_parts = 0;
+
+  FMT_CONSTEXPR void on_text(const Char* begin, const Char* end) {
+    if (begin != end) ++num_parts;
+  }
+
+  FMT_CONSTEXPR int on_arg_id() { return ++num_parts, 0; }
+  FMT_CONSTEXPR int on_arg_id(int) { return ++num_parts, 0; }
+  FMT_CONSTEXPR int on_arg_id(basic_string_view<Char>) {
+    return ++num_parts, 0;
+  }
+
+  FMT_CONSTEXPR void on_replacement_field(int, const Char*) {}
+
+  FMT_CONSTEXPR const Char* on_format_specs(int, const Char* begin,
+                                            const Char* end) {
+    // Find the matching brace.
+    unsigned brace_counter = 0;
+    for (; begin != end; ++begin) {
+      if (*begin == '{') {
+        ++brace_counter;
+      } else if (*begin == '}') {
+        if (brace_counter == 0u) break;
+        --brace_counter;
+      }
+    }
+    return begin;
+  }
+
+  FMT_CONSTEXPR void on_error(const char*) {}
+};
+
+// Counts the number of parts in a format string.
+template <typename Char>
+FMT_CONSTEXPR unsigned count_parts(basic_string_view<Char> format_str) {
+  part_counter<Char> counter;
+  parse_format_string<true>(format_str, counter);
+  return counter.num_parts;
+}
+
+template <typename Char, typename PartHandler>
+class format_string_compiler : public error_handler {
+ private:
+  using part = format_part<Char>;
+
+  PartHandler handler_;
+  part part_;
+  basic_string_view<Char> format_str_;
+  basic_format_parse_context<Char> parse_context_;
+
+ public:
+  FMT_CONSTEXPR format_string_compiler(basic_string_view<Char> format_str,
+                                       PartHandler handler)
+      : handler_(handler),
+        format_str_(format_str),
+        parse_context_(format_str) {}
+
+  FMT_CONSTEXPR void on_text(const Char* begin, const Char* end) {
+    if (begin != end)
+      handler_(part::make_text({begin, to_unsigned(end - begin)}));
+  }
+
+  FMT_CONSTEXPR int on_arg_id() {
+    part_ = part::make_arg_index(parse_context_.next_arg_id());
+    return 0;
+  }
+
+  FMT_CONSTEXPR int on_arg_id(int id) {
+    parse_context_.check_arg_id(id);
+    part_ = part::make_arg_index(id);
+    return 0;
+  }
+
+  FMT_CONSTEXPR int on_arg_id(basic_string_view<Char> id) {
+    part_ = part::make_arg_name(id);
+    return 0;
+  }
+
+  FMT_CONSTEXPR void on_replacement_field(int, const Char* ptr) {
+    part_.arg_id_end = ptr;
+    handler_(part_);
+  }
+
+  FMT_CONSTEXPR const Char* on_format_specs(int, const Char* begin,
+                                            const Char* end) {
+    auto repl = typename part::replacement();
+    dynamic_specs_handler<basic_format_parse_context<Char>> handler(
+        repl.specs, parse_context_);
+    auto it = parse_format_specs(begin, end, handler);
+    if (*it != '}') on_error("missing '}' in format string");
+    repl.arg_id = part_.part_kind == part::kind::arg_index
+                      ? arg_ref<Char>(part_.val.arg_index)
+                      : arg_ref<Char>(part_.val.str);
+    auto part = part::make_replacement(repl);
+    part.arg_id_end = begin;
+    handler_(part);
+    return it;
+  }
+};
+
+// Compiles a format string and invokes handler(part) for each parsed part.
+template <bool IS_CONSTEXPR, typename Char, typename PartHandler>
+FMT_CONSTEXPR void compile_format_string(basic_string_view<Char> format_str,
+                                         PartHandler handler) {
+  parse_format_string<IS_CONSTEXPR>(
+      format_str,
+      format_string_compiler<Char, PartHandler>(format_str, handler));
+}
+
+template <typename OutputIt, typename Context, typename Id>
+void format_arg(
+    basic_format_parse_context<typename Context::char_type>& parse_ctx,
+    Context& ctx, Id arg_id) {
+  ctx.advance_to(visit_format_arg(
+      arg_formatter<OutputIt, typename Context::char_type>(ctx, &parse_ctx),
+      ctx.arg(arg_id)));
+}
+
+// vformat_to is defined in a subnamespace to prevent ADL.
+namespace cf {
+template <typename Context, typename OutputIt, typename CompiledFormat>
+auto vformat_to(OutputIt out, CompiledFormat& cf,
+                basic_format_args<Context> args) -> typename Context::iterator {
+  using char_type = typename Context::char_type;
+  basic_format_parse_context<char_type> parse_ctx(
+      to_string_view(cf.format_str_));
+  Context ctx(out, args);
+
+  const auto& parts = cf.parts();
+  for (auto part_it = std::begin(parts); part_it != std::end(parts);
+       ++part_it) {
+    const auto& part = *part_it;
+    const auto& value = part.val;
+
+    using format_part_t = format_part<char_type>;
+    switch (part.part_kind) {
+    case format_part_t::kind::text: {
+      const auto text = value.str;
+      auto output = ctx.out();
+      auto&& it = reserve(output, text.size());
+      it = std::copy_n(text.begin(), text.size(), it);
+      ctx.advance_to(output);
+      break;
+    }
+
+    case format_part_t::kind::arg_index:
+      advance_to(parse_ctx, part.arg_id_end);
+      detail::format_arg<OutputIt>(parse_ctx, ctx, value.arg_index);
+      break;
+
+    case format_part_t::kind::arg_name:
+      advance_to(parse_ctx, part.arg_id_end);
+      detail::format_arg<OutputIt>(parse_ctx, ctx, value.str);
+      break;
+
+    case format_part_t::kind::replacement: {
+      const auto& arg_id_value = value.repl.arg_id.val;
+      const auto arg = value.repl.arg_id.kind == arg_id_kind::index
+                           ? ctx.arg(arg_id_value.index)
+                           : ctx.arg(arg_id_value.name);
+
+      auto specs = value.repl.specs;
+
+      handle_dynamic_spec<width_checker>(specs.width, specs.width_ref, ctx);
+      handle_dynamic_spec<precision_checker>(specs.precision,
+                                             specs.precision_ref, ctx);
+
+      error_handler h;
+      numeric_specs_checker<error_handler> checker(h, arg.type());
+      if (specs.align == align::numeric) checker.require_numeric_argument();
+      if (specs.sign != sign::none) checker.check_sign();
+      if (specs.alt) checker.require_numeric_argument();
+      if (specs.precision >= 0) checker.check_precision();
+
+      advance_to(parse_ctx, part.arg_id_end);
+      ctx.advance_to(
+          visit_format_arg(arg_formatter<OutputIt, typename Context::char_type>(
+                               ctx, nullptr, &specs),
+                           arg));
+      break;
+    }
+    }
+  }
+  return ctx.out();
+}
+}  // namespace cf
+
+struct basic_compiled_format {};
+
+template <typename S, typename = void>
+struct compiled_format_base : basic_compiled_format {
+  using char_type = char_t<S>;
+  using parts_container = std::vector<detail::format_part<char_type>>;
+
+  parts_container compiled_parts;
+
+  explicit compiled_format_base(basic_string_view<char_type> format_str) {
+    compile_format_string<false>(format_str,
+                                 [this](const format_part<char_type>& part) {
+                                   compiled_parts.push_back(part);
+                                 });
+  }
+
+  const parts_container& parts() const { return compiled_parts; }
+};
+
+template <typename Char, unsigned N> struct format_part_array {
+  format_part<Char> data[N] = {};
+  FMT_CONSTEXPR format_part_array() = default;
+};
+
+template <typename Char, unsigned N>
+FMT_CONSTEXPR format_part_array<Char, N> compile_to_parts(
+    basic_string_view<Char> format_str) {
+  format_part_array<Char, N> parts;
+  unsigned counter = 0;
+  // This is not a lambda for compatibility with older compilers.
+  struct {
+    format_part<Char>* parts;
+    unsigned* counter;
+    FMT_CONSTEXPR void operator()(const format_part<Char>& part) {
+      parts[(*counter)++] = part;
+    }
+  } collector{parts.data, &counter};
+  compile_format_string<true>(format_str, collector);
+  if (counter < N) {
+    parts.data[counter] =
+        format_part<Char>::make_text(basic_string_view<Char>());
+  }
+  return parts;
+}
+
+template <typename T> constexpr const T& constexpr_max(const T& a, const T& b) {
+  return (a < b) ? b : a;
+}
+
+template <typename S>
+struct compiled_format_base<S, enable_if_t<is_compile_string<S>::value>>
+    : basic_compiled_format {
+  using char_type = char_t<S>;
+
+  FMT_CONSTEXPR explicit compiled_format_base(basic_string_view<char_type>) {}
+
+// Workaround for old compilers. Format string compilation will not be
+// performed there anyway.
+#if FMT_USE_CONSTEXPR
+  static FMT_CONSTEXPR_DECL const unsigned num_format_parts =
+      constexpr_max(count_parts(to_string_view(S())), 1u);
+#else
+  static const unsigned num_format_parts = 1;
+#endif
+
+  using parts_container = format_part<char_type>[num_format_parts];
+
+  const parts_container& parts() const {
+    static FMT_CONSTEXPR_DECL const auto compiled_parts =
+        compile_to_parts<char_type, num_format_parts>(
+            detail::to_string_view(S()));
+    return compiled_parts.data;
+  }
+};
+
+template <typename S, typename... Args>
+class compiled_format : private compiled_format_base<S> {
+ public:
+  using typename compiled_format_base<S>::char_type;
+
+ private:
+  basic_string_view<char_type> format_str_;
+
+  template <typename Context, typename OutputIt, typename CompiledFormat>
+  friend auto cf::vformat_to(OutputIt out, CompiledFormat& cf,
+                             basic_format_args<Context> args) ->
+      typename Context::iterator;
+
+ public:
+  compiled_format() = delete;
+  explicit constexpr compiled_format(basic_string_view<char_type> format_str)
+      : compiled_format_base<S>(format_str), format_str_(format_str) {}
+};
+
+#ifdef __cpp_if_constexpr
+template <typename... Args> struct type_list {};
+
+// Returns a reference to the argument at index N from [first, rest...].
+template <int N, typename T, typename... Args>
+constexpr const auto& get([[maybe_unused]] const T& first,
+                          [[maybe_unused]] const Args&... rest) {
+  static_assert(N < 1 + sizeof...(Args), "index is out of bounds");
+  if constexpr (N == 0)
+    return first;
+  else
+    return get<N - 1>(rest...);
+}
+
+template <int N, typename> struct get_type_impl;
+
+template <int N, typename... Args> struct get_type_impl<N, type_list<Args...>> {
+  using type = remove_cvref_t<decltype(get<N>(std::declval<Args>()...))>;
+};
+
+template <int N, typename T>
+using get_type = typename get_type_impl<N, T>::type;
+
+template <typename T> struct is_compiled_format : std::false_type {};
+
+template <typename Char> struct text {
+  basic_string_view<Char> data;
+  using char_type = Char;
+
+  template <typename OutputIt, typename... Args>
+  OutputIt format(OutputIt out, const Args&...) const {
+    return write<Char>(out, data);
+  }
+};
+
+template <typename Char>
+struct is_compiled_format<text<Char>> : std::true_type {};
+
+template <typename Char>
+constexpr text<Char> make_text(basic_string_view<Char> s, size_t pos,
+                               size_t size) {
+  return {{&s[pos], size}};
+}
+
+template <typename Char> struct code_unit {
+  Char value;
+  using char_type = Char;
+
+  template <typename OutputIt, typename... Args>
+  OutputIt format(OutputIt out, const Args&...) const {
+    return write<Char>(out, value);
+  }
+};
+
+template <typename Char>
+struct is_compiled_format<code_unit<Char>> : std::true_type {};
+
+// A replacement field that refers to argument N.
+template <typename Char, typename T, int N> struct field {
+  using char_type = Char;
+
+  template <typename OutputIt, typename... Args>
+  OutputIt format(OutputIt out, const Args&... args) const {
+    // This ensures that the argument type is convertile to `const T&`.
+    const T& arg = get<N>(args...);
+    return write<Char>(out, arg);
+  }
+};
+
+template <typename Char, typename T, int N>
+struct is_compiled_format<field<Char, T, N>> : std::true_type {};
+
+// A replacement field that refers to argument N and has format specifiers.
+template <typename Char, typename T, int N> struct spec_field {
+  using char_type = Char;
+  mutable formatter<T, Char> fmt;
+
+  template <typename OutputIt, typename... Args>
+  OutputIt format(OutputIt out, const Args&... args) const {
+    // This ensures that the argument type is convertile to `const T&`.
+    const T& arg = get<N>(args...);
+    const auto& vargs =
+        make_format_args<basic_format_context<OutputIt, Char>>(args...);
+    basic_format_context<OutputIt, Char> ctx(out, vargs);
+    return fmt.format(arg, ctx);
+  }
+};
+
+template <typename Char, typename T, int N>
+struct is_compiled_format<spec_field<Char, T, N>> : std::true_type {};
+
+template <typename L, typename R> struct concat {
+  L lhs;
+  R rhs;
+  using char_type = typename L::char_type;
+
+  template <typename OutputIt, typename... Args>
+  OutputIt format(OutputIt out, const Args&... args) const {
+    out = lhs.format(out, args...);
+    return rhs.format(out, args...);
+  }
+};
+
+template <typename L, typename R>
+struct is_compiled_format<concat<L, R>> : std::true_type {};
+
+template <typename L, typename R>
+constexpr concat<L, R> make_concat(L lhs, R rhs) {
+  return {lhs, rhs};
+}
+
+struct unknown_format {};
+
+template <typename Char>
+constexpr size_t parse_text(basic_string_view<Char> str, size_t pos) {
+  for (size_t size = str.size(); pos != size; ++pos) {
+    if (str[pos] == '{' || str[pos] == '}') break;
+  }
+  return pos;
+}
+
+template <typename Args, size_t POS, int ID, typename S>
+constexpr auto compile_format_string(S format_str);
+
+template <typename Args, size_t POS, int ID, typename T, typename S>
+constexpr auto parse_tail(T head, S format_str) {
+  if constexpr (POS !=
+                basic_string_view<typename S::char_type>(format_str).size()) {
+    constexpr auto tail = compile_format_string<Args, POS, ID>(format_str);
+    if constexpr (std::is_same<remove_cvref_t<decltype(tail)>,
+                               unknown_format>())
+      return tail;
+    else
+      return make_concat(head, tail);
+  } else {
+    return head;
+  }
+}
+
+template <typename T, typename Char> struct parse_specs_result {
+  formatter<T, Char> fmt;
+  size_t end;
+  int next_arg_id;
+};
+
+template <typename T, typename Char>
+constexpr parse_specs_result<T, Char> parse_specs(basic_string_view<Char> str,
+                                                  size_t pos, int arg_id) {
+  str.remove_prefix(pos);
+  auto ctx = basic_format_parse_context<Char>(str, {}, arg_id + 1);
+  auto f = formatter<T, Char>();
+  auto end = f.parse(ctx);
+  return {f, pos + (end - str.data()) + 1, ctx.next_arg_id()};
+}
+
+// Compiles a non-empty format string and returns the compiled representation
+// or unknown_format() on unrecognized input.
+template <typename Args, size_t POS, int ID, typename S>
+constexpr auto compile_format_string(S format_str) {
+  using char_type = typename S::char_type;
+  constexpr basic_string_view<char_type> str = format_str;
+  if constexpr (str[POS] == '{') {
+    if (POS + 1 == str.size())
+      throw format_error("unmatched '{' in format string");
+    if constexpr (str[POS + 1] == '{') {
+      return parse_tail<Args, POS + 2, ID>(make_text(str, POS, 1), format_str);
+    } else if constexpr (str[POS + 1] == '}') {
+      using type = get_type<ID, Args>;
+      return parse_tail<Args, POS + 2, ID + 1>(field<char_type, type, ID>(),
+                                               format_str);
+    } else if constexpr (str[POS + 1] == ':') {
+      using type = get_type<ID, Args>;
+      constexpr auto result = parse_specs<type>(str, POS + 2, ID);
+      return parse_tail<Args, result.end, result.next_arg_id>(
+          spec_field<char_type, type, ID>{result.fmt}, format_str);
+    } else {
+      return unknown_format();
+    }
+  } else if constexpr (str[POS] == '}') {
+    if (POS + 1 == str.size())
+      throw format_error("unmatched '}' in format string");
+    return parse_tail<Args, POS + 2, ID>(make_text(str, POS, 1), format_str);
+  } else {
+    constexpr auto end = parse_text(str, POS + 1);
+    if constexpr (end - POS > 1) {
+      return parse_tail<Args, end, ID>(make_text(str, POS, end - POS),
+                                       format_str);
+    } else {
+      return parse_tail<Args, end, ID>(code_unit<char_type>{str[POS]},
+                                       format_str);
+    }
+  }
+}
+
+template <typename... Args, typename S,
+          FMT_ENABLE_IF(is_compile_string<S>::value ||
+                        detail::is_compiled_string<S>::value)>
+constexpr auto compile(S format_str) {
+  constexpr basic_string_view<typename S::char_type> str = format_str;
+  if constexpr (str.size() == 0) {
+    return detail::make_text(str, 0, 0);
+  } else {
+    constexpr auto result =
+        detail::compile_format_string<detail::type_list<Args...>, 0, 0>(
+            format_str);
+    if constexpr (std::is_same<remove_cvref_t<decltype(result)>,
+                               detail::unknown_format>()) {
+      return detail::compiled_format<S, Args...>(to_string_view(format_str));
+    } else {
+      return result;
+    }
+  }
+}
+#else
+template <typename... Args, typename S,
+          FMT_ENABLE_IF(is_compile_string<S>::value)>
+constexpr auto compile(S format_str) -> detail::compiled_format<S, Args...> {
+  return detail::compiled_format<S, Args...>(to_string_view(format_str));
+}
+#endif  // __cpp_if_constexpr
+
+// Compiles the format string which must be a string literal.
+template <typename... Args, typename Char, size_t N>
+auto compile(const Char (&format_str)[N])
+    -> detail::compiled_format<const Char*, Args...> {
+  return detail::compiled_format<const Char*, Args...>(
+      basic_string_view<Char>(format_str, N - 1));
+}
+}  // namespace detail
+
+// DEPRECATED! use FMT_COMPILE instead.
+template <typename... Args>
+FMT_DEPRECATED auto compile(const Args&... args)
+    -> decltype(detail::compile(args...)) {
+  return detail::compile(args...);
+}
+
+#if FMT_USE_CONSTEXPR
+#  ifdef __cpp_if_constexpr
+
+template <typename CompiledFormat, typename... Args,
+          typename Char = typename CompiledFormat::char_type,
+          FMT_ENABLE_IF(detail::is_compiled_format<CompiledFormat>::value)>
+FMT_INLINE std::basic_string<Char> format(const CompiledFormat& cf,
+                                          const Args&... args) {
+  basic_memory_buffer<Char> buffer;
+  cf.format(detail::buffer_appender<Char>(buffer), args...);
+  return to_string(buffer);
+}
+
+template <typename OutputIt, typename CompiledFormat, typename... Args,
+          FMT_ENABLE_IF(detail::is_compiled_format<CompiledFormat>::value)>
+OutputIt format_to(OutputIt out, const CompiledFormat& cf,
+                   const Args&... args) {
+  return cf.format(out, args...);
+}
+#  endif  // __cpp_if_constexpr
+#endif    // FMT_USE_CONSTEXPR
+
+template <typename CompiledFormat, typename... Args,
+          typename Char = typename CompiledFormat::char_type,
+          FMT_ENABLE_IF(std::is_base_of<detail::basic_compiled_format,
+                                        CompiledFormat>::value)>
+std::basic_string<Char> format(const CompiledFormat& cf, const Args&... args) {
+  basic_memory_buffer<Char> buffer;
+  using context = buffer_context<Char>;
+  detail::cf::vformat_to<context>(detail::buffer_appender<Char>(buffer), cf,
+                                  make_format_args<context>(args...));
+  return to_string(buffer);
+}
+
+template <typename S, typename... Args,
+          FMT_ENABLE_IF(detail::is_compiled_string<S>::value)>
+FMT_INLINE std::basic_string<typename S::char_type> format(const S&,
+                                                           Args&&... args) {
+#ifdef __cpp_if_constexpr
+  if constexpr (std::is_same<typename S::char_type, char>::value) {
+    constexpr basic_string_view<typename S::char_type> str = S();
+    if (str.size() == 2 && str[0] == '{' && str[1] == '}')
+      return fmt::to_string(detail::first(args...));
+  }
+#endif
+  constexpr auto compiled = detail::compile<Args...>(S());
+  return format(compiled, std::forward<Args>(args)...);
+}
+
+template <typename OutputIt, typename CompiledFormat, typename... Args,
+          FMT_ENABLE_IF(std::is_base_of<detail::basic_compiled_format,
+                                        CompiledFormat>::value)>
+OutputIt format_to(OutputIt out, const CompiledFormat& cf,
+                   const Args&... args) {
+  using char_type = typename CompiledFormat::char_type;
+  using context = format_context_t<OutputIt, char_type>;
+  return detail::cf::vformat_to<context>(out, cf,
+                                         make_format_args<context>(args...));
+}
+
+template <typename OutputIt, typename S, typename... Args,
+          FMT_ENABLE_IF(detail::is_compiled_string<S>::value)>
+OutputIt format_to(OutputIt out, const S&, const Args&... args) {
+  constexpr auto compiled = detail::compile<Args...>(S());
+  return format_to(out, compiled, args...);
+}
+
+template <
+    typename OutputIt, typename CompiledFormat, typename... Args,
+    FMT_ENABLE_IF(detail::is_output_iterator<OutputIt>::value&& std::is_base_of<
+                  detail::basic_compiled_format, CompiledFormat>::value)>
+format_to_n_result<OutputIt> format_to_n(OutputIt out, size_t n,
+                                         const CompiledFormat& cf,
+                                         const Args&... args) {
+  auto it =
+      format_to(detail::truncating_iterator<OutputIt>(out, n), cf, args...);
+  return {it.base(), it.count()};
+}
+
+template <typename OutputIt, typename S, typename... Args,
+          FMT_ENABLE_IF(detail::is_compiled_string<S>::value)>
+format_to_n_result<OutputIt> format_to_n(OutputIt out, size_t n, const S&,
+                                         const Args&... args) {
+  constexpr auto compiled = detail::compile<Args...>(S());
+  auto it = format_to(detail::truncating_iterator<OutputIt>(out, n), compiled,
+                      args...);
+  return {it.base(), it.count()};
+}
+
+template <typename CompiledFormat, typename... Args>
+size_t formatted_size(const CompiledFormat& cf, const Args&... args) {
+  return format_to(detail::counting_iterator(), cf, args...).count();
+}
+
+FMT_END_NAMESPACE
+
+#endif  // FMT_COMPILE_H_
--- a/src/3rdparty/fmt/core.h
+++ b/src/3rdparty/fmt/core.h
--- a/src/3rdparty/fmt/format-inl.h
+++ b/src/3rdparty/fmt/format-inl.h
--- a/src/3rdparty/fmt/format.cc
+++ b/src/3rdparty/fmt/format.cc
@@ -0,0 +1,69 @@
+// Formatting library for C++
+//
+// Copyright (c) 2012 - 2016, Victor Zverovich
+// All rights reserved.
+//
+// For the license information refer to format.h.
+
+#include "3rdparty/fmt/format-inl.h"
+
+FMT_BEGIN_NAMESPACE
+namespace detail {
+
+template <typename T>
+int format_float(char* buf, std::size_t size, const char* format, int precision,
+                 T value) {
+#ifdef FMT_FUZZ
+  if (precision > 100000)
+    throw std::runtime_error(
+        "fuzz mode - avoid large allocation inside snprintf");
+#endif
+  // Suppress the warning about nonliteral format string.
+  int (*snprintf_ptr)(char*, size_t, const char*, ...) = FMT_SNPRINTF;
+  return precision < 0 ? snprintf_ptr(buf, size, format, value)
+                       : snprintf_ptr(buf, size, format, precision, value);
+}
+}  // namespace detail
+
+template struct FMT_INSTANTIATION_DEF_API detail::basic_data<void>;
+
+// Workaround a bug in MSVC2013 that prevents instantiation of format_float.
+int (*instantiate_format_float)(double, int, detail::float_specs,
+                                detail::buffer<char>&) = detail::format_float;
+
+#ifndef FMT_STATIC_THOUSANDS_SEPARATOR
+template FMT_API detail::locale_ref::locale_ref(const std::locale& loc);
+template FMT_API std::locale detail::locale_ref::get<std::locale>() const;
+#endif
+
+// Explicit instantiations for char.
+
+template FMT_API std::string detail::grouping_impl<char>(locale_ref);
+template FMT_API char detail::thousands_sep_impl(locale_ref);
+template FMT_API char detail::decimal_point_impl(locale_ref);
+
+template FMT_API void detail::buffer<char>::append(const char*, const char*);
+
+template FMT_API FMT_BUFFER_CONTEXT(char)::iterator detail::vformat_to(
+    detail::buffer<char>&, string_view,
+    basic_format_args<FMT_BUFFER_CONTEXT(char)>);
+
+template FMT_API int detail::snprintf_float(double, int, detail::float_specs,
+                                            detail::buffer<char>&);
+template FMT_API int detail::snprintf_float(long double, int,
+                                            detail::float_specs,
+                                            detail::buffer<char>&);
+template FMT_API int detail::format_float(double, int, detail::float_specs,
+                                          detail::buffer<char>&);
+template FMT_API int detail::format_float(long double, int, detail::float_specs,
+                                          detail::buffer<char>&);
+
+// Explicit instantiations for wchar_t.
+
+template FMT_API std::string detail::grouping_impl<wchar_t>(locale_ref);
+template FMT_API wchar_t detail::thousands_sep_impl(locale_ref);
+template FMT_API wchar_t detail::decimal_point_impl(locale_ref);
+
+template FMT_API void detail::buffer<wchar_t>::append(const wchar_t*,
+                                                      const wchar_t*);
+FMT_END_NAMESPACE
--- a/src/3rdparty/fmt/format.h
+++ b/src/3rdparty/fmt/format.h
--- a/src/3rdparty/fmt/locale.h
+++ b/src/3rdparty/fmt/locale.h
@@ -0,0 +1,78 @@
+// Formatting library for C++ - std::locale support
+//
+// Copyright (c) 2012 - present, Victor Zverovich
+// All rights reserved.
+//
+// For the license information refer to format.h.
+
+#ifndef FMT_LOCALE_H_
+#define FMT_LOCALE_H_
+
+#include <locale>
+
+#include "format.h"
+
+FMT_BEGIN_NAMESPACE
+
+namespace detail {
+template <typename Char>
+typename buffer_context<Char>::iterator vformat_to(
+    const std::locale& loc, buffer<Char>& buf,
+    basic_string_view<Char> format_str,
+    basic_format_args<buffer_context<type_identity_t<Char>>> args) {
+  using af = arg_formatter<typename buffer_context<Char>::iterator, Char>;
+  return vformat_to<af>(buffer_appender<Char>(buf), to_string_view(format_str),
+                        args, detail::locale_ref(loc));
+}
+
+template <typename Char>
+std::basic_string<Char> vformat(
+    const std::locale& loc, basic_string_view<Char> format_str,
+    basic_format_args<buffer_context<type_identity_t<Char>>> args) {
+  basic_memory_buffer<Char> buffer;
+  detail::vformat_to(loc, buffer, format_str, args);
+  return fmt::to_string(buffer);
+}
+}  // namespace detail
+
+template <typename S, typename Char = char_t<S>>
+inline std::basic_string<Char> vformat(
+    const std::locale& loc, const S& format_str,
+    basic_format_args<buffer_context<type_identity_t<Char>>> args) {
+  return detail::vformat(loc, to_string_view(format_str), args);
+}
+
+template <typename S, typename... Args, typename Char = char_t<S>>
+inline std::basic_string<Char> format(const std::locale& loc,
+                                      const S& format_str, Args&&... args) {
+  return detail::vformat(
+      loc, to_string_view(format_str),
+      fmt::make_args_checked<Args...>(format_str, args...));
+}
+
+template <typename S, typename OutputIt, typename... Args,
+          typename Char = enable_if_t<
+              detail::is_output_iterator<OutputIt>::value, char_t<S>>>
+inline OutputIt vformat_to(
+    OutputIt out, const std::locale& loc, const S& format_str,
+    basic_format_args<buffer_context<type_identity_t<Char>>> args) {
+  decltype(detail::get_buffer<Char>(out)) buf(detail::get_buffer_init(out));
+  using af =
+    detail::arg_formatter<typename buffer_context<Char>::iterator, Char>;
+  vformat_to<af>(detail::buffer_appender<Char>(buf), to_string_view(format_str),
+                 args, detail::locale_ref(loc));
+  return detail::get_iterator(buf);
+}
+
+template <typename OutputIt, typename S, typename... Args,
+          FMT_ENABLE_IF(detail::is_output_iterator<OutputIt>::value&&
+                            detail::is_string<S>::value)>
+inline OutputIt format_to(OutputIt out, const std::locale& loc,
+                          const S& format_str, Args&&... args) {
+  const auto& vargs = fmt::make_args_checked<Args...>(format_str, args...);
+  return vformat_to(out, loc, to_string_view(format_str), vargs);
+}
+
+FMT_END_NAMESPACE
+
+#endif  // FMT_LOCALE_H_
--- a/src/3rdparty/fmt/os.cc
+++ b/src/3rdparty/fmt/os.cc
@@ -0,0 +1,322 @@
+// Formatting library for C++ - optional OS-specific functionality
+//
+// Copyright (c) 2012 - 2016, Victor Zverovich
+// All rights reserved.
+//
+// For the license information refer to format.h.
+
+// Disable bogus MSVC warnings.
+#if !defined(_CRT_SECURE_NO_WARNINGS) && defined(_MSC_VER)
+#  define _CRT_SECURE_NO_WARNINGS
+#endif
+
+#include "fmt/os.h"
+
+#include <climits>
+
+#if FMT_USE_FCNTL
+#  include <sys/stat.h>
+#  include <sys/types.h>
+
+#  ifndef _WIN32
+#    include <unistd.h>
+#  else
+#    ifndef WIN32_LEAN_AND_MEAN
+#      define WIN32_LEAN_AND_MEAN
+#    endif
+#    include <io.h>
+#    include <windows.h>
+
+#    define O_CREAT _O_CREAT
+#    define O_TRUNC _O_TRUNC
+
+#    ifndef S_IRUSR
+#      define S_IRUSR _S_IREAD
+#    endif
+
+#    ifndef S_IWUSR
+#      define S_IWUSR _S_IWRITE
+#    endif
+
+#    ifdef __MINGW32__
+#      define _SH_DENYNO 0x40
+#    endif
+#  endif  // _WIN32
+#endif    // FMT_USE_FCNTL
+
+#ifdef _WIN32
+#  include <windows.h>
+#endif
+
+#ifdef fileno
+#  undef fileno
+#endif
+
+namespace {
+#ifdef _WIN32
+// Return type of read and write functions.
+using RWResult = int;
+
+// On Windows the count argument to read and write is unsigned, so convert
+// it from size_t preventing integer overflow.
+inline unsigned convert_rwcount(std::size_t count) {
+  return count <= UINT_MAX ? static_cast<unsigned>(count) : UINT_MAX;
+}
+#else
+// Return type of read and write functions.
+using RWResult = ssize_t;
+
+inline std::size_t convert_rwcount(std::size_t count) { return count; }
+#endif
+}  // namespace
+
+FMT_BEGIN_NAMESPACE
+
+#ifdef _WIN32
+detail::utf16_to_utf8::utf16_to_utf8(wstring_view s) {
+  if (int error_code = convert(s)) {
+    FMT_THROW(windows_error(error_code,
+                            "cannot convert string from UTF-16 to UTF-8"));
+  }
+}
+
+int detail::utf16_to_utf8::convert(wstring_view s) {
+  if (s.size() > INT_MAX) return ERROR_INVALID_PARAMETER;
+  int s_size = static_cast<int>(s.size());
+  if (s_size == 0) {
+    // WideCharToMultiByte does not support zero length, handle separately.
+    buffer_.resize(1);
+    buffer_[0] = 0;
+    return 0;
+  }
+
+  int length = WideCharToMultiByte(CP_UTF8, 0, s.data(), s_size, nullptr, 0,
+                                   nullptr, nullptr);
+  if (length == 0) return GetLastError();
+  buffer_.resize(length + 1);
+  length = WideCharToMultiByte(CP_UTF8, 0, s.data(), s_size, &buffer_[0],
+                               length, nullptr, nullptr);
+  if (length == 0) return GetLastError();
+  buffer_[length] = 0;
+  return 0;
+}
+
+void windows_error::init(int err_code, string_view format_str,
+                         format_args args) {
+  error_code_ = err_code;
+  memory_buffer buffer;
+  detail::format_windows_error(buffer, err_code, vformat(format_str, args));
+  std::runtime_error& base = *this;
+  base = std::runtime_error(to_string(buffer));
+}
+
+void detail::format_windows_error(detail::buffer<char>& out, int error_code,
+                                  string_view message) FMT_NOEXCEPT {
+  FMT_TRY {
+    wmemory_buffer buf;
+    buf.resize(inline_buffer_size);
+    for (;;) {
+      wchar_t* system_message = &buf[0];
+      int result = FormatMessageW(
+          FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS, nullptr,
+          error_code, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), system_message,
+          static_cast<uint32_t>(buf.size()), nullptr);
+      if (result != 0) {
+        utf16_to_utf8 utf8_message;
+        if (utf8_message.convert(system_message) == ERROR_SUCCESS) {
+          format_to(buffer_appender<char>(out), "{}: {}", message,
+                    utf8_message);
+          return;
+        }
+        break;
+      }
+      if (GetLastError() != ERROR_INSUFFICIENT_BUFFER)
+        break;  // Can't get error message, report error code instead.
+      buf.resize(buf.size() * 2);
+    }
+  }
+  FMT_CATCH(...) {}
+  format_error_code(out, error_code, message);
+}
+
+void report_windows_error(int error_code,
+                          fmt::string_view message) FMT_NOEXCEPT {
+  report_error(detail::format_windows_error, error_code, message);
+}
+#endif  // _WIN32
+
+buffered_file::~buffered_file() FMT_NOEXCEPT {
+  if (file_ && FMT_SYSTEM(fclose(file_)) != 0)
+    report_system_error(errno, "cannot close file");
+}
+
+buffered_file::buffered_file(cstring_view filename, cstring_view mode) {
+  FMT_RETRY_VAL(file_, FMT_SYSTEM(fopen(filename.c_str(), mode.c_str())),
+                nullptr);
+  if (!file_)
+    FMT_THROW(system_error(errno, "cannot open file {}", filename.c_str()));
+}
+
+void buffered_file::close() {
+  if (!file_) return;
+  int result = FMT_SYSTEM(fclose(file_));
+  file_ = nullptr;
+  if (result != 0) FMT_THROW(system_error(errno, "cannot close file"));
+}
+
+// A macro used to prevent expansion of fileno on broken versions of MinGW.
+#define FMT_ARGS
+
+int buffered_file::fileno() const {
+  int fd = FMT_POSIX_CALL(fileno FMT_ARGS(file_));
+  if (fd == -1) FMT_THROW(system_error(errno, "cannot get file descriptor"));
+  return fd;
+}
+
+#if FMT_USE_FCNTL
+file::file(cstring_view path, int oflag) {
+  int mode = S_IRUSR | S_IWUSR;
+#  if defined(_WIN32) && !defined(__MINGW32__)
+  fd_ = -1;
+  FMT_POSIX_CALL(sopen_s(&fd_, path.c_str(), oflag, _SH_DENYNO, mode));
+#  else
+  FMT_RETRY(fd_, FMT_POSIX_CALL(open(path.c_str(), oflag, mode)));
+#  endif
+  if (fd_ == -1)
+    FMT_THROW(system_error(errno, "cannot open file {}", path.c_str()));
+}
+
+file::~file() FMT_NOEXCEPT {
+  // Don't retry close in case of EINTR!
+  // See http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-09/3000.html
+  if (fd_ != -1 && FMT_POSIX_CALL(close(fd_)) != 0)
+    report_system_error(errno, "cannot close file");
+}
+
+void file::close() {
+  if (fd_ == -1) return;
+  // Don't retry close in case of EINTR!
+  // See http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-09/3000.html
+  int result = FMT_POSIX_CALL(close(fd_));
+  fd_ = -1;
+  if (result != 0) FMT_THROW(system_error(errno, "cannot close file"));
+}
+
+long long file::size() const {
+#  ifdef _WIN32
+  // Use GetFileSize instead of GetFileSizeEx for the case when _WIN32_WINNT
+  // is less than 0x0500 as is the case with some default MinGW builds.
+  // Both functions support large file sizes.
+  DWORD size_upper = 0;
+  HANDLE handle = reinterpret_cast<HANDLE>(_get_osfhandle(fd_));
+  DWORD size_lower = FMT_SYSTEM(GetFileSize(handle, &size_upper));
+  if (size_lower == INVALID_FILE_SIZE) {
+    DWORD error = GetLastError();
+    if (error != NO_ERROR)
+      FMT_THROW(windows_error(GetLastError(), "cannot get file size"));
+  }
+  unsigned long long long_size = size_upper;
+  return (long_size << sizeof(DWORD) * CHAR_BIT) | size_lower;
+#  else
+  using Stat = struct stat;
+  Stat file_stat = Stat();
+  if (FMT_POSIX_CALL(fstat(fd_, &file_stat)) == -1)
+    FMT_THROW(system_error(errno, "cannot get file attributes"));
+  static_assert(sizeof(long long) >= sizeof(file_stat.st_size),
+                "return type of file::size is not large enough");
+  return file_stat.st_size;
+#  endif
+}
+
+std::size_t file::read(void* buffer, std::size_t count) {
+  RWResult result = 0;
+  FMT_RETRY(result, FMT_POSIX_CALL(read(fd_, buffer, convert_rwcount(count))));
+  if (result < 0) FMT_THROW(system_error(errno, "cannot read from file"));
+  return detail::to_unsigned(result);
+}
+
+std::size_t file::write(const void* buffer, std::size_t count) {
+  RWResult result = 0;
+  FMT_RETRY(result, FMT_POSIX_CALL(write(fd_, buffer, convert_rwcount(count))));
+  if (result < 0) FMT_THROW(system_error(errno, "cannot write to file"));
+  return detail::to_unsigned(result);
+}
+
+file file::dup(int fd) {
+  // Don't retry as dup doesn't return EINTR.
+  // http://pubs.opengroup.org/onlinepubs/009695399/functions/dup.html
+  int new_fd = FMT_POSIX_CALL(dup(fd));
+  if (new_fd == -1)
+    FMT_THROW(system_error(errno, "cannot duplicate file descriptor {}", fd));
+  return file(new_fd);
+}
+
+void file::dup2(int fd) {
+  int result = 0;
+  FMT_RETRY(result, FMT_POSIX_CALL(dup2(fd_, fd)));
+  if (result == -1) {
+    FMT_THROW(system_error(errno, "cannot duplicate file descriptor {} to {}",
+                           fd_, fd));
+  }
+}
+
+void file::dup2(int fd, error_code& ec) FMT_NOEXCEPT {
+  int result = 0;
+  FMT_RETRY(result, FMT_POSIX_CALL(dup2(fd_, fd)));
+  if (result == -1) ec = error_code(errno);
+}
+
+void file::pipe(file& read_end, file& write_end) {
+  // Close the descriptors first to make sure that assignments don't throw
+  // and there are no leaks.
+  read_end.close();
+  write_end.close();
+  int fds[2] = {};
+#  ifdef _WIN32
+  // Make the default pipe capacity same as on Linux 2.6.11+.
+  enum { DEFAULT_CAPACITY = 65536 };
+  int result = FMT_POSIX_CALL(pipe(fds, DEFAULT_CAPACITY, _O_BINARY));
+#  else
+  // Don't retry as the pipe function doesn't return EINTR.
+  // http://pubs.opengroup.org/onlinepubs/009696799/functions/pipe.html
+  int result = FMT_POSIX_CALL(pipe(fds));
+#  endif
+  if (result != 0) FMT_THROW(system_error(errno, "cannot create pipe"));
+  // The following assignments don't throw because read_fd and write_fd
+  // are closed.
+  read_end = file(fds[0]);
+  write_end = file(fds[1]);
+}
+
+buffered_file file::fdopen(const char* mode) {
+// Don't retry as fdopen doesn't return EINTR.
+#  if defined(__MINGW32__) && defined(_POSIX_)
+  FILE* f = ::fdopen(fd_, mode);
+#  else
+  FILE* f = FMT_POSIX_CALL(fdopen(fd_, mode));
+#  endif
+  if (!f)
+    FMT_THROW(
+        system_error(errno, "cannot associate stream with file descriptor"));
+  buffered_file bf(f);
+  fd_ = -1;
+  return bf;
+}
+
+long getpagesize() {
+#  ifdef _WIN32
+  SYSTEM_INFO si;
+  GetSystemInfo(&si);
+  return si.dwPageSize;
+#  else
+  long size = FMT_POSIX_CALL(sysconf(_SC_PAGESIZE));
+  if (size < 0) FMT_THROW(system_error(errno, "cannot get memory page size"));
+  return size;
+#  endif
+}
+
+void ostream::grow(size_t) {
+  if (this->size() == this->capacity()) flush();
+}
+#endif  // FMT_USE_FCNTL
+FMT_END_NAMESPACE
--- a/src/3rdparty/fmt/os.h
+++ b/src/3rdparty/fmt/os.h
@@ -0,0 +1,480 @@
+// Formatting library for C++ - optional OS-specific functionality
+//
+// Copyright (c) 2012 - present, Victor Zverovich
+// All rights reserved.
+//
+// For the license information refer to format.h.
+
+#ifndef FMT_OS_H_
+#define FMT_OS_H_
+
+#if defined(__MINGW32__) || defined(__CYGWIN__)
+// Workaround MinGW bug https://sourceforge.net/p/mingw/bugs/2024/.
+#  undef __STRICT_ANSI__
+#endif
+
+#include <cerrno>
+#include <clocale>  // for locale_t
+#include <cstddef>
+#include <cstdio>
+#include <cstdlib>  // for strtod_l
+
+#if defined __APPLE__ || defined(__FreeBSD__)
+#  include <xlocale.h>  // for LC_NUMERIC_MASK on OS X
+#endif
+
+#include "format.h"
+
+// UWP doesn't provide _pipe.
+#if FMT_HAS_INCLUDE("winapifamily.h")
+#  include <winapifamily.h>
+#endif
+#if (FMT_HAS_INCLUDE(<fcntl.h>) || defined(__APPLE__) || \
+     defined(__linux__)) &&                              \
+    (!defined(WINAPI_FAMILY) || (WINAPI_FAMILY == WINAPI_FAMILY_DESKTOP_APP))
+#  include <fcntl.h>  // for O_RDONLY
+#  define FMT_USE_FCNTL 1
+#else
+#  define FMT_USE_FCNTL 0
+#endif
+
+#ifndef FMT_POSIX
+#  if defined(_WIN32) && !defined(__MINGW32__)
+// Fix warnings about deprecated symbols.
+#    define FMT_POSIX(call) _##call
+#  else
+#    define FMT_POSIX(call) call
+#  endif
+#endif
+
+// Calls to system functions are wrapped in FMT_SYSTEM for testability.
+#ifdef FMT_SYSTEM
+#  define FMT_POSIX_CALL(call) FMT_SYSTEM(call)
+#else
+#  define FMT_SYSTEM(call) ::call
+#  ifdef _WIN32
+// Fix warnings about deprecated symbols.
+#    define FMT_POSIX_CALL(call) ::_##call
+#  else
+#    define FMT_POSIX_CALL(call) ::call
+#  endif
+#endif
+
+// Retries the expression while it evaluates to error_result and errno
+// equals to EINTR.
+#ifndef _WIN32
+#  define FMT_RETRY_VAL(result, expression, error_result) \
+    do {                                                  \
+      (result) = (expression);                            \
+    } while ((result) == (error_result) && errno == EINTR)
+#else
+#  define FMT_RETRY_VAL(result, expression, error_result) result = (expression)
+#endif
+
+#define FMT_RETRY(result, expression) FMT_RETRY_VAL(result, expression, -1)
+
+FMT_BEGIN_NAMESPACE
+
+/**
+  \rst
+  A reference to a null-terminated string. It can be constructed from a C
+  string or ``std::string``.
+
+  You can use one of the following type aliases for common character types:
+
+  +---------------+-----------------------------+
+  | Type          | Definition                  |
+  +===============+=============================+
+  | cstring_view  | basic_cstring_view<char>    |
+  +---------------+-----------------------------+
+  | wcstring_view | basic_cstring_view<wchar_t> |
+  +---------------+-----------------------------+
+
+  This class is most useful as a parameter type to allow passing
+  different types of strings to a function, for example::
+
+    template <typename... Args>
+    std::string format(cstring_view format_str, const Args & ... args);
+
+    format("{}", 42);
+    format(std::string("{}"), 42);
+  \endrst
+ */
+template <typename Char> class basic_cstring_view {
+ private:
+  const Char* data_;
+
+ public:
+  /** Constructs a string reference object from a C string. */
+  basic_cstring_view(const Char* s) : data_(s) {}
+
+  /**
+    \rst
+    Constructs a string reference from an ``std::string`` object.
+    \endrst
+   */
+  basic_cstring_view(const std::basic_string<Char>& s) : data_(s.c_str()) {}
+
+  /** Returns the pointer to a C string. */
+  const Char* c_str() const { return data_; }
+};
+
+using cstring_view = basic_cstring_view<char>;
+using wcstring_view = basic_cstring_view<wchar_t>;
+
+// An error code.
+class error_code {
+ private:
+  int value_;
+
+ public:
+  explicit error_code(int value = 0) FMT_NOEXCEPT : value_(value) {}
+
+  int get() const FMT_NOEXCEPT { return value_; }
+};
+
+#ifdef _WIN32
+namespace detail {
+// A converter from UTF-16 to UTF-8.
+// It is only provided for Windows since other systems support UTF-8 natively.
+class utf16_to_utf8 {
+ private:
+  memory_buffer buffer_;
+
+ public:
+  utf16_to_utf8() {}
+  FMT_API explicit utf16_to_utf8(wstring_view s);
+  operator string_view() const { return string_view(&buffer_[0], size()); }
+  size_t size() const { return buffer_.size() - 1; }
+  const char* c_str() const { return &buffer_[0]; }
+  std::string str() const { return std::string(&buffer_[0], size()); }
+
+  // Performs conversion returning a system error code instead of
+  // throwing exception on conversion error. This method may still throw
+  // in case of memory allocation error.
+  FMT_API int convert(wstring_view s);
+};
+
+FMT_API void format_windows_error(buffer<char>& out, int error_code,
+                                  string_view message) FMT_NOEXCEPT;
+}  // namespace detail
+
+/** A Windows error. */
+class windows_error : public system_error {
+ private:
+  FMT_API void init(int error_code, string_view format_str, format_args args);
+
+ public:
+  /**
+   \rst
+   Constructs a :class:`fmt::windows_error` object with the description
+   of the form
+
+   .. parsed-literal::
+     *<message>*: *<system-message>*
+
+   where *<message>* is the formatted message and *<system-message>* is the
+   system message corresponding to the error code.
+   *error_code* is a Windows error code as given by ``GetLastError``.
+   If *error_code* is not a valid error code such as -1, the system message
+   will look like "error -1".
+
+   **Example**::
+
+     // This throws a windows_error with the description
+     //   cannot open file 'madeup': The system cannot find the file specified.
+     // or similar (system message may vary).
+     const char *filename = "madeup";
+     LPOFSTRUCT of = LPOFSTRUCT();
+     HFILE file = OpenFile(filename, &of, OF_READ);
+     if (file == HFILE_ERROR) {
+       throw fmt::windows_error(GetLastError(),
+                                "cannot open file '{}'", filename);
+     }
+   \endrst
+  */
+  template <typename... Args>
+  windows_error(int error_code, string_view message, const Args&... args) {
+    init(error_code, message, make_format_args(args...));
+  }
+};
+
+// Reports a Windows error without throwing an exception.
+// Can be used to report errors from destructors.
+FMT_API void report_windows_error(int error_code,
+                                  string_view message) FMT_NOEXCEPT;
+#endif  // _WIN32
+
+// A buffered file.
+class buffered_file {
+ private:
+  FILE* file_;
+
+  friend class file;
+
+  explicit buffered_file(FILE* f) : file_(f) {}
+
+ public:
+  buffered_file(const buffered_file&) = delete;
+  void operator=(const buffered_file&) = delete;
+
+  // Constructs a buffered_file object which doesn't represent any file.
+  buffered_file() FMT_NOEXCEPT : file_(nullptr) {}
+
+  // Destroys the object closing the file it represents if any.
+  FMT_API ~buffered_file() FMT_NOEXCEPT;
+
+ public:
+  buffered_file(buffered_file&& other) FMT_NOEXCEPT : file_(other.file_) {
+    other.file_ = nullptr;
+  }
+
+  buffered_file& operator=(buffered_file&& other) {
+    close();
+    file_ = other.file_;
+    other.file_ = nullptr;
+    return *this;
+  }
+
+  // Opens a file.
+  FMT_API buffered_file(cstring_view filename, cstring_view mode);
+
+  // Closes the file.
+  FMT_API void close();
+
+  // Returns the pointer to a FILE object representing this file.
+  FILE* get() const FMT_NOEXCEPT { return file_; }
+
+  // We place parentheses around fileno to workaround a bug in some versions
+  // of MinGW that define fileno as a macro.
+  FMT_API int(fileno)() const;
+
+  void vprint(string_view format_str, format_args args) {
+    fmt::vprint(file_, format_str, args);
+  }
+
+  template <typename... Args>
+  inline void print(string_view format_str, const Args&... args) {
+    vprint(format_str, make_format_args(args...));
+  }
+};
+
+#if FMT_USE_FCNTL
+// A file. Closed file is represented by a file object with descriptor -1.
+// Methods that are not declared with FMT_NOEXCEPT may throw
+// fmt::system_error in case of failure. Note that some errors such as
+// closing the file multiple times will cause a crash on Windows rather
+// than an exception. You can get standard behavior by overriding the
+// invalid parameter handler with _set_invalid_parameter_handler.
+class file {
+ private:
+  int fd_;  // File descriptor.
+
+  // Constructs a file object with a given descriptor.
+  explicit file(int fd) : fd_(fd) {}
+
+ public:
+  // Possible values for the oflag argument to the constructor.
+  enum {
+    RDONLY = FMT_POSIX(O_RDONLY),  // Open for reading only.
+    WRONLY = FMT_POSIX(O_WRONLY),  // Open for writing only.
+    RDWR = FMT_POSIX(O_RDWR),      // Open for reading and writing.
+    CREATE = FMT_POSIX(O_CREAT),   // Create if the file doesn't exist.
+    APPEND = FMT_POSIX(O_APPEND)   // Open in append mode.
+  };
+
+  // Constructs a file object which doesn't represent any file.
+  file() FMT_NOEXCEPT : fd_(-1) {}
+
+  // Opens a file and constructs a file object representing this file.
+  FMT_API file(cstring_view path, int oflag);
+
+ public:
+  file(const file&) = delete;
+  void operator=(const file&) = delete;
+
+  file(file&& other) FMT_NOEXCEPT : fd_(other.fd_) { other.fd_ = -1; }
+
+  file& operator=(file&& other) FMT_NOEXCEPT {
+    close();
+    fd_ = other.fd_;
+    other.fd_ = -1;
+    return *this;
+  }
+
+  // Destroys the object closing the file it represents if any.
+  FMT_API ~file() FMT_NOEXCEPT;
+
+  // Returns the file descriptor.
+  int descriptor() const FMT_NOEXCEPT { return fd_; }
+
+  // Closes the file.
+  FMT_API void close();
+
+  // Returns the file size. The size has signed type for consistency with
+  // stat::st_size.
+  FMT_API long long size() const;
+
+  // Attempts to read count bytes from the file into the specified buffer.
+  FMT_API size_t read(void* buffer, size_t count);
+
+  // Attempts to write count bytes from the specified buffer to the file.
+  FMT_API size_t write(const void* buffer, size_t count);
+
+  // Duplicates a file descriptor with the dup function and returns
+  // the duplicate as a file object.
+  FMT_API static file dup(int fd);
+
+  // Makes fd be the copy of this file descriptor, closing fd first if
+  // necessary.
+  FMT_API void dup2(int fd);
+
+  // Makes fd be the copy of this file descriptor, closing fd first if
+  // necessary.
+  FMT_API void dup2(int fd, error_code& ec) FMT_NOEXCEPT;
+
+  // Creates a pipe setting up read_end and write_end file objects for reading
+  // and writing respectively.
+  FMT_API static void pipe(file& read_end, file& write_end);
+
+  // Creates a buffered_file object associated with this file and detaches
+  // this file object from the file.
+  FMT_API buffered_file fdopen(const char* mode);
+};
+
+// Returns the memory page size.
+long getpagesize();
+
+namespace detail {
+
+struct buffer_size {
+  size_t value = 0;
+  buffer_size operator=(size_t val) const {
+    auto bs = buffer_size();
+    bs.value = val;
+    return bs;
+  }
+};
+
+struct ostream_params {
+  int oflag = file::WRONLY | file::CREATE;
+  size_t buffer_size = BUFSIZ > 32768 ? BUFSIZ : 32768;
+
+  ostream_params() {}
+
+  template <typename... T>
+  ostream_params(T... params, int oflag) : ostream_params(params...) {
+    this->oflag = oflag;
+  }
+
+  template <typename... T>
+  ostream_params(T... params, detail::buffer_size bs)
+      : ostream_params(params...) {
+    this->buffer_size = bs.value;
+  }
+};
+}  // namespace detail
+
+static constexpr detail::buffer_size buffer_size;
+
+// A fast output stream which is not thread-safe.
+class ostream : private detail::buffer<char> {
+ private:
+  file file_;
+
+  void flush() {
+    if (size() == 0) return;
+    file_.write(data(), size());
+    clear();
+  }
+
+  void grow(size_t) final;
+
+  ostream(cstring_view path, const detail::ostream_params& params)
+      : file_(path, params.oflag) {
+    set(new char[params.buffer_size], params.buffer_size);
+  }
+
+ public:
+  ostream(ostream&& other)
+      : detail::buffer<char>(other.data(), other.size(), other.capacity()),
+        file_(std::move(other.file_)) {
+    other.set(nullptr, 0);
+  }
+  ~ostream() {
+    flush();
+    delete[] data();
+  }
+
+  template <typename... T>
+  friend ostream output_file(cstring_view path, T... params);
+
+  void close() {
+    flush();
+    file_.close();
+  }
+
+  template <typename S, typename... Args>
+  void print(const S& format_str, const Args&... args) {
+    format_to(detail::buffer_appender<char>(*this), format_str, args...);
+  }
+};
+
+/**
+  Opens a file for writing. Supported parameters passed in `params`:
+  * ``<integer>``: Output flags (``file::WRONLY | file::CREATE`` by default)
+  * ``buffer_size=<integer>``: Output buffer size
+ */
+template <typename... T>
+inline ostream output_file(cstring_view path, T... params) {
+  return {path, detail::ostream_params(params...)};
+}
+#endif  // FMT_USE_FCNTL
+
+#ifdef FMT_LOCALE
+// A "C" numeric locale.
+class locale {
+ private:
+#  ifdef _WIN32
+  using locale_t = _locale_t;
+
+  static void freelocale(locale_t loc) { _free_locale(loc); }
+
+  static double strtod_l(const char* nptr, char** endptr, _locale_t loc) {
+    return _strtod_l(nptr, endptr, loc);
+  }
+#  endif
+
+  locale_t locale_;
+
+ public:
+  using type = locale_t;
+  locale(const locale&) = delete;
+  void operator=(const locale&) = delete;
+
+  locale() {
+#  ifndef _WIN32
+    locale_ = FMT_SYSTEM(newlocale(LC_NUMERIC_MASK, "C", nullptr));
+#  else
+    locale_ = _create_locale(LC_NUMERIC, "C");
+#  endif
+    if (!locale_) FMT_THROW(system_error(errno, "cannot create locale"));
+  }
+  ~locale() { freelocale(locale_); }
+
+  type get() const { return locale_; }
+
+  // Converts string to floating-point number and advances str past the end
+  // of the parsed input.
+  double strtod(const char*& str) const {
+    char* end = nullptr;
+    double result = strtod_l(str, &end, locale_);
+    str = end;
+    return result;
+  }
+};
+using Locale FMT_DEPRECATED_ALIAS = locale;
+#endif  // FMT_LOCALE
+FMT_END_NAMESPACE
+
+#endif  // FMT_OS_H_
--- a/src/3rdparty/fmt/ostream.h
+++ b/src/3rdparty/fmt/ostream.h
@@ -0,0 +1,177 @@
+// Formatting library for C++ - std::ostream support
+//
+// Copyright (c) 2012 - present, Victor Zverovich
+// All rights reserved.
+//
+// For the license information refer to format.h.
+
+#ifndef FMT_OSTREAM_H_
+#define FMT_OSTREAM_H_
+
+#include <ostream>
+
+#include "format.h"
+
+FMT_BEGIN_NAMESPACE
+
+template <typename Char> class basic_printf_parse_context;
+template <typename OutputIt, typename Char> class basic_printf_context;
+
+namespace detail {
+
+template <class Char> class formatbuf : public std::basic_streambuf<Char> {
+ private:
+  using int_type = typename std::basic_streambuf<Char>::int_type;
+  using traits_type = typename std::basic_streambuf<Char>::traits_type;
+
+  buffer<Char>& buffer_;
+
+ public:
+  formatbuf(buffer<Char>& buf) : buffer_(buf) {}
+
+ protected:
+  // The put-area is actually always empty. This makes the implementation
+  // simpler and has the advantage that the streambuf and the buffer are always
+  // in sync and sputc never writes into uninitialized memory. The obvious
+  // disadvantage is that each call to sputc always results in a (virtual) call
+  // to overflow. There is no disadvantage here for sputn since this always
+  // results in a call to xsputn.
+
+  int_type overflow(int_type ch = traits_type::eof()) FMT_OVERRIDE {
+    if (!traits_type::eq_int_type(ch, traits_type::eof()))
+      buffer_.push_back(static_cast<Char>(ch));
+    return ch;
+  }
+
+  std::streamsize xsputn(const Char* s, std::streamsize count) FMT_OVERRIDE {
+    buffer_.append(s, s + count);
+    return count;
+  }
+};
+
+struct converter {
+  template <typename T, FMT_ENABLE_IF(is_integral<T>::value)> converter(T);
+};
+
+template <typename Char> struct test_stream : std::basic_ostream<Char> {
+ private:
+  void_t<> operator<<(converter);
+};
+
+// Hide insertion operators for built-in types.
+template <typename Char, typename Traits>
+void_t<> operator<<(std::basic_ostream<Char, Traits>&, Char);
+template <typename Char, typename Traits>
+void_t<> operator<<(std::basic_ostream<Char, Traits>&, char);
+template <typename Traits>
+void_t<> operator<<(std::basic_ostream<char, Traits>&, char);
+template <typename Traits>
+void_t<> operator<<(std::basic_ostream<char, Traits>&, signed char);
+template <typename Traits>
+void_t<> operator<<(std::basic_ostream<char, Traits>&, unsigned char);
+
+// Checks if T has a user-defined operator<< (e.g. not a member of
+// std::ostream).
+template <typename T, typename Char> class is_streamable {
+ private:
+  template <typename U>
+  static bool_constant<!std::is_same<decltype(std::declval<test_stream<Char>&>()
+                                              << std::declval<U>()),
+                                     void_t<>>::value>
+  test(int);
+
+  template <typename> static std::false_type test(...);
+
+  using result = decltype(test<T>(0));
+
+ public:
+  static const bool value = result::value;
+};
+
+// Write the content of buf to os.
+template <typename Char>
+void write_buffer(std::basic_ostream<Char>& os, buffer<Char>& buf) {
+  const Char* buf_data = buf.data();
+  using unsigned_streamsize = std::make_unsigned<std::streamsize>::type;
+  unsigned_streamsize size = buf.size();
+  unsigned_streamsize max_size = to_unsigned(max_value<std::streamsize>());
+  do {
+    unsigned_streamsize n = size <= max_size ? size : max_size;
+    os.write(buf_data, static_cast<std::streamsize>(n));
+    buf_data += n;
+    size -= n;
+  } while (size != 0);
+}
+
+template <typename Char, typename T>
+void format_value(buffer<Char>& buf, const T& value,
+                  locale_ref loc = locale_ref()) {
+  formatbuf<Char> format_buf(buf);
+  std::basic_ostream<Char> output(&format_buf);
+#if !defined(FMT_STATIC_THOUSANDS_SEPARATOR)
+  if (loc) output.imbue(loc.get<std::locale>());
+#endif
+  output << value;
+  output.exceptions(std::ios_base::failbit | std::ios_base::badbit);
+  buf.try_resize(buf.size());
+}
+
+// Formats an object of type T that has an overloaded ostream operator<<.
+template <typename T, typename Char>
+struct fallback_formatter<T, Char, enable_if_t<is_streamable<T, Char>::value>>
+    : private formatter<basic_string_view<Char>, Char> {
+  FMT_CONSTEXPR auto parse(basic_format_parse_context<Char>& ctx)
+      -> decltype(ctx.begin()) {
+    return formatter<basic_string_view<Char>, Char>::parse(ctx);
+  }
+  template <typename ParseCtx,
+            FMT_ENABLE_IF(std::is_same<
+                          ParseCtx, basic_printf_parse_context<Char>>::value)>
+  auto parse(ParseCtx& ctx) -> decltype(ctx.begin()) {
+    return ctx.begin();
+  }
+
+  template <typename OutputIt>
+  auto format(const T& value, basic_format_context<OutputIt, Char>& ctx)
+      -> OutputIt {
+    basic_memory_buffer<Char> buffer;
+    format_value(buffer, value, ctx.locale());
+    basic_string_view<Char> str(buffer.data(), buffer.size());
+    return formatter<basic_string_view<Char>, Char>::format(str, ctx);
+  }
+  template <typename OutputIt>
+  auto format(const T& value, basic_printf_context<OutputIt, Char>& ctx)
+      -> OutputIt {
+    basic_memory_buffer<Char> buffer;
+    format_value(buffer, value, ctx.locale());
+    return std::copy(buffer.begin(), buffer.end(), ctx.out());
+  }
+};
+}  // namespace detail
+
+template <typename Char>
+void vprint(std::basic_ostream<Char>& os, basic_string_view<Char> format_str,
+            basic_format_args<buffer_context<type_identity_t<Char>>> args) {
+  basic_memory_buffer<Char> buffer;
+  detail::vformat_to(buffer, format_str, args);
+  detail::write_buffer(os, buffer);
+}
+
+/**
+  \rst
+  Prints formatted data to the stream *os*.
+
+  **Example**::
+
+    fmt::print(cerr, "Don't {}!", "panic");
+  \endrst
+ */
+template <typename S, typename... Args,
+          typename Char = enable_if_t<detail::is_string<S>::value, char_t<S>>>
+void print(std::basic_ostream<Char>& os, const S& format_str, Args&&... args) {
+  vprint(os, to_string_view(format_str),
+         fmt::make_args_checked<Args...>(format_str, args...));
+}
+FMT_END_NAMESPACE
+
+#endif  // FMT_OSTREAM_H_
--- a/src/3rdparty/fmt/posix.h
+++ b/src/3rdparty/fmt/posix.h
@@ -0,0 +1,2 @@
+#include "os.h"
+#warning "fmt/posix.h is deprecated; use fmt/os.h instead"
--- a/src/3rdparty/fmt/printf.h
+++ b/src/3rdparty/fmt/printf.h
@@ -0,0 +1,751 @@
+// Formatting library for C++ - legacy printf implementation
+//
+// Copyright (c) 2012 - 2016, Victor Zverovich
+// All rights reserved.
+//
+// For the license information refer to format.h.
+
+#ifndef FMT_PRINTF_H_
+#define FMT_PRINTF_H_
+
+#include <algorithm>  // std::max
+#include <limits>     // std::numeric_limits
+
+#include "ostream.h"
+
+FMT_BEGIN_NAMESPACE
+namespace detail {
+
+// Checks if a value fits in int - used to avoid warnings about comparing
+// signed and unsigned integers.
+template <bool IsSigned> struct int_checker {
+  template <typename T> static bool fits_in_int(T value) {
+    unsigned max = max_value<int>();
+    return value <= max;
+  }
+  static bool fits_in_int(bool) { return true; }
+};
+
+template <> struct int_checker<true> {
+  template <typename T> static bool fits_in_int(T value) {
+    return value >= (std::numeric_limits<int>::min)() &&
+           value <= max_value<int>();
+  }
+  static bool fits_in_int(int) { return true; }
+};
+
+class printf_precision_handler {
+ public:
+  template <typename T, FMT_ENABLE_IF(std::is_integral<T>::value)>
+  int operator()(T value) {
+    if (!int_checker<std::numeric_limits<T>::is_signed>::fits_in_int(value))
+      FMT_THROW(format_error("number is too big"));
+    return (std::max)(static_cast<int>(value), 0);
+  }
+
+  template <typename T, FMT_ENABLE_IF(!std::is_integral<T>::value)>
+  int operator()(T) {
+    FMT_THROW(format_error("precision is not integer"));
+    return 0;
+  }
+};
+
+// An argument visitor that returns true iff arg is a zero integer.
+class is_zero_int {
+ public:
+  template <typename T, FMT_ENABLE_IF(std::is_integral<T>::value)>
+  bool operator()(T value) {
+    return value == 0;
+  }
+
+  template <typename T, FMT_ENABLE_IF(!std::is_integral<T>::value)>
+  bool operator()(T) {
+    return false;
+  }
+};
+
+template <typename T> struct make_unsigned_or_bool : std::make_unsigned<T> {};
+
+template <> struct make_unsigned_or_bool<bool> { using type = bool; };
+
+template <typename T, typename Context> class arg_converter {
+ private:
+  using char_type = typename Context::char_type;
+
+  basic_format_arg<Context>& arg_;
+  char_type type_;
+
+ public:
+  arg_converter(basic_format_arg<Context>& arg, char_type type)
+      : arg_(arg), type_(type) {}
+
+  void operator()(bool value) {
+    if (type_ != 's') operator()<bool>(value);
+  }
+
+  template <typename U, FMT_ENABLE_IF(std::is_integral<U>::value)>
+  void operator()(U value) {
+    bool is_signed = type_ == 'd' || type_ == 'i';
+    using target_type = conditional_t<std::is_same<T, void>::value, U, T>;
+    if (const_check(sizeof(target_type) <= sizeof(int))) {
+      // Extra casts are used to silence warnings.
+      if (is_signed) {
+        arg_ = detail::make_arg<Context>(
+            static_cast<int>(static_cast<target_type>(value)));
+      } else {
+        using unsigned_type = typename make_unsigned_or_bool<target_type>::type;
+        arg_ = detail::make_arg<Context>(
+            static_cast<unsigned>(static_cast<unsigned_type>(value)));
+      }
+    } else {
+      if (is_signed) {
+        // glibc's printf doesn't sign extend arguments of smaller types:
+        //   std::printf("%lld", -42);  // prints "4294967254"
+        // but we don't have to do the same because it's a UB.
+        arg_ = detail::make_arg<Context>(static_cast<long long>(value));
+      } else {
+        arg_ = detail::make_arg<Context>(
+            static_cast<typename make_unsigned_or_bool<U>::type>(value));
+      }
+    }
+  }
+
+  template <typename U, FMT_ENABLE_IF(!std::is_integral<U>::value)>
+  void operator()(U) {}  // No conversion needed for non-integral types.
+};
+
+// Converts an integer argument to T for printf, if T is an integral type.
+// If T is void, the argument is converted to corresponding signed or unsigned
+// type depending on the type specifier: 'd' and 'i' - signed, other -
+// unsigned).
+template <typename T, typename Context, typename Char>
+void convert_arg(basic_format_arg<Context>& arg, Char type) {
+  visit_format_arg(arg_converter<T, Context>(arg, type), arg);
+}
+
+// Converts an integer argument to char for printf.
+template <typename Context> class char_converter {
+ private:
+  basic_format_arg<Context>& arg_;
+
+ public:
+  explicit char_converter(basic_format_arg<Context>& arg) : arg_(arg) {}
+
+  template <typename T, FMT_ENABLE_IF(std::is_integral<T>::value)>
+  void operator()(T value) {
+    arg_ = detail::make_arg<Context>(
+        static_cast<typename Context::char_type>(value));
+  }
+
+  template <typename T, FMT_ENABLE_IF(!std::is_integral<T>::value)>
+  void operator()(T) {}  // No conversion needed for non-integral types.
+};
+
+// An argument visitor that return a pointer to a C string if argument is a
+// string or null otherwise.
+template <typename Char> struct get_cstring {
+  template <typename T> const Char* operator()(T) { return nullptr; }
+  const Char* operator()(const Char* s) { return s; }
+};
+
+// Checks if an argument is a valid printf width specifier and sets
+// left alignment if it is negative.
+template <typename Char> class printf_width_handler {
+ private:
+  using format_specs = basic_format_specs<Char>;
+
+  format_specs& specs_;
+
+ public:
+  explicit printf_width_handler(format_specs& specs) : specs_(specs) {}
+
+  template <typename T, FMT_ENABLE_IF(std::is_integral<T>::value)>
+  unsigned operator()(T value) {
+    auto width = static_cast<uint32_or_64_or_128_t<T>>(value);
+    if (detail::is_negative(value)) {
+      specs_.align = align::left;
+      width = 0 - width;
+    }
+    unsigned int_max = max_value<int>();
+    if (width > int_max) FMT_THROW(format_error("number is too big"));
+    return static_cast<unsigned>(width);
+  }
+
+  template <typename T, FMT_ENABLE_IF(!std::is_integral<T>::value)>
+  unsigned operator()(T) {
+    FMT_THROW(format_error("width is not integer"));
+    return 0;
+  }
+};
+
+template <typename Char, typename Context>
+void vprintf(buffer<Char>& buf, basic_string_view<Char> format,
+             basic_format_args<Context> args) {
+  Context(buffer_appender<Char>(buf), format, args).format();
+}
+}  // namespace detail
+
+// For printing into memory_buffer.
+template <typename Char, typename Context>
+FMT_DEPRECATED void printf(detail::buffer<Char>& buf,
+                           basic_string_view<Char> format,
+                           basic_format_args<Context> args) {
+  return detail::vprintf(buf, format, args);
+}
+using detail::vprintf;
+
+template <typename Char>
+class basic_printf_parse_context : public basic_format_parse_context<Char> {
+  using basic_format_parse_context<Char>::basic_format_parse_context;
+};
+template <typename OutputIt, typename Char> class basic_printf_context;
+
+/**
+  \rst
+  The ``printf`` argument formatter.
+  \endrst
+ */
+template <typename OutputIt, typename Char>
+class printf_arg_formatter : public detail::arg_formatter_base<OutputIt, Char> {
+ public:
+  using iterator = OutputIt;
+
+ private:
+  using char_type = Char;
+  using base = detail::arg_formatter_base<OutputIt, Char>;
+  using context_type = basic_printf_context<OutputIt, Char>;
+
+  context_type& context_;
+
+  void write_null_pointer(char) {
+    this->specs()->type = 0;
+    this->write("(nil)");
+  }
+
+  void write_null_pointer(wchar_t) {
+    this->specs()->type = 0;
+    this->write(L"(nil)");
+  }
+
+ public:
+  using format_specs = typename base::format_specs;
+
+  /**
+    \rst
+    Constructs an argument formatter object.
+    *buffer* is a reference to the output buffer and *specs* contains format
+    specifier information for standard argument types.
+    \endrst
+   */
+  printf_arg_formatter(iterator iter, format_specs& specs, context_type& ctx)
+      : base(iter, &specs, detail::locale_ref()), context_(ctx) {}
+
+  template <typename T, FMT_ENABLE_IF(fmt::detail::is_integral<T>::value)>
+  iterator operator()(T value) {
+    // MSVC2013 fails to compile separate overloads for bool and char_type so
+    // use std::is_same instead.
+    if (std::is_same<T, bool>::value) {
+      format_specs& fmt_specs = *this->specs();
+      if (fmt_specs.type != 's') return base::operator()(value ? 1 : 0);
+      fmt_specs.type = 0;
+      this->write(value != 0);
+    } else if (std::is_same<T, char_type>::value) {
+      format_specs& fmt_specs = *this->specs();
+      if (fmt_specs.type && fmt_specs.type != 'c')
+        return (*this)(static_cast<int>(value));
+      fmt_specs.sign = sign::none;
+      fmt_specs.alt = false;
+      fmt_specs.fill[0] = ' ';  // Ignore '0' flag for char types.
+      // align::numeric needs to be overwritten here since the '0' flag is
+      // ignored for non-numeric types
+      if (fmt_specs.align == align::none || fmt_specs.align == align::numeric)
+        fmt_specs.align = align::right;
+      return base::operator()(value);
+    } else {
+      return base::operator()(value);
+    }
+    return this->out();
+  }
+
+  template <typename T, FMT_ENABLE_IF(std::is_floating_point<T>::value)>
+  iterator operator()(T value) {
+    return base::operator()(value);
+  }
+
+  /** Formats a null-terminated C string. */
+  iterator operator()(const char* value) {
+    if (value)
+      base::operator()(value);
+    else if (this->specs()->type == 'p')
+      write_null_pointer(char_type());
+    else
+      this->write("(null)");
+    return this->out();
+  }
+
+  /** Formats a null-terminated wide C string. */
+  iterator operator()(const wchar_t* value) {
+    if (value)
+      base::operator()(value);
+    else if (this->specs()->type == 'p')
+      write_null_pointer(char_type());
+    else
+      this->write(L"(null)");
+    return this->out();
+  }
+
+  iterator operator()(basic_string_view<char_type> value) {
+    return base::operator()(value);
+  }
+
+  iterator operator()(monostate value) { return base::operator()(value); }
+
+  /** Formats a pointer. */
+  iterator operator()(const void* value) {
+    if (value) return base::operator()(value);
+    this->specs()->type = 0;
+    write_null_pointer(char_type());
+    return this->out();
+  }
+
+  /** Formats an argument of a custom (user-defined) type. */
+  iterator operator()(typename basic_format_arg<context_type>::handle handle) {
+    handle.format(context_.parse_context(), context_);
+    return this->out();
+  }
+};
+
+template <typename T> struct printf_formatter {
+  printf_formatter() = delete;
+
+  template <typename ParseContext>
+  auto parse(ParseContext& ctx) -> decltype(ctx.begin()) {
+    return ctx.begin();
+  }
+
+  template <typename FormatContext>
+  auto format(const T& value, FormatContext& ctx) -> decltype(ctx.out()) {
+    detail::format_value(detail::get_container(ctx.out()), value);
+    return ctx.out();
+  }
+};
+
+/**
+ This template formats data and writes the output through an output iterator.
+ */
+template <typename OutputIt, typename Char> class basic_printf_context {
+ public:
+  /** The character type for the output. */
+  using char_type = Char;
+  using iterator = OutputIt;
+  using format_arg = basic_format_arg<basic_printf_context>;
+  using parse_context_type = basic_printf_parse_context<Char>;
+  template <typename T> using formatter_type = printf_formatter<T>;
+
+ private:
+  using format_specs = basic_format_specs<char_type>;
+
+  OutputIt out_;
+  basic_format_args<basic_printf_context> args_;
+  parse_context_type parse_ctx_;
+
+  static void parse_flags(format_specs& specs, const Char*& it,
+                          const Char* end);
+
+  // Returns the argument with specified index or, if arg_index is -1, the next
+  // argument.
+  format_arg get_arg(int arg_index = -1);
+
+  // Parses argument index, flags and width and returns the argument index.
+  int parse_header(const Char*& it, const Char* end, format_specs& specs);
+
+ public:
+  /**
+   \rst
+   Constructs a ``printf_context`` object. References to the arguments are
+   stored in the context object so make sure they have appropriate lifetimes.
+   \endrst
+   */
+  basic_printf_context(OutputIt out, basic_string_view<char_type> format_str,
+                       basic_format_args<basic_printf_context> args)
+      : out_(out), args_(args), parse_ctx_(format_str) {}
+
+  OutputIt out() { return out_; }
+  void advance_to(OutputIt it) { out_ = it; }
+
+  detail::locale_ref locale() { return {}; }
+
+  format_arg arg(int id) const { return args_.get(id); }
+
+  parse_context_type& parse_context() { return parse_ctx_; }
+
+  FMT_CONSTEXPR void on_error(const char* message) {
+    parse_ctx_.on_error(message);
+  }
+
+  /** Formats stored arguments and writes the output to the range. */
+  template <typename ArgFormatter = printf_arg_formatter<OutputIt, Char>>
+  OutputIt format();
+};
+
+template <typename OutputIt, typename Char>
+void basic_printf_context<OutputIt, Char>::parse_flags(format_specs& specs,
+                                                       const Char*& it,
+                                                       const Char* end) {
+  for (; it != end; ++it) {
+    switch (*it) {
+    case '-':
+      specs.align = align::left;
+      break;
+    case '+':
+      specs.sign = sign::plus;
+      break;
+    case '0':
+      specs.fill[0] = '0';
+      break;
+    case ' ':
+      if (specs.sign != sign::plus) {
+        specs.sign = sign::space;
+      }
+      break;
+    case '#':
+      specs.alt = true;
+      break;
+    default:
+      return;
+    }
+  }
+}
+
+template <typename OutputIt, typename Char>
+typename basic_printf_context<OutputIt, Char>::format_arg
+basic_printf_context<OutputIt, Char>::get_arg(int arg_index) {
+  if (arg_index < 0)
+    arg_index = parse_ctx_.next_arg_id();
+  else
+    parse_ctx_.check_arg_id(--arg_index);
+  return detail::get_arg(*this, arg_index);
+}
+
+template <typename OutputIt, typename Char>
+int basic_printf_context<OutputIt, Char>::parse_header(const Char*& it,
+                                                       const Char* end,
+                                                       format_specs& specs) {
+  int arg_index = -1;
+  char_type c = *it;
+  if (c >= '0' && c <= '9') {
+    // Parse an argument index (if followed by '$') or a width possibly
+    // preceded with '0' flag(s).
+    detail::error_handler eh;
+    int value = parse_nonnegative_int(it, end, eh);
+    if (it != end && *it == '$') {  // value is an argument index
+      ++it;
+      arg_index = value;
+    } else {
+      if (c == '0') specs.fill[0] = '0';
+      if (value != 0) {
+        // Nonzero value means that we parsed width and don't need to
+        // parse it or flags again, so return now.
+        specs.width = value;
+        return arg_index;
+      }
+    }
+  }
+  parse_flags(specs, it, end);
+  // Parse width.
+  if (it != end) {
+    if (*it >= '0' && *it <= '9') {
+      detail::error_handler eh;
+      specs.width = parse_nonnegative_int(it, end, eh);
+    } else if (*it == '*') {
+      ++it;
+      specs.width = static_cast<int>(visit_format_arg(
+          detail::printf_width_handler<char_type>(specs), get_arg()));
+    }
+  }
+  return arg_index;
+}
+
+template <typename OutputIt, typename Char>
+template <typename ArgFormatter>
+OutputIt basic_printf_context<OutputIt, Char>::format() {
+  auto out = this->out();
+  const Char* start = parse_ctx_.begin();
+  const Char* end = parse_ctx_.end();
+  auto it = start;
+  while (it != end) {
+    char_type c = *it++;
+    if (c != '%') continue;
+    if (it != end && *it == c) {
+      out = std::copy(start, it, out);
+      start = ++it;
+      continue;
+    }
+    out = std::copy(start, it - 1, out);
+
+    format_specs specs;
+    specs.align = align::right;
+
+    // Parse argument index, flags and width.
+    int arg_index = parse_header(it, end, specs);
+    if (arg_index == 0) on_error("argument not found");
+
+    // Parse precision.
+    if (it != end && *it == '.') {
+      ++it;
+      c = it != end ? *it : 0;
+      if ('0' <= c && c <= '9') {
+        detail::error_handler eh;
+        specs.precision = parse_nonnegative_int(it, end, eh);
+      } else if (c == '*') {
+        ++it;
+        specs.precision = static_cast<int>(
+            visit_format_arg(detail::printf_precision_handler(), get_arg()));
+      } else {
+        specs.precision = 0;
+      }
+    }
+
+    format_arg arg = get_arg(arg_index);
+    // For d, i, o, u, x, and X conversion specifiers, if a precision is
+    // specified, the '0' flag is ignored
+    if (specs.precision >= 0 && arg.is_integral())
+      specs.fill[0] =
+          ' ';  // Ignore '0' flag for non-numeric types or if '-' present.
+    if (specs.precision >= 0 && arg.type() == detail::type::cstring_type) {
+      auto str = visit_format_arg(detail::get_cstring<Char>(), arg);
+      auto str_end = str + specs.precision;
+      auto nul = std::find(str, str_end, Char());
+      arg = detail::make_arg<basic_printf_context>(basic_string_view<Char>(
+          str,
+          detail::to_unsigned(nul != str_end ? nul - str : specs.precision)));
+    }
+    if (specs.alt && visit_format_arg(detail::is_zero_int(), arg))
+      specs.alt = false;
+    if (specs.fill[0] == '0') {
+      if (arg.is_arithmetic() && specs.align != align::left)
+        specs.align = align::numeric;
+      else
+        specs.fill[0] = ' ';  // Ignore '0' flag for non-numeric types or if '-'
+                              // flag is also present.
+    }
+
+    // Parse length and convert the argument to the required type.
+    c = it != end ? *it++ : 0;
+    char_type t = it != end ? *it : 0;
+    using detail::convert_arg;
+    switch (c) {
+    case 'h':
+      if (t == 'h') {
+        ++it;
+        t = it != end ? *it : 0;
+        convert_arg<signed char>(arg, t);
+      } else {
+        convert_arg<short>(arg, t);
+      }
+      break;
+    case 'l':
+      if (t == 'l') {
+        ++it;
+        t = it != end ? *it : 0;
+        convert_arg<long long>(arg, t);
+      } else {
+        convert_arg<long>(arg, t);
+      }
+      break;
+    case 'j':
+      convert_arg<intmax_t>(arg, t);
+      break;
+    case 'z':
+      convert_arg<size_t>(arg, t);
+      break;
+    case 't':
+      convert_arg<std::ptrdiff_t>(arg, t);
+      break;
+    case 'L':
+      // printf produces garbage when 'L' is omitted for long double, no
+      // need to do the same.
+      break;
+    default:
+      --it;
+      convert_arg<void>(arg, c);
+    }
+
+    // Parse type.
+    if (it == end) FMT_THROW(format_error("invalid format string"));
+    specs.type = static_cast<char>(*it++);
+    if (arg.is_integral()) {
+      // Normalize type.
+      switch (specs.type) {
+      case 'i':
+      case 'u':
+        specs.type = 'd';
+        break;
+      case 'c':
+        visit_format_arg(detail::char_converter<basic_printf_context>(arg),
+                         arg);
+        break;
+      }
+    }
+
+    start = it;
+
+    // Format argument.
+    out = visit_format_arg(ArgFormatter(out, specs, *this), arg);
+  }
+  return std::copy(start, it, out);
+}
+
+template <typename Char>
+using basic_printf_context_t =
+    basic_printf_context<detail::buffer_appender<Char>, Char>;
+
+using printf_context = basic_printf_context_t<char>;
+using wprintf_context = basic_printf_context_t<wchar_t>;
+
+using printf_args = basic_format_args<printf_context>;
+using wprintf_args = basic_format_args<wprintf_context>;
+
+/**
+  \rst
+  Constructs an `~fmt::format_arg_store` object that contains references to
+  arguments and can be implicitly converted to `~fmt::printf_args`.
+  \endrst
+ */
+template <typename... Args>
+inline format_arg_store<printf_context, Args...> make_printf_args(
+    const Args&... args) {
+  return {args...};
+}
+
+/**
+  \rst
+  Constructs an `~fmt::format_arg_store` object that contains references to
+  arguments and can be implicitly converted to `~fmt::wprintf_args`.
+  \endrst
+ */
+template <typename... Args>
+inline format_arg_store<wprintf_context, Args...> make_wprintf_args(
+    const Args&... args) {
+  return {args...};
+}
+
+template <typename S, typename Char = char_t<S>>
+inline std::basic_string<Char> vsprintf(
+    const S& format,
+    basic_format_args<basic_printf_context_t<type_identity_t<Char>>> args) {
+  basic_memory_buffer<Char> buffer;
+  vprintf(buffer, to_string_view(format), args);
+  return to_string(buffer);
+}
+
+/**
+  \rst
+  Formats arguments and returns the result as a string.
+
+  **Example**::
+
+    std::string message = fmt::sprintf("The answer is %d", 42);
+  \endrst
+*/
+template <typename S, typename... Args,
+          typename Char = enable_if_t<detail::is_string<S>::value, char_t<S>>>
+inline std::basic_string<Char> sprintf(const S& format, const Args&... args) {
+  using context = basic_printf_context_t<Char>;
+  return vsprintf(to_string_view(format), make_format_args<context>(args...));
+}
+
+template <typename S, typename Char = char_t<S>>
+inline int vfprintf(
+    std::FILE* f, const S& format,
+    basic_format_args<basic_printf_context_t<type_identity_t<Char>>> args) {
+  basic_memory_buffer<Char> buffer;
+  vprintf(buffer, to_string_view(format), args);
+  size_t size = buffer.size();
+  return std::fwrite(buffer.data(), sizeof(Char), size, f) < size
+             ? -1
+             : static_cast<int>(size);
+}
+
+/**
+  \rst
+  Prints formatted data to the file *f*.
+
+  **Example**::
+
+    fmt::fprintf(stderr, "Don't %s!", "panic");
+  \endrst
+ */
+template <typename S, typename... Args,
+          typename Char = enable_if_t<detail::is_string<S>::value, char_t<S>>>
+inline int fprintf(std::FILE* f, const S& format, const Args&... args) {
+  using context = basic_printf_context_t<Char>;
+  return vfprintf(f, to_string_view(format),
+                  make_format_args<context>(args...));
+}
+
+template <typename S, typename Char = char_t<S>>
+inline int vprintf(
+    const S& format,
+    basic_format_args<basic_printf_context_t<type_identity_t<Char>>> args) {
+  return vfprintf(stdout, to_string_view(format), args);
+}
+
+/**
+  \rst
+  Prints formatted data to ``stdout``.
+
+  **Example**::
+
+    fmt::printf("Elapsed time: %.2f seconds", 1.23);
+  \endrst
+ */
+template <typename S, typename... Args,
+          FMT_ENABLE_IF(detail::is_string<S>::value)>
+inline int printf(const S& format_str, const Args&... args) {
+  using context = basic_printf_context_t<char_t<S>>;
+  return vprintf(to_string_view(format_str),
+                 make_format_args<context>(args...));
+}
+
+template <typename S, typename Char = char_t<S>>
+inline int vfprintf(
+    std::basic_ostream<Char>& os, const S& format,
+    basic_format_args<basic_printf_context_t<type_identity_t<Char>>> args) {
+  basic_memory_buffer<Char> buffer;
+  vprintf(buffer, to_string_view(format), args);
+  detail::write_buffer(os, buffer);
+  return static_cast<int>(buffer.size());
+}
+
+/** Formats arguments and writes the output to the range. */
+template <typename ArgFormatter, typename Char,
+          typename Context =
+              basic_printf_context<typename ArgFormatter::iterator, Char>>
+typename ArgFormatter::iterator vprintf(
+    detail::buffer<Char>& out, basic_string_view<Char> format_str,
+    basic_format_args<type_identity_t<Context>> args) {
+  typename ArgFormatter::iterator iter(out);
+  Context(iter, format_str, args).template format<ArgFormatter>();
+  return iter;
+}
+
+/**
+  \rst
+  Prints formatted data to the stream *os*.
+
+  **Example**::
+
+    fmt::fprintf(cerr, "Don't %s!", "panic");
+  \endrst
+ */
+template <typename S, typename... Args, typename Char = char_t<S>>
+inline int fprintf(std::basic_ostream<Char>& os, const S& format_str,
+                   const Args&... args) {
+  using context = basic_printf_context_t<Char>;
+  return vfprintf(os, to_string_view(format_str),
+                  make_format_args<context>(args...));
+}
+FMT_END_NAMESPACE
+
+#endif  // FMT_PRINTF_H_
--- a/src/3rdparty/fmt/ranges.h
+++ b/src/3rdparty/fmt/ranges.h
@@ -0,0 +1,393 @@
+// Formatting library for C++ - experimental range support
+//
+// Copyright (c) 2012 - present, Victor Zverovich
+// All rights reserved.
+//
+// For the license information refer to format.h.
+//
+// Copyright (c) 2018 - present, Remotion (Igor Schulz)
+// All Rights Reserved
+// {fmt} support for ranges, containers and types tuple interface.
+
+#ifndef FMT_RANGES_H_
+#define FMT_RANGES_H_
+
+#include <initializer_list>
+#include <type_traits>
+
+#include "format.h"
+
+// output only up to N items from the range.
+#ifndef FMT_RANGE_OUTPUT_LENGTH_LIMIT
+#  define FMT_RANGE_OUTPUT_LENGTH_LIMIT 256
+#endif
+
+FMT_BEGIN_NAMESPACE
+
+template <typename Char> struct formatting_base {
+  template <typename ParseContext>
+  FMT_CONSTEXPR auto parse(ParseContext& ctx) -> decltype(ctx.begin()) {
+    return ctx.begin();
+  }
+};
+
+template <typename Char, typename Enable = void>
+struct formatting_range : formatting_base<Char> {
+  static FMT_CONSTEXPR_DECL const size_t range_length_limit =
+      FMT_RANGE_OUTPUT_LENGTH_LIMIT;  // output only up to N items from the
+                                      // range.
+  Char prefix;
+  Char delimiter;
+  Char postfix;
+  formatting_range() : prefix('{'), delimiter(','), postfix('}') {}
+  static FMT_CONSTEXPR_DECL const bool add_delimiter_spaces = true;
+  static FMT_CONSTEXPR_DECL const bool add_prepostfix_space = false;
+};
+
+template <typename Char, typename Enable = void>
+struct formatting_tuple : formatting_base<Char> {
+  Char prefix;
+  Char delimiter;
+  Char postfix;
+  formatting_tuple() : prefix('('), delimiter(','), postfix(')') {}
+  static FMT_CONSTEXPR_DECL const bool add_delimiter_spaces = true;
+  static FMT_CONSTEXPR_DECL const bool add_prepostfix_space = false;
+};
+
+namespace detail {
+
+template <typename RangeT, typename OutputIterator>
+OutputIterator copy(const RangeT& range, OutputIterator out) {
+  for (auto it = range.begin(), end = range.end(); it != end; ++it)
+    *out++ = *it;
+  return out;
+}
+
+template <typename OutputIterator>
+OutputIterator copy(const char* str, OutputIterator out) {
+  while (*str) *out++ = *str++;
+  return out;
+}
+
+template <typename OutputIterator>
+OutputIterator copy(char ch, OutputIterator out) {
+  *out++ = ch;
+  return out;
+}
+
+/// Return true value if T has std::string interface, like std::string_view.
+template <typename T> class is_like_std_string {
+  template <typename U>
+  static auto check(U* p)
+      -> decltype((void)p->find('a'), p->length(), (void)p->data(), int());
+  template <typename> static void check(...);
+
+ public:
+  static FMT_CONSTEXPR_DECL const bool value =
+      is_string<T>::value || !std::is_void<decltype(check<T>(nullptr))>::value;
+};
+
+template <typename Char>
+struct is_like_std_string<fmt::basic_string_view<Char>> : std::true_type {};
+
+template <typename... Ts> struct conditional_helper {};
+
+template <typename T, typename _ = void> struct is_range_ : std::false_type {};
+
+#if !FMT_MSC_VER || FMT_MSC_VER > 1800
+template <typename T>
+struct is_range_<
+    T, conditional_t<false,
+                     conditional_helper<decltype(std::declval<T>().begin()),
+                                        decltype(std::declval<T>().end())>,
+                     void>> : std::true_type {};
+#endif
+
+/// tuple_size and tuple_element check.
+template <typename T> class is_tuple_like_ {
+  template <typename U>
+  static auto check(U* p) -> decltype(std::tuple_size<U>::value, int());
+  template <typename> static void check(...);
+
+ public:
+  static FMT_CONSTEXPR_DECL const bool value =
+      !std::is_void<decltype(check<T>(nullptr))>::value;
+};
+
+// Check for integer_sequence
+#if defined(__cpp_lib_integer_sequence) || FMT_MSC_VER >= 1900
+template <typename T, T... N>
+using integer_sequence = std::integer_sequence<T, N...>;
+template <size_t... N> using index_sequence = std::index_sequence<N...>;
+template <size_t N> using make_index_sequence = std::make_index_sequence<N>;
+#else
+template <typename T, T... N> struct integer_sequence {
+  using value_type = T;
+
+  static FMT_CONSTEXPR size_t size() { return sizeof...(N); }
+};
+
+template <size_t... N> using index_sequence = integer_sequence<size_t, N...>;
+
+template <typename T, size_t N, T... Ns>
+struct make_integer_sequence : make_integer_sequence<T, N - 1, N - 1, Ns...> {};
+template <typename T, T... Ns>
+struct make_integer_sequence<T, 0, Ns...> : integer_sequence<T, Ns...> {};
+
+template <size_t N>
+using make_index_sequence = make_integer_sequence<size_t, N>;
+#endif
+
+template <class Tuple, class F, size_t... Is>
+void for_each(index_sequence<Is...>, Tuple&& tup, F&& f) FMT_NOEXCEPT {
+  using std::get;
+  // using free function get<I>(T) now.
+  const int _[] = {0, ((void)f(get<Is>(tup)), 0)...};
+  (void)_;  // blocks warnings
+}
+
+template <class T>
+FMT_CONSTEXPR make_index_sequence<std::tuple_size<T>::value> get_indexes(
+    T const&) {
+  return {};
+}
+
+template <class Tuple, class F> void for_each(Tuple&& tup, F&& f) {
+  const auto indexes = get_indexes(tup);
+  for_each(indexes, std::forward<Tuple>(tup), std::forward<F>(f));
+}
+
+template <typename Range>
+using value_type = remove_cvref_t<decltype(*std::declval<Range>().begin())>;
+
+template <typename Arg, FMT_ENABLE_IF(!is_like_std_string<
+                                      typename std::decay<Arg>::type>::value)>
+FMT_CONSTEXPR const char* format_str_quoted(bool add_space, const Arg&) {
+  return add_space ? " {}" : "{}";
+}
+
+template <typename Arg, FMT_ENABLE_IF(is_like_std_string<
+                                      typename std::decay<Arg>::type>::value)>
+FMT_CONSTEXPR const char* format_str_quoted(bool add_space, const Arg&) {
+  return add_space ? " \"{}\"" : "\"{}\"";
+}
+
+FMT_CONSTEXPR const char* format_str_quoted(bool add_space, const char*) {
+  return add_space ? " \"{}\"" : "\"{}\"";
+}
+FMT_CONSTEXPR const wchar_t* format_str_quoted(bool add_space, const wchar_t*) {
+  return add_space ? L" \"{}\"" : L"\"{}\"";
+}
+
+FMT_CONSTEXPR const char* format_str_quoted(bool add_space, const char) {
+  return add_space ? " '{}'" : "'{}'";
+}
+FMT_CONSTEXPR const wchar_t* format_str_quoted(bool add_space, const wchar_t) {
+  return add_space ? L" '{}'" : L"'{}'";
+}
+}  // namespace detail
+
+template <typename T> struct is_tuple_like {
+  static FMT_CONSTEXPR_DECL const bool value =
+      detail::is_tuple_like_<T>::value && !detail::is_range_<T>::value;
+};
+
+template <typename TupleT, typename Char>
+struct formatter<TupleT, Char, enable_if_t<fmt::is_tuple_like<TupleT>::value>> {
+ private:
+  // C++11 generic lambda for format()
+  template <typename FormatContext> struct format_each {
+    template <typename T> void operator()(const T& v) {
+      if (i > 0) {
+        if (formatting.add_prepostfix_space) {
+          *out++ = ' ';
+        }
+        out = detail::copy(formatting.delimiter, out);
+      }
+      out = format_to(out,
+                      detail::format_str_quoted(
+                          (formatting.add_delimiter_spaces && i > 0), v),
+                      v);
+      ++i;
+    }
+
+    formatting_tuple<Char>& formatting;
+    size_t& i;
+    typename std::add_lvalue_reference<decltype(
+        std::declval<FormatContext>().out())>::type out;
+  };
+
+ public:
+  formatting_tuple<Char> formatting;
+
+  template <typename ParseContext>
+  FMT_CONSTEXPR auto parse(ParseContext& ctx) -> decltype(ctx.begin()) {
+    return formatting.parse(ctx);
+  }
+
+  template <typename FormatContext = format_context>
+  auto format(const TupleT& values, FormatContext& ctx) -> decltype(ctx.out()) {
+    auto out = ctx.out();
+    size_t i = 0;
+    detail::copy(formatting.prefix, out);
+
+    detail::for_each(values, format_each<FormatContext>{formatting, i, out});
+    if (formatting.add_prepostfix_space) {
+      *out++ = ' ';
+    }
+    detail::copy(formatting.postfix, out);
+
+    return ctx.out();
+  }
+};
+
+template <typename T, typename Char> struct is_range {
+  static FMT_CONSTEXPR_DECL const bool value =
+      detail::is_range_<T>::value && !detail::is_like_std_string<T>::value &&
+      !std::is_convertible<T, std::basic_string<Char>>::value &&
+      !std::is_constructible<detail::std_string_view<Char>, T>::value;
+};
+
+template <typename T, typename Char>
+struct formatter<
+    T, Char,
+    enable_if_t<fmt::is_range<T, Char>::value
+// Workaround a bug in MSVC 2017 and earlier.
+#if !FMT_MSC_VER || FMT_MSC_VER >= 1927
+                && has_formatter<detail::value_type<T>, format_context>::value
+#endif
+                >> {
+  formatting_range<Char> formatting;
+
+  template <typename ParseContext>
+  FMT_CONSTEXPR auto parse(ParseContext& ctx) -> decltype(ctx.begin()) {
+    return formatting.parse(ctx);
+  }
+
+  template <typename FormatContext>
+  typename FormatContext::iterator format(const T& values, FormatContext& ctx) {
+    auto out = detail::copy(formatting.prefix, ctx.out());
+    size_t i = 0;
+    auto it = values.begin();
+    auto end = values.end();
+    for (; it != end; ++it) {
+      if (i > 0) {
+        if (formatting.add_prepostfix_space) *out++ = ' ';
+        out = detail::copy(formatting.delimiter, out);
+      }
+      out = format_to(out,
+                      detail::format_str_quoted(
+                          (formatting.add_delimiter_spaces && i > 0), *it),
+                      *it);
+      if (++i > formatting.range_length_limit) {
+        out = format_to(out, " ... <other elements>");
+        break;
+      }
+    }
+    if (formatting.add_prepostfix_space) *out++ = ' ';
+    return detail::copy(formatting.postfix, out);
+  }
+};
+
+template <typename Char, typename... T> struct tuple_arg_join : detail::view {
+  const std::tuple<T...>& tuple;
+  basic_string_view<Char> sep;
+
+  tuple_arg_join(const std::tuple<T...>& t, basic_string_view<Char> s)
+      : tuple{t}, sep{s} {}
+};
+
+template <typename Char, typename... T>
+struct formatter<tuple_arg_join<Char, T...>, Char> {
+  template <typename ParseContext>
+  FMT_CONSTEXPR auto parse(ParseContext& ctx) -> decltype(ctx.begin()) {
+    return ctx.begin();
+  }
+
+  template <typename FormatContext>
+  typename FormatContext::iterator format(
+      const tuple_arg_join<Char, T...>& value, FormatContext& ctx) {
+    return format(value, ctx, detail::make_index_sequence<sizeof...(T)>{});
+  }
+
+ private:
+  template <typename FormatContext, size_t... N>
+  typename FormatContext::iterator format(
+      const tuple_arg_join<Char, T...>& value, FormatContext& ctx,
+      detail::index_sequence<N...>) {
+    return format_args(value, ctx, std::get<N>(value.tuple)...);
+  }
+
+  template <typename FormatContext>
+  typename FormatContext::iterator format_args(
+      const tuple_arg_join<Char, T...>&, FormatContext& ctx) {
+    // NOTE: for compilers that support C++17, this empty function instantiation
+    // can be replaced with a constexpr branch in the variadic overload.
+    return ctx.out();
+  }
+
+  template <typename FormatContext, typename Arg, typename... Args>
+  typename FormatContext::iterator format_args(
+      const tuple_arg_join<Char, T...>& value, FormatContext& ctx,
+      const Arg& arg, const Args&... args) {
+    using base = formatter<typename std::decay<Arg>::type, Char>;
+    auto out = ctx.out();
+    out = base{}.format(arg, ctx);
+    if (sizeof...(Args) > 0) {
+      out = std::copy(value.sep.begin(), value.sep.end(), out);
+      ctx.advance_to(out);
+      return format_args(value, ctx, args...);
+    }
+    return out;
+  }
+};
+
+/**
+  \rst
+  Returns an object that formats `tuple` with elements separated by `sep`.
+
+  **Example**::
+
+    std::tuple<int, char> t = {1, 'a'};
+    fmt::print("{}", fmt::join(t, ", "));
+    // Output: "1, a"
+  \endrst
+ */
+template <typename... T>
+FMT_CONSTEXPR tuple_arg_join<char, T...> join(const std::tuple<T...>& tuple,
+                                              string_view sep) {
+  return {tuple, sep};
+}
+
+template <typename... T>
+FMT_CONSTEXPR tuple_arg_join<wchar_t, T...> join(const std::tuple<T...>& tuple,
+                                                 wstring_view sep) {
+  return {tuple, sep};
+}
+
+/**
+  \rst
+  Returns an object that formats `initializer_list` with elements separated by
+  `sep`.
+
+  **Example**::
+
+    fmt::print("{}", fmt::join({1, 2, 3}, ", "));
+    // Output: "1, 2, 3"
+  \endrst
+ */
+template <typename T>
+arg_join<const T*, const T*, char> join(std::initializer_list<T> list,
+                                        string_view sep) {
+  return join(std::begin(list), std::end(list), sep);
+}
+
+template <typename T>
+arg_join<const T*, const T*, wchar_t> join(std::initializer_list<T> list,
+                                           wstring_view sep) {
+  return join(std::begin(list), std::end(list), sep);
+}
+
+FMT_END_NAMESPACE
+
+#endif  // FMT_RANGES_H_
--- a/src/3rdparty/http-parser/AUTHORS
+++ b/src/3rdparty/http-parser/AUTHORS
@@ -1,68 +0,0 @@
-# Authors ordered by first contribution.
-Ryan Dahl <ry@tinyclouds.org>
-Jeremy Hinegardner <jeremy@hinegardner.org>
-Sergey Shepelev <temotor@gmail.com>
-Joe Damato <ice799@gmail.com>
-tomika <tomika_nospam@freemail.hu>
-Phoenix Sol <phoenix@burninglabs.com>
-Cliff Frey <cliff@meraki.com>
-Ewen Cheslack-Postava <ewencp@cs.stanford.edu>
-Santiago Gala <sgala@apache.org>
-Tim Becker <tim.becker@syngenio.de>
-Jeff Terrace <jterrace@gmail.com>
-Ben Noordhuis <info@bnoordhuis.nl>
-Nathan Rajlich <nathan@tootallnate.net>
-Mark Nottingham <mnot@mnot.net>
-Aman Gupta <aman@tmm1.net>
-Tim Becker <tim.becker@kuriositaet.de>
-Sean Cunningham <sean.cunningham@mandiant.com>
-Peter Griess <pg@std.in>
-Salman Haq <salman.haq@asti-usa.com>
-Cliff Frey <clifffrey@gmail.com>
-Jon Kolb <jon@b0g.us>
-Fouad Mardini <f.mardini@gmail.com>
-Paul Querna <pquerna@apache.org>
-Felix Geisendörfer <felix@debuggable.com>
-koichik <koichik@improvement.jp>
-Andre Caron <andre.l.caron@gmail.com>
-Ivo Raisr <ivosh@ivosh.net>
-James McLaughlin <jamie@lacewing-project.org>
-David Gwynne <loki@animata.net>
-Thomas LE ROUX <thomas@november-eleven.fr>
-Randy Rizun <rrizun@ortivawireless.com>
-Andre Louis Caron <andre.louis.caron@usherbrooke.ca>
-Simon Zimmermann <simonz05@gmail.com>
-Erik Dubbelboer <erik@dubbelboer.com>
-Martell Malone <martellmalone@gmail.com>
-Bertrand Paquet <bpaquet@octo.com>
-BogDan Vatra <bogdan@kde.org>
-Peter Faiman <peter@thepicard.org>
-Corey Richardson <corey@octayn.net>
-Tóth Tamás <tomika_nospam@freemail.hu>
-Cam Swords <cam.swords@gmail.com>
-Chris Dickinson <christopher.s.dickinson@gmail.com>
-Uli Köhler <ukoehler@btronik.de>
-Charlie Somerville <charlie@charliesomerville.com>
-Patrik Stutz <patrik.stutz@gmail.com>
-Fedor Indutny <fedor.indutny@gmail.com>
-runner <runner.mei@gmail.com>
-Alexis Campailla <alexis@janeasystems.com>
-David Wragg <david@wragg.org>
-Vinnie Falco <vinnie.falco@gmail.com>
-Alex Butum <alexbutum@linux.com>
-Rex Feng <rexfeng@gmail.com>
-Alex Kocharin <alex@kocharin.ru>
-Mark Koopman <markmontymark@yahoo.com>
-Helge Heß <me@helgehess.eu>
-Alexis La Goutte <alexis.lagoutte@gmail.com>
-George Miroshnykov <george.miroshnykov@gmail.com>
-Maciej Małecki <me@mmalecki.com>
-Marc O'Morain <github.com@marcomorain.com>
-Jeff Pinner <jpinner@twitter.com>
-Timothy J Fontaine <tjfontaine@gmail.com>
-Akagi201 <akagi201@gmail.com>
-Romain Giraud <giraud.romain@gmail.com>
-Jay Satiro <raysatiro@yahoo.com>
-Arne Steen <Arne.Steen@gmx.de>
-Kjell Schubert <kjell.schubert@gmail.com>
-Olivier Mengué <dolmen@cpan.org>
--- a/src/3rdparty/http-parser/LICENSE-MIT
+++ b/src/3rdparty/http-parser/LICENSE-MIT
@@ -1,19 +0,0 @@
-Copyright Joyent, Inc. and other Node contributors.
-
-Permission is hereby granted, free of charge, to any person obtaining a copy
-of this software and associated documentation files (the "Software"), to
-deal in the Software without restriction, including without limitation the
-rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
-sell copies of the Software, and to permit persons to whom the Software is
-furnished to do so, subject to the following conditions:
-
-The above copyright notice and this permission notice shall be included in
-all copies or substantial portions of the Software.
-
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
-FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
-IN THE SOFTWARE. 
--- a/src/3rdparty/http-parser/README.md
+++ b/src/3rdparty/http-parser/README.md
@@ -1,246 +0,0 @@
-HTTP Parser
-===========
-
-[![Build Status](https://api.travis-ci.org/nodejs/http-parser.svg?branch=master)](https://travis-ci.org/nodejs/http-parser)
-
-This is a parser for HTTP messages written in C. It parses both requests and
-responses. The parser is designed to be used in performance HTTP
-applications. It does not make any syscalls nor allocations, it does not
-buffer data, it can be interrupted at anytime. Depending on your
-architecture, it only requires about 40 bytes of data per message
-stream (in a web server that is per connection).
-
-Features:
-
-  * No dependencies
-  * Handles persistent streams (keep-alive).
-  * Decodes chunked encoding.
-  * Upgrade support
-  * Defends against buffer overflow attacks.
-
-The parser extracts the following information from HTTP messages:
-
-  * Header fields and values
-  * Content-Length
-  * Request method
-  * Response status code
-  * Transfer-Encoding
-  * HTTP version
-  * Request URL
-  * Message body
-
-
-Usage
-----
-
-One `http_parser` object is used per TCP connection. Initialize the struct
-using `http_parser_init()` and set the callbacks. That might look something
-like this for a request parser:
-```c
-http_parser_settings settings;
-settings.on_url = my_url_callback;
-settings.on_header_field = my_header_field_callback;
-/* ... */
-
-http_parser *parser = malloc(sizeof(http_parser));
-http_parser_init(parser, HTTP_REQUEST);
-parser->data = my_socket;
-```
-
-When data is received on the socket execute the parser and check for errors.
-
-```c
-size_t len = 80*1024, nparsed;
-char buf[len];
-ssize_t recved;
-
-recved = recv(fd, buf, len, 0);
-
-if (recved < 0) {
-  /* Handle error. */
-}
-
-/* Start up / continue the parser.
- * Note we pass recved==0 to signal that EOF has been received.
- */
-nparsed = http_parser_execute(parser, &settings, buf, recved);
-
-if (parser->upgrade) {
-  /* handle new protocol */
-} else if (nparsed != recved) {
-  /* Handle error. Usually just close the connection. */
-}
-```
-
-`http_parser` needs to know where the end of the stream is. For example, sometimes
-servers send responses without Content-Length and expect the client to
-consume input (for the body) until EOF. To tell `http_parser` about EOF, give
-`0` as the fourth parameter to `http_parser_execute()`. Callbacks and errors
-can still be encountered during an EOF, so one must still be prepared
-to receive them.
-
-Scalar valued message information such as `status_code`, `method`, and the
-HTTP version are stored in the parser structure. This data is only
-temporally stored in `http_parser` and gets reset on each new message. If
-this information is needed later, copy it out of the structure during the
-`headers_complete` callback.
-
-The parser decodes the transfer-encoding for both requests and responses
-transparently. That is, a chunked encoding is decoded before being sent to
-the on_body callback.
-
-
-The Special Problem of Upgrade
------------------------------
-
-`http_parser` supports upgrading the connection to a different protocol. An
-increasingly common example of this is the WebSocket protocol which sends
-a request like
-
-        GET /demo HTTP/1.1
-        Upgrade: WebSocket
-        Connection: Upgrade
-        Host: example.com
-        Origin: http://example.com
-        WebSocket-Protocol: sample
-
-followed by non-HTTP data.
-
-(See [RFC6455](https://tools.ietf.org/html/rfc6455) for more information the
-WebSocket protocol.)
-
-To support this, the parser will treat this as a normal HTTP message without a
-body, issuing both on_headers_complete and on_message_complete callbacks. However
-http_parser_execute() will stop parsing at the end of the headers and return.
-
-The user is expected to check if `parser->upgrade` has been set to 1 after
-`http_parser_execute()` returns. Non-HTTP data begins at the buffer supplied
-offset by the return value of `http_parser_execute()`.
-
-
-Callbacks
---------
-
-During the `http_parser_execute()` call, the callbacks set in
-`http_parser_settings` will be executed. The parser maintains state and
-never looks behind, so buffering the data is not necessary. If you need to
-save certain data for later usage, you can do that from the callbacks.
-
-There are two types of callbacks:
-
-* notification `typedef int (*http_cb) (http_parser*);`
-    Callbacks: on_message_begin, on_headers_complete, on_message_complete.
-* data `typedef int (*http_data_cb) (http_parser*, const char *at, size_t length);`
-    Callbacks: (requests only) on_url,
-               (common) on_header_field, on_header_value, on_body;
-
-Callbacks must return 0 on success. Returning a non-zero value indicates
-error to the parser, making it exit immediately.
-
-For cases where it is necessary to pass local information to/from a callback,
-the `http_parser` object's `data` field can be used.
-An example of such a case is when using threads to handle a socket connection,
-parse a request, and then give a response over that socket. By instantiation
-of a thread-local struct containing relevant data (e.g. accepted socket,
-allocated memory for callbacks to write into, etc), a parser's callbacks are
-able to communicate data between the scope of the thread and the scope of the
-callback in a threadsafe manner. This allows `http_parser` to be used in
-multi-threaded contexts.
-
-Example:
-```c
- typedef struct {
-  socket_t sock;
-  void* buffer;
-  int buf_len;
- } custom_data_t;
-
-
-int my_url_callback(http_parser* parser, const char *at, size_t length) {
-  /* access to thread local custom_data_t struct.
-  Use this access save parsed data for later use into thread local
-  buffer, or communicate over socket
-  */
-  parser->data;
-  ...
-  return 0;
-}
-
-...
-
-void http_parser_thread(socket_t sock) {
- int nparsed = 0;
- /* allocate memory for user data */
- custom_data_t *my_data = malloc(sizeof(custom_data_t));
-
- /* some information for use by callbacks.
- * achieves thread -> callback information flow */
- my_data->sock = sock;
-
- /* instantiate a thread-local parser */
- http_parser *parser = malloc(sizeof(http_parser));
- http_parser_init(parser, HTTP_REQUEST); /* initialise parser */
- /* this custom data reference is accessible through the reference to the
- parser supplied to callback functions */
- parser->data = my_data;
-
- http_parser_settings settings; /* set up callbacks */
- settings.on_url = my_url_callback;
-
- /* execute parser */
- nparsed = http_parser_execute(parser, &settings, buf, recved);
-
- ...
- /* parsed information copied from callback.
- can now perform action on data copied into thread-local memory from callbacks.
- achieves callback -> thread information flow */
- my_data->buffer;
- ...
-}
-
-```
-
-In case you parse HTTP message in chunks (i.e. `read()` request line
-from socket, parse, read half headers, parse, etc) your data callbacks
-may be called more than once. `http_parser` guarantees that data pointer is only
-valid for the lifetime of callback. You can also `read()` into a heap allocated
-buffer to avoid copying memory around if this fits your application.
-
-Reading headers may be a tricky task if you read/parse headers partially.
-Basically, you need to remember whether last header callback was field or value
-and apply the following logic:
-
-    (on_header_field and on_header_value shortened to on_h_*)
-     ------------------------ ------------ --------------------------------------------
-    | State (prev. callback) | Callback   | Description/action                         |
-     ------------------------ ------------ --------------------------------------------
-    | nothing (first call)   | on_h_field | Allocate new buffer and copy callback data |
-    |                        |            | into it                                    |
-     ------------------------ ------------ --------------------------------------------
-    | value                  | on_h_field | New header started.                        |
-    |                        |            | Copy current name,value buffers to headers |
-    |                        |            | list and allocate new buffer for new name  |
-     ------------------------ ------------ --------------------------------------------
-    | field                  | on_h_field | Previous name continues. Reallocate name   |
-    |                        |            | buffer and append callback data to it      |
-     ------------------------ ------------ --------------------------------------------
-    | field                  | on_h_value | Value for current header started. Allocate |
-    |                        |            | new buffer and copy callback data to it    |
-     ------------------------ ------------ --------------------------------------------
-    | value                  | on_h_value | Value continues. Reallocate value buffer   |
-    |                        |            | and append callback data to it             |
-     ------------------------ ------------ --------------------------------------------
-
-
-Parsing URLs
------------
-
-A simplistic zero-copy URL parser is provided as `http_parser_parse_url()`.
-Users of this library may wish to use it to parse URLs constructed from
-consecutive `on_url` callbacks.
-
-See examples of reading in headers:
-
-* [partial example](http://gist.github.com/155877) in C
-* [from http-parser tests](http://github.com/joyent/http-parser/blob/37a0ff8/test.c#L403) in C
-* [from Node library](http://github.com/joyent/node/blob/842eaf4/src/http.js#L284) in Javascript
--- a/src/3rdparty/http-parser/http_parser.c
+++ b/src/3rdparty/http-parser/http_parser.c
--- a/src/3rdparty/http-parser/http_parser.h
+++ b/src/3rdparty/http-parser/http_parser.h
@@ -1,442 +0,0 @@
-/* Copyright Joyent, Inc. and other Node contributors. All rights reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- */
-#ifndef http_parser_h
-#define http_parser_h
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-/* Also update SONAME in the Makefile whenever you change these. */
-#define HTTP_PARSER_VERSION_MAJOR 2
-#define HTTP_PARSER_VERSION_MINOR 9
-#define HTTP_PARSER_VERSION_PATCH 3
-
-#include <stddef.h>
-#if defined(_WIN32) && !defined(__MINGW32__) && \
-  (!defined(_MSC_VER) || _MSC_VER<1600) && !defined(__WINE__)
-#include <BaseTsd.h>
-typedef __int8 int8_t;
-typedef unsigned __int8 uint8_t;
-typedef __int16 int16_t;
-typedef unsigned __int16 uint16_t;
-typedef __int32 int32_t;
-typedef unsigned __int32 uint32_t;
-typedef __int64 int64_t;
-typedef unsigned __int64 uint64_t;
-#else
-#include <stdint.h>
-#endif
-
-/* Compile with -DHTTP_PARSER_STRICT=0 to make less checks, but run
- * faster
- */
-#ifndef HTTP_PARSER_STRICT
-# define HTTP_PARSER_STRICT 1
-#endif
-
-/* Maximium header size allowed. If the macro is not defined
- * before including this header then the default is used. To
- * change the maximum header size, define the macro in the build
- * environment (e.g. -DHTTP_MAX_HEADER_SIZE=<value>). To remove
- * the effective limit on the size of the header, define the macro
- * to a very large number (e.g. -DHTTP_MAX_HEADER_SIZE=0x7fffffff)
- */
-#ifndef HTTP_MAX_HEADER_SIZE
-# define HTTP_MAX_HEADER_SIZE (80*1024)
-#endif
-
-typedef struct http_parser http_parser;
-typedef struct http_parser_settings http_parser_settings;
-
-
-/* Callbacks should return non-zero to indicate an error. The parser will
- * then halt execution.
- *
- * The one exception is on_headers_complete. In a HTTP_RESPONSE parser
- * returning '1' from on_headers_complete will tell the parser that it
- * should not expect a body. This is used when receiving a response to a
- * HEAD request which may contain 'Content-Length' or 'Transfer-Encoding:
- * chunked' headers that indicate the presence of a body.
- *
- * Returning `2` from on_headers_complete will tell parser that it should not
- * expect neither a body nor any futher responses on this connection. This is
- * useful for handling responses to a CONNECT request which may not contain
- * `Upgrade` or `Connection: upgrade` headers.
- *
- * http_data_cb does not return data chunks. It will be called arbitrarily
- * many times for each string. E.G. you might get 10 callbacks for "on_url"
- * each providing just a few characters more data.
- */
-typedef int (*http_data_cb) (http_parser*, const char *at, size_t length);
-typedef int (*http_cb) (http_parser*);
-
-
-/* Status Codes */
-#define HTTP_STATUS_MAP(XX)                                                 \
-  XX(100, CONTINUE,                        Continue)                        \
-  XX(101, SWITCHING_PROTOCOLS,             Switching Protocols)             \
-  XX(102, PROCESSING,                      Processing)                      \
-  XX(200, OK,                              OK)                              \
-  XX(201, CREATED,                         Created)                         \
-  XX(202, ACCEPTED,                        Accepted)                        \
-  XX(203, NON_AUTHORITATIVE_INFORMATION,   Non-Authoritative Information)   \
-  XX(204, NO_CONTENT,                      No Content)                      \
-  XX(205, RESET_CONTENT,                   Reset Content)                   \
-  XX(206, PARTIAL_CONTENT,                 Partial Content)                 \
-  XX(207, MULTI_STATUS,                    Multi-Status)                    \
-  XX(208, ALREADY_REPORTED,                Already Reported)                \
-  XX(226, IM_USED,                         IM Used)                         \
-  XX(300, MULTIPLE_CHOICES,                Multiple Choices)                \
-  XX(301, MOVED_PERMANENTLY,               Moved Permanently)               \
-  XX(302, FOUND,                           Found)                           \
-  XX(303, SEE_OTHER,                       See Other)                       \
-  XX(304, NOT_MODIFIED,                    Not Modified)                    \
-  XX(305, USE_PROXY,                       Use Proxy)                       \
-  XX(307, TEMPORARY_REDIRECT,              Temporary Redirect)              \
-  XX(308, PERMANENT_REDIRECT,              Permanent Redirect)              \
-  XX(400, BAD_REQUEST,                     Bad Request)                     \
-  XX(401, UNAUTHORIZED,                    Unauthorized)                    \
-  XX(402, PAYMENT_REQUIRED,                Payment Required)                \
-  XX(403, FORBIDDEN,                       Forbidden)                       \
-  XX(404, NOT_FOUND,                       Not Found)                       \
-  XX(405, METHOD_NOT_ALLOWED,              Method Not Allowed)              \
-  XX(406, NOT_ACCEPTABLE,                  Not Acceptable)                  \
-  XX(407, PROXY_AUTHENTICATION_REQUIRED,   Proxy Authentication Required)   \
-  XX(408, REQUEST_TIMEOUT,                 Request Timeout)                 \
-  XX(409, CONFLICT,                        Conflict)                        \
-  XX(410, GONE,                            Gone)                            \
-  XX(411, LENGTH_REQUIRED,                 Length Required)                 \
-  XX(412, PRECONDITION_FAILED,             Precondition Failed)             \
-  XX(413, PAYLOAD_TOO_LARGE,               Payload Too Large)               \
-  XX(414, URI_TOO_LONG,                    URI Too Long)                    \
-  XX(415, UNSUPPORTED_MEDIA_TYPE,          Unsupported Media Type)          \
-  XX(416, RANGE_NOT_SATISFIABLE,           Range Not Satisfiable)           \
-  XX(417, EXPECTATION_FAILED,              Expectation Failed)              \
-  XX(421, MISDIRECTED_REQUEST,             Misdirected Request)             \
-  XX(422, UNPROCESSABLE_ENTITY,            Unprocessable Entity)            \
-  XX(423, LOCKED,                          Locked)                          \
-  XX(424, FAILED_DEPENDENCY,               Failed Dependency)               \
-  XX(426, UPGRADE_REQUIRED,                Upgrade Required)                \
-  XX(428, PRECONDITION_REQUIRED,           Precondition Required)           \
-  XX(429, TOO_MANY_REQUESTS,               Too Many Requests)               \
-  XX(431, REQUEST_HEADER_FIELDS_TOO_LARGE, Request Header Fields Too Large) \
-  XX(451, UNAVAILABLE_FOR_LEGAL_REASONS,   Unavailable For Legal Reasons)   \
-  XX(500, INTERNAL_SERVER_ERROR,           Internal Server Error)           \
-  XX(501, NOT_IMPLEMENTED,                 Not Implemented)                 \
-  XX(502, BAD_GATEWAY,                     Bad Gateway)                     \
-  XX(503, SERVICE_UNAVAILABLE,             Service Unavailable)             \
-  XX(504, GATEWAY_TIMEOUT,                 Gateway Timeout)                 \
-  XX(505, HTTP_VERSION_NOT_SUPPORTED,      HTTP Version Not Supported)      \
-  XX(506, VARIANT_ALSO_NEGOTIATES,         Variant Also Negotiates)         \
-  XX(507, INSUFFICIENT_STORAGE,            Insufficient Storage)            \
-  XX(508, LOOP_DETECTED,                   Loop Detected)                   \
-  XX(510, NOT_EXTENDED,                    Not Extended)                    \
-  XX(511, NETWORK_AUTHENTICATION_REQUIRED, Network Authentication Required) \
-
-enum http_status
-  {
-#define XX(num, name, string) HTTP_STATUS_##name = num,
-  HTTP_STATUS_MAP(XX)
-#undef XX
-  };
-
-
-/* Request Methods */
-#define HTTP_METHOD_MAP(XX)         \
-  XX(0,  DELETE,      DELETE)       \
-  XX(1,  GET,         GET)          \
-  XX(2,  HEAD,        HEAD)         \
-  XX(3,  POST,        POST)         \
-  XX(4,  PUT,         PUT)          \
-  /* pathological */                \
-  XX(5,  CONNECT,     CONNECT)      \
-  XX(6,  OPTIONS,     OPTIONS)      \
-  XX(7,  TRACE,       TRACE)        \
-  /* WebDAV */                      \
-  XX(8,  COPY,        COPY)         \
-  XX(9,  LOCK,        LOCK)         \
-  XX(10, MKCOL,       MKCOL)        \
-  XX(11, MOVE,        MOVE)         \
-  XX(12, PROPFIND,    PROPFIND)     \
-  XX(13, PROPPATCH,   PROPPATCH)    \
-  XX(14, SEARCH,      SEARCH)       \
-  XX(15, UNLOCK,      UNLOCK)       \
-  XX(16, BIND,        BIND)         \
-  XX(17, REBIND,      REBIND)       \
-  XX(18, UNBIND,      UNBIND)       \
-  XX(19, ACL,         ACL)          \
-  /* subversion */                  \
-  XX(20, REPORT,      REPORT)       \
-  XX(21, MKACTIVITY,  MKACTIVITY)   \
-  XX(22, CHECKOUT,    CHECKOUT)     \
-  XX(23, MERGE,       MERGE)        \
-  /* upnp */                        \
-  XX(24, MSEARCH,     M-SEARCH)     \
-  XX(25, NOTIFY,      NOTIFY)       \
-  XX(26, SUBSCRIBE,   SUBSCRIBE)    \
-  XX(27, UNSUBSCRIBE, UNSUBSCRIBE)  \
-  /* RFC-5789 */                    \
-  XX(28, PATCH,       PATCH)        \
-  XX(29, PURGE,       PURGE)        \
-  /* CalDAV */                      \
-  XX(30, MKCALENDAR,  MKCALENDAR)   \
-  /* RFC-2068, section 19.6.1.2 */  \
-  XX(31, LINK,        LINK)         \
-  XX(32, UNLINK,      UNLINK)       \
-  /* icecast */                     \
-  XX(33, SOURCE,      SOURCE)       \
-
-enum http_method
-  {
-#define XX(num, name, string) HTTP_##name = num,
-  HTTP_METHOD_MAP(XX)
-#undef XX
-  };
-
-
-enum http_parser_type { HTTP_REQUEST, HTTP_RESPONSE, HTTP_BOTH };
-
-
-/* Flag values for http_parser.flags field */
-enum flags
-  { F_CHUNKED               = 1 << 0
-  , F_CONNECTION_KEEP_ALIVE = 1 << 1
-  , F_CONNECTION_CLOSE      = 1 << 2
-  , F_CONNECTION_UPGRADE    = 1 << 3
-  , F_TRAILING              = 1 << 4
-  , F_UPGRADE               = 1 << 5
-  , F_SKIPBODY              = 1 << 6
-  , F_CONTENTLENGTH         = 1 << 7
-  , F_TRANSFER_ENCODING     = 1 << 8
-  };
-
-
-/* Map for errno-related constants
- *
- * The provided argument should be a macro that takes 2 arguments.
- */
-#define HTTP_ERRNO_MAP(XX)                                           \
-  /* No error */                                                     \
-  XX(OK, "success")                                                  \
-                                                                     \
-  /* Callback-related errors */                                      \
-  XX(CB_message_begin, "the on_message_begin callback failed")       \
-  XX(CB_url, "the on_url callback failed")                           \
-  XX(CB_header_field, "the on_header_field callback failed")         \
-  XX(CB_header_value, "the on_header_value callback failed")         \
-  XX(CB_headers_complete, "the on_headers_complete callback failed") \
-  XX(CB_body, "the on_body callback failed")                         \
-  XX(CB_message_complete, "the on_message_complete callback failed") \
-  XX(CB_status, "the on_status callback failed")                     \
-  XX(CB_chunk_header, "the on_chunk_header callback failed")         \
-  XX(CB_chunk_complete, "the on_chunk_complete callback failed")     \
-                                                                     \
-  /* Parsing-related errors */                                       \
-  XX(INVALID_EOF_STATE, "stream ended at an unexpected time")        \
-  XX(HEADER_OVERFLOW,                                                \
-     "too many header bytes seen; overflow detected")                \
-  XX(CLOSED_CONNECTION,                                              \
-     "data received after completed connection: close message")      \
-  XX(INVALID_VERSION, "invalid HTTP version")                        \
-  XX(INVALID_STATUS, "invalid HTTP status code")                     \
-  XX(INVALID_METHOD, "invalid HTTP method")                          \
-  XX(INVALID_URL, "invalid URL")                                     \
-  XX(INVALID_HOST, "invalid host")                                   \
-  XX(INVALID_PORT, "invalid port")                                   \
-  XX(INVALID_PATH, "invalid path")                                   \
-  XX(INVALID_QUERY_STRING, "invalid query string")                   \
-  XX(INVALID_FRAGMENT, "invalid fragment")                           \
-  XX(LF_EXPECTED, "LF character expected")                           \
-  XX(INVALID_HEADER_TOKEN, "invalid character in header")            \
-  XX(INVALID_CONTENT_LENGTH,                                         \
-     "invalid character in content-length header")                   \
-  XX(UNEXPECTED_CONTENT_LENGTH,                                      \
-     "unexpected content-length header")                             \
-  XX(INVALID_CHUNK_SIZE,                                             \
-     "invalid character in chunk size header")                       \
-  XX(INVALID_TRANSFER_ENCODING,                                      \
-     "request has invalid transfer-encoding")                        \
-  XX(INVALID_CONSTANT, "invalid constant string")                    \
-  XX(INVALID_INTERNAL_STATE, "encountered unexpected internal state")\
-  XX(STRICT, "strict mode assertion failed")                         \
-  XX(PAUSED, "parser is paused")                                     \
-  XX(UNKNOWN, "an unknown error occurred")
-
-
-/* Define HPE_* values for each errno value above */
-#define HTTP_ERRNO_GEN(n, s) HPE_##n,
-enum http_errno {
-  HTTP_ERRNO_MAP(HTTP_ERRNO_GEN)
-};
-#undef HTTP_ERRNO_GEN
-
-
-/* Get an http_errno value from an http_parser */
-#define HTTP_PARSER_ERRNO(p)            ((enum http_errno) (p)->http_errno)
-
-
-struct http_parser {
-  /** PRIVATE **/
-  unsigned int type : 2;         /* enum http_parser_type */
-  unsigned int state : 7;        /* enum state from http_parser.c */
-  unsigned int header_state : 7; /* enum header_state from http_parser.c */
-  unsigned int index : 7;        /* index into current matcher */
-  unsigned int lenient_http_headers : 1;
-  unsigned int flags : 16;       /* F_* values from 'flags' enum; semi-public */
-
-  uint32_t nread;          /* # bytes read in various scenarios */
-  uint64_t content_length; /* # bytes in body (0 if no Content-Length header) */
-
-  /** READ-ONLY **/
-  unsigned short http_major;
-  unsigned short http_minor;
-  unsigned int status_code : 16; /* responses only */
-  unsigned int method : 8;       /* requests only */
-  unsigned int http_errno : 7;
-
-  /* 1 = Upgrade header was present and the parser has exited because of that.
-   * 0 = No upgrade header present.
-   * Should be checked when http_parser_execute() returns in addition to
-   * error checking.
-   */
-  unsigned int upgrade : 1;
-
-  /** PUBLIC **/
-  void *data; /* A pointer to get hook to the "connection" or "socket" object */
-};
-
-
-struct http_parser_settings {
-  http_cb      on_message_begin;
-  http_data_cb on_url;
-  http_data_cb on_status;
-  http_data_cb on_header_field;
-  http_data_cb on_header_value;
-  http_cb      on_headers_complete;
-  http_data_cb on_body;
-  http_cb      on_message_complete;
-  /* When on_chunk_header is called, the current chunk length is stored
-   * in parser->content_length.
-   */
-  http_cb      on_chunk_header;
-  http_cb      on_chunk_complete;
-};
-
-
-enum http_parser_url_fields
-  { UF_SCHEMA           = 0
-  , UF_HOST             = 1
-  , UF_PORT             = 2
-  , UF_PATH             = 3
-  , UF_QUERY            = 4
-  , UF_FRAGMENT         = 5
-  , UF_USERINFO         = 6
-  , UF_MAX              = 7
-  };
-
-
-/* Result structure for http_parser_parse_url().
- *
- * Callers should index into field_data[] with UF_* values iff field_set
- * has the relevant (1 << UF_*) bit set. As a courtesy to clients (and
- * because we probably have padding left over), we convert any port to
- * a uint16_t.
- */
-struct http_parser_url {
-  uint16_t field_set;           /* Bitmask of (1 << UF_*) values */
-  uint16_t port;                /* Converted UF_PORT string */
-
-  struct {
-    uint16_t off;               /* Offset into buffer in which field starts */
-    uint16_t len;               /* Length of run in buffer */
-  } field_data[UF_MAX];
-};
-
-
-/* Returns the library version. Bits 16-23 contain the major version number,
- * bits 8-15 the minor version number and bits 0-7 the patch level.
- * Usage example:
- *
- *   unsigned long version = http_parser_version();
- *   unsigned major = (version >> 16) & 255;
- *   unsigned minor = (version >> 8) & 255;
- *   unsigned patch = version & 255;
- *   printf("http_parser v%u.%u.%u\n", major, minor, patch);
- */
-unsigned long http_parser_version(void);
-
-void http_parser_init(http_parser *parser, enum http_parser_type type);
-
-
-/* Initialize http_parser_settings members to 0
- */
-void http_parser_settings_init(http_parser_settings *settings);
-
-
-/* Executes the parser. Returns number of parsed bytes. Sets
- * `parser->http_errno` on error. */
-size_t http_parser_execute(http_parser *parser,
-                           const http_parser_settings *settings,
-                           const char *data,
-                           size_t len);
-
-
-/* If http_should_keep_alive() in the on_headers_complete or
- * on_message_complete callback returns 0, then this should be
- * the last message on the connection.
- * If you are the server, respond with the "Connection: close" header.
- * If you are the client, close the connection.
- */
-int http_should_keep_alive(const http_parser *parser);
-
-/* Returns a string version of the HTTP method. */
-const char *http_method_str(enum http_method m);
-
-/* Returns a string version of the HTTP status code. */
-const char *http_status_str(enum http_status s);
-
-/* Return a string name of the given error */
-const char *http_errno_name(enum http_errno err);
-
-/* Return a string description of the given error */
-const char *http_errno_description(enum http_errno err);
-
-/* Initialize all http_parser_url members to 0 */
-void http_parser_url_init(struct http_parser_url *u);
-
-/* Parse a URL; return nonzero on failure */
-int http_parser_parse_url(const char *buf, size_t buflen,
-                          int is_connect,
-                          struct http_parser_url *u);
-
-/* Pause or un-pause the parser; a nonzero value pauses */
-void http_parser_pause(http_parser *parser, int paused);
-
-/* Checks if this is the final chunk of the body. */
-int http_body_is_final(const http_parser *parser);
-
-/* Change the maximum header size provided at compile time. */
-void http_parser_set_max_header_size(uint32_t size);
-
-#ifdef __cplusplus
-}
-#endif
-#endif
--- a/src/3rdparty/hwloc/CMakeLists.txt
+++ b/src/3rdparty/hwloc/CMakeLists.txt
@@ -1,4 +1,4 @@
-cmake_minimum_required (VERSION 2.8)
+cmake_minimum_required (VERSION  2.8.12)
 project (hwloc C)

 include_directories(include)
@@ -30,6 +30,8 @@ set(SOURCES
    src/topology-xml.c
    src/topology-xml-nolibxml.c
    src/traversal.c
+    src/memattrs.c
+    src/cpukinds.c
   )

 add_library(hwloc STATIC
--- a/src/3rdparty/hwloc/NEWS
+++ b/src/3rdparty/hwloc/NEWS
@@ -2,6 +2,7 @@ Copyright © 2009 CNRS
 Copyright © 2009-2020 Inria.  All rights reserved.
 Copyright © 2009-2013 Université Bordeaux
 Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
+Copyright © 2020 Hewlett Packard Enterprise.  All rights reserved.

 $COPYRIGHT$

@@ -16,6 +17,76 @@ bug fixes (and other actions) for each version of hwloc since version
 0.9.


+Version 2.4.0
+-------------
+* API
+  + Add hwloc/cpukinds.h for reporting information about hybrid CPUs.
+    - Use Linux cpufreq frequencies to rank cores by efficiency.
+    - Use x86 CPUID hybrid leaf and future Linux kernels sysfs CPU type
+      files to identify Intel Atom and Core cores.
+    - Use the Windows native EfficiencyClass to separate kinds.
+* Backends
+  + Properly handle Linux kernel 5.10+ exposing ACPI HMAT information
+    with knowledge of Generic Initiators.
+* Tools
+  + lstopo has new --cpukinds and --no-cpukinds options for showing
+    CPU kinds or not in textual and graphical modes respectively.
+  + hwloc-calc has a new --cpukind option for filtering PUs by kind.
+  + hwloc-annotate has a new cpukind command for modifying CPU kinds.
+* Misc
+  + Fix hwloc_bitmap_nr_ulongs(), thanks to Norbert Eicker.
+  + Add a documentation section about
+    "Topology Attributes: Distances, Memory Attributes and CPU Kinds".
+  + Silence some spurious warnings in the OpenCL backend and when showing
+    process binding with lstopo --ps.
+
+
+Version 2.3.0
+-------------
+* API
+  + Add hwloc/memattrs.h for exposing latency/bandwidth information
+    between initiators (CPU sets for now) and target NUMA nodes,
+    typically on heterogeneous platforms.
+    - When available, bandwidths and latencies are read from the ACPI HMAT
+      table exposed by Linux kernel 5.2+.
+    - Attributes may also be customized to expose user-defined performance
+      information.
+  + Add hwloc_get_local_numanode_objs() for listing NUMA nodes that are
+    local to some locality.
+  + The new topology flag HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT causes
+    support arrays to be loaded from XML exported with hwloc 2.3+.
+    - hwloc_topology_get_support() now returns an additional "misc"
+      array with feature "imported_support" set when support was imported.
+  + Add hwloc_topology_refresh() to refresh internal caches after modifying
+    the topology and before consulting the topology in a multithread context.
+* Backends
+  + Add a ROCm SMI backend and a hwloc/rsmi.h helper file for getting
+    the locality of AMD GPUs, now exposed as "rsmi" OS devices.
+    Thanks to Mike Li.
+  + Remove POWER device-tree-based topology on Linux,
+    (it was disabled by default since 2.1).
+* Tools
+  + Command-line options for specifying flags now understand comma-separated
+    lists of flag names (substrings).
+  + hwloc-info and hwloc-calc have new --local-memory --local-memory-flags
+    and --best-memattr options for reporting local memory nodes and filtering
+    by memory attributes.
+  + hwloc-bind has a new --best-memattr option for filtering by memory attributes
+    among the memory binding set.
+  + Tools that have a --restrict option may now receive a nodeset or
+    some custom flags for restricting the topology.
+  + lstopo now has a --thickness option for changing line thickness in the
+    graphical output.
+  + Fix lstopo drawing when autoresizing on Windows 10.
+  + Pressing the F5 key in lstopo X11 and Windows graphical/interactive outputs
+    now refreshes the display according to the current topology and binding.
+  + Add a tikz lstopo graphical backend to generate picture easily included into
+    LaTeX documents. Thanks to Clement Foyer.
+* Misc
+  + The default installation path of the Bash completion file has changed to
+    ${datadir}/bash-completion/completions/hwloc. Thanks to Tomasz Kłoczko.
+
+
 Version 2.2.0
 -------------
 * API
--- a/src/3rdparty/hwloc/README
+++ b/src/3rdparty/hwloc/README
@@ -23,9 +23,9 @@ APIs are documented after these sections.

 Installation

-hwloc (http://www.open-mpi.org/projects/hwloc/) is available under the BSD
-license. It is hosted as a sub-project of the overall Open MPI project (http://
-www.open-mpi.org/). Note that hwloc does not require any functionality from
+hwloc (https://www.open-mpi.org/projects/hwloc/) is available under the BSD
+license. It is hosted as a sub-project of the overall Open MPI project (https:/
+/www.open-mpi.org/). Note that hwloc does not require any functionality from
 Open MPI -- it is a wholly separate (and much smaller!) project and code base.
 It just happens to be hosted as part of the overall Open MPI project.

@@ -75,7 +75,7 @@ Bugs should be reported in the tracker (https://github.com/open-mpi/hwloc/
 issues). Opening a new issue automatically displays lots of hints about how to
 debug and report issues.

-Questions may be sent to the users or developers mailing lists (http://
+Questions may be sent to the users or developers mailing lists (https://
 www.open-mpi.org/community/lists/hwloc.php).

 There is also a #hwloc IRC channel on Freenode (irc.freenode.net).
--- a/src/3rdparty/hwloc/VERSION
+++ b/src/3rdparty/hwloc/VERSION
@@ -8,7 +8,7 @@
 # Please update HWLOC_VERSION* in contrib/windows/hwloc_config.h too.

 major=2
-minor=2
+minor=4
 release=0

 # greek is used for alpha or beta release tags.  If it is non-empty,
@@ -22,7 +22,7 @@ greek=

 # The date when this release was created

-date="Mar 30, 2020"
+date="Nov 26, 2020"

 # If snapshot=1, then use the value from snapshot_version as the
 # entire hwloc version (i.e., ignore major, minor, release, and
@@ -41,7 +41,7 @@ snapshot_version=${major}.${minor}.${release}${greek}-git
 # 2. Version numbers are described in the Libtool current:revision:age
 # format.

-libhwloc_so_version=17:0:2
+libhwloc_so_version=19:0:4
 libnetloc_so_version=0:0:0

 # Please also update the <TargetName> lines in contrib/windows/libhwloc.vcxproj
--- a/src/3rdparty/hwloc/include/hwloc.h
+++ b/src/3rdparty/hwloc/include/hwloc.h
@@ -1,8 +1,8 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2020 Inria.  All rights reserved.
+ * Copyright © 2009-2021 Inria.  All rights reserved.
 * Copyright © 2009-2012 Université Bordeaux
- * Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
+ * Copyright © 2009-2020 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
 */

@@ -11,7 +11,7 @@
 *         ------------------------------------------------
 *               $tarball_directory/doc/doxygen-doc/
 *                                or
- *           http://www.open-mpi.org/projects/hwloc/doc/
+ *           https://www.open-mpi.org/projects/hwloc/doc/
 *=====================================================================
 *
 * FAIR WARNING: Do NOT expect to be able to figure out all the
@@ -93,7 +93,7 @@ extern "C" {
 * Two stable releases of the same series usually have the same ::HWLOC_API_VERSION
 * even if their HWLOC_VERSION are different.
 */
-#define HWLOC_API_VERSION 0x00020100
+#define HWLOC_API_VERSION 0x00020400

 /** \brief Indicate at runtime which hwloc API version was used at build time.
 *
@@ -102,7 +102,7 @@ extern "C" {
 HWLOC_DECLSPEC unsigned hwloc_get_api_version(void);

 /** \brief Current component and plugin ABI version (see hwloc/plugins.h) */
-#define HWLOC_COMPONENT_ABI 6
+#define HWLOC_COMPONENT_ABI 7

 /** @} */

@@ -196,7 +196,7 @@ typedef enum {
 			  */
  HWLOC_OBJ_CORE,	/**< \brief Core.
 			  * A computation unit (may be shared by several
-			  * logical processors).
+			  * PUs, aka logical processors).
 			  */
  HWLOC_OBJ_PU,		/**< \brief Processing Unit, or (Logical) Processor.
 			  * An execution unit (may share a core with some
@@ -257,22 +257,31 @@ typedef enum {
  HWLOC_OBJ_BRIDGE,	/**< \brief Bridge (filtered out by default).
 			  * Any bridge (or PCI switch) that connects the host or an I/O bus,
 			  * to another I/O bus.
-			  * They are not added to the topology unless I/O discovery
-			  * is enabled with hwloc_topology_set_flags().
+			  *
+			  * Bridges are not added to the topology unless their
+			  * filtering is changed (see hwloc_topology_set_type_filter()
+			  * and hwloc_topology_set_io_types_filter()).
+			  *
 			  * I/O objects are not listed in the main children list,
 			  * but rather in the dedicated io children list.
 			  * I/O objects have NULL CPU and node sets.
 			  */
  HWLOC_OBJ_PCI_DEVICE,	/**< \brief PCI device (filtered out by default).
-			  * They are not added to the topology unless I/O discovery
-			  * is enabled with hwloc_topology_set_flags().
+			  *
+			  * PCI devices are not added to the topology unless their
+			  * filtering is changed (see hwloc_topology_set_type_filter()
+			  * and hwloc_topology_set_io_types_filter()).
+			  *
 			  * I/O objects are not listed in the main children list,
 			  * but rather in the dedicated io children list.
 			  * I/O objects have NULL CPU and node sets.
 			  */
  HWLOC_OBJ_OS_DEVICE,	/**< \brief Operating system device (filtered out by default).
-			  * They are not added to the topology unless I/O discovery
-			  * is enabled with hwloc_topology_set_flags().
+			  *
+			  * OS devices are not added to the topology unless their
+			  * filtering is changed (see hwloc_topology_set_type_filter()
+			  * and hwloc_topology_set_io_types_filter()).
+			  *
 			  * I/O objects are not listed in the main children list,
 			  * but rather in the dedicated io children list.
 			  * I/O objects have NULL CPU and node sets.
@@ -282,6 +291,10 @@ typedef enum {
 			  * Objects without particular meaning, that can e.g. be
 			  * added by the application for its own use, or by hwloc
 			  * for miscellaneous objects such as MemoryModule (DIMMs).
+			  *
+			  * They are not added to the topology unless their filtering
+			  * is changed (see hwloc_topology_set_type_filter()).
+			  *
 			  * These objects are not listed in the main children list,
 			  * but rather in the dedicated misc children list.
 			  * Misc objects may only have Misc objects as children,
@@ -304,7 +317,6 @@ typedef enum {

  HWLOC_OBJ_DIE,	/**< \brief Die within a physical package.
 			 * A subpart of the physical package, that contains multiple cores.
-			 * \hideinitializer
 			 */

  HWLOC_OBJ_TYPE_MAX    /**< \private Sentinel value */
@@ -338,8 +350,7 @@ typedef enum hwloc_obj_osdev_type_e {
  HWLOC_OBJ_OSDEV_DMA,		/**< \brief Operating system dma engine device.
 				  * For instance the "dma0chan0" DMA channel on Linux. */
  HWLOC_OBJ_OSDEV_COPROC	/**< \brief Operating system co-processor device.
-				  * For instance "mic0" for a Xeon Phi (MIC) on Linux,
-				  * "opencl0d0" for a OpenCL device,
+				  * For instance "opencl0d0" for a OpenCL device,
 				  * "cuda0" for a CUDA device. */
 } hwloc_obj_osdev_type_t;

@@ -512,7 +523,7 @@ struct hwloc_obj {
 					  *
                                          * \note Its value must not be changed, hwloc_bitmap_dup() must be used instead.
                                          */
-  hwloc_cpuset_t complete_cpuset;       /**< \brief The complete CPU set of logical processors of this object,
+  hwloc_cpuset_t complete_cpuset;       /**< \brief The complete CPU set of processors of this object,
                                          *
                                          * This may include not only the same as the cpuset field, but also some CPUs for
                                          * which topology information is unknown or incomplete, some offlines CPUs, and
@@ -533,6 +544,8 @@ struct hwloc_obj {
                                          * between this object and the NUMA node objects).
                                          *
                                          * In the end, these nodes are those that are close to the current object.
+                                          * Function hwloc_get_local_numanode_objs() may be used to list those NUMA
+                                          * nodes more precisely.
                                          *
                                          * If the ::HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED configuration flag is set,
                                          * some of these nodes may not be allowed for allocation,
@@ -1929,7 +1942,31 @@ enum hwloc_topology_flags_e {
   * would result in the same behavior.
   * \hideinitializer
   */
-  HWLOC_TOPOLOGY_FLAG_THISSYSTEM_ALLOWED_RESOURCES = (1UL<<2)
+  HWLOC_TOPOLOGY_FLAG_THISSYSTEM_ALLOWED_RESOURCES = (1UL<<2),
+
+  /** \brief Import support from the imported topology.
+   *
+   * When importing a XML topology from a remote machine, binding is
+   * disabled by default (see ::HWLOC_TOPOLOGY_FLAG_IS_THISSYSTEM).
+   * This disabling is also marked by putting zeroes in the corresponding
+   * supported feature bits reported by hwloc_topology_get_support().
+   *
+   * The flag ::HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT actually imports
+   * support bits from the remote machine. It also sets the flag
+   * \p imported_support in the struct hwloc_topology_misc_support array.
+   * If the imported XML did not contain any support information
+   * (exporter hwloc is too old), this flag is not set.
+   *
+   * Note that these supported features are only relevant for the hwloc
+   * installation that actually exported the XML topology
+   * (it may vary with the operating system, or with how hwloc was compiled).
+   *
+   * Note that setting this flag however does not enable binding for the
+   * locally imported hwloc topology, it only reports what the remote
+   * hwloc and machine support.
+   *
+   */
+  HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT = (1UL<<3)
 };

 /** \brief Set OR'ed flags to non-yet-loaded topology.
@@ -1972,6 +2009,8 @@ struct hwloc_topology_discovery_support {
  unsigned char disallowed_pu;
  /** \brief Detecting and identifying NUMA nodes that are not available to the current process is supported. */
  unsigned char disallowed_numa;
+  /** \brief Detecting the efficiency of CPU kinds is supported, see \ref hwlocality_cpukinds. */
+  unsigned char cpukind_efficiency;
 };

 /** \brief Flags describing actual PU binding support for this topology.
@@ -2042,6 +2081,13 @@ struct hwloc_topology_membind_support {
  unsigned char get_area_memlocation;
 };

+/** \brief Flags describing miscellaneous features.
+ */
+struct hwloc_topology_misc_support {
+  /** Support was imported when importing another topology, see ::HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT. */
+  unsigned char imported_support;
+};
+
 /** \brief Set of flags describing actual support for this topology.
 *
 * This is retrieved with hwloc_topology_get_support() and will be valid until
@@ -2052,6 +2098,7 @@ struct hwloc_topology_support {
  struct hwloc_topology_discovery_support *discovery;
  struct hwloc_topology_cpubind_support *cpubind;
  struct hwloc_topology_membind_support *membind;
+  struct hwloc_topology_misc_support *misc;
 };

 /** \brief Retrieve the topology support.
@@ -2062,6 +2109,18 @@ struct hwloc_topology_support {
 * call may still fail in some corner cases.
 *
 * These features are also listed by hwloc-info \--support
+ *
+ * The reported features are what the current topology supports
+ * on the current machine. If the topology was exported to XML
+ * from another machine and later imported here, support still
+ * describes what is supported for this imported topology after
+ * import. By default, binding will be reported as unsupported
+ * in this case (see ::HWLOC_TOPOLOGY_FLAG_IS_THISSYSTEM).
+ *
+ * Topology flag ::HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT may be used
+ * to report the supported features of the original remote machine
+ * instead. If it was successfully imported, \p imported_support
+ * will be set in the struct hwloc_topology_misc_support array.
 */
 HWLOC_DECLSPEC const struct hwloc_topology_support *hwloc_topology_get_support(hwloc_topology_t __hwloc_restrict topology);

@@ -2108,8 +2167,8 @@ enum hwloc_type_filter_e {
   *
   * It is only useful for I/O object types.
   * For ::HWLOC_OBJ_PCI_DEVICE and ::HWLOC_OBJ_OS_DEVICE, it means that only objects
-   * of major/common kinds are kept (storage, network, OpenFabrics, Intel MICs, CUDA,
-   * OpenCL, NVML, and displays).
+   * of major/common kinds are kept (storage, network, OpenFabrics, CUDA,
+   * OpenCL, RSMI, NVML, and displays).
   * Also, only OS devices directly attached on PCI (e.g. no USB) are reported.
   * For ::HWLOC_OBJ_BRIDGE, it means that bridges are kept only if they have children.
   *
@@ -2303,22 +2362,9 @@ HWLOC_DECLSPEC hwloc_obj_t hwloc_topology_insert_misc_object(hwloc_topology_t to
 /** \brief Allocate a Group object to insert later with hwloc_topology_insert_group_object().
 *
 * This function returns a new Group object.
- * The caller should (at least) initialize its sets before inserting the object.
- * See hwloc_topology_insert_group_object().
 *
- * The \p subtype object attribute may be set to display something else
- * than "Group" as the type name for this object in lstopo.
- * Custom name/value info pairs may be added with hwloc_obj_add_info() after
- * insertion.
- *
- * The \p kind group attribute should be 0. The \p subkind group attribute may
- * be set to identify multiple Groups of the same level.
- *
- * It is recommended not to set any other object attribute before insertion,
- * since the Group may get discarded during insertion.
- *
- * The object will be destroyed if passed to hwloc_topology_insert_group_object()
- * without any set defined.
+ * The caller should (at least) initialize its sets before inserting
+ * the object in the topology. See hwloc_topology_insert_group_object().
 */
 HWLOC_DECLSPEC hwloc_obj_t hwloc_topology_alloc_group_object(hwloc_topology_t topology);

@@ -2329,34 +2375,44 @@ HWLOC_DECLSPEC hwloc_obj_t hwloc_topology_alloc_group_object(hwloc_topology_t to
 * the final location of the Group in the topology.
 * Then the object can be passed to this function for actual insertion in the topology.
 *
- * The group \p dont_merge attribute may be set to prevent the core from
- * ever merging this object with another object hierarchically-identical.
- *
 * Either the cpuset or nodeset field (or both, if compatible) must be set
 * to a non-empty bitmap. The complete_cpuset or complete_nodeset may be set
 * instead if inserting with respect to the complete topology
 * (including disallowed, offline or unknown objects).
- *
- * It grouping several objects, hwloc_obj_add_other_obj_sets() is an easy way
+ * If grouping several objects, hwloc_obj_add_other_obj_sets() is an easy way
 * to build the Group sets iteratively.
- *
 * These sets cannot be larger than the current topology, or they would get
 * restricted silently.
- *
 * The core will setup the other sets after actual insertion.
 *
+ * The \p subtype object attribute may be defined (to a dynamically
+ * allocated string) to display something else than "Group" as the
+ * type name for this object in lstopo.
+ * Custom name/value info pairs may be added with hwloc_obj_add_info() after
+ * insertion.
+ *
+ * The group \p dont_merge attribute may be set to \c 1 to prevent
+ * the hwloc core from ever merging this object with another
+ * hierarchically-identical object.
+ * This is useful when the Group itself describes an important feature
+ * that cannot be exposed anywhere else in the hierarchy.
+ *
+ * The group \p kind attribute may be set to a high value such
+ * as \c 0xffffffff to tell hwloc that this new Group should always
+ * be discarded in favor of any existing Group with the same locality.
+ *
 * \return The inserted object if it was properly inserted.
 *
- * \return An existing object if the Group was discarded because the topology already
- * contained an object at the same location (the Group did not add any locality information).
- * Any name/info key pair set before inserting is appended to the existing object.
+ * \return An existing object if the Group was merged or discarded
+ * because the topology already contained an object at the same
+ * location (the Group did not add any hierarchy information).
 *
 * \return \c NULL if the insertion failed because of conflicting sets in topology tree.
 *
 * \return \c NULL if Group objects are filtered-out of the topology (::HWLOC_TYPE_FILTER_KEEP_NONE).
 *
- * \return \c NULL if the object was discarded because no set was initialized in the Group
- * before insert, or all of them were empty.
+ * \return \c NULL if the object was discarded because no set was
+ * initialized in the Group before insert, or all of them were empty.
 */
 HWLOC_DECLSPEC hwloc_obj_t hwloc_topology_insert_group_object(hwloc_topology_t topology, hwloc_obj_t group);

@@ -2371,6 +2427,22 @@ HWLOC_DECLSPEC hwloc_obj_t hwloc_topology_insert_group_object(hwloc_topology_t t
 */
 HWLOC_DECLSPEC int hwloc_obj_add_other_obj_sets(hwloc_obj_t dst, hwloc_obj_t src);

+/** \brief Refresh internal structures after topology modification.
+ *
+ * Modifying the topology (by restricting, adding objects, modifying structures
+ * such as distances or memory attributes, etc.) may cause some internal caches
+ * to become invalid. These caches are automatically refreshed when accessed
+ * but this refreshing is not thread-safe.
+ *
+ * This function is not thread-safe either, but it is a good way to end a
+ * non-thread-safe phase of topology modification. Once this refresh is done,
+ * multiple threads may concurrently consult the topology, objects, distances,
+ * attributes, etc.
+ *
+ * See also \ref threadsafety
+ */
+HWLOC_DECLSPEC int hwloc_topology_refresh(hwloc_topology_t topology);
+
 /** @} */


@@ -2386,6 +2458,12 @@ HWLOC_DECLSPEC int hwloc_obj_add_other_obj_sets(hwloc_obj_t dst, hwloc_obj_t src
 /* inline code of some functions above */
 #include "hwloc/inlines.h"

+/* memory attributes */
+#include "hwloc/memattrs.h"
+
+/* kinds of CPU cores */
+#include "hwloc/cpukinds.h"
+
 /* exporting to XML or synthetic */
 #include "hwloc/export.h"

--- a/src/3rdparty/hwloc/include/hwloc/autogen/config.h
+++ b/src/3rdparty/hwloc/include/hwloc/autogen/config.h
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2019 Inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009-2012 Université Bordeaux
 * Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -11,10 +11,10 @@
 #ifndef HWLOC_CONFIG_H
 #define HWLOC_CONFIG_H

-#define HWLOC_VERSION "2.2.0"
+#define HWLOC_VERSION "2.4.1"
 #define HWLOC_VERSION_MAJOR 2
-#define HWLOC_VERSION_MINOR 2
-#define HWLOC_VERSION_RELEASE 0
+#define HWLOC_VERSION_MINOR 4
+#define HWLOC_VERSION_RELEASE 1
 #define HWLOC_VERSION_GREEK ""

 #define __hwloc_restrict
--- a/src/3rdparty/hwloc/include/hwloc/bitmap.h
+++ b/src/3rdparty/hwloc/include/hwloc/bitmap.h
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2018 Inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009-2012 Université Bordeaux
 * Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -231,7 +231,7 @@ HWLOC_DECLSPEC int hwloc_bitmap_clr_range(hwloc_bitmap_t bitmap, unsigned begin,
 /** \brief Keep a single index among those set in bitmap \p bitmap
 *
 * May be useful before binding so that the process does not
- * have a chance of migrating between multiple logical CPUs
+ * have a chance of migrating between multiple processors
 * in the original mask.
 * Instead of running the task on any PU inside the given CPU set,
 * the operating system scheduler will be forced to run it on a single
--- a/src/3rdparty/hwloc/include/hwloc/cpukinds.h
+++ b/src/3rdparty/hwloc/include/hwloc/cpukinds.h
@@ -0,0 +1,188 @@
+/*
+ * Copyright © 2020 Inria.  All rights reserved.
+ * See COPYING in top-level directory.
+ */
+
+/** \file
+ * \brief Kinds of CPU cores.
+ */
+
+#ifndef HWLOC_CPUKINDS_H
+#define HWLOC_CPUKINDS_H
+
+#include "hwloc.h"
+
+#ifdef __cplusplus
+extern "C" {
+#elif 0
+}
+#endif
+
+/** \defgroup hwlocality_cpukinds Kinds of CPU cores
+ *
+ * Platforms with heterogeneous CPUs may have some cores with
+ * different features or frequencies.
+ * This API exposes identical PUs in sets called CPU kinds.
+ * Each PU of the topology may only be in a single kind.
+ *
+ * The number of kinds may be obtained with hwloc_cpukinds_get_nr().
+ * If the platform is homogeneous, there may be a single kind
+ * with all PUs.
+ * If the platform or operating system does not expose any
+ * information about CPU cores, there may be no kind at all.
+ *
+ * The index of the kind that describes a given CPU set
+ * (if any, and not partially)
+ * may be obtained with hwloc_cpukinds_get_by_cpuset().
+ *
+ * From the index of a kind, it is possible to retrieve information
+ * with hwloc_cpukinds_get_info():
+ * an abstracted efficiency value,
+ * and an array of info attributes
+ * (for instance the "CoreType" and "FrequencyMaxMHz",
+ *  see \ref topoattrs_cpukinds).
+ *
+ * A higher efficiency value means intrinsic greater performance
+ * (and possibly less performance/power efficiency).
+ * Kinds with lower efficiency are ranked first:
+ * Passing 0 as \p kind_index to hwloc_cpukinds_get_info() will
+ * return information about the less efficient CPU kind.
+ *
+ * When available, efficiency values are gathered from the operating
+ * system (when \p cpukind_efficiency is set in the
+ * struct hwloc_topology_discovery_support array, only on Windows 10 for now).
+ * Otherwise hwloc tries to compute efficiencies
+ * by comparing CPU kinds using frequencies (on ARM),
+ * or core types and frequencies (on other architectures).
+ * The environment variable HWLOC_CPUKINDS_RANKING may be used
+ * to change this heuristics, see \ref envvar.
+ *
+ * If hwloc fails to rank any kind, for instance because the operating
+ * system does not expose efficiencies and core frequencies,
+ * all kinds will have an unknown efficiency (\c -1),
+ * and they are not indexed/ordered in any specific way.
+ *
+ * @{
+ */
+
+/** \brief Get the number of different kinds of CPU cores in the topology.
+ *
+ * \p flags must be \c 0 for now.
+ *
+ * \return The number of CPU kinds (positive integer) on success.
+ * \return \c 0 if no information about kinds was found.
+ * \return \c -1 with \p errno set to \c EINVAL if \p flags is invalid.
+ */
+HWLOC_DECLSPEC int
+hwloc_cpukinds_get_nr(hwloc_topology_t topology,
+                      unsigned long flags);
+
+/** \brief Get the index of the CPU kind that contains CPUs listed in \p cpuset.
+ *
+ * \p flags must be \c 0 for now.
+ *
+ * \return The index of the CPU kind (positive integer or 0) on success.
+ * \return \c -1 with \p errno set to \c EXDEV if \p cpuset is
+ * only partially included in the some kind.
+ * \return \c -1 with \p errno set to \c ENOENT if \p cpuset is
+ * not included in any kind, even partially.
+ * \return \c -1 with \p errno set to \c EINVAL if parameters are invalid.
+ */
+HWLOC_DECLSPEC int
+hwloc_cpukinds_get_by_cpuset(hwloc_topology_t topology,
+                             hwloc_const_bitmap_t cpuset,
+                             unsigned long flags);
+
+/** \brief Get the CPU set and infos about a CPU kind in the topology.
+ *
+ * \p kind_index identifies one kind of CPU between 0 and the number
+ * of kinds returned by hwloc_cpukinds_get_nr() minus 1.
+ *
+ * If not \c NULL, the bitmap \p cpuset will be filled with
+ * the set of PUs of this kind.
+ *
+ * The integer pointed by \p efficiency, if not \c NULL will, be filled
+ * with the ranking of this kind of CPU in term of efficiency (see above).
+ * It ranges from \c 0 to the number of kinds
+ * (as reported by hwloc_cpukinds_get_nr()) minus 1.
+ *
+ * Kinds with lower efficiency are reported first.
+ *
+ * If there is a single kind in the topology, its efficiency \c 0.
+ * If the efficiency of some kinds of cores is unknown,
+ * the efficiency of all kinds is set to \c -1,
+ * and kinds are reported in no specific order.
+ *
+ * The array of info attributes (for instance the "CoreType",
+ * "FrequencyMaxMHz" or "FrequencyBaseMHz", see \ref topoattrs_cpukinds)
+ * and its length are returned in \p infos or \p nr_infos.
+ * The array belongs to the topology, it should not be freed or modified.
+ *
+ * If \p nr_infos or \p infos is \c NULL, no info is returned.
+ *
+ * \p flags must be \c 0 for now.
+ *
+ * \return \c 0 on success.
+ * \return \c -1 with \p errno set to \c ENOENT if \p kind_index does not match any CPU kind.
+ * \return \c -1 with \p errno set to \c EINVAL if parameters are invalid.
+ */
+HWLOC_DECLSPEC int
+hwloc_cpukinds_get_info(hwloc_topology_t topology,
+                        unsigned kind_index,
+                        hwloc_bitmap_t cpuset,
+                        int *efficiency,
+                        unsigned *nr_infos, struct hwloc_info_s **infos,
+                        unsigned long flags);
+
+/** \brief Register a kind of CPU in the topology.
+ *
+ * Mark the PUs listed in \p cpuset as being of the same kind
+ * with respect to the given attributes.
+ *
+ * \p forced_efficiency should be \c -1 if unknown.
+ * Otherwise it is an abstracted efficiency value to enforce
+ * the ranking of all kinds if all of them have valid (and
+ * different) efficiencies.
+ *
+ * The array \p infos of size \p nr_infos may be used to provide
+ * info names and values describing this kind of PUs.
+ *
+ * \p flags must be \c 0 for now.
+ *
+ * Parameters \p cpuset and \p infos will be duplicated internally,
+ * the caller is responsible for freeing them.
+ *
+ * If \p cpuset overlaps with some existing kinds, those might get
+ * modified or split. For instance if existing kind A contains
+ * PUs 0 and 1, and one registers another kind for PU 1 and 2,
+ * there will be 3 resulting kinds:
+ * existing kind A is restricted to only PU 0;
+ * new kind B contains only PU 1 and combines information from A
+ * and from the newly-registered kind;
+ * new kind C contains only PU 2 and only gets information from
+ * the newly-registered kind.
+ *
+ * \note The efficiency \p forced_efficiency provided to this function
+ * may be different from the one reported later by hwloc_cpukinds_get_info()
+ * because hwloc will scale efficiency values down to
+ * between 0 and the number of kinds minus 1.
+ *
+ * \return \c 0 on success.
+ * \return \c -1 with \p errno set to \c EINVAL if some parameters are invalid,
+ * for instance if \p cpuset is \c NULL or empty.
+ */
+HWLOC_DECLSPEC int
+hwloc_cpukinds_register(hwloc_topology_t topology,
+                        hwloc_bitmap_t cpuset,
+                        int forced_efficiency,
+                        unsigned nr_infos, struct hwloc_info_s *infos,
+                        unsigned long flags);
+
+/** @} */
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+
+#endif /* HWLOC_CPUKINDS_H */
--- a/src/3rdparty/hwloc/include/hwloc/cuda.h
+++ b/src/3rdparty/hwloc/include/hwloc/cuda.h
@@ -1,5 +1,5 @@
 /*
- * Copyright © 2010-2017 Inria.  All rights reserved.
+ * Copyright © 2010-2020 Inria.  All rights reserved.
 * Copyright © 2010-2011 Université Bordeaux
 * Copyright © 2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -72,7 +72,7 @@ hwloc_cuda_get_device_pci_ids(hwloc_topology_t topology __hwloc_attribute_unused
  return 0;
 }

-/** \brief Get the CPU set of logical processors that are physically
+/** \brief Get the CPU set of processors that are physically
 * close to device \p cudevice.
 *
 * Return the CPU set describing the locality of the CUDA device \p cudevice.
--- a/src/3rdparty/hwloc/include/hwloc/cudart.h
+++ b/src/3rdparty/hwloc/include/hwloc/cudart.h
@@ -1,5 +1,5 @@
 /*
- * Copyright © 2010-2017 Inria.  All rights reserved.
+ * Copyright © 2010-2020 Inria.  All rights reserved.
 * Copyright © 2010-2011 Université Bordeaux
 * Copyright © 2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -69,7 +69,7 @@ hwloc_cudart_get_device_pci_ids(hwloc_topology_t topology __hwloc_attribute_unus
  return 0;
 }

-/** \brief Get the CPU set of logical processors that are physically
+/** \brief Get the CPU set of processors that are physically
 * close to device \p idx.
 *
 * Return the CPU set describing the locality of the CUDA device
--- a/src/3rdparty/hwloc/include/hwloc/diff.h
+++ b/src/3rdparty/hwloc/include/hwloc/diff.h
@@ -1,5 +1,5 @@
 /*
- * Copyright © 2013-2018 Inria.  All rights reserved.
+ * Copyright © 2013-2020 Inria.  All rights reserved.
 * See COPYING in top-level directory.
 */

@@ -110,7 +110,7 @@ union hwloc_topology_diff_obj_attr_u {
 */
 typedef enum hwloc_topology_diff_type_e {
  /** \brief An object attribute was changed.
-   * The union is a hwloc_topology_diff_obj_attr_u::hwloc_topology_diff_obj_attr_s.
+   * The union is a hwloc_topology_diff_u::hwloc_topology_diff_obj_attr_s.
   */
  HWLOC_TOPOLOGY_DIFF_OBJ_ATTR,

@@ -119,7 +119,7 @@ typedef enum hwloc_topology_diff_type_e {
   * this object has not been checked.
   * hwloc_topology_diff_build() will return 1.
   *
-   * The union is a hwloc_topology_diff_obj_attr_u::hwloc_topology_diff_too_complex_s.
+   * The union is a hwloc_topology_diff_u::hwloc_topology_diff_too_complex_s.
   */
  HWLOC_TOPOLOGY_DIFF_TOO_COMPLEX
 } hwloc_topology_diff_type_t;
--- a/src/3rdparty/hwloc/include/hwloc/distances.h
+++ b/src/3rdparty/hwloc/include/hwloc/distances.h
@@ -1,5 +1,5 @@
 /*
- * Copyright © 2010-2019 Inria.  All rights reserved.
+ * Copyright © 2010-2020 Inria.  All rights reserved.
 * See COPYING in top-level directory.
 */

@@ -34,6 +34,7 @@ extern "C" {
 * It corresponds to the latency for accessing the memory of one node
 * from a core in another node.
 * The corresponding kind is ::HWLOC_DISTANCES_KIND_FROM_OS | ::HWLOC_DISTANCES_KIND_FROM_USER.
+ * The name of this distances structure is "NUMALatency".
 *
 * The matrix may also contain bandwidths between random sets of objects,
 * possibly provided by the user, as specified in the \p kind attribute.
@@ -144,6 +145,8 @@ hwloc_distances_get_by_type(hwloc_topology_t topology, hwloc_obj_type_t type,
 /** \brief Retrieve a distance matrix with the given name.
 *
 * Usually only one distances structure may match a given name.
+ *
+ * The name of the most common structure is "NUMALatency".
 */
 HWLOC_DECLSPEC int
 hwloc_distances_get_by_name(hwloc_topology_t topology, const char *name,
--- a/src/3rdparty/hwloc/include/hwloc/glibc-sched.h
+++ b/src/3rdparty/hwloc/include/hwloc/glibc-sched.h
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2013 inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009-2011 Université Bordeaux
 * Copyright © 2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -22,7 +22,7 @@

 #include <assert.h>

-#if !defined _GNU_SOURCE || !defined _SCHED_H || (!defined CPU_SETSIZE && !defined sched_priority)
+#if !defined _GNU_SOURCE || (!defined _SCHED_H && !defined _SCHED_H_) || (!defined CPU_SETSIZE && !defined sched_priority)
 #error Please make sure to include sched.h before including glibc-sched.h, and define _GNU_SOURCE before any inclusion of sched.h
 #endif

--- a/src/3rdparty/hwloc/include/hwloc/helper.h
+++ b/src/3rdparty/hwloc/include/hwloc/helper.h
@@ -872,8 +872,8 @@ hwloc_distrib(hwloc_topology_t topology,
    unsigned chunk, weight;
    hwloc_obj_t root = roots[flags & HWLOC_DISTRIB_FLAG_REVERSE ? n_roots-1-i : i];
    hwloc_cpuset_t cpuset = root->cpuset;
-    if (root->type == HWLOC_OBJ_NUMANODE)
-      /* NUMANodes have same cpuset as their parent, but we need normal objects below */
+    while (!hwloc_obj_type_is_normal(root->type))
+      /* If memory/io/misc, walk up to normal parent */
      root = root->parent;
    weight = (unsigned) hwloc_bitmap_weight(cpuset);
    if (!weight)
@@ -919,7 +919,7 @@ hwloc_distrib(hwloc_topology_t topology,

 /** \brief Get complete CPU set
 *
- * \return the complete CPU set of logical processors of the system.
+ * \return the complete CPU set of processors of the system.
 *
 * \note The returned cpuset is not newly allocated and should thus not be
 * changed or freed; hwloc_bitmap_dup() must be used to obtain a local copy.
@@ -931,7 +931,7 @@ hwloc_topology_get_complete_cpuset(hwloc_topology_t topology) __hwloc_attribute_

 /** \brief Get topology CPU set
 *
- * \return the CPU set of logical processors of the system for which hwloc
+ * \return the CPU set of processors of the system for which hwloc
 * provides topology information. This is equivalent to the cpuset of the
 * system object.
 *
@@ -945,7 +945,7 @@ hwloc_topology_get_topology_cpuset(hwloc_topology_t topology) __hwloc_attribute_

 /** \brief Get allowed CPU set
 *
- * \return the CPU set of allowed logical processors of the system.
+ * \return the CPU set of allowed processors of the system.
 *
 * \note If the topology flag ::HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED was not set,
 * this is identical to hwloc_topology_get_topology_cpuset(), which means
--- a/src/3rdparty/hwloc/include/hwloc/memattrs.h
+++ b/src/3rdparty/hwloc/include/hwloc/memattrs.h
@@ -0,0 +1,455 @@
+/*
+ * Copyright © 2019-2020 Inria.  All rights reserved.
+ * See COPYING in top-level directory.
+ */
+
+/** \file
+ * \brief Memory node attributes.
+ */
+
+#ifndef HWLOC_MEMATTR_H
+#define HWLOC_MEMATTR_H
+
+#include "hwloc.h"
+
+#ifdef __cplusplus
+extern "C" {
+#elif 0
+}
+#endif
+
+/** \defgroup hwlocality_memattrs Comparing memory node attributes for finding where to allocate on
+ *
+ * Platforms with heterogeneous memory require ways to decide whether
+ * a buffer should be allocated on "fast" memory (such as HBM),
+ * "normal" memory (DDR) or even "slow" but large-capacity memory
+ * (non-volatile memory).
+ * These memory nodes are called "Targets" while the CPU accessing them
+ * is called the "Initiator". Access performance depends on their
+ * locality (NUMA platforms) as well as the intrinsic performance
+ * of the targets (heterogeneous platforms).
+ *
+ * The following attributes describe the performance of memory accesses
+ * from an Initiator to a memory Target, for instance their latency
+ * or bandwidth.
+ * Initiators performing these memory accesses are usually some PUs or Cores
+ * (described as a CPU set).
+ * Hence a Core may choose where to allocate a memory buffer by comparing
+ * the attributes of different target memory nodes nearby.
+ *
+ * There are also some attributes that are system-wide.
+ * Their value does not depend on a specific initiator performing
+ * an access.
+ * The memory node Capacity is an example of such attribute without
+ * initiator.
+ *
+ * One way to use this API is to start with a cpuset describing the Cores where
+ * a program is bound. The best target NUMA node for allocating memory in this
+ * program on these Cores may be obtained by passing this cpuset as an initiator
+ * to hwloc_memattr_get_best_target() with the relevant memory attribute.
+ * For instance, if the code is latency limited, use the Latency attribute.
+ *
+ * A more flexible approach consists in getting the list of local NUMA nodes
+ * by passing this cpuset to hwloc_get_local_numanode_objs().
+ * Attribute values for these nodes, if any, may then be obtained with
+ * hwloc_memattr_get_value() and manually compared with the desired criteria.
+ *
+ * \note The API also supports specific objects as initiator,
+ * but it is currently not used internally by hwloc.
+ * Users may for instance use it to provide custom performance
+ * values for host memory accesses performed by GPUs.
+ *
+ * \note The interface actually also accepts targets that are not NUMA nodes.
+ * @{
+ */
+
+/** \brief Memory node attributes. */
+enum hwloc_memattr_id_e {
+  /** \brief "Capacity".
+   * The capacity is returned in bytes
+   * (local_memory attribute in objects).
+   *
+   * Best capacity nodes are nodes with <b>higher capacity</b>.
+   *
+   * No initiator is involved when looking at this attribute.
+   * The corresponding attribute flags are ::HWLOC_MEMATTR_FLAG_HIGHER_FIRST.
+   */
+  HWLOC_MEMATTR_ID_CAPACITY = 0,
+
+  /** \brief "Locality".
+   * The locality is returned as the number of PUs in that locality
+   * (e.g. the weight of its cpuset).
+   *
+   * Best locality nodes are nodes with <b>smaller locality</b>
+   * (nodes that are local to very few PUs).
+   * Poor locality nodes are nodes with larger locality
+   * (nodes that are local to the entire machine).
+   *
+   * No initiator is involved when looking at this attribute.
+   * The corresponding attribute flags are ::HWLOC_MEMATTR_FLAG_HIGHER_FIRST.
+   */
+  HWLOC_MEMATTR_ID_LOCALITY = 1,
+
+  /** \brief "Bandwidth".
+   * The bandwidth is returned in MiB/s, as seen from the given initiator location.
+   * Best bandwidth nodes are nodes with <b>higher bandwidth</b>.
+   * The corresponding attribute flags are ::HWLOC_MEMATTR_FLAG_HIGHER_FIRST
+   * and ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR.
+   */
+  HWLOC_MEMATTR_ID_BANDWIDTH = 2,
+
+  /** \brief "Latency".
+   * The latency is returned as nanoseconds, as seen from the given initiator location.
+   * Best latency nodes are nodes with <b>smaller latency</b>.
+   * The corresponding attribute flags are ::HWLOC_MEMATTR_FLAG_LOWER_FIRST
+   * and ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR.
+   */
+  HWLOC_MEMATTR_ID_LATENCY = 3
+
+  /* TODO read vs write, persistence? */
+};
+
+/** \brief A memory attribute identifier.
+ * May be either one of ::hwloc_memattr_id_e or a new id returned by hwloc_memattr_register().
+ */
+typedef unsigned hwloc_memattr_id_t;
+
+/** \brief Return the identifier of the memory attribute with the given name.
+ */
+HWLOC_DECLSPEC int
+hwloc_memattr_get_by_name(hwloc_topology_t topology,
+                          const char *name,
+                          hwloc_memattr_id_t *id);
+
+
+/** \brief Type of location. */
+enum hwloc_location_type_e {
+  /** \brief Location is given as a cpuset, in the location cpuset union field. \hideinitializer */
+  HWLOC_LOCATION_TYPE_CPUSET = 1,
+  /** \brief Location is given as an object, in the location object union field. \hideinitializer */
+  HWLOC_LOCATION_TYPE_OBJECT = 0
+};
+
+/** \brief Where to measure attributes from. */
+struct hwloc_location {
+  /** \brief Type of location. */
+  enum hwloc_location_type_e type;
+  /** \brief Actual location. */
+  union hwloc_location_u {
+    /** \brief Location as a cpuset, when the location type is ::HWLOC_LOCATION_TYPE_CPUSET. */
+    hwloc_cpuset_t cpuset;
+    /** \brief Location as an object, when the location type is ::HWLOC_LOCATION_TYPE_OBJECT. */
+    hwloc_obj_t object;
+  } location;
+};
+
+
+/** \brief Flags for selecting target NUMA nodes. */
+enum hwloc_local_numanode_flag_e {
+  /** \brief Select NUMA nodes whose locality is larger than the given cpuset.
+   * For instance, if a single PU (or its cpuset) is given in \p initiator,
+   * select all nodes close to the package that contains this PU.
+   * \hideinitializer
+   */
+  HWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY = (1UL<<0),
+
+  /** \brief Select NUMA nodes whose locality is smaller than the given cpuset.
+   * For instance, if a package (or its cpuset) is given in \p initiator,
+   * also select nodes that are attached to only a half of that package.
+   * \hideinitializer
+   */
+  HWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY = (1UL<<1),
+
+  /** \brief Select all NUMA nodes in the topology.
+   * The initiator \p initiator is ignored.
+   * \hideinitializer
+   */
+  HWLOC_LOCAL_NUMANODE_FLAG_ALL = (1UL<<2)
+};
+
+/** \brief Return an array of local NUMA nodes.
+ *
+ * By default only select the NUMA nodes whose locality is exactly
+ * the given \p location. More nodes may be selected if additional flags
+ * are given as a OR'ed set of ::hwloc_local_numanode_flag_e.
+ *
+ * If \p location is given as an explicit object, its CPU set is used
+ * to find NUMA nodes with the corresponding locality.
+ * If the object does not have a CPU set (e.g. I/O object), the CPU
+ * parent (where the I/O object is attached) is used.
+ *
+ * On input, \p nr points to the number of nodes that may be stored
+ * in the \p nodes array.
+ * On output, \p nr will be changed to the number of stored nodes,
+ * or the number of nodes that would have been stored if there were
+ * enough room.
+ *
+ * \note Some of these NUMA nodes may not have any memory attribute
+ * values and hence not be reported as actual targets in other functions.
+ *
+ * \note The number of NUMA nodes in the topology (obtained by
+ * hwloc_bitmap_weight() on the root object nodeset) may be used
+ * to allocate the \p nodes array.
+ *
+ * \note When an object CPU set is given as locality, for instance a Package,
+ * and when flags contain both ::HWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY
+ * and ::HWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY,
+ * the returned array corresponds to the nodeset of that object.
+ */
+HWLOC_DECLSPEC int
+hwloc_get_local_numanode_objs(hwloc_topology_t topology,
+                              struct hwloc_location *location,
+                              unsigned *nr,
+                              hwloc_obj_t *nodes,
+                              unsigned long flags);
+
+
+
+/** \brief Return an attribute value for a specific target NUMA node.
+ *
+ * If the attribute does not relate to a specific initiator
+ * (it does not have the flag ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR),
+ * location \p initiator is ignored and may be \c NULL.
+ *
+ * \p flags must be \c 0 for now.
+ *
+ * \note The initiator \p initiator should be of type ::HWLOC_LOCATION_TYPE_CPUSET
+ * when refering to accesses performed by CPU cores.
+ * ::HWLOC_LOCATION_TYPE_OBJECT is currently unused internally by hwloc,
+ * but users may for instance use it to provide custom information about
+ * host memory accesses performed by GPUs.
+ */
+HWLOC_DECLSPEC int
+hwloc_memattr_get_value(hwloc_topology_t topology,
+                        hwloc_memattr_id_t attribute,
+                        hwloc_obj_t target_node,
+                        struct hwloc_location *initiator,
+                        unsigned long flags,
+                        hwloc_uint64_t *value);
+
+/** \brief Return the best target NUMA node for the given attribute and initiator.
+ *
+ * If the attribute does not relate to a specific initiator
+ * (it does not have the flag ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR),
+ * location \p initiator is ignored and may be \c NULL.
+ *
+ * If \p value is non \c NULL, the corresponding value is returned there.
+ *
+ * If multiple targets have the same attribute values, only one is
+ * returned (and there is no way to clarify how that one is chosen).
+ * Applications that want to detect targets with identical/similar
+ * values, or that want to look at values for multiple attributes,
+ * should rather get all values using hwloc_memattr_get_value()
+ * and manually select the target they consider the best.
+ *
+ * \p flags must be \c 0 for now.
+ *
+ * If there are no matching targets, \c -1 is returned with \p errno set to \c ENOENT;
+ *
+ * \note The initiator \p initiator should be of type ::HWLOC_LOCATION_TYPE_CPUSET
+ * when refering to accesses performed by CPU cores.
+ * ::HWLOC_LOCATION_TYPE_OBJECT is currently unused internally by hwloc,
+ * but users may for instance use it to provide custom information about
+ * host memory accesses performed by GPUs.
+ */
+HWLOC_DECLSPEC int
+hwloc_memattr_get_best_target(hwloc_topology_t topology,
+                              hwloc_memattr_id_t attribute,
+                              struct hwloc_location *initiator,
+                              unsigned long flags,
+                              hwloc_obj_t *best_target, hwloc_uint64_t *value);
+
+/** \brief Return the best initiator for the given attribute and target NUMA node.
+ *
+ * If the attribute does not relate to a specific initiator
+ * (it does not have the flag ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR),
+ * \c -1 is returned and \p errno is set to \c EINVAL.
+ *
+ * If \p value is non \c NULL, the corresponding value is returned there.
+ *
+ * If multiple initiators have the same attribute values, only one is
+ * returned (and there is no way to clarify how that one is chosen).
+ * Applications that want to detect initiators with identical/similar
+ * values, or that want to look at values for multiple attributes,
+ * should rather get all values using hwloc_memattr_get_value()
+ * and manually select the initiator they consider the best.
+ *
+ * The returned initiator should not be modified or freed,
+ * it belongs to the topology.
+ *
+ * \p flags must be \c 0 for now.
+ *
+ * If there are no matching initiators, \c -1 is returned with \p errno set to \c ENOENT;
+ */
+HWLOC_DECLSPEC int
+hwloc_memattr_get_best_initiator(hwloc_topology_t topology,
+                                 hwloc_memattr_id_t attribute,
+                                 hwloc_obj_t target,
+                                 unsigned long flags,
+                                 struct hwloc_location *best_initiator, hwloc_uint64_t *value);
+
+/** @} */
+
+
+/** \defgroup hwlocality_memattrs_manage Managing memory attributes
+ * @{
+ */
+
+/** \brief Return the name of a memory attribute.
+ */
+HWLOC_DECLSPEC int
+hwloc_memattr_get_name(hwloc_topology_t topology,
+                       hwloc_memattr_id_t attribute,
+                       const char **name);
+
+/** \brief Return the flags of the given attribute.
+ *
+ * Flags are a OR'ed set of ::hwloc_memattr_flag_e.
+ */
+HWLOC_DECLSPEC int
+hwloc_memattr_get_flags(hwloc_topology_t topology,
+                        hwloc_memattr_id_t attribute,
+                        unsigned long *flags);
+
+/** \brief Memory attribute flags.
+ * Given to hwloc_memattr_register() and returned by hwloc_memattr_get_flags().
+ */
+enum hwloc_memattr_flag_e {
+  /** \brief The best nodes for this memory attribute are those with the higher values.
+   * For instance Bandwidth.
+   */
+  HWLOC_MEMATTR_FLAG_HIGHER_FIRST = (1UL<<0),
+  /** \brief The best nodes for this memory attribute are those with the lower values.
+   * For instance Latency.
+   */
+  HWLOC_MEMATTR_FLAG_LOWER_FIRST = (1UL<<1),
+  /** \brief The value returned for this memory attribute depends on the given initiator.
+   * For instance Bandwidth and Latency, but not Capacity.
+   */
+  HWLOC_MEMATTR_FLAG_NEED_INITIATOR = (1UL<<2)
+};
+
+/** \brief Register a new memory attribute.
+ *
+ * Add a specific memory attribute that is not defined in ::hwloc_memattr_id_e.
+ * Flags are a OR'ed set of ::hwloc_memattr_flag_e. It must contain at least
+ * one of ::HWLOC_MEMATTR_FLAG_HIGHER_FIRST or ::HWLOC_MEMATTR_FLAG_LOWER_FIRST.
+ */
+HWLOC_DECLSPEC int
+hwloc_memattr_register(hwloc_topology_t topology,
+                       const char *name,
+                       unsigned long flags,
+                       hwloc_memattr_id_t *id);
+
+/** \brief Set an attribute value for a specific target NUMA node.
+ *
+ * If the attribute does not relate to a specific initiator
+ * (it does not have the flag ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR),
+ * location \p initiator is ignored and may be \c NULL.
+ *
+ * The initiator will be copied into the topology,
+ * the caller should free anything allocated to store the initiator,
+ * for instance the cpuset.
+ *
+ * \p flags must be \c 0 for now.
+ *
+ * \note The initiator \p initiator should be of type ::HWLOC_LOCATION_TYPE_CPUSET
+ * when refering to accesses performed by CPU cores.
+ * ::HWLOC_LOCATION_TYPE_OBJECT is currently unused internally by hwloc,
+ * but users may for instance use it to provide custom information about
+ * host memory accesses performed by GPUs.
+ */
+HWLOC_DECLSPEC int
+hwloc_memattr_set_value(hwloc_topology_t topology,
+                        hwloc_memattr_id_t attribute,
+                        hwloc_obj_t target_node,
+                        struct hwloc_location *initiator,
+                        unsigned long flags,
+                        hwloc_uint64_t value);
+
+/** \brief Return the target NUMA nodes that have some values for a given attribute.
+ *
+ * Return targets for the given attribute in the \p targets array
+ * (for the given initiator if any).
+ * If \p values is not \c NULL, the corresponding attribute values
+ * are stored in the array it points to.
+ *
+ * On input, \p nr points to the number of targets that may be stored
+ * in the array \p targets (and \p values).
+ * On output, \p nr points to the number of targets (and values) that
+ * were actually found, even if some of them couldn't be stored in the array.
+ * Targets that couldn't be stored are ignored, but the function still
+ * returns success (\c 0). The caller may find out by comparing the value pointed
+ * by \p nr before and after the function call.
+ *
+ * The returned targets should not be modified or freed,
+ * they belong to the topology.
+ *
+ * Argument \p initiator is ignored if the attribute does not relate to a specific
+ * initiator (it does not have the flag ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR).
+ * Otherwise \p initiator may be non \c NULL to report only targets
+ * that have a value for that initiator.
+ *
+ * \p flags must be \c 0 for now.
+ *
+ * \note This function is meant for tools and debugging (listing internal information)
+ * rather than for application queries. Applications should rather select useful
+ * NUMA nodes with hwloc_get_local_numanode_objs() and then look at their attribute
+ * values.
+ *
+ * \note The initiator \p initiator should be of type ::HWLOC_LOCATION_TYPE_CPUSET
+ * when refering to accesses performed by CPU cores.
+ * ::HWLOC_LOCATION_TYPE_OBJECT is currently unused internally by hwloc,
+ * but users may for instance use it to provide custom information about
+ * host memory accesses performed by GPUs.
+ */
+HWLOC_DECLSPEC int
+hwloc_memattr_get_targets(hwloc_topology_t topology,
+                          hwloc_memattr_id_t attribute,
+                          struct hwloc_location *initiator,
+                          unsigned long flags,
+                          unsigned *nrp, hwloc_obj_t *targets, hwloc_uint64_t *values);
+
+/** \brief Return the initiators that have values for a given attribute for a specific target NUMA node.
+ *
+ * Return initiators for the given attribute and target node in the
+ * \p initiators array.
+ * If \p values is not \c NULL, the corresponding attribute values
+ * are stored in the array it points to.
+ *
+ * On input, \p nr points to the number of initiators that may be stored
+ * in the array \p initiators (and \p values).
+ * On output, \p nr points to the number of initiators (and values) that
+ * were actually found, even if some of them couldn't be stored in the array.
+ * Initiators that couldn't be stored are ignored, but the function still
+ * returns success (\c 0). The caller may find out by comparing the value pointed
+ * by \p nr before and after the function call.
+ *
+ * The returned initiators should not be modified or freed,
+ * they belong to the topology.
+ *
+ * \p flags must be \c 0 for now.
+ *
+ * If the attribute does not relate to a specific initiator
+ * (it does not have the flag ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR),
+ * no initiator is returned.
+ *
+ * \note This function is meant for tools and debugging (listing internal information)
+ * rather than for application queries. Applications should rather select useful
+ * NUMA nodes with hwloc_get_local_numanode_objs() and then look at their attribute
+ * values for some relevant initiators.
+ */
+HWLOC_DECLSPEC int
+hwloc_memattr_get_initiators(hwloc_topology_t topology,
+                             hwloc_memattr_id_t attribute,
+                             hwloc_obj_t target_node,
+                             unsigned long flags,
+                             unsigned *nr, struct hwloc_location *initiators, hwloc_uint64_t *values);
+/** @} */
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+
+#endif /* HWLOC_MEMATTR_H */
--- a/src/3rdparty/hwloc/include/hwloc/nvml.h
+++ b/src/3rdparty/hwloc/include/hwloc/nvml.h
@@ -1,5 +1,5 @@
 /*
- * Copyright © 2012-2016 Inria.  All rights reserved.
+ * Copyright © 2012-2020 Inria.  All rights reserved.
 * See COPYING in top-level directory.
 */

@@ -36,7 +36,7 @@ extern "C" {
 * @{
 */

-/** \brief Get the CPU set of logical processors that are physically
+/** \brief Get the CPU set of processors that are physically
 * close to NVML device \p device.
 *
 * Return the CPU set describing the locality of the NVML device \p device.
--- a/src/3rdparty/hwloc/include/hwloc/opencl.h
+++ b/src/3rdparty/hwloc/include/hwloc/opencl.h
@@ -1,5 +1,5 @@
 /*
- * Copyright © 2012-2019 Inria.  All rights reserved.
+ * Copyright © 2012-2021 Inria.  All rights reserved.
 * Copyright © 2013, 2018 Université Bordeaux.  All right reserved.
 * See COPYING in top-level directory.
 */
@@ -82,9 +82,10 @@ hwloc_opencl_get_device_pci_busid(cl_device_id device,
 	if (CL_SUCCESS == clret
 	    && HWLOC_CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD == amdtopo.raw.type) {
 		*domain = 0; /* can't do anything better */
-		*bus = (unsigned) amdtopo.pcie.bus;
-		*dev = (unsigned) amdtopo.pcie.device;
-		*func = (unsigned) amdtopo.pcie.function;
+		/* cl_device_topology_amd stores bus ID in cl_char, dont convert those signed char directly to unsigned int */
+		*bus = (unsigned) (unsigned char) amdtopo.pcie.bus;
+		*dev = (unsigned) (unsigned char) amdtopo.pcie.device;
+		*func = (unsigned) (unsigned char) amdtopo.pcie.function;
 		return 0;
 	}

@@ -109,7 +110,7 @@ hwloc_opencl_get_device_pci_busid(cl_device_id device,
 	return -1;
 }

-/** \brief Get the CPU set of logical processors that are physically
+/** \brief Get the CPU set of processors that are physically
 * close to OpenCL device \p device.
 *
 * Return the CPU set describing the locality of the OpenCL device \p device.
--- a/src/3rdparty/hwloc/include/hwloc/openfabrics-verbs.h
+++ b/src/3rdparty/hwloc/include/hwloc/openfabrics-verbs.h
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2016 Inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009-2010 Université Bordeaux
 * Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -41,7 +41,7 @@ extern "C" {
 * @{
 */

-/** \brief Get the CPU set of logical processors that are physically
+/** \brief Get the CPU set of processors that are physically
 * close to device \p ibdev.
 *
 * Return the CPU set describing the locality of the OpenFabrics
--- a/src/3rdparty/hwloc/include/hwloc/plugins.h
+++ b/src/3rdparty/hwloc/include/hwloc/plugins.h
@@ -313,7 +313,13 @@ struct hwloc_component {
 * @{
 */

+/** \brief Check whether insertion errors are hidden */
+HWLOC_DECLSPEC int hwloc_hide_errors(void);
+
 /** \brief Add an object to the topology.
+ *
+ * Insert new object \p obj in the topology starting under existing object \p root
+ * (if \c NULL, the topology root object is used).
 *
 * It is sorted along the tree of other objects according to the inclusion of
 * cpusets, to eventually be added as a child of the smallest object including
@@ -327,32 +333,20 @@ struct hwloc_component {
 *
 * This shall only be called before levels are built.
 *
- * In case of error, hwloc_report_os_error() is called.
- *
 * The caller should check whether the object type is filtered-out before calling this function.
 *
 * The topology cpuset/nodesets will be enlarged to include the object sets.
 *
+ * \p reason is a unique string identifying where and why this insertion call was performed
+ * (it will be displayed in case of internal insertion error).
+ *
 * Returns the object on success.
 * Returns NULL and frees obj on error.
 * Returns another object and frees obj if it was merged with an identical pre-existing object.
 */
-HWLOC_DECLSPEC struct hwloc_obj *hwloc_insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t obj);
-
-/** \brief Type of error callbacks during object insertion */
-typedef void (*hwloc_report_error_t)(const char * msg, int line);
-/** \brief Report an insertion error from a backend */
-HWLOC_DECLSPEC void hwloc_report_os_error(const char * msg, int line);
-/** \brief Check whether insertion errors are hidden */
-HWLOC_DECLSPEC int hwloc_hide_errors(void);
-
-/** \brief Add an object to the topology and specify which error callback to use.
- *
- * This function is similar to hwloc_insert_object_by_cpuset() but it allows specifying
- * where to start insertion from (if \p root is NULL, the topology root object is used),
- * and specifying the error callback.
- */
-HWLOC_DECLSPEC struct hwloc_obj *hwloc__insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t root, hwloc_obj_t obj, hwloc_report_error_t report_error);
+HWLOC_DECLSPEC hwloc_obj_t
+hwloc__insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t root,
+                               hwloc_obj_t obj, const char *reason);

 /** \brief Insert an object somewhere in the topology.
 *
--- a/src/3rdparty/hwloc/include/hwloc/rename.h
+++ b/src/3rdparty/hwloc/include/hwloc/rename.h
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
- * Copyright © 2010-2019 Inria.  All rights reserved.
+ * Copyright © 2010-2020 Inria.  All rights reserved.
 * See COPYING in top-level directory.
 */

@@ -119,6 +119,7 @@ extern "C" {
 #define HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED HWLOC_NAME_CAPS(TOPOLOGY_FLAG_WITH_DISALLOWED)
 #define HWLOC_TOPOLOGY_FLAG_IS_THISSYSTEM HWLOC_NAME_CAPS(TOPOLOGY_FLAG_IS_THISSYSTEM)
 #define HWLOC_TOPOLOGY_FLAG_THISSYSTEM_ALLOWED_RESOURCES HWLOC_NAME_CAPS(TOPOLOGY_FLAG_THISSYSTEM_ALLOWED_RESOURCES)
+#define HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT HWLOC_NAME_CAPS(TOPOLOGY_FLAG_IMPORT_SUPPORT)

 #define hwloc_topology_set_pid HWLOC_NAME(topology_set_pid)
 #define hwloc_topology_set_synthetic HWLOC_NAME(topology_set_synthetic)
@@ -134,6 +135,7 @@ extern "C" {
 #define hwloc_topology_discovery_support HWLOC_NAME(topology_discovery_support)
 #define hwloc_topology_cpubind_support HWLOC_NAME(topology_cpubind_support)
 #define hwloc_topology_membind_support HWLOC_NAME(topology_membind_support)
+#define hwloc_topology_misc_support HWLOC_NAME(topology_misc_support)
 #define hwloc_topology_support HWLOC_NAME(topology_support)
 #define hwloc_topology_get_support HWLOC_NAME(topology_get_support)

@@ -170,6 +172,7 @@ extern "C" {
 #define hwloc_topology_alloc_group_object HWLOC_NAME(topology_alloc_group_object)
 #define hwloc_topology_insert_group_object HWLOC_NAME(topology_insert_group_object)
 #define hwloc_obj_add_other_obj_sets HWLOC_NAME(obj_add_other_obj_sets)
+#define hwloc_topology_refresh HWLOC_NAME(topology_refresh)

 #define hwloc_topology_get_depth HWLOC_NAME(topology_get_depth)
 #define hwloc_get_type_depth HWLOC_NAME(get_type_depth)
@@ -367,6 +370,51 @@ extern "C" {
 #define hwloc_cpuset_to_nodeset HWLOC_NAME(cpuset_to_nodeset)
 #define hwloc_cpuset_from_nodeset HWLOC_NAME(cpuset_from_nodeset)

+/* memattrs.h */
+
+#define hwloc_memattr_id_e HWLOC_NAME(memattr_id_e)
+#define HWLOC_MEMATTR_ID_CAPACITY HWLOC_NAME_CAPS(MEMATTR_ID_CAPACITY)
+#define HWLOC_MEMATTR_ID_LOCALITY HWLOC_NAME_CAPS(MEMATTR_ID_LOCALITY)
+#define HWLOC_MEMATTR_ID_BANDWIDTH HWLOC_NAME_CAPS(MEMATTR_ID_BANDWIDTH)
+#define HWLOC_MEMATTR_ID_LATENCY HWLOC_NAME_CAPS(MEMATTR_ID_LATENCY)
+
+#define hwloc_memattr_id_t HWLOC_NAME(memattr_id_t)
+#define hwloc_memattr_get_by_name HWLOC_NAME(memattr_get_by_name)
+
+#define hwloc_location HWLOC_NAME(location)
+#define hwloc_location_type_e HWLOC_NAME(location_type_e)
+#define HWLOC_LOCATION_TYPE_OBJECT HWLOC_NAME_CAPS(LOCATION_TYPE_OBJECT)
+#define HWLOC_LOCATION_TYPE_CPUSET HWLOC_NAME_CAPS(LOCATION_TYPE_CPUSET)
+#define hwloc_location_u HWLOC_NAME(location_u)
+
+#define hwloc_memattr_get_value HWLOC_NAME(memattr_get_value)
+#define hwloc_memattr_get_best_target HWLOC_NAME(memattr_get_best_target)
+#define hwloc_memattr_get_best_initiator HWLOC_NAME(memattr_get_best_initiator)
+
+#define hwloc_local_numanode_flag_e HWLOC_NAME(local_numanode_flag_e)
+#define HWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY HWLOC_NAME_CAPS(LOCAL_NUMANODE_FLAG_LARGER_LOCALITY)
+#define HWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY HWLOC_NAME_CAPS(LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY)
+#define HWLOC_LOCAL_NUMANODE_FLAG_ALL HWLOC_NAME_CAPS(LOCAL_NUMANODE_FLAG_ALL)
+#define hwloc_get_local_numanode_objs HWLOC_NAME(get_local_numanode_objs)
+
+#define hwloc_memattr_get_name HWLOC_NAME(memattr_get_name)
+#define hwloc_memattr_get_flags HWLOC_NAME(memattr_get_flags)
+#define hwloc_memattr_flag_e HWLOC_NAME(memattr_flag_e)
+#define HWLOC_MEMATTR_FLAG_HIGHER_FIRST HWLOC_NAME_CAPS(MEMATTR_FLAG_HIGHER_FIRST)
+#define HWLOC_MEMATTR_FLAG_LOWER_FIRST HWLOC_NAME_CAPS(MEMATTR_FLAG_LOWER_FIRST)
+#define HWLOC_MEMATTR_FLAG_NEED_INITIATOR HWLOC_NAME_CAPS(MEMATTR_FLAG_NEED_INITIATOR)
+#define hwloc_memattr_register HWLOC_NAME(memattr_register)
+#define hwloc_memattr_set_value HWLOC_NAME(memattr_set_value)
+#define hwloc_memattr_get_targets HWLOC_NAME(memattr_get_targets)
+#define hwloc_memattr_get_initiators HWLOC_NAME(memattr_get_initiators)
+
+/* cpukinds.h */
+
+#define hwloc_cpukinds_get_nr HWLOC_NAME(cpukinds_get_nr)
+#define hwloc_cpukinds_get_by_cpuset HWLOC_NAME(cpukinds_get_by_cpuset)
+#define hwloc_cpukinds_get_info HWLOC_NAME(cpukinds_get_info)
+#define hwloc_cpukinds_register HWLOC_NAME(cpukinds_register)
+
 /* export.h */

 #define hwloc_topology_export_xml_flags_e HWLOC_NAME(topology_export_xml_flags_e)
@@ -510,6 +558,12 @@ extern "C" {
 #define hwloc_nvml_get_device_osdev HWLOC_NAME(nvml_get_device_osdev)
 #define hwloc_nvml_get_device_osdev_by_index HWLOC_NAME(nvml_get_device_osdev_by_index)

+/* rsmi.h */
+
+#define hwloc_rsmi_get_device_cpuset HWLOC_NAME(rsmi_get_device_cpuset)
+#define hwloc_rsmi_get_device_osdev HWLOC_NAME(rsmi_get_device_osdev)
+#define hwloc_rsmi_get_device_osdev_by_index HWLOC_NAME(rsmi_get_device_osdev_by_index)
+
 /* gl.h */

 #define hwloc_gl_get_display_osdev_by_port_device HWLOC_NAME(gl_get_display_osdev_by_port_device)
@@ -547,9 +601,6 @@ extern "C" {

 #define hwloc_plugin_check_namespace HWLOC_NAME(plugin_check_namespace)

-#define hwloc_insert_object_by_cpuset HWLOC_NAME(insert_object_by_cpuset)
-#define hwloc_report_error_t HWLOC_NAME(report_error_t)
-#define hwloc_report_os_error HWLOC_NAME(report_os_error)
 #define hwloc_hide_errors HWLOC_NAME(hide_errors)
 #define hwloc__insert_object_by_cpuset HWLOC_NAME(_insert_object_by_cpuset)
 #define hwloc_insert_object_by_parent HWLOC_NAME(insert_object_by_parent)
@@ -683,6 +734,7 @@ extern "C" {
 #define hwloc_cuda_component HWLOC_NAME(cuda_component)
 #define hwloc_gl_component HWLOC_NAME(gl_component)
 #define hwloc_nvml_component HWLOC_NAME(nvml_component)
+#define hwloc_rsmi_component HWLOC_NAME(rsmi_component)
 #define hwloc_opencl_component HWLOC_NAME(opencl_component)
 #define hwloc_pci_component HWLOC_NAME(pci_component)

@@ -691,6 +743,8 @@ extern "C" {

 /* private/private.h */

+#define hwloc_internal_location_s HWLOC_NAME(internal_location_s)
+
 #define hwloc_special_level_s HWLOC_NAME(special_level_s)

 #define hwloc_pci_forced_locality_s HWLOC_NAME(pci_forced_locality_s)
@@ -713,6 +767,8 @@ extern "C" {

 #define hwloc__attach_memory_object HWLOC_NAME(insert_memory_object)

+#define hwloc_get_obj_by_type_and_gp_index HWLOC_NAME(get_obj_by_type_and_gp_index)
+
 #define hwloc_pci_discovery_init HWLOC_NAME(pci_discovery_init)
 #define hwloc_pci_discovery_prepare HWLOC_NAME(pci_discovery_prepare)
 #define hwloc_pci_discovery_exit HWLOC_NAME(pci_discovery_exit)
@@ -723,6 +779,7 @@ extern "C" {
 #define hwloc__add_info_nodup HWLOC_NAME(_add_info_nodup)
 #define hwloc__move_infos HWLOC_NAME(_move_infos)
 #define hwloc__free_infos HWLOC_NAME(_free_infos)
+#define hwloc__tma_dup_infos HWLOC_NAME(_tma_dup_infos)

 #define hwloc_binding_hooks HWLOC_NAME(binding_hooks)
 #define hwloc_set_native_binding_hooks HWLOC_NAME(set_native_binding_hooks)
@@ -764,6 +821,24 @@ extern "C" {
 #define hwloc_internal_distances_add_by_index HWLOC_NAME(internal_distances_add_by_index)
 #define hwloc_internal_distances_invalidate_cached_objs HWLOC_NAME(hwloc_internal_distances_invalidate_cached_objs)

+#define hwloc_internal_memattr_s HWLOC_NAME(internal_memattr_s)
+#define hwloc_internal_memattr_target_s HWLOC_NAME(internal_memattr_target_s)
+#define hwloc_internal_memattr_initiator_s HWLOC_NAME(internal_memattr_initiator_s)
+#define hwloc_internal_memattrs_init HWLOC_NAME(internal_memattrs_init)
+#define hwloc_internal_memattrs_prepare HWLOC_NAME(internal_memattrs_prepare)
+#define hwloc_internal_memattrs_dup HWLOC_NAME(internal_memattrs_dup)
+#define hwloc_internal_memattrs_destroy HWLOC_NAME(internal_memattrs_destroy)
+#define hwloc_internal_memattrs_need_refresh HWLOC_NAME(internal_memattrs_need_refresh)
+#define hwloc_internal_memattrs_refresh HWLOC_NAME(internal_memattrs_refresh)
+
+#define hwloc_internal_cpukind_s HWLOC_NAME(internal_cpukind_s)
+#define hwloc_internal_cpukinds_init HWLOC_NAME(internal_cpukinds_init)
+#define hwloc_internal_cpukinds_destroy HWLOC_NAME(internal_cpukinds_destroy)
+#define hwloc_internal_cpukinds_dup HWLOC_NAME(internal_cpukinds_dup)
+#define hwloc_internal_cpukinds_register HWLOC_NAME(internal_cpukinds_register)
+#define hwloc_internal_cpukinds_rank HWLOC_NAME(internal_cpukinds_rank)
+#define hwloc_internal_cpukinds_restrict HWLOC_NAME(internal_cpukinds_restrict)
+
 #define hwloc_encode_to_base64 HWLOC_NAME(encode_to_base64)
 #define hwloc_decode_from_base64 HWLOC_NAME(decode_from_base64)

--- a/src/3rdparty/hwloc/include/hwloc/rsmi.h
+++ b/src/3rdparty/hwloc/include/hwloc/rsmi.h
@@ -0,0 +1,201 @@
+/*
+ * Copyright © 2012-2020 Inria.  All rights reserved.
+ * Copyright (c) 2020, Advanced Micro Devices, Inc. All rights reserved.
+ * Written by Advanced Micro Devices,
+ * See COPYING in top-level directory.
+ */
+
+/** \file
+ * \brief Macros to help interaction between hwloc and the ROCm SMI Management Library.
+ *
+ * Applications that use both hwloc and the ROCm SMI Management Library may want to
+ * include this file so as to get topology information for AMD GPU devices.
+ */
+
+#ifndef HWLOC_RSMI_H
+#define HWLOC_RSMI_H
+
+#include "hwloc.h"
+#include "hwloc/autogen/config.h"
+#include "hwloc/helper.h"
+#ifdef HWLOC_LINUX_SYS
+#include "hwloc/linux.h"
+#endif
+
+#include <rocm_smi/rocm_smi.h>
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+
+/** \defgroup hwlocality_rsmi Interoperability with the ROCm SMI Management Library
+ *
+ * This interface offers ways to retrieve topology information about
+ * devices managed by the ROCm SMI Management Library.
+ *
+ * @{
+ */
+
+/** \brief Get the CPU set of logical processors that are physically
+ * close to AMD GPU device whose index is \p dv_ind.
+ *
+ * Return the CPU set describing the locality of the AMD GPU device
+ * whose index is \p dv_ind.
+ *
+ * Topology \p topology and device \p dv_ind must match the local machine.
+ * I/O devices detection and the ROCm SMI component are not needed in the
+ * topology.
+ *
+ * The function only returns the locality of the device.
+ * If more information about the device is needed, OS objects should
+ * be used instead, see hwloc_rsmi_get_device_osdev()
+ * and hwloc_rsmi_get_device_osdev_by_index().
+ *
+ * This function is currently only implemented in a meaningful way for
+ * Linux; other systems will simply get a full cpuset.
+ */
+static __hwloc_inline int
+hwloc_rsmi_get_device_cpuset(hwloc_topology_t topology __hwloc_attribute_unused,
+                             uint32_t dv_ind, hwloc_cpuset_t set)
+{
+#ifdef HWLOC_LINUX_SYS
+  /* If we're on Linux, use the sysfs mechanism to get the local cpus */
+#define HWLOC_RSMI_DEVICE_SYSFS_PATH_MAX 128
+  char path[HWLOC_RSMI_DEVICE_SYSFS_PATH_MAX];
+  rsmi_status_t ret;
+  uint64_t bdfid = 0;
+  unsigned domain, device, bus;
+
+  if (!hwloc_topology_is_thissystem(topology)) {
+    errno = EINVAL;
+    return -1;
+  }
+
+  ret = rsmi_dev_pci_id_get(dv_ind, &bdfid);
+  if (RSMI_STATUS_SUCCESS != ret) {
+    errno = EINVAL;
+    return -1;
+  }
+  domain = (bdfid>>32) & 0xffffffff;
+  bus = ((bdfid & 0xffff)>>8) & 0xff;
+  device = ((bdfid & 0xff)>>3) & 0x1f;
+
+  sprintf(path, "/sys/bus/pci/devices/%04x:%02x:%02x.0/local_cpus", domain, bus, device);
+  if (hwloc_linux_read_path_as_cpumask(path, set) < 0
+      || hwloc_bitmap_iszero(set))
+    hwloc_bitmap_copy(set, hwloc_topology_get_complete_cpuset(topology));
+#else
+  /* Non-Linux systems simply get a full cpuset */
+  hwloc_bitmap_copy(set, hwloc_topology_get_complete_cpuset(topology));
+#endif
+  return 0;
+}
+
+/** \brief Get the hwloc OS device object corresponding to the
+ * AMD GPU device whose index is \p dv_ind.
+ *
+ * Return the OS device object describing the AMD GPU device whose
+ * index is \p dv_ind. Returns NULL if there is none.
+ *
+ * The topology \p topology does not necessarily have to match the current
+ * machine. For instance the topology may be an XML import of a remote host.
+ * I/O devices detection and the ROCm SMI component must be enabled in the
+ * topology.
+ *
+ * \note The corresponding PCI device object can be obtained by looking
+ * at the OS device parent object (unless PCI devices are filtered out).
+ */
+static __hwloc_inline hwloc_obj_t
+hwloc_rsmi_get_device_osdev_by_index(hwloc_topology_t topology, uint32_t dv_ind)
+{
+  hwloc_obj_t osdev = NULL;
+  while ((osdev = hwloc_get_next_osdev(topology, osdev)) != NULL) {
+    if (HWLOC_OBJ_OSDEV_GPU == osdev->attr->osdev.type
+      && osdev->name
+      && !strncmp("rsmi", osdev->name, 4)
+      && atoi(osdev->name + 4) == (int) dv_ind)
+      return osdev;
+  }
+  return NULL;
+}
+
+/** \brief Get the hwloc OS device object corresponding to AMD GPU device,
+ * whose index is \p dv_ind.
+ *
+ * Return the hwloc OS device object that describes the given
+ * AMD GPU, whose index is \p dv_ind Return NULL if there is none.
+ *
+ * Topology \p topology and device \p dv_ind must match the local machine.
+ * I/O devices detection and the ROCm SMI component must be enabled in the
+ * topology. If not, the locality of the object may still be found using
+ * hwloc_rsmi_get_device_cpuset().
+ *
+ * \note The corresponding hwloc PCI device may be found by looking
+ * at the result parent pointer (unless PCI devices are filtered out).
+ */
+static __hwloc_inline hwloc_obj_t
+hwloc_rsmi_get_device_osdev(hwloc_topology_t topology, uint32_t dv_ind)
+{
+  hwloc_obj_t osdev;
+  rsmi_status_t ret;
+  uint64_t bdfid = 0;
+  unsigned domain, device, bus, func;
+  uint64_t id;
+  char uuid[64];
+
+  if (!hwloc_topology_is_thissystem(topology)) {
+    errno = EINVAL;
+    return NULL;
+  }
+
+  ret = rsmi_dev_pci_id_get(dv_ind, &bdfid);
+  if (RSMI_STATUS_SUCCESS != ret) {
+    errno = EINVAL;
+    return NULL;
+  }
+  domain = (bdfid>>32) & 0xffffffff;
+  bus = ((bdfid & 0xffff)>>8) & 0xff;
+  device = ((bdfid & 0xff)>>3) & 0x1f;
+  func = bdfid & 0x7;
+
+  ret = rsmi_dev_unique_id_get(dv_ind, &id);
+  if (RSMI_STATUS_SUCCESS != ret)
+    uuid[0] = '\0';
+  else
+    sprintf(uuid, "%lx", id);
+
+  osdev = NULL;
+  while ((osdev = hwloc_get_next_osdev(topology, osdev)) != NULL) {
+    hwloc_obj_t pcidev = osdev->parent;
+    const char *info;
+
+    if (strncmp(osdev->name, "rsmi", 4))
+      continue;
+
+    if (pcidev
+      && pcidev->type == HWLOC_OBJ_PCI_DEVICE
+      && pcidev->attr->pcidev.domain == domain
+      && pcidev->attr->pcidev.bus == bus
+      && pcidev->attr->pcidev.dev == device
+      && pcidev->attr->pcidev.func == func)
+      return osdev;
+
+    info = hwloc_obj_get_info_by_name(osdev, "AMDUUID");
+    if (info && !strcmp(info, uuid))
+      return osdev;
+  }
+
+  return NULL;
+}
+
+/** @} */
+
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+
+#endif /* HWLOC_RSMI_H */
--- a/src/3rdparty/hwloc/include/private/autogen/config.h
+++ b/src/3rdparty/hwloc/include/private/autogen/config.h
@@ -1,8 +1,8 @@
 /*
 * Copyright © 2009, 2011, 2012 CNRS.  All rights reserved.
- * Copyright © 2009-2018 Inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009, 2011, 2012, 2015 Université Bordeaux.  All rights reserved.
- * Copyright © 2009 Cisco Systems, Inc.  All rights reserved.
+ * Copyright © 2009-2020 Cisco Systems, Inc.  All rights reserved.
 * $COPYRIGHT$
 *
 * Additional copyrights may follow
@@ -575,7 +575,7 @@
 #define PACKAGE "hwloc"

 /* Define to the address where bug reports for this package should be sent. */
-#define PACKAGE_BUGREPORT "http://www.open-mpi.org/projects/hwloc/"
+#define PACKAGE_BUGREPORT "https://www.open-mpi.org/projects/hwloc/"

 /* Define to the full name of this package. */
 #define PACKAGE_NAME "hwloc"
@@ -668,5 +668,9 @@
 /* Define this to the thread ID type */
 #define hwloc_thread_t HANDLE

+/* Define to 1 if you have the declaration of `GetModuleFileName', and to 0 if
+   you don't. */
+#define HAVE_DECL_GETMODULEFILENAME 1
+

 #endif /* HWLOC_CONFIGURE_H */
--- a/src/3rdparty/hwloc/include/private/debug.h
+++ b/src/3rdparty/hwloc/include/private/debug.h
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2017 Inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009, 2011 Université Bordeaux
 * Copyright © 2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -19,6 +19,10 @@
 #include <stdio.h>
 #endif

+#ifdef ANDROID
+extern void JNIDebug(char *text);
+#endif
+
 /* Compile-time assertion */
 #define HWLOC_BUILD_ASSERT(condition) ((void)sizeof(char[1 - 2*!(condition)]))

@@ -44,9 +48,17 @@ static __hwloc_inline void hwloc_debug(const char *s __hwloc_attribute_unused, .
 {
 #ifdef HWLOC_DEBUG
  if (hwloc_debug_enabled()) {
+#ifdef ANDROID
+    char buffer[256];
+#endif
    va_list ap;
    va_start(ap, s);
+#ifdef ANDROID
+    vsprintf(buffer, s, ap);
+    JNIDebug(buffer);
+#else
    vfprintf(stderr, s, ap);
+#endif
    va_end(ap);
  }
 #endif
@@ -57,21 +69,21 @@ static __hwloc_inline void hwloc_debug(const char *s __hwloc_attribute_unused, .
 if (hwloc_debug_enabled()) { \
  char *s; \
  hwloc_bitmap_asprintf(&s, bitmap); \
-  fprintf(stderr, fmt, s); \
+  hwloc_debug(fmt, s); \
  free(s); \
 } } while (0)
 #define hwloc_debug_1arg_bitmap(fmt, arg1, bitmap) do { \
 if (hwloc_debug_enabled()) { \
  char *s; \
  hwloc_bitmap_asprintf(&s, bitmap); \
-  fprintf(stderr, fmt, arg1, s); \
+  hwloc_debug(fmt, arg1, s); \
  free(s); \
 } } while (0)
 #define hwloc_debug_2args_bitmap(fmt, arg1, arg2, bitmap) do { \
 if (hwloc_debug_enabled()) { \
  char *s; \
  hwloc_bitmap_asprintf(&s, bitmap); \
-  fprintf(stderr, fmt, arg1, arg2, s); \
+  hwloc_debug(fmt, arg1, arg2, s); \
  free(s); \
 } } while (0)
 #else
--- a/src/3rdparty/hwloc/include/private/internal-components.h
+++ b/src/3rdparty/hwloc/include/private/internal-components.h
@@ -30,6 +30,7 @@ HWLOC_DECLSPEC extern const struct hwloc_component hwloc_x86_component;
 HWLOC_DECLSPEC extern const struct hwloc_component hwloc_cuda_component;
 HWLOC_DECLSPEC extern const struct hwloc_component hwloc_gl_component;
 HWLOC_DECLSPEC extern const struct hwloc_component hwloc_nvml_component;
+HWLOC_DECLSPEC extern const struct hwloc_component hwloc_rsmi_component;
 HWLOC_DECLSPEC extern const struct hwloc_component hwloc_opencl_component;
 HWLOC_DECLSPEC extern const struct hwloc_component hwloc_pci_component;

--- a/src/3rdparty/hwloc/include/private/private.h
+++ b/src/3rdparty/hwloc/include/private/private.h
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009      CNRS
- * Copyright © 2009-2019 Inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009-2012, 2020 Université Bordeaux
 * Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
 *
@@ -40,7 +40,19 @@
 #endif
 #include <string.h>

-#define HWLOC_TOPOLOGY_ABI 0x20100 /* version of the layout of struct topology */
+#define HWLOC_TOPOLOGY_ABI 0x20400 /* version of the layout of struct topology */
+
+struct hwloc_internal_location_s {
+  enum hwloc_location_type_e type;
+  union {
+    struct {
+      hwloc_obj_t obj; /* cached between refreshes */
+      uint64_t gp_index;
+      hwloc_obj_type_t type;
+    } object; /* if type == HWLOC_LOCATION_TYPE_OBJECT */
+    hwloc_cpuset_t cpuset; /* if type == HWLOC_LOCATION_TYPE_CPUSET */
+  } location;
+};

 /*****************************************************
 * WARNING:
@@ -163,6 +175,50 @@ struct hwloc_topology {
  } *first_dist, *last_dist;
  unsigned next_dist_id;

+  /* memory attributes */
+  unsigned nr_memattrs;
+  struct hwloc_internal_memattr_s {
+    /* memattr info */
+    char *name; /* TODO unit is implicit, in the documentation of standard attributes, or in the name? */
+    unsigned long flags;
+#define HWLOC_IMATTR_FLAG_STATIC_NAME (1U<<0) /* no need to free name */
+#define HWLOC_IMATTR_FLAG_CACHE_VALID (1U<<1) /* target and initiator are valid */
+#define HWLOC_IMATTR_FLAG_CONVENIENCE (1U<<2) /* convenience attribute reporting values from non-memattr attributes (R/O and no actual targets stored) */
+    unsigned iflags;
+
+    /* array of values */
+    unsigned nr_targets;
+    struct hwloc_internal_memattr_target_s {
+      /* target object */
+      hwloc_obj_t obj; /* cached between refreshes */
+      hwloc_obj_type_t type;
+      unsigned os_index; /* only used temporarily during discovery when there's no obj/gp_index yet */
+      hwloc_uint64_t gp_index;
+
+      /* value if there are no initiator for this attr */
+      hwloc_uint64_t noinitiator_value;
+      /* initiators otherwise */
+      unsigned nr_initiators;
+      struct hwloc_internal_memattr_initiator_s {
+        struct hwloc_internal_location_s initiator;
+        hwloc_uint64_t value;
+      } *initiators;
+    } *targets;
+  } *memattrs;
+
+  /* hybridcpus */
+  unsigned nr_cpukinds;
+  unsigned nr_cpukinds_allocated;
+  struct hwloc_internal_cpukind_s {
+    hwloc_cpuset_t cpuset;
+#define HWLOC_CPUKIND_EFFICIENCY_UNKNOWN -1
+    int efficiency;
+    int forced_efficiency; /* returned by the hardware or OS if any */
+    hwloc_uint64_t ranking_value; /* internal value for ranking */
+    unsigned nr_infos;
+    struct hwloc_info_s *infos;
+  } *cpukinds;
+
  int grouping;
  int grouping_verbose;
  unsigned grouping_nbaccuracies;
@@ -240,8 +296,9 @@ extern void hwloc_topology_clear(struct hwloc_topology *topology);

 /* insert memory object as memory child of normal parent */
 extern struct hwloc_obj * hwloc__attach_memory_object(struct hwloc_topology *topology, hwloc_obj_t parent,
-						      hwloc_obj_t obj,
-						      hwloc_report_error_t report_error);
+                                                      hwloc_obj_t obj, const char *reason);
+
+extern hwloc_obj_t hwloc_get_obj_by_type_and_gp_index(hwloc_topology_t topology, hwloc_obj_type_t type, uint64_t gp_index);

 extern void hwloc_pci_discovery_init(struct hwloc_topology *topology);
 extern void hwloc_pci_discovery_prepare(struct hwloc_topology *topology);
@@ -261,6 +318,7 @@ extern hwloc_obj_t hwloc_find_insert_io_parent_by_complete_cpuset(struct hwloc_t
 extern int hwloc__add_info(struct hwloc_info_s **infosp, unsigned *countp, const char *name, const char *value);
 extern int hwloc__add_info_nodup(struct hwloc_info_s **infosp, unsigned *countp, const char *name, const char *value, int replace);
 extern int hwloc__move_infos(struct hwloc_info_s **dst_infosp, unsigned *dst_countp, struct hwloc_info_s **src_infosp, unsigned *src_countp);
+extern int hwloc__tma_dup_infos(struct hwloc_tma *tma, struct hwloc_info_s **dst_infosp, unsigned *dst_countp, struct hwloc_info_s *src_infos, unsigned src_count);
 extern void hwloc__free_infos(struct hwloc_info_s *infos, unsigned count);

 /* set native OS binding hooks */
@@ -354,6 +412,22 @@ extern int hwloc_internal_distances_add(hwloc_topology_t topology, const char *n
 extern int hwloc_internal_distances_add_by_index(hwloc_topology_t topology, const char *name, hwloc_obj_type_t unique_type, hwloc_obj_type_t *different_types, unsigned nbobjs, uint64_t *indexes, uint64_t *values, unsigned long kind, unsigned long flags);
 extern void hwloc_internal_distances_invalidate_cached_objs(hwloc_topology_t topology);

+extern void hwloc_internal_memattrs_init(hwloc_topology_t topology);
+extern void hwloc_internal_memattrs_prepare(hwloc_topology_t topology);
+extern void hwloc_internal_memattrs_destroy(hwloc_topology_t topology);
+extern void hwloc_internal_memattrs_need_refresh(hwloc_topology_t topology);
+extern void hwloc_internal_memattrs_refresh(hwloc_topology_t topology);
+extern int hwloc_internal_memattrs_dup(hwloc_topology_t new, hwloc_topology_t old);
+extern int hwloc_internal_memattr_set_value(hwloc_topology_t topology, hwloc_memattr_id_t id, hwloc_obj_type_t target_type, hwloc_uint64_t target_gp_index, unsigned target_os_index, struct hwloc_internal_location_s *initiator, hwloc_uint64_t value);
+
+extern void hwloc_internal_cpukinds_init(hwloc_topology_t topology);
+extern int hwloc_internal_cpukinds_rank(hwloc_topology_t topology);
+extern void hwloc_internal_cpukinds_destroy(hwloc_topology_t topology);
+extern int hwloc_internal_cpukinds_dup(hwloc_topology_t new, hwloc_topology_t old);
+#define HWLOC_CPUKINDS_REGISTER_FLAG_OVERWRITE_FORCED_EFFICIENCY (1<<0)
+extern int hwloc_internal_cpukinds_register(hwloc_topology_t topology, hwloc_cpuset_t cpuset, int forced_efficiency, const struct hwloc_info_s *infos, unsigned nr_infos, unsigned long flags);
+extern void hwloc_internal_cpukinds_restrict(hwloc_topology_t topology);
+
 /* encode src buffer into target buffer.
 * targsize must be at least 4*((srclength+2)/3)+1.
 * target will be 0-terminated.
--- a/src/3rdparty/hwloc/include/private/xml.h
+++ b/src/3rdparty/hwloc/include/private/xml.h
@@ -46,7 +46,7 @@ struct hwloc_xml_backend_data_s {
  int (*find_child)(struct hwloc__xml_import_state_s * state, struct hwloc__xml_import_state_s * childstate, char **tagp);
  int (*close_tag)(struct hwloc__xml_import_state_s * state); /* look for an explicit closing tag </name> */
  void (*close_child)(struct hwloc__xml_import_state_s * state);
-  int (*get_content)(struct hwloc__xml_import_state_s * state, char **beginp, size_t expected_length); /* return 0 on empty content (and sets beginp to empty string), 1 on actual content, -1 on error or unexpected content length */
+  int (*get_content)(struct hwloc__xml_import_state_s * state, const char **beginp, size_t expected_length); /* return 0 on empty content (and sets beginp to empty string), 1 on actual content, -1 on error or unexpected content length */
  void (*close_content)(struct hwloc__xml_import_state_s * state);
  char * msgprefix;
  void *data; /* libxml2 doc, or nolibxml buffer */
--- a/src/3rdparty/hwloc/src/bind.c
+++ b/src/3rdparty/hwloc/src/bind.c
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2019 Inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009-2010, 2012 Université Bordeaux
 * Copyright © 2011-2015 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -921,5 +921,6 @@ hwloc_set_binding_hooks(struct hwloc_topology *topology)
    DO(mem,get_area_membind);
    DO(mem,get_area_memlocation);
    DO(mem,alloc_membind);
+#undef DO
  }
 }
--- a/src/3rdparty/hwloc/src/bitmap.c
+++ b/src/3rdparty/hwloc/src/bitmap.c
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2018 Inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009-2011 Université Bordeaux
 * Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -818,7 +818,7 @@ int hwloc_bitmap_nr_ulongs(const struct hwloc_bitmap_s *set)
 		return -1;

 	last = hwloc_bitmap_last(set);
-	return (last + HWLOC_BITS_PER_LONG-1)/HWLOC_BITS_PER_LONG;
+	return (last + HWLOC_BITS_PER_LONG)/HWLOC_BITS_PER_LONG;
 }

 int hwloc_bitmap_only(struct hwloc_bitmap_s * set, unsigned cpu)
--- a/src/3rdparty/hwloc/src/cpukinds.c
+++ b/src/3rdparty/hwloc/src/cpukinds.c
@@ -0,0 +1,649 @@
+/*
+ * Copyright © 2020-2021 Inria.  All rights reserved.
+ * See COPYING in top-level directory.
+ */
+
+#include "private/autogen/config.h"
+#include "hwloc.h"
+#include "private/private.h"
+#include "private/debug.h"
+
+
+/*****************
+ * Basics
+ */
+
+void
+hwloc_internal_cpukinds_init(struct hwloc_topology *topology)
+{
+  topology->cpukinds = NULL;
+  topology->nr_cpukinds = 0;
+  topology->nr_cpukinds_allocated = 0;
+}
+
+void
+hwloc_internal_cpukinds_destroy(struct hwloc_topology *topology)
+{
+  unsigned i;
+  for(i=0; i<topology->nr_cpukinds; i++) {
+    struct hwloc_internal_cpukind_s *kind = &topology->cpukinds[i];
+    hwloc_bitmap_free(kind->cpuset);
+    hwloc__free_infos(kind->infos, kind->nr_infos);
+  }
+  free(topology->cpukinds);
+  topology->cpukinds = NULL;
+  topology->nr_cpukinds = 0;
+}
+
+int
+hwloc_internal_cpukinds_dup(hwloc_topology_t new, hwloc_topology_t old)
+{
+  struct hwloc_tma *tma = new->tma;
+  struct hwloc_internal_cpukind_s *kinds;
+  unsigned i;
+
+  kinds = hwloc_tma_malloc(tma, old->nr_cpukinds * sizeof(*kinds));
+  if (!kinds)
+    return -1;
+  new->cpukinds = kinds;
+  new->nr_cpukinds = old->nr_cpukinds;
+  memcpy(kinds, old->cpukinds, old->nr_cpukinds * sizeof(*kinds));
+
+  for(i=0;i<old->nr_cpukinds; i++) {
+    kinds[i].cpuset = hwloc_bitmap_tma_dup(tma, old->cpukinds[i].cpuset);
+    if (!kinds[i].cpuset) {
+      new->nr_cpukinds = i;
+      goto failed;
+    }
+    if (hwloc__tma_dup_infos(tma,
+                             &kinds[i].infos, &kinds[i].nr_infos,
+                             old->cpukinds[i].infos, old->cpukinds[i].nr_infos) < 0) {
+      assert(!tma || !tma->dontfree); /* this tma cannot fail to allocate */
+      hwloc_bitmap_free(kinds[i].cpuset);
+      new->nr_cpukinds = i;
+      goto failed;
+    }
+  }
+
+  return 0;
+
+ failed:
+  hwloc_internal_cpukinds_destroy(new);
+  return -1;
+}
+
+void
+hwloc_internal_cpukinds_restrict(hwloc_topology_t topology)
+{
+  unsigned i;
+  int removed = 0;
+  for(i=0; i<topology->nr_cpukinds; i++) {
+    struct hwloc_internal_cpukind_s *kind = &topology->cpukinds[i];
+    hwloc_bitmap_and(kind->cpuset, kind->cpuset, hwloc_get_root_obj(topology)->cpuset);
+    if (hwloc_bitmap_iszero(kind->cpuset)) {
+      hwloc_bitmap_free(kind->cpuset);
+      hwloc__free_infos(kind->infos, kind->nr_infos);
+      memmove(kind, kind+1, (topology->nr_cpukinds - i - 1)*sizeof(*kind));
+      i--;
+      topology->nr_cpukinds--;
+      removed = 1;
+    }
+  }
+  if (removed)
+    hwloc_internal_cpukinds_rank(topology);
+}
+
+
+/********************
+ * Registering
+ */
+
+static __hwloc_inline int
+hwloc__cpukind_check_duplicate_info(struct hwloc_internal_cpukind_s *kind,
+                                    const char *name, const char *value)
+{
+  unsigned i;
+  for(i=0; i<kind->nr_infos; i++)
+    if (!strcmp(kind->infos[i].name, name)
+        && !strcmp(kind->infos[i].value, value))
+      return 1;
+  return 0;
+}
+
+static __hwloc_inline void
+hwloc__cpukind_add_infos(struct hwloc_internal_cpukind_s *kind,
+                         const struct hwloc_info_s *infos, unsigned nr_infos)
+{
+  unsigned i;
+  for(i=0; i<nr_infos; i++) {
+    if (hwloc__cpukind_check_duplicate_info(kind, infos[i].name, infos[i].value))
+      continue;
+    hwloc__add_info(&kind->infos, &kind->nr_infos, infos[i].name, infos[i].value);
+  }
+}
+
+int
+hwloc_internal_cpukinds_register(hwloc_topology_t topology, hwloc_cpuset_t cpuset,
+                                 int forced_efficiency,
+                                 const struct hwloc_info_s *infos, unsigned nr_infos,
+                                 unsigned long flags)
+{
+  struct hwloc_internal_cpukind_s *kinds;
+  unsigned i, max, bits, oldnr, newnr;
+
+  if (hwloc_bitmap_iszero(cpuset)) {
+    hwloc_bitmap_free(cpuset);
+    errno = EINVAL;
+    return -1;
+  }
+
+  if (flags & ~HWLOC_CPUKINDS_REGISTER_FLAG_OVERWRITE_FORCED_EFFICIENCY) {
+    errno = EINVAL;
+    return -1;
+  }
+
+  /* TODO: for now, only windows provides a forced efficiency.
+   * if another backend ever provides a conflicting value, the first backend value will be kept.
+   * (user-provided values are not an issue, they are meant to overwrite)
+   */
+
+  /* If we have N kinds currently, we may need 2N+1 kinds after inserting the new one:
+   * - each existing kind may get split into which PUs are in the new kind and which aren't.
+   * - some PUs might not have been in any kind yet.
+   */
+  max = 2 * topology->nr_cpukinds + 1;
+  /* Allocate the power-of-two above 2N+1. */
+  bits = hwloc_flsl(max-1) + 1;
+  max = 1U<<bits;
+  /* Allocate 8 minimum to avoid multiple reallocs */
+  if (max < 8)
+    max = 8;
+
+  /* Create or enlarge the array of kinds if needed */
+  kinds = topology->cpukinds;
+  if (max > topology->nr_cpukinds_allocated) {
+    kinds = realloc(kinds, max * sizeof(*kinds));
+    if (!kinds) {
+      hwloc_bitmap_free(cpuset);
+      return -1;
+    }
+    memset(&kinds[topology->nr_cpukinds_allocated], 0, (max - topology->nr_cpukinds_allocated) * sizeof(*kinds));
+    topology->nr_cpukinds_allocated = max;
+    topology->cpukinds = kinds;
+  }
+
+  newnr = oldnr = topology->nr_cpukinds;
+  for(i=0; i<oldnr; i++) {
+    int res = hwloc_bitmap_compare_inclusion(cpuset, kinds[i].cpuset);
+    if (res == HWLOC_BITMAP_INTERSECTS || res == HWLOC_BITMAP_INCLUDED) {
+      /* new kind with intersection of cpusets and union of infos */
+      kinds[newnr].cpuset = hwloc_bitmap_alloc();
+      kinds[newnr].efficiency = HWLOC_CPUKIND_EFFICIENCY_UNKNOWN;
+      kinds[newnr].forced_efficiency = forced_efficiency;
+      hwloc_bitmap_and(kinds[newnr].cpuset, cpuset, kinds[i].cpuset);
+      hwloc__cpukind_add_infos(&kinds[newnr], kinds[i].infos, kinds[i].nr_infos);
+      hwloc__cpukind_add_infos(&kinds[newnr], infos, nr_infos);
+      /* remove cpuset PUs from the existing kind that we just split */
+      hwloc_bitmap_andnot(kinds[i].cpuset, kinds[i].cpuset, kinds[newnr].cpuset);
+      /* clear cpuset PUs that were taken care of */
+      hwloc_bitmap_andnot(cpuset, cpuset, kinds[newnr].cpuset);
+
+      newnr++;
+
+    } else if (res == HWLOC_BITMAP_CONTAINS
+               || res == HWLOC_BITMAP_EQUAL) {
+      /* append new info to existing smaller (or equal) kind */
+      hwloc__cpukind_add_infos(&kinds[i], infos, nr_infos);
+      if ((flags & HWLOC_CPUKINDS_REGISTER_FLAG_OVERWRITE_FORCED_EFFICIENCY)
+          || kinds[i].forced_efficiency == HWLOC_CPUKIND_EFFICIENCY_UNKNOWN)
+        kinds[i].forced_efficiency = forced_efficiency;
+      /* clear cpuset PUs that were taken care of */
+      hwloc_bitmap_andnot(cpuset, cpuset, kinds[i].cpuset);
+
+    } else {
+      assert(res == HWLOC_BITMAP_DIFFERENT);
+      /* nothing to do */
+    }
+
+    /* don't compare with anything else if already empty */
+    if (hwloc_bitmap_iszero(cpuset))
+      break;
+  }
+
+  /* add a final kind with remaining PUs if any */
+  if (!hwloc_bitmap_iszero(cpuset)) {
+    kinds[newnr].cpuset = cpuset;
+    kinds[newnr].efficiency = HWLOC_CPUKIND_EFFICIENCY_UNKNOWN;
+    kinds[newnr].forced_efficiency = forced_efficiency;
+    hwloc__cpukind_add_infos(&kinds[newnr], infos, nr_infos);
+    newnr++;
+  } else {
+    hwloc_bitmap_free(cpuset);
+  }
+
+  topology->nr_cpukinds = newnr;
+  return 0;
+}
+
+int
+hwloc_cpukinds_register(hwloc_topology_t topology, hwloc_cpuset_t _cpuset,
+                        int forced_efficiency,
+                        unsigned nr_infos, struct hwloc_info_s *infos,
+                        unsigned long flags)
+{
+  hwloc_bitmap_t cpuset;
+  int err;
+
+  if (flags) {
+    errno = EINVAL;
+    return -1;
+  }
+
+  if (!_cpuset || hwloc_bitmap_iszero(_cpuset)) {
+    errno = EINVAL;
+    return -1;
+  }
+
+  cpuset = hwloc_bitmap_dup(_cpuset);
+  if (!cpuset)
+    return -1;
+
+  if (forced_efficiency < 0)
+    forced_efficiency = HWLOC_CPUKIND_EFFICIENCY_UNKNOWN;
+
+  err = hwloc_internal_cpukinds_register(topology, cpuset, forced_efficiency, infos, nr_infos, HWLOC_CPUKINDS_REGISTER_FLAG_OVERWRITE_FORCED_EFFICIENCY);
+  if (err < 0)
+    return err;
+
+  hwloc_internal_cpukinds_rank(topology);
+  return 0;
+}
+
+
+/*********************
+ * Ranking
+ */
+
+static int
+hwloc__cpukinds_check_duplicate_rankings(struct hwloc_topology *topology)
+{
+  unsigned i,j;
+  for(i=0; i<topology->nr_cpukinds; i++)
+    for(j=i+1; j<topology->nr_cpukinds; j++)
+      if (topology->cpukinds[i].ranking_value == topology->cpukinds[j].ranking_value)
+        /* if any duplicate, fail */
+        return -1;
+  return 0;
+}
+
+static int
+hwloc__cpukinds_try_rank_by_forced_efficiency(struct hwloc_topology *topology)
+{
+  unsigned i;
+
+  hwloc_debug("Trying to rank cpukinds by forced efficiency...\n");
+  for(i=0; i<topology->nr_cpukinds; i++) {
+    if (topology->cpukinds[i].forced_efficiency == HWLOC_CPUKIND_EFFICIENCY_UNKNOWN)
+      /* if any unknown, fail */
+      return -1;
+    topology->cpukinds[i].ranking_value = topology->cpukinds[i].forced_efficiency;
+  }
+
+  return hwloc__cpukinds_check_duplicate_rankings(topology);
+}
+
+struct hwloc_cpukinds_info_summary {
+  int have_max_freq;
+  int have_base_freq;
+  int have_intel_core_type;
+  struct hwloc_cpukind_info_summary {
+    unsigned intel_core_type; /* 1 for atom, 2 for core */
+    unsigned max_freq, base_freq; /* MHz, hence < 100000 */
+  } * summaries;
+};
+
+static void
+hwloc__cpukinds_summarize_info(struct hwloc_topology *topology,
+                               struct hwloc_cpukinds_info_summary *summary)
+{
+  unsigned i, j;
+
+  summary->have_max_freq = 1;
+  summary->have_base_freq = 1;
+  summary->have_intel_core_type = 1;
+
+  for(i=0; i<topology->nr_cpukinds; i++) {
+    struct hwloc_internal_cpukind_s *kind = &topology->cpukinds[i];
+    for(j=0; j<kind->nr_infos; j++) {
+      struct hwloc_info_s *info = &kind->infos[j];
+      if (!strcmp(info->name, "FrequencyMaxMHz")) {
+        summary->summaries[i].max_freq = atoi(info->value);
+      } else if (!strcmp(info->name, "FrequencyBaseMHz")) {
+        summary->summaries[i].base_freq = atoi(info->value);
+      } else if (!strcmp(info->name, "CoreType")) {
+        if (!strcmp(info->value, "IntelAtom"))
+          summary->summaries[i].intel_core_type = 1;
+        else if (!strcmp(info->value, "IntelCore"))
+          summary->summaries[i].intel_core_type = 2;
+      }
+    }
+    hwloc_debug("cpukind #%u has intel_core_type %u max_freq %u base_freq %u\n",
+                i, summary->summaries[i].intel_core_type,
+                summary->summaries[i].max_freq, summary->summaries[i].base_freq);
+    if (!summary->summaries[i].base_freq)
+      summary->have_base_freq = 0;
+    if (!summary->summaries[i].max_freq)
+      summary->have_max_freq = 0;
+    if (!summary->summaries[i].intel_core_type)
+      summary->have_intel_core_type = 0;
+  }
+}
+
+enum hwloc_cpukinds_ranking {
+  HWLOC_CPUKINDS_RANKING_DEFAULT, /* forced + frequency on ARM, forced + coretype_frequency otherwise */
+  HWLOC_CPUKINDS_RANKING_NO_FORCED_EFFICIENCY, /* default without forced */
+  HWLOC_CPUKINDS_RANKING_FORCED_EFFICIENCY,
+  HWLOC_CPUKINDS_RANKING_CORETYPE_FREQUENCY,
+  HWLOC_CPUKINDS_RANKING_CORETYPE,
+  HWLOC_CPUKINDS_RANKING_FREQUENCY,
+  HWLOC_CPUKINDS_RANKING_FREQUENCY_MAX,
+  HWLOC_CPUKINDS_RANKING_FREQUENCY_BASE,
+  HWLOC_CPUKINDS_RANKING_NONE
+};
+
+static int
+hwloc__cpukinds_try_rank_by_info(struct hwloc_topology *topology,
+                                 enum hwloc_cpukinds_ranking heuristics,
+                                 struct hwloc_cpukinds_info_summary *summary)
+{
+  unsigned i;
+
+  if (HWLOC_CPUKINDS_RANKING_CORETYPE_FREQUENCY == heuristics) {
+    hwloc_debug("Trying to rank cpukinds by coretype+frequency...\n");
+    /* we need intel_core_type + (base or max freq) for all kinds */
+    if (!summary->have_intel_core_type
+        || (!summary->have_max_freq && !summary->have_base_freq))
+      return -1;
+    /* rank first by coretype (Core>>Atom) then by frequency, base if available, max otherwise */
+    for(i=0; i<topology->nr_cpukinds; i++) {
+      struct hwloc_internal_cpukind_s *kind = &topology->cpukinds[i];
+      if (summary->have_base_freq)
+        kind->ranking_value = (summary->summaries[i].intel_core_type << 20) + summary->summaries[i].base_freq;
+      else
+        kind->ranking_value = (summary->summaries[i].intel_core_type << 20) + summary->summaries[i].max_freq;
+    }
+
+  } else if (HWLOC_CPUKINDS_RANKING_CORETYPE == heuristics) {
+    hwloc_debug("Trying to rank cpukinds by coretype...\n");
+    /* we need intel_core_type */
+    if (!summary->have_intel_core_type)
+      return -1;
+    /* rank by coretype (Core>>Atom) */
+    for(i=0; i<topology->nr_cpukinds; i++) {
+      struct hwloc_internal_cpukind_s *kind = &topology->cpukinds[i];
+      kind->ranking_value = (summary->summaries[i].intel_core_type << 20);
+    }
+
+  } else if (HWLOC_CPUKINDS_RANKING_FREQUENCY == heuristics) {
+    hwloc_debug("Trying to rank cpukinds by frequency...\n");
+    /* we need base or max freq for all kinds */
+    if (!summary->have_max_freq && !summary->have_base_freq)
+      return -1;
+    /* rank first by frequency, base if available, max otherwise */
+    for(i=0; i<topology->nr_cpukinds; i++) {
+      struct hwloc_internal_cpukind_s *kind = &topology->cpukinds[i];
+      if (summary->have_base_freq)
+        kind->ranking_value = summary->summaries[i].base_freq;
+      else
+        kind->ranking_value = summary->summaries[i].max_freq;
+    }
+
+  } else if (HWLOC_CPUKINDS_RANKING_FREQUENCY_MAX == heuristics) {
+    hwloc_debug("Trying to rank cpukinds by frequency max...\n");
+    /* we need max freq for all kinds */
+    if (!summary->have_max_freq)
+      return -1;
+    /* rank first by frequency, base if available, max otherwise */
+    for(i=0; i<topology->nr_cpukinds; i++) {
+      struct hwloc_internal_cpukind_s *kind = &topology->cpukinds[i];
+      kind->ranking_value = summary->summaries[i].max_freq;
+    }
+
+  } else if (HWLOC_CPUKINDS_RANKING_FREQUENCY_BASE == heuristics) {
+    hwloc_debug("Trying to rank cpukinds by frequency base...\n");
+    /* we need max freq for all kinds */
+    if (!summary->have_base_freq)
+      return -1;
+    /* rank first by frequency, base if available, max otherwise */
+    for(i=0; i<topology->nr_cpukinds; i++) {
+      struct hwloc_internal_cpukind_s *kind = &topology->cpukinds[i];
+      kind->ranking_value = summary->summaries[i].base_freq;
+    }
+
+  } else assert(0);
+
+  return hwloc__cpukinds_check_duplicate_rankings(topology);
+}
+
+static int hwloc__cpukinds_compare_ranking_values(const void *_a, const void *_b)
+{
+  const struct hwloc_internal_cpukind_s *a = _a;
+  const struct hwloc_internal_cpukind_s *b = _b;
+  return a->ranking_value - b->ranking_value;
+}
+
+/* this function requires ranking values to be unique */
+static void
+hwloc__cpukinds_finalize_ranking(struct hwloc_topology *topology)
+{
+  unsigned i;
+  /* sort */
+  qsort(topology->cpukinds, topology->nr_cpukinds, sizeof(*topology->cpukinds), hwloc__cpukinds_compare_ranking_values);
+  /* define our own efficiency between 0 and N-1 */
+  for(i=0; i<topology->nr_cpukinds; i++)
+    topology->cpukinds[i].efficiency = i;
+}
+
+int
+hwloc_internal_cpukinds_rank(struct hwloc_topology *topology)
+{
+  enum hwloc_cpukinds_ranking heuristics;
+  char *env;
+  unsigned i;
+  int err;
+
+  if (!topology->nr_cpukinds)
+    return 0;
+
+  if (topology->nr_cpukinds == 1) {
+    topology->cpukinds[0].efficiency = 0;
+    return 0;
+  }
+
+  heuristics = HWLOC_CPUKINDS_RANKING_DEFAULT;
+  env = getenv("HWLOC_CPUKINDS_RANKING");
+  if (env) {
+    if (!strcmp(env, "default"))
+      heuristics = HWLOC_CPUKINDS_RANKING_DEFAULT;
+    else if (!strcmp(env, "none"))
+      heuristics = HWLOC_CPUKINDS_RANKING_NONE;
+    else if (!strcmp(env, "coretype+frequency"))
+      heuristics = HWLOC_CPUKINDS_RANKING_CORETYPE_FREQUENCY;
+    else if (!strcmp(env, "coretype"))
+      heuristics = HWLOC_CPUKINDS_RANKING_CORETYPE;
+    else if (!strcmp(env, "frequency"))
+      heuristics = HWLOC_CPUKINDS_RANKING_FREQUENCY;
+    else if (!strcmp(env, "frequency_max"))
+      heuristics = HWLOC_CPUKINDS_RANKING_FREQUENCY_MAX;
+    else if (!strcmp(env, "frequency_base"))
+      heuristics = HWLOC_CPUKINDS_RANKING_FREQUENCY_BASE;
+    else if (!strcmp(env, "forced_efficiency"))
+      heuristics = HWLOC_CPUKINDS_RANKING_FORCED_EFFICIENCY;
+    else if (!strcmp(env, "no_forced_efficiency"))
+      heuristics = HWLOC_CPUKINDS_RANKING_NO_FORCED_EFFICIENCY;
+    else if (!hwloc_hide_errors())
+      fprintf(stderr, "Failed to recognize HWLOC_CPUKINDS_RANKING value %s\n", env);
+  }
+
+  if (heuristics == HWLOC_CPUKINDS_RANKING_DEFAULT
+      || heuristics == HWLOC_CPUKINDS_RANKING_NO_FORCED_EFFICIENCY) {
+    /* default is forced_efficiency first */
+    struct hwloc_cpukinds_info_summary summary;
+    enum hwloc_cpukinds_ranking subheuristics;
+    const char *arch;
+
+    if (heuristics == HWLOC_CPUKINDS_RANKING_DEFAULT)
+      hwloc_debug("Using default ranking strategy...\n");
+    else
+      hwloc_debug("Using custom ranking strategy from HWLOC_CPUKINDS_RANKING=%s\n", env);
+
+    if (heuristics != HWLOC_CPUKINDS_RANKING_NO_FORCED_EFFICIENCY) {
+      err = hwloc__cpukinds_try_rank_by_forced_efficiency(topology);
+      if (!err)
+        goto ready;
+    }
+
+    summary.summaries = calloc(topology->nr_cpukinds, sizeof(*summary.summaries));
+    if (!summary.summaries)
+      goto failed;
+    hwloc__cpukinds_summarize_info(topology, &summary);
+
+    arch = hwloc_obj_get_info_by_name(topology->levels[0][0], "Architecture");
+    /* TODO: rather coretype_frequency only on x86/Intel? */
+    if (arch && (!strncmp(arch, "arm", 3) || !strncmp(arch, "aarch", 5)))
+      /* then frequency on ARM */
+      subheuristics = HWLOC_CPUKINDS_RANKING_FREQUENCY;
+    else
+      /* or coretype+frequency otherwise */
+      subheuristics = HWLOC_CPUKINDS_RANKING_CORETYPE_FREQUENCY;
+
+    err = hwloc__cpukinds_try_rank_by_info(topology, subheuristics, &summary);
+    free(summary.summaries);
+    if (!err)
+      goto ready;
+
+  } else if (heuristics == HWLOC_CPUKINDS_RANKING_FORCED_EFFICIENCY) {
+    hwloc_debug("Using custom ranking strategy from HWLOC_CPUKINDS_RANKING=%s\n", env);
+
+    err = hwloc__cpukinds_try_rank_by_forced_efficiency(topology);
+    if (!err)
+      goto ready;
+
+  } else if (heuristics != HWLOC_CPUKINDS_RANKING_NONE) {
+    /* custom heuristics */
+    struct hwloc_cpukinds_info_summary summary;
+
+    hwloc_debug("Using custom ranking strategy from HWLOC_CPUKINDS_RANKING=%s\n", env);
+
+    summary.summaries = calloc(topology->nr_cpukinds, sizeof(*summary.summaries));
+    if (!summary.summaries)
+      goto failed;
+    hwloc__cpukinds_summarize_info(topology, &summary);
+
+    err = hwloc__cpukinds_try_rank_by_info(topology, heuristics, &summary);
+    free(summary.summaries);
+    if (!err)
+      goto ready;
+  }
+
+ failed:
+  /* failed to rank, clear efficiencies */
+  for(i=0; i<topology->nr_cpukinds; i++)
+    topology->cpukinds[i].efficiency = HWLOC_CPUKIND_EFFICIENCY_UNKNOWN;
+  hwloc_debug("Failed to rank cpukinds.\n\n");
+  return 0;
+
+ ready:
+  for(i=0; i<topology->nr_cpukinds; i++)
+    hwloc_debug("cpukind #%u got ranking value %llu\n", i, (unsigned long long) topology->cpukinds[i].ranking_value);
+  hwloc__cpukinds_finalize_ranking(topology);
+#ifdef HWLOC_DEBUG
+  for(i=0; i<topology->nr_cpukinds; i++)
+    assert(topology->cpukinds[i].efficiency == (int) i);
+#endif
+  hwloc_debug("\n");
+  return 0;
+}
+
+
+/*****************
+ * Consulting
+ */
+
+int
+hwloc_cpukinds_get_nr(hwloc_topology_t topology, unsigned long flags)
+{
+  if (flags) {
+    errno = EINVAL;
+    return -1;
+  }
+
+  return topology->nr_cpukinds;
+}
+
+int
+hwloc_cpukinds_get_info(hwloc_topology_t topology,
+                        unsigned id,
+                        hwloc_bitmap_t cpuset,
+                        int *efficiencyp,
+                        unsigned *nr_infosp, struct hwloc_info_s **infosp,
+                        unsigned long flags)
+{
+  struct hwloc_internal_cpukind_s *kind;
+
+  if (flags) {
+    errno = EINVAL;
+    return -1;
+  }
+
+  if (id >= topology->nr_cpukinds) {
+    errno = ENOENT;
+    return -1;
+  }
+
+  kind = &topology->cpukinds[id];
+
+  if (cpuset)
+    hwloc_bitmap_copy(cpuset, kind->cpuset);
+
+  if (efficiencyp)
+    *efficiencyp = kind->efficiency;
+
+  if (nr_infosp && infosp) {
+    *nr_infosp = kind->nr_infos;
+    *infosp = kind->infos;
+  }
+  return 0;
+}
+
+int
+hwloc_cpukinds_get_by_cpuset(hwloc_topology_t topology,
+                             hwloc_const_bitmap_t cpuset,
+                             unsigned long flags)
+{
+  unsigned id;
+
+  if (flags) {
+    errno = EINVAL;
+    return -1;
+  }
+
+  if (!cpuset || hwloc_bitmap_iszero(cpuset)) {
+    errno = EINVAL;
+    return -1;
+  }
+
+  for(id=0; id<topology->nr_cpukinds; id++) {
+    struct hwloc_internal_cpukind_s *kind = &topology->cpukinds[id];
+    int res = hwloc_bitmap_compare_inclusion(cpuset, kind->cpuset);
+    if (res == HWLOC_BITMAP_EQUAL || res == HWLOC_BITMAP_INCLUDED) {
+      return (int) id;
+    } else if (res == HWLOC_BITMAP_INTERSECTS || res == HWLOC_BITMAP_CONTAINS) {
+      errno = EXDEV;
+      return -1;
+    }
+  }
+
+  errno = ENOENT;
+  return -1;
+}
--- a/src/3rdparty/hwloc/src/diff.c
+++ b/src/3rdparty/hwloc/src/diff.c
@@ -1,5 +1,5 @@
 /*
- * Copyright © 2013-2019 Inria.  All rights reserved.
+ * Copyright © 2013-2020 Inria.  All rights reserved.
 * See COPYING in top-level directory.
 */

@@ -333,10 +333,8 @@ int hwloc_topology_diff_build(hwloc_topology_t topo1,

 	if (!err) {
 		if (SETS_DIFFERENT(allowed_cpuset, topo1, topo2)
-		    || SETS_DIFFERENT(allowed_nodeset, topo1, topo2)) {
-			hwloc_append_diff_too_complex(hwloc_get_root_obj(topo1), diffp, &lastdiff);
-			err = 1;
-		}
+		    || SETS_DIFFERENT(allowed_nodeset, topo1, topo2))
+                  goto roottoocomplex;
 	}

 	if (!err) {
@@ -346,33 +344,78 @@ int hwloc_topology_diff_build(hwloc_topology_t topo1,
 		dist1 = topo1->first_dist;
 		dist2 = topo2->first_dist;
 		while (dist1 || dist2) {
-			if (!!dist1 != !!dist2) {
-				hwloc_append_diff_too_complex(hwloc_get_root_obj(topo1), diffp, &lastdiff);
-				err = 1;
-				break;
-			}
+			if (!!dist1 != !!dist2)
+                          goto roottoocomplex;
 			if (dist1->unique_type != dist2->unique_type
 			    || dist1->different_types || dist2->different_types /* too lazy to support this case */
 			    || dist1->nbobjs != dist2->nbobjs
 			    || dist1->kind != dist2->kind
-			    || memcmp(dist1->values, dist2->values, dist1->nbobjs * dist1->nbobjs * sizeof(*dist1->values))) {
-				hwloc_append_diff_too_complex(hwloc_get_root_obj(topo1), diffp, &lastdiff);
-				err = 1;
-				break;
-			}
+			    || memcmp(dist1->values, dist2->values, dist1->nbobjs * dist1->nbobjs * sizeof(*dist1->values)))
+                          goto roottoocomplex;
 			for(i=0; i<dist1->nbobjs; i++)
 				/* gp_index isn't enforced above. so compare logical_index instead, which is enforced. requires distances refresh() above */
-				if (dist1->objs[i]->logical_index != dist2->objs[i]->logical_index) {
-					hwloc_append_diff_too_complex(hwloc_get_root_obj(topo1), diffp, &lastdiff);
-					err = 1;
-					break;
-				}
+				if (dist1->objs[i]->logical_index != dist2->objs[i]->logical_index)
+                                  goto roottoocomplex;
 			dist1 = dist1->next;
 			dist2 = dist2->next;
 		}
 	}

+        if (!err) {
+          /* memattrs */
+          hwloc_internal_memattrs_refresh(topo1);
+          hwloc_internal_memattrs_refresh(topo2);
+          if (topo1->nr_memattrs != topo2->nr_memattrs)
+            goto roottoocomplex;
+          for(i=0; i<topo1->nr_memattrs; i++) {
+            struct hwloc_internal_memattr_s *imattr1 = &topo1->memattrs[i], *imattr2 = &topo2->memattrs[i];
+            unsigned j;
+           if (strcmp(imattr1->name, imattr2->name)
+                || imattr1->flags != imattr2->flags
+                || imattr1->nr_targets != imattr2->nr_targets)
+              goto roottoocomplex;
+            if (i == HWLOC_MEMATTR_ID_CAPACITY
+                || i == HWLOC_MEMATTR_ID_LOCALITY)
+              /* no need to check virtual attributes, there were refreshed from other topology attributes, checked above */
+              continue;
+            for(j=0; j<imattr1->nr_targets; j++) {
+              struct hwloc_internal_memattr_target_s *imtg1 = &imattr1->targets[j], *imtg2 = &imattr2->targets[j];
+              if (imtg1->type != imtg2->type)
+                goto roottoocomplex;
+              if (imtg1->obj->logical_index != imtg2->obj->logical_index)
+                goto roottoocomplex;
+              if (imattr1->flags & HWLOC_MEMATTR_FLAG_NEED_INITIATOR) {
+                unsigned k;
+                for(k=0; k<imtg1->nr_initiators; k++) {
+                  struct hwloc_internal_memattr_initiator_s *imi1 = &imtg1->initiators[k], *imi2 = &imtg2->initiators[k];
+                  if (imi1->value != imi2->value
+                      || imi1->initiator.type != imi2->initiator.type)
+                    goto roottoocomplex;
+                  if (imi1->initiator.type == HWLOC_LOCATION_TYPE_CPUSET) {
+                    if (!hwloc_bitmap_isequal(imi1->initiator.location.cpuset, imi2->initiator.location.cpuset))
+                      goto roottoocomplex;
+                  } else if (imi1->initiator.type == HWLOC_LOCATION_TYPE_OBJECT) {
+                    if (imi1->initiator.location.object.type != imi2->initiator.location.object.type)
+                      goto roottoocomplex;
+                    if (imi1->initiator.location.object.obj->logical_index != imi2->initiator.location.object.obj->logical_index)
+                      goto roottoocomplex;
+                  } else {
+                    assert(0);
+                  }
+                }
+              } else {
+                if (imtg1->noinitiator_value != imtg2->noinitiator_value)
+                  goto roottoocomplex;
+              }
+            }
+          }
+        }
+
 	return err;
+
+ roottoocomplex:
+  hwloc_append_diff_too_complex(hwloc_get_root_obj(topo1), diffp, &lastdiff);
+  return 1;
 }

 /********************
--- a/src/3rdparty/hwloc/src/distances.c
+++ b/src/3rdparty/hwloc/src/distances.c
@@ -1,5 +1,5 @@
 /*
- * Copyright © 2010-2019 Inria.  All rights reserved.
+ * Copyright © 2010-2020 Inria.  All rights reserved.
 * Copyright © 2011-2012 Université Bordeaux
 * Copyright © 2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -526,36 +526,6 @@ int hwloc_distances_add(hwloc_topology_t topology,
 * Refresh objects in distances
 */

-static hwloc_obj_t hwloc_find_obj_by_depth_and_gp_index(hwloc_topology_t topology, unsigned depth, uint64_t gp_index)
-{
-  hwloc_obj_t obj = hwloc_get_obj_by_depth(topology, depth, 0);
-  while (obj) {
-    if (obj->gp_index == gp_index)
-      return obj;
-    obj = obj->next_cousin;
-  }
-  return NULL;
-}
-
-static hwloc_obj_t hwloc_find_obj_by_type_and_gp_index(hwloc_topology_t topology, hwloc_obj_type_t type, uint64_t gp_index)
-{
-  int depth = hwloc_get_type_depth(topology, type);
-  if (depth == HWLOC_TYPE_DEPTH_UNKNOWN)
-    return NULL;
-  if (depth == HWLOC_TYPE_DEPTH_MULTIPLE) {
-    int topodepth = hwloc_topology_get_depth(topology);
-    for(depth=0; depth<topodepth; depth++) {
-      if (hwloc_get_depth_type(topology, depth) == type) {
-	hwloc_obj_t obj = hwloc_find_obj_by_depth_and_gp_index(topology, depth, gp_index);
-	if (obj)
-	  return obj;
-      }
-    }
-    return NULL;
-  }
-  return hwloc_find_obj_by_depth_and_gp_index(topology, depth, gp_index);
-}
-
 static void
 hwloc_internal_distances_restrict(hwloc_obj_t *objs,
 				  uint64_t *indexes,
@@ -612,7 +582,7 @@ hwloc_internal_distances_refresh_one(hwloc_topology_t topology,
      else
 	abort();
    } else {
-      obj = hwloc_find_obj_by_type_and_gp_index(topology, different_types ? different_types[i] : unique_type, indexes[i]);
+      obj = hwloc_get_obj_by_type_and_gp_index(topology, different_types ? different_types[i] : unique_type, indexes[i]);
    }
    objs[i] = obj;
    if (!obj)
@@ -874,26 +844,6 @@ hwloc_distances_get_by_type(hwloc_topology_t topology, hwloc_obj_type_t type,
 * Grouping objects according to distances
 */

-static void hwloc_report_user_distance_error(const char *msg, int line)
-{
-  static int reported = 0;
-
-  if (!reported && !hwloc_hide_errors()) {
-    fprintf(stderr, "****************************************************************************\n");
-    fprintf(stderr, "* hwloc %s was given invalid distances by the user.\n", HWLOC_VERSION);
-    fprintf(stderr, "*\n");
-    fprintf(stderr, "* %s\n", msg);
-    fprintf(stderr, "* Error occurred in topology.c line %d\n", line);
-    fprintf(stderr, "*\n");
-    fprintf(stderr, "* Please make sure that distances given through the programming API\n");
-    fprintf(stderr, "* do not contradict any other topology information.\n");
-    fprintf(stderr, "* \n");
-    fprintf(stderr, "* hwloc will now ignore this invalid topology information and continue.\n");
-    fprintf(stderr, "****************************************************************************\n");
-    reported = 1;
-  }
-}
-
 static int hwloc_compare_values(uint64_t a, uint64_t b, float accuracy)
 {
  if (accuracy != 0.0f && fabsf((float)a-(float)b) < (float)a * accuracy)
@@ -1086,7 +1036,7 @@ hwloc__groups_by_distances(struct hwloc_topology *topology,
          hwloc_debug_1arg_bitmap("adding Group object with %u objects and cpuset %s\n",
                                  groupsizes[i], group_obj->cpuset);
          res_obj = hwloc__insert_object_by_cpuset(topology, NULL, group_obj,
-						   (kind & HWLOC_DISTANCES_KIND_FROM_USER) ? hwloc_report_user_distance_error : hwloc_report_os_error);
+                                                   (kind & HWLOC_DISTANCES_KIND_FROM_USER) ? "distances:fromuser:group" : "distances:group");
 	  /* res_obj may be NULL on failure to insert. */
 	  if (!res_obj)
 	    failed++;
--- a/src/3rdparty/hwloc/src/memattrs.c
+++ b/src/3rdparty/hwloc/src/memattrs.c
--- a/src/3rdparty/hwloc/src/misc.c
+++ b/src/3rdparty/hwloc/src/misc.c
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2018 Inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009-2010 Université Bordeaux
 * Copyright © 2009-2018 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -114,7 +114,7 @@ void hwloc_add_uname_info(struct hwloc_topology *topology __hwloc_attribute_unus
 char *
 hwloc_progname(struct hwloc_topology *topology __hwloc_attribute_unused)
 {
-#if HAVE_DECL_GETMODULEFILENAME
+#if (defined HAVE_DECL_GETMODULEFILENAME) && HAVE_DECL_GETMODULEFILENAME
  char name[256], *local_basename;
  unsigned res = GetModuleFileName(NULL, name, sizeof(name));
  if (res == sizeof(name) || !res)
--- a/src/3rdparty/hwloc/src/pci-common.c
+++ b/src/3rdparty/hwloc/src/pci-common.c
@@ -232,7 +232,8 @@ enum hwloc_pci_busid_comparison_e {
  HWLOC_PCI_BUSID_LOWER,
  HWLOC_PCI_BUSID_HIGHER,
  HWLOC_PCI_BUSID_INCLUDED,
-  HWLOC_PCI_BUSID_SUPERSET
+  HWLOC_PCI_BUSID_SUPERSET,
+  HWLOC_PCI_BUSID_EQUAL
 };

 static enum hwloc_pci_busid_comparison_e
@@ -274,11 +275,8 @@ hwloc_pci_compare_busids(struct hwloc_obj *a, struct hwloc_obj *b)
  if (a->attr->pcidev.func > b->attr->pcidev.func)
    return HWLOC_PCI_BUSID_HIGHER;

-  /* Should never reach here.  Abort on both debug builds and
-     non-debug builds */
-  assert(0);
-  fprintf(stderr, "Bad assertion in hwloc %s:%d (aborting)\n", __FILE__, __LINE__);
-  exit(1);
+  /* Should never reach here. */
+  return HWLOC_PCI_BUSID_EQUAL;
 }

 static void
@@ -329,6 +327,23 @@ hwloc_pci_add_object(struct hwloc_obj *parent, struct hwloc_obj **parent_io_firs
      }
      return;
    }
+    case HWLOC_PCI_BUSID_EQUAL: {
+      static int reported = 0;
+      if (!reported && !hwloc_hide_errors()) {
+        fprintf(stderr, "*********************************************************\n");
+        fprintf(stderr, "* hwloc %s received invalid PCI information.\n", HWLOC_VERSION);
+        fprintf(stderr, "*\n");
+        fprintf(stderr, "* Trying to insert PCI object %04x:%02x:%02x.%01x at %04x:%02x:%02x.%01x\n",
+                new->attr->pcidev.domain, new->attr->pcidev.bus, new->attr->pcidev.dev, new->attr->pcidev.func,
+                (*curp)->attr->pcidev.domain, (*curp)->attr->pcidev.bus, (*curp)->attr->pcidev.dev, (*curp)->attr->pcidev.func);
+        fprintf(stderr, "*\n");
+        fprintf(stderr, "* hwloc will now ignore this object and continue.\n");
+        fprintf(stderr, "*********************************************************\n");
+        reported = 1;
+      }
+      hwloc_free_unlinked_object(new);
+      return;
+    }
    }
  }
  /* add to the end of the list if higher than everybody */
@@ -425,39 +440,10 @@ hwloc_pcidisc_add_hostbridges(struct hwloc_topology *topology,

 static struct hwloc_obj *
 hwloc_pci_fixup_busid_parent(struct hwloc_topology *topology __hwloc_attribute_unused,
-			     struct hwloc_pcidev_attr_s *busid,
-			     struct hwloc_obj *parent)
+			     struct hwloc_pcidev_attr_s *busid __hwloc_attribute_unused,
+			     struct hwloc_obj *parent __hwloc_attribute_unused)
 {
-  /* Xeon E5v3 in cluster-on-die mode only have PCI on the first NUMA node of each package.
-   * but many dual-processor host report the second PCI hierarchy on 2nd NUMA of first package.
-   */
-  if (parent->depth >= 2
-      && parent->type == HWLOC_OBJ_NUMANODE
-      && parent->sibling_rank == 1 && parent->parent->arity == 2
-      && parent->parent->type == HWLOC_OBJ_PACKAGE
-      && parent->parent->sibling_rank == 0 && parent->parent->parent->arity == 2) {
-    const char *cpumodel = hwloc_obj_get_info_by_name(parent->parent, "CPUModel");
-    if (cpumodel && strstr(cpumodel, "Xeon")) {
-      if (!hwloc_hide_errors()) {
-	fprintf(stderr, "****************************************************************************\n");
-	fprintf(stderr, "* hwloc %s has encountered an incorrect PCI locality information.\n", HWLOC_VERSION);
-	fprintf(stderr, "* PCI bus %04x:%02x is supposedly close to 2nd NUMA node of 1st package,\n",
-		busid->domain, busid->bus);
-	fprintf(stderr, "* however hwloc believes this is impossible on this architecture.\n");
-	fprintf(stderr, "* Therefore the PCI bus will be moved to 1st NUMA node of 2nd package.\n");
-	fprintf(stderr, "*\n");
-	fprintf(stderr, "* If you feel this fixup is wrong, disable it by setting in your environment\n");
-	fprintf(stderr, "* HWLOC_PCI_%04x_%02x_LOCALCPUS= (empty value), and report the problem\n",
-		busid->domain, busid->bus);
-	fprintf(stderr, "* to the hwloc's user mailing list together with the XML output of lstopo.\n");
-	fprintf(stderr, "*\n");
-	fprintf(stderr, "* You may silence this message by setting HWLOC_HIDE_ERRORS=1 in your environment.\n");
-	fprintf(stderr, "****************************************************************************\n");
-      }
-      return parent->parent->next_sibling->first_child;
-    }
-  }
-
+  /* no quirk for now */
  return parent;
 }

--- a/src/3rdparty/hwloc/src/shmem.c
+++ b/src/3rdparty/hwloc/src/shmem.c
@@ -1,5 +1,5 @@
 /*
- * Copyright © 2017-2019 Inria.  All rights reserved.
+ * Copyright © 2017-2020 Inria.  All rights reserved.
 * See COPYING in top-level directory.
 */

@@ -97,6 +97,7 @@ hwloc_shmem_topology_write(hwloc_topology_t topology,
   * without being able to free() them.
   */
  hwloc_internal_distances_refresh(topology);
+  hwloc_internal_memattrs_refresh(topology);

  header.header_version = HWLOC_SHMEM_HEADER_VERSION;
  header.header_length = sizeof(header);
@@ -134,8 +135,9 @@ hwloc_shmem_topology_write(hwloc_topology_t topology,

  assert((char *)mmap_res <= (char *)mmap_address + length);

-  /* now refresh the new distances so that adopters can use them without refreshing the R/O shmem mapping */
+  /* now refresh the new distances/memattrs so that adopters can use them without refreshing the R/O shmem mapping */
  hwloc_internal_distances_refresh(new);
+  hwloc_internal_memattrs_refresh(topology);

  /* topology is saved, release resources now */
  munmap(mmap_address, length);
@@ -214,11 +216,13 @@ hwloc_shmem_topology_adopt(hwloc_topology_t *topologyp,
  new->support.discovery = malloc(sizeof(*new->support.discovery));
  new->support.cpubind = malloc(sizeof(*new->support.cpubind));
  new->support.membind = malloc(sizeof(*new->support.membind));
-  if (!new->support.discovery || !new->support.cpubind || !new->support.membind)
+  new->support.misc = malloc(sizeof(*new->support.misc));
+  if (!new->support.discovery || !new->support.cpubind || !new->support.membind || !new->support.misc)
    goto out_with_support;
  memcpy(new->support.discovery, old->support.discovery, sizeof(*new->support.discovery));
  memcpy(new->support.cpubind, old->support.cpubind, sizeof(*new->support.cpubind));
  memcpy(new->support.membind, old->support.membind, sizeof(*new->support.membind));
+  memcpy(new->support.misc, old->support.misc, sizeof(*new->support.misc));
  hwloc_set_binding_hooks(new);
  /* clear userdata callbacks pointing to the writer process' functions */
  new->userdata_export_cb = NULL;
@@ -236,6 +240,7 @@ hwloc_shmem_topology_adopt(hwloc_topology_t *topologyp,
  free(new->support.discovery);
  free(new->support.cpubind);
  free(new->support.membind);
+  free(new->support.misc);
  free(new);
 out_with_components:
  hwloc_components_fini();
@@ -252,6 +257,7 @@ hwloc__topology_disadopt(hwloc_topology_t topology)
  free(topology->support.discovery);
  free(topology->support.cpubind);
  free(topology->support.membind);
+  free(topology->support.misc);
  free(topology);
 }

--- a/src/3rdparty/hwloc/src/topology-synthetic.c
+++ b/src/3rdparty/hwloc/src/topology-synthetic.c
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2019 Inria.  All rights reserved.
+ * Copyright © 2009-2020 Inria.  All rights reserved.
 * Copyright © 2009-2010 Université Bordeaux
 * Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -471,7 +471,7 @@ hwloc_backend_synthetic_init(struct hwloc_synthetic_backend_data_s *data,
    /* initialize parent arity to 0 so that the levels are not infinite */
    data->level[count-1].arity = 0;

-    while (*pos == ' ')
+    while (*pos == ' ' || *pos == '\n')
      pos++;

    if (!*pos)
@@ -912,7 +912,7 @@ hwloc_synthetic_insert_attached(struct hwloc_topology *topology,

  hwloc_synthetic_set_attr(&attached->attr, child);

-  hwloc_insert_object_by_cpuset(topology, child);
+  hwloc__insert_object_by_cpuset(topology, NULL, child, "synthetic:attached");

  hwloc_synthetic_insert_attached(topology, data, attached->next, set);
 }
@@ -964,7 +964,7 @@ hwloc__look_synthetic(struct hwloc_topology *topology,

    hwloc_synthetic_set_attr(&curlevel->attr, obj);

-    hwloc_insert_object_by_cpuset(topology, obj);
+    hwloc__insert_object_by_cpuset(topology, NULL, obj, "synthetic");
  }

  hwloc_synthetic_insert_attached(topology, data, curlevel->attached, set);
--- a/src/3rdparty/hwloc/src/topology-windows.c
+++ b/src/3rdparty/hwloc/src/topology-windows.c
@@ -93,9 +93,10 @@ typedef struct _GROUP_AFFINITY {
 #endif

 #ifndef HAVE_PROCESSOR_RELATIONSHIP
-typedef struct _PROCESSOR_RELATIONSHIP {
+typedef struct HWLOC_PROCESSOR_RELATIONSHIP {
  BYTE Flags;
-  BYTE Reserved[21];
+  BYTE EfficiencyClass; /* for RelationProcessorCore, higher means greater performance but less efficiency, only available in Win10+ */
+  BYTE Reserved[20];
  WORD GroupCount;
  GROUP_AFFINITY GroupMask[ANYSIZE_ARRAY];
 } PROCESSOR_RELATIONSHIP, *PPROCESSOR_RELATIONSHIP;
@@ -228,9 +229,12 @@ static PFN_VIRTUALFREEEX VirtualFreeExProc;
 typedef BOOL (WINAPI *PFN_QUERYWORKINGSETEX)(HANDLE hProcess, PVOID pv, DWORD cb);
 static PFN_QUERYWORKINGSETEX QueryWorkingSetExProc;

+typedef NTSTATUS (WINAPI *PFN_RTLGETVERSION)(OSVERSIONINFOEX*);
+PFN_RTLGETVERSION RtlGetVersionProc;
+
 static void hwloc_win_get_function_ptrs(void)
 {
-    HMODULE kernel32;
+  HMODULE kernel32, ntdll;

 #if HWLOC_HAVE_GCC_W_CAST_FUNCTION_TYPE
 #pragma GCC diagnostic ignored "-Wcast-function-type"
@@ -275,6 +279,9 @@ static void hwloc_win_get_function_ptrs(void)
        QueryWorkingSetExProc = (PFN_QUERYWORKINGSETEX) GetProcAddress(psapi, "QueryWorkingSetEx");
    }

+    ntdll = GetModuleHandle("ntdll");
+    RtlGetVersionProc = (PFN_RTLGETVERSION) GetProcAddress(ntdll, "RtlGetVersion");
+
 #if HWLOC_HAVE_GCC_W_CAST_FUNCTION_TYPE
 #pragma GCC diagnostic warning "-Wcast-function-type"
 #endif
@@ -734,6 +741,88 @@ hwloc_win_get_area_memlocation(hwloc_topology_t topology __hwloc_attribute_unuse
 }


+
+/*************************
+ * Efficiency classes
+ */
+
+struct hwloc_win_efficiency_classes {
+  unsigned nr_classes;
+  unsigned nr_classes_allocated;
+  struct hwloc_win_efficiency_class {
+    unsigned value;
+    hwloc_bitmap_t cpuset;
+  } *classes;
+};
+
+static void
+hwloc_win_efficiency_classes_init(struct hwloc_win_efficiency_classes *classes)
+{
+  classes->classes = NULL;
+  classes->nr_classes_allocated = 0;
+  classes->nr_classes = 0;
+}
+
+static int
+hwloc_win_efficiency_classes_add(struct hwloc_win_efficiency_classes *classes,
+                                 hwloc_const_bitmap_t cpuset,
+                                 unsigned value)
+{
+  unsigned i;
+
+  /* look for existing class with that efficiency value */
+  for(i=0; i<classes->nr_classes; i++) {
+    if (classes->classes[i].value == value) {
+      hwloc_bitmap_or(classes->classes[i].cpuset, classes->classes[i].cpuset, cpuset);
+      return 0;
+    }
+  }
+
+  /* extend the array if needed */
+  if (classes->nr_classes == classes->nr_classes_allocated) {
+    struct hwloc_win_efficiency_class *tmp;
+    unsigned new_nr_allocated = 2*classes->nr_classes_allocated;
+    if (!new_nr_allocated) {
+#define HWLOC_WIN_EFFICIENCY_CLASSES_DEFAULT_MAX 4 /* 2 should be enough is most cases */
+      new_nr_allocated = HWLOC_WIN_EFFICIENCY_CLASSES_DEFAULT_MAX;
+    }
+    tmp = realloc(classes->classes, new_nr_allocated * sizeof(*classes->classes));
+    if (!tmp)
+      return -1;
+    classes->classes = tmp;
+    classes->nr_classes_allocated = new_nr_allocated;
+  }
+
+  /* add new class */
+  classes->classes[classes->nr_classes].cpuset = hwloc_bitmap_alloc();
+  if (!classes->classes[classes->nr_classes].cpuset)
+    return -1;
+  classes->classes[classes->nr_classes].value = value;
+  hwloc_bitmap_copy(classes->classes[classes->nr_classes].cpuset, cpuset);
+  classes->nr_classes++;
+  return 0;
+}
+
+static void
+hwloc_win_efficiency_classes_register(hwloc_topology_t topology,
+                                      struct hwloc_win_efficiency_classes *classes)
+{
+  unsigned i;
+  for(i=0; i<classes->nr_classes; i++) {
+    hwloc_internal_cpukinds_register(topology, classes->classes[i].cpuset, classes->classes[i].value, NULL, 0, 0);
+    classes->classes[i].cpuset = NULL; /* given to cpukinds */
+  }
+}
+
+static void
+hwloc_win_efficiency_classes_destroy(struct hwloc_win_efficiency_classes *classes)
+{
+  unsigned i;
+  for(i=0; i<classes->nr_classes; i++)
+    hwloc_bitmap_free(classes->classes[i].cpuset);
+  free(classes->classes);
+}
+
 /*************************
 * discovery
 */
@@ -753,6 +842,12 @@ hwloc_look_windows(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
  DWORD length;
  int gotnuma = 0;
  int gotnumamemory = 0;
+  OSVERSIONINFOEX osvi;
+  char versionstr[20];
+  char hostname[122] = "";
+  unsigned hostname_size = sizeof(hostname);
+  int has_efficiencyclass = 0;
+  struct hwloc_win_efficiency_classes eclasses;

  assert(dstatus->phase == HWLOC_DISC_PHASE_CPU);

@@ -760,6 +855,25 @@ hwloc_look_windows(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
    /* somebody discovered things */
    return -1;

+  ZeroMemory(&osvi, sizeof(OSVERSIONINFOEX));
+  osvi.dwOSVersionInfoSize = sizeof(OSVERSIONINFOEX);
+
+  if (RtlGetVersionProc) {
+    /* RtlGetVersion() returns the currently-running Windows version */
+    RtlGetVersionProc(&osvi);
+  } else {
+    /* GetVersionEx() and isWindows10OrGreater() depend on what the manifest says
+     * (manifest of the program, not of libhwloc.dll), they may return old versions
+     * if the currently-running Windows is not listed in the manifest.
+     */
+    GetVersionEx((LPOSVERSIONINFO)&osvi);
+  }
+
+  if (osvi.dwMajorVersion >= 10) {
+    has_efficiencyclass = 1;
+    hwloc_win_efficiency_classes_init(&eclasses);
+  }
+
  hwloc_alloc_root_sets(topology->levels[0][0]);

  GetSystemInfo(&SystemInfo);
@@ -887,7 +1001,7 @@ hwloc_look_windows(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
 	  default:
 	    break;
 	}
-	hwloc_insert_object_by_cpuset(topology, obj);
+	hwloc__insert_object_by_cpuset(topology, NULL, obj, "windows:GetLogicalProcessorInformation");
      }

      free(procInfo);
@@ -919,6 +1033,7 @@ hwloc_look_windows(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
 	   (void*) procInfo < (void*) ((uintptr_t) procInfoTotal + length);
 	   procInfo = (void*) ((uintptr_t) procInfo + procInfo->Size)) {
        unsigned num, i;
+        unsigned efficiency_class = 0;
        GROUP_AFFINITY *GroupMask;

        /* Ignore unknown caches */
@@ -953,6 +1068,11 @@ hwloc_look_windows(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
 	    type = HWLOC_OBJ_CORE;
            num = procInfo->Processor.GroupCount;
            GroupMask = procInfo->Processor.GroupMask;
+            if (has_efficiencyclass)
+              /* the EfficiencyClass field didn't exist before Windows10 and recent MSVC headers,
+               * so just access it manually instead of trying to detect it.
+               */
+              efficiency_class = * ((&procInfo->Processor.Flags) + 1);
 	    break;
 	  case RelationGroup:
 	    /* So strange an interface... */
@@ -981,7 +1101,7 @@ hwloc_look_windows(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
 		obj = hwloc_alloc_setup_object(topology, HWLOC_OBJ_GROUP, id);
 		obj->cpuset = set;
 		obj->attr->group.kind = HWLOC_GROUP_KIND_WINDOWS_PROCESSOR_GROUP;
-		hwloc_insert_object_by_cpuset(topology, obj);
+		hwloc__insert_object_by_cpuset(topology, NULL, obj, "windows:GetLogicalProcessorInformation:ProcessorGroup");
 	      } else
 		hwloc_bitmap_free(set);
 	    }
@@ -1005,6 +1125,11 @@ hwloc_look_windows(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
        }
 	hwloc_debug_2args_bitmap("%s#%u bitmap %s\n", hwloc_obj_type_string(type), id, obj->cpuset);
 	switch (type) {
+        case HWLOC_OBJ_CORE: {
+          if (has_efficiencyclass)
+            hwloc_win_efficiency_classes_add(&eclasses, obj->cpuset, efficiency_class);
+          break;
+        }
 	  case HWLOC_OBJ_NUMANODE:
 	    {
 	      ULONGLONG avail;
@@ -1055,7 +1180,7 @@ hwloc_look_windows(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
 	  default:
 	    break;
 	}
-	hwloc_insert_object_by_cpuset(topology, obj);
+	hwloc__insert_object_by_cpuset(topology, NULL, obj, "windows:GetLogicalProcessorInformationEx");
      }
      free(procInfoTotal);
  }
@@ -1076,29 +1201,88 @@ hwloc_look_windows(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
      hwloc_bitmap_only(obj->cpuset, idx);
      hwloc_debug_1arg_bitmap("cpu %u has cpuset %s\n",
 			      idx, obj->cpuset);
-      hwloc_insert_object_by_cpuset(topology, obj);
+      hwloc__insert_object_by_cpuset(topology, NULL, obj, "windows:ProcessorGroup:pu");
    } hwloc_bitmap_foreach_end();
    hwloc_bitmap_free(groups_pu_set);
  } else {
    /* no processor groups */
-    SYSTEM_INFO sysinfo;
    hwloc_obj_t obj;
    unsigned idx;
-    GetSystemInfo(&sysinfo);
    for(idx=0; idx<32; idx++)
-      if (sysinfo.dwActiveProcessorMask & (((DWORD_PTR)1)<<idx)) {
+      if (SystemInfo.dwActiveProcessorMask & (((DWORD_PTR)1)<<idx)) {
 	obj = hwloc_alloc_setup_object(topology, HWLOC_OBJ_PU, idx);
 	obj->cpuset = hwloc_bitmap_alloc();
 	hwloc_bitmap_only(obj->cpuset, idx);
 	hwloc_debug_1arg_bitmap("cpu %u has cpuset %s\n",
 				idx, obj->cpuset);
-	hwloc_insert_object_by_cpuset(topology, obj);
+	hwloc__insert_object_by_cpuset(topology, NULL, obj, "windows:pu");
      }
  }

+  if (has_efficiencyclass) {
+    topology->support.discovery->cpukind_efficiency = 1;
+    hwloc_win_efficiency_classes_register(topology, &eclasses);
+  }
+
 out:
+  if (has_efficiencyclass)
+    hwloc_win_efficiency_classes_destroy(&eclasses);
+
+  /* emulate uname instead of calling hwloc_add_uname_info() */
  hwloc_obj_add_info(topology->levels[0][0], "Backend", "Windows");
-  hwloc_add_uname_info(topology, NULL);
+  hwloc_obj_add_info(topology->levels[0][0], "OSName", "Windows");
+
+#if defined(__CYGWIN__)
+  hwloc_obj_add_info(topology->levels[0][0], "WindowsBuildEnvironment", "Cygwin");
+#elif defined(__MINGW32__)
+  hwloc_obj_add_info(topology->levels[0][0], "WindowsBuildEnvironment", "MinGW");
+#endif
+
+  /* see https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-osversioninfoexa */
+  if (osvi.dwMajorVersion == 10) {
+    if (osvi.dwMinorVersion == 0)
+      hwloc_obj_add_info(topology->levels[0][0], "OSRelease", "10");
+  } else if (osvi.dwMajorVersion == 6) {
+    if (osvi.dwMinorVersion == 3)
+      hwloc_obj_add_info(topology->levels[0][0], "OSRelease", "8.1"); /* or "Server 2012 R2" */
+    else if (osvi.dwMinorVersion == 2)
+      hwloc_obj_add_info(topology->levels[0][0], "OSRelease", "8"); /* or "Server 2012" */
+    else if (osvi.dwMinorVersion == 1)
+      hwloc_obj_add_info(topology->levels[0][0], "OSRelease", "7"); /* or "Server 2008 R2" */
+    else if (osvi.dwMinorVersion == 0)
+      hwloc_obj_add_info(topology->levels[0][0], "OSRelease", "Vista"); /* or "Server 2008" */
+  } /* earlier versions are ignored */
+
+  snprintf(versionstr, sizeof(versionstr), "%u.%u.%u", osvi.dwMajorVersion, osvi.dwMinorVersion, osvi.dwBuildNumber);
+  hwloc_obj_add_info(topology->levels[0][0], "OSVersion", versionstr);
+
+#if !defined(__CYGWIN__)
+  GetComputerName(hostname, &hostname_size);
+#else
+  gethostname(hostname, hostname_size);
+#endif
+  if (*hostname)
+    hwloc_obj_add_info(topology->levels[0][0], "Hostname", hostname);
+
+  /* convert to unix-like architecture strings */
+  switch (SystemInfo.wProcessorArchitecture) {
+  case 0:
+    hwloc_obj_add_info(topology->levels[0][0], "Architecture", "i686");
+    break;
+  case 9:
+    hwloc_obj_add_info(topology->levels[0][0], "Architecture", "x86_64");
+    break;
+  case 5:
+    hwloc_obj_add_info(topology->levels[0][0], "Architecture", "arm");
+    break;
+  case 12:
+    hwloc_obj_add_info(topology->levels[0][0], "Architecture", "arm64");
+    break;
+  case 6:
+    hwloc_obj_add_info(topology->levels[0][0], "Architecture", "ia64");
+    break;
+  }
+
  return 0;
 }

--- a/src/3rdparty/hwloc/src/topology-x86.c
+++ b/src/3rdparty/hwloc/src/topology-x86.c
@@ -1,5 +1,5 @@
 /*
- * Copyright © 2010-2019 Inria.  All rights reserved.
+ * Copyright © 2010-2021 Inria.  All rights reserved.
 * Copyright © 2010-2013 Université Bordeaux
 * Copyright © 2010-2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -181,6 +181,7 @@ enum hwloc_x86_disc_flags {

 #define has_topoext(features) ((features)[6] & (1 << 22))
 #define has_x2apic(features) ((features)[4] & (1 << 21))
+#define has_hybrid(features) ((features)[18] & (1 << 15))

 struct cacheinfo {
  hwloc_obj_cache_type_t type;
@@ -217,6 +218,9 @@ struct procinfo {
  unsigned cpustepping;
  unsigned cpumodelnumber;
  unsigned cpufamilynumber;
+
+  unsigned hybridcoretype;
+  unsigned hybridnativemodel;
 };

 enum cpuid_type {
@@ -681,6 +685,15 @@ static void look_proc(struct hwloc_backend *backend, struct procinfo *infos, uns
    }
  }

+  if (highest_cpuid >= 0x1a && has_hybrid(features)) {
+    /* Get hybrid cpu information from cpuid 0x1a */
+    eax = 0x1a;
+    ecx = 0;
+    cpuid_or_from_dump(&eax, &ebx, &ecx, &edx, src_cpuiddump);
+    infos->hybridcoretype = eax >> 24;
+    infos->hybridnativemodel = eax & 0xffffff;
+  }
+
  /*********************************************************************************
   * Get the hierarchy of thread, core, die, package, etc. from CPU-specific leaves
   */
@@ -751,7 +764,13 @@ static void look_proc(struct hwloc_backend *backend, struct procinfo *infos, uns
    /* default cacheid value */
    cache->cacheid = infos->apicid / cache->nbthreads_sharing;

-    if (cpuid_type == amd) {
+    if (cpuid_type == intel) {
+      /* round nbthreads_sharing to nearest power of two to build a mask (for clearing lower bits) */
+      unsigned bits = hwloc_flsl(cache->nbthreads_sharing-1);
+      unsigned mask = ~((1U<<bits) - 1);
+      cache->cacheid = infos->apicid & mask;
+
+    } else if (cpuid_type == amd) {
      /* AMD quirks */
      if (infos->cpufamilynumber == 0x17
 	  && cache->level == 3 && cache->nbthreads_sharing == 6) {
@@ -872,7 +891,7 @@ hwloc_x86_add_groups(hwloc_topology_t topology,
    obj->attr->group.dont_merge = dont_merge;
    hwloc_debug_2args_bitmap("os %s %u has cpuset %s\n",
 			     subtype, id, obj_cpuset);
-    hwloc_insert_object_by_cpuset(topology, obj);
+    hwloc__insert_object_by_cpuset(topology, NULL, obj, "x86:group");
  }
 }

@@ -889,6 +908,16 @@ static void summarize(struct hwloc_backend *backend, struct procinfo *infos, uns
  int gotnuma = 0;
  int fulldiscovery = (flags & HWLOC_X86_DISC_FLAG_FULL);

+#ifdef HWLOC_DEBUG
+  hwloc_debug("\nSummary of x86 CPUID topology:\n");
+  for(i=0; i<nbprocs; i++) {
+    hwloc_debug("PU %u present=%u apicid=%u on PKG %d CORE %d DIE %d NODE %d\n",
+                i, infos[i].present, infos[i].apicid,
+                infos[i].ids[PKG], infos[i].ids[CORE], infos[i].ids[DIE], infos[i].ids[NODE]);
+  }
+  hwloc_debug("\n");
+#endif
+
  for (i = 0; i < nbprocs; i++)
    if (infos[i].present) {
      hwloc_bitmap_set(complete_cpuset, i);
@@ -930,7 +959,7 @@ static void summarize(struct hwloc_backend *backend, struct procinfo *infos, uns

 	hwloc_debug_1arg_bitmap("os package %u has cpuset %s\n",
 				packageid, package_cpuset);
-	hwloc_insert_object_by_cpuset(topology, package);
+	hwloc__insert_object_by_cpuset(topology, NULL, package, "x86:package");

      } else {
 	/* Annotate packages previously-existing packages */
@@ -986,7 +1015,7 @@ static void summarize(struct hwloc_backend *backend, struct procinfo *infos, uns
      hwloc_bitmap_set(node->nodeset, nodeid);
      hwloc_debug_1arg_bitmap("os node %u has cpuset %s\n",
          nodeid, node_cpuset);
-      hwloc_insert_object_by_cpuset(topology, node);
+      hwloc__insert_object_by_cpuset(topology, NULL, node, "x86:numa");
      gotnuma++;
    }
  }
@@ -1033,7 +1062,7 @@ static void summarize(struct hwloc_backend *backend, struct procinfo *infos, uns
 	      unknown_obj->attr->group.subkind = level;
 	      hwloc_debug_2args_bitmap("os unknown%u %u has cpuset %s\n",
 				       level, unknownid, unknown_cpuset);
-	      hwloc_insert_object_by_cpuset(topology, unknown_obj);
+	      hwloc__insert_object_by_cpuset(topology, NULL, unknown_obj, "x86:group:unknown");
 	    }
 	  }
 	}
@@ -1073,7 +1102,7 @@ static void summarize(struct hwloc_backend *backend, struct procinfo *infos, uns
 	die->cpuset = die_cpuset;
 	hwloc_debug_1arg_bitmap("os die %u has cpuset %s\n",
 				dieid, die_cpuset);
-	hwloc_insert_object_by_cpuset(topology, die);
+	hwloc__insert_object_by_cpuset(topology, NULL, die, "x86:die");
      }
    }
  }
@@ -1111,7 +1140,7 @@ static void summarize(struct hwloc_backend *backend, struct procinfo *infos, uns
 	core->cpuset = core_cpuset;
 	hwloc_debug_1arg_bitmap("os core %u has cpuset %s\n",
 				coreid, core_cpuset);
-	hwloc_insert_object_by_cpuset(topology, core);
+	hwloc__insert_object_by_cpuset(topology, NULL, core, "x86:core");
      }
    }
  }
@@ -1125,7 +1154,7 @@ static void summarize(struct hwloc_backend *backend, struct procinfo *infos, uns
       obj->cpuset = hwloc_bitmap_alloc();
       hwloc_bitmap_only(obj->cpuset, i);
       hwloc_debug_1arg_bitmap("PU %u has cpuset %s\n", i, obj->cpuset);
-       hwloc_insert_object_by_cpuset(topology, obj);
+       hwloc__insert_object_by_cpuset(topology, NULL, obj, "x86:pu");
     }
  }

@@ -1208,7 +1237,7 @@ static void summarize(struct hwloc_backend *backend, struct procinfo *infos, uns
 	  hwloc_obj_add_info(cache, "Inclusive", infos[i].cache[l].inclusive ? "1" : "0");
 	  hwloc_debug_2args_bitmap("os L%u cache %u has cpuset %s\n",
 				   level, cacheid, cache_cpuset);
-	  hwloc_insert_object_by_cpuset(topology, cache);
+	  hwloc__insert_object_by_cpuset(topology, NULL, cache, "x86:cache");
 	}
      }
    }
@@ -1274,8 +1303,41 @@ look_procs(struct hwloc_backend *backend, struct procinfo *infos, unsigned long
    hwloc_bitmap_free(orig_cpuset);
  }

-  if (data->apicid_unique)
+  if (data->apicid_unique) {
    summarize(backend, infos, flags);
+
+    if (has_hybrid(features)) {
+      /* use hybrid info for cpukinds */
+      hwloc_bitmap_t atomset = hwloc_bitmap_alloc();
+      hwloc_bitmap_t coreset = hwloc_bitmap_alloc();
+      for(i=0; i<nbprocs; i++) {
+        if (infos[i].hybridcoretype == 0x20)
+          hwloc_bitmap_set(atomset, i);
+        else if (infos[i].hybridcoretype == 0x40)
+          hwloc_bitmap_set(coreset, i);
+      }
+      /* register IntelAtom set if any */
+      if (!hwloc_bitmap_iszero(atomset)) {
+        struct hwloc_info_s infoattr;
+        infoattr.name = (char *) "CoreType";
+        infoattr.value = (char *) "IntelAtom";
+        hwloc_internal_cpukinds_register(topology, atomset, HWLOC_CPUKIND_EFFICIENCY_UNKNOWN, &infoattr, 1, 0);
+        /* the cpuset is given to the callee */
+      } else {
+        hwloc_bitmap_free(atomset);
+      }
+      /* register IntelCore set if any */
+      if (!hwloc_bitmap_iszero(coreset)) {
+        struct hwloc_info_s infoattr;
+        infoattr.name = (char *) "CoreType";
+        infoattr.value = (char *) "IntelCore";
+        hwloc_internal_cpukinds_register(topology, coreset, HWLOC_CPUKIND_EFFICIENCY_UNKNOWN, &infoattr, 1, 0);
+        /* the cpuset is given to the callee */
+      } else {
+        hwloc_bitmap_free(coreset);
+      }
+    }
+  }
  /* if !data->apicid_unique, do nothing and return success, so that the caller does nothing either */

  return 0;
@@ -1354,7 +1416,7 @@ int hwloc_look_x86(struct hwloc_backend *backend, unsigned long flags)
  unsigned highest_cpuid;
  unsigned highest_ext_cpuid;
  /* This stores cpuid features with the same indexing as Linux */
-  unsigned features[10] = { 0 };
+  unsigned features[19] = { 0 };
  struct procinfo *infos = NULL;
  enum cpuid_type cpuid_type = unknown;
  hwloc_x86_os_state_t os_state;
@@ -1381,6 +1443,9 @@ int hwloc_look_x86(struct hwloc_backend *backend, unsigned long flags)
    /* check if binding works */
    memset(&hooks, 0, sizeof(hooks));
    support.membind = &memsupport;
+    /* We could just copy the main hooks (except in some corner cases),
+     * but the current overhead is negligible, so just always reget them.
+     */
    hwloc_set_native_binding_hooks(&hooks, &support);
    if (hooks.get_thisthread_cpubind && hooks.set_thisthread_cpubind) {
      get_cpubind = hooks.get_thisthread_cpubind;
@@ -1451,6 +1516,7 @@ int hwloc_look_x86(struct hwloc_backend *backend, unsigned long flags)
    ecx = 0;
    cpuid_or_from_dump(&eax, &ebx, &ecx, &edx, src_cpuiddump);
    features[9] = ebx;
+    features[18] = edx;
  }

  if (cpuid_type != intel && highest_ext_cpuid >= 0x80000001) {
@@ -1531,7 +1597,8 @@ hwloc_x86_discover(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
  }

  if (topology->levels[0][0]->cpuset) {
-    /* somebody else discovered things */
+    /* somebody else discovered things, reconnect levels so that we can look at them */
+    hwloc_topology_reconnect(topology, 0);
    if (topology->nb_levels == 2 && topology->level_nbobjects[1] == data->nbprocs) {
      /* only PUs were discovered, as much as we would, complete the topology with everything else */
      alreadypus = 1;
@@ -1539,7 +1606,6 @@ hwloc_x86_discover(struct hwloc_backend *backend, struct hwloc_disc_status *dsta
    }

    /* several object types were added, we can't easily complete, just do partial discovery */
-    hwloc_topology_reconnect(topology, 0);
    ret = hwloc_look_x86(backend, flags);
    if (ret)
      hwloc_obj_add_info(topology->levels[0][0], "Backend", "x86");
--- a/src/3rdparty/hwloc/src/topology-xml-nolibxml.c
+++ b/src/3rdparty/hwloc/src/topology-xml-nolibxml.c
@@ -213,7 +213,7 @@ hwloc__nolibxml_import_close_child(hwloc__xml_import_state_t state)

 static int
 hwloc__nolibxml_import_get_content(hwloc__xml_import_state_t state,
-				   char **beginp, size_t expected_length)
+				   const char **beginp, size_t expected_length)
 {
  hwloc__nolibxml_import_state_data_t nstate = (void*) state->data;
  char *buffer = nstate->tagbuffer;
@@ -224,7 +224,7 @@ hwloc__nolibxml_import_get_content(hwloc__xml_import_state_t state,
  if (nstate->closed) {
    if (expected_length)
      return -1;
-    *beginp = (char *) "";
+    *beginp = "";
    return 0;
  }

--- a/src/3rdparty/hwloc/src/topology-xml.c
+++ b/src/3rdparty/hwloc/src/topology-xml.c
@@ -1,7 +1,7 @@
 /*
 * Copyright © 2009 CNRS
 * Copyright © 2009-2020 Inria.  All rights reserved.
- * Copyright © 2009-2011 Université Bordeaux
+ * Copyright © 2009-2011, 2020 Université Bordeaux
 * Copyright © 2009-2018 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
 */
@@ -481,10 +481,8 @@ hwloc__xml_import_object_attr(struct hwloc_topology *topology,
  }
 }

-
 static int
-hwloc__xml_import_info(struct hwloc_xml_backend_data_s *data,
-		       hwloc_obj_t obj,
+hwloc___xml_import_info(char **infonamep, char **infovaluep,
                        hwloc__xml_import_state_t state)
 {
  char *infoname = NULL;
@@ -502,6 +500,25 @@ hwloc__xml_import_info(struct hwloc_xml_backend_data_s *data,
      return -1;
  }

+  *infonamep = infoname;
+  *infovaluep = infovalue;
+
+  return state->global->close_tag(state);
+}
+
+static int
+hwloc__xml_import_obj_info(struct hwloc_xml_backend_data_s *data,
+                           hwloc_obj_t obj,
+                           hwloc__xml_import_state_t state)
+{
+  char *infoname = NULL;
+  char *infovalue = NULL;
+  int err;
+
+  err = hwloc___xml_import_info(&infoname, &infovalue, state);
+  if (err < 0)
+    return err;
+
  if (infoname) {
    /* empty strings are ignored by libxml */
    if (data->version_major < 2 &&
@@ -518,7 +535,7 @@ hwloc__xml_import_info(struct hwloc_xml_backend_data_s *data,
    }
  }

-  return state->global->close_tag(state);
+  return err;
 }

 static int
@@ -694,14 +711,15 @@ hwloc__xml_import_userdata(hwloc_topology_t topology __hwloc_attribute_unused, h
  }

  if (!topology->userdata_import_cb) {
-    char *buffer;
+    const char *buffer;
    size_t reallength = encoded ? BASE64_ENCODED_LENGTH(length) : length;
    ret = state->global->get_content(state, &buffer, reallength);
    if (ret < 0)
      return -1;

  } else if (topology->userdata_not_decoded) {
-      char *buffer, *fakename;
+      const char *buffer;
+      char *fakename;
      size_t reallength = encoded ? BASE64_ENCODED_LENGTH(length) : length;
      ret = state->global->get_content(state, &buffer, reallength);
      if (ret < 0)
@@ -714,7 +732,7 @@ hwloc__xml_import_userdata(hwloc_topology_t topology __hwloc_attribute_unused, h
      free(fakename);

  } else if (encoded && length) {
-      char *encoded_buffer;
+      const char *encoded_buffer;
      size_t encoded_length = BASE64_ENCODED_LENGTH(length);
      ret = state->global->get_content(state, &encoded_buffer, encoded_length);
      if (ret < 0)
@@ -734,7 +752,7 @@ hwloc__xml_import_userdata(hwloc_topology_t topology __hwloc_attribute_unused, h
      }

  } else { /* always handle length==0 in the non-encoded case */
-      char *buffer = (char *) "";
+      const char *buffer = "";
      if (length) {
 	ret = state->global->get_content(state, &buffer, length);
 	if (ret < 0)
@@ -888,7 +906,7 @@ hwloc__xml_import_object(hwloc_topology_t topology,
      }

    } else if (!strcmp(tag, "info")) {
-      ret = hwloc__xml_import_info(data, obj, &childstate);
+      ret = hwloc__xml_import_obj_info(data, obj, &childstate);
    } else if (data->version_major < 2 && !strcmp(tag, "distances")) {
      ret = hwloc__xml_v1import_distances(data, obj, &childstate);
    } else if (!strcmp(tag, "userdata")) {
@@ -1238,6 +1256,80 @@ hwloc__xml_import_object(hwloc_topology_t topology,
  return -1;
 }

+static int
+hwloc__xml_v2import_support(hwloc_topology_t topology,
+                            hwloc__xml_import_state_t state)
+{
+  char *name = NULL;
+  int value = 1; /* value is optional */
+  while (1) {
+    char *attrname, *attrvalue;
+    if (state->global->next_attr(state, &attrname, &attrvalue) < 0)
+      break;
+    if (!strcmp(attrname, "name"))
+      name = attrvalue;
+    else if (!strcmp(attrname, "value"))
+      value = atoi(attrvalue);
+    else {
+      if (hwloc__xml_verbose())
+	fprintf(stderr, "%s: ignoring unknown support attribute %s\n",
+		state->global->msgprefix, attrname);
+    }
+  }
+
+  if (name && topology->flags & HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT) {
+#ifdef HWLOC_DEBUG
+    HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_support) == 4*sizeof(void*));
+    HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_discovery_support) == 6);
+    HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_cpubind_support) == 11);
+    HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_membind_support) == 15);
+    HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_misc_support) == 1);
+#endif
+
+#define DO(_cat,_name) if (!strcmp(#_cat "." #_name, name)) topology->support._cat->_name = value
+    DO(discovery,pu);
+    else DO(discovery,numa);
+    else DO(discovery,numa_memory);
+    else DO(discovery,disallowed_pu);
+    else DO(discovery,disallowed_numa);
+    else DO(discovery,cpukind_efficiency);
+    else DO(cpubind,set_thisproc_cpubind);
+    else DO(cpubind,get_thisproc_cpubind);
+    else DO(cpubind,set_proc_cpubind);
+    else DO(cpubind,get_proc_cpubind);
+    else DO(cpubind,set_thisthread_cpubind);
+    else DO(cpubind,get_thisthread_cpubind);
+    else DO(cpubind,set_thread_cpubind);
+    else DO(cpubind,get_thread_cpubind);
+    else DO(cpubind,get_thisproc_last_cpu_location);
+    else DO(cpubind,get_proc_last_cpu_location);
+    else DO(cpubind,get_thisthread_last_cpu_location);
+    else DO(membind,set_thisproc_membind);
+    else DO(membind,get_thisproc_membind);
+    else DO(membind,set_proc_membind);
+    else DO(membind,get_proc_membind);
+    else DO(membind,set_thisthread_membind);
+    else DO(membind,get_thisthread_membind);
+    else DO(membind,set_area_membind);
+    else DO(membind,get_area_membind);
+    else DO(membind,alloc_membind);
+    else DO(membind,firsttouch_membind);
+    else DO(membind,bind_membind);
+    else DO(membind,interleave_membind);
+    else DO(membind,nexttouch_membind);
+    else DO(membind,migrate_membind);
+    else DO(membind,get_area_memlocation);
+
+    else if (!strcmp("custom.exported_support", name))
+      /* support was exported in a custom/fake field, mark it as imported here */
+      topology->support.misc->imported_support = 1;
+
+#undef DO
+  }
+
+  return 0;
+}
+
 static int
 hwloc__xml_v2import_distances(hwloc_topology_t topology,
 			      hwloc__xml_import_state_t state,
@@ -1317,7 +1409,8 @@ hwloc__xml_v2import_distances(hwloc_topology_t topology,
  nr_u64values = 0;
  while (1) {
    struct hwloc__xml_import_state_s childstate;
-    char *attrname, *attrvalue, *tag, *buffer;
+    char *attrname, *attrvalue, *tag;
+    const char *buffer;
    int length;
    int is_index = 0;
    int is_u64values = 0;
@@ -1356,7 +1449,7 @@ hwloc__xml_v2import_distances(hwloc_topology_t topology,

    if (is_index) {
      /* get indexes */
-      char *tmp, *tmp2;
+      const char *tmp, *tmp2;
      if (nr_indexes >= nbobjs) {
 	if (hwloc__xml_verbose())
 	  fprintf(stderr, "%s: %s with more than %u indexes\n",
@@ -1369,6 +1462,9 @@ hwloc__xml_v2import_distances(hwloc_topology_t topology,
 	unsigned long long u;
 	if (heterotypes) {
 	  hwloc_obj_type_t t = HWLOC_OBJ_TYPE_NONE;
+          if (!*tmp)
+            /* reached the end of this indexes attribute */
+            break;
 	  if (hwloc_type_sscanf(tmp, &t, NULL, 0) < 0) {
 	    if (hwloc__xml_verbose())
 	      fprintf(stderr, "%s: %s with unrecognized heterogeneous type %s\n",
@@ -1398,7 +1494,7 @@ hwloc__xml_v2import_distances(hwloc_topology_t topology,

    } else if (is_u64values) {
      /* get uint64_t values */
-      char *tmp;
+      const char *tmp;
      if (nr_u64values >= nbobjs*nbobjs) {
 	if (hwloc__xml_verbose())
 	  fprintf(stderr, "%s: %s with more than %u u64values\n",
@@ -1491,6 +1587,259 @@ hwloc__xml_v2import_distances(hwloc_topology_t topology,
 #undef _TAG_NAME
 }

+static int
+hwloc__xml_import_memattr_value(hwloc_topology_t topology,
+                                hwloc_memattr_id_t id,
+                                unsigned long flags,
+                                hwloc__xml_import_state_t state)
+{
+  char *target_obj_gp_index_s = NULL;
+  char *target_obj_type_s = NULL;
+  hwloc_uint64_t target_obj_gp_index;
+  char *value_s = NULL;
+  hwloc_uint64_t value;
+  char *initiator_cpuset_s = NULL;
+  char *initiator_obj_gp_index_s = NULL;
+  char *initiator_obj_type_s = NULL;
+  hwloc_obj_type_t target_obj_type = HWLOC_OBJ_TYPE_NONE;
+
+  while (1) {
+    char *attrname, *attrvalue;
+    if (state->global->next_attr(state, &attrname, &attrvalue) < 0)
+      break;
+    if (!strcmp(attrname, "target_obj_gp_index"))
+      target_obj_gp_index_s = attrvalue;
+    else if (!strcmp(attrname, "target_obj_type"))
+      target_obj_type_s = attrvalue;
+    else if (!strcmp(attrname, "value"))
+      value_s = attrvalue;
+    else if (!strcmp(attrname, "initiator_cpuset"))
+      initiator_cpuset_s = attrvalue;
+    else if (!strcmp(attrname, "initiator_obj_gp_index"))
+      initiator_obj_gp_index_s = attrvalue;
+    else if (!strcmp(attrname, "initiator_obj_type"))
+      initiator_obj_type_s = attrvalue;
+    else {
+      if (hwloc__xml_verbose())
+        fprintf(stderr, "%s: ignoring unknown memattr_value attribute %s\n",
+                state->global->msgprefix, attrname);
+      return -1;
+    }
+  }
+
+  if (!target_obj_type_s) {
+    if (hwloc__xml_verbose())
+      fprintf(stderr, "%s: ignoring memattr_value without target_obj_type.\n",
+              state->global->msgprefix);
+    return -1;
+  }
+  if (hwloc_type_sscanf(target_obj_type_s, &target_obj_type, NULL, 0) < 0) {
+    if (hwloc__xml_verbose())
+      fprintf(stderr, "%s: failed to identify memattr_value target object type %s\n",
+              state->global->msgprefix, target_obj_type_s);
+    return -1;
+  }
+
+  if (!value_s || !target_obj_gp_index_s) {
+    if (hwloc__xml_verbose())
+      fprintf(stderr, "%s: ignoring memattr_value without value and target_obj_gp_index\n",
+              state->global->msgprefix);
+    return -1;
+  }
+  target_obj_gp_index = strtoull(target_obj_gp_index_s, NULL, 10);
+  value = strtoull(value_s, NULL, 10);
+
+  if (flags & HWLOC_MEMATTR_FLAG_NEED_INITIATOR) {
+    /* add a value with initiator */
+    struct hwloc_internal_location_s loc;
+    if (!initiator_cpuset_s && (!initiator_obj_gp_index_s || !initiator_obj_type_s)) {
+      if (hwloc__xml_verbose())
+        fprintf(stderr, "%s: ignoring memattr_value without initiator attributes\n",
+                state->global->msgprefix);
+      return -1;
+    }
+
+    /* setup the initiator */
+    if (initiator_cpuset_s) {
+      loc.type = HWLOC_LOCATION_TYPE_CPUSET;
+      loc.location.cpuset = hwloc_bitmap_alloc();
+      if (!loc.location.cpuset) {
+        if (hwloc__xml_verbose())
+          fprintf(stderr, "%s: failed to allocated memattr_value initiator cpuset\n",
+                  state->global->msgprefix);
+        return -1;
+      }
+      hwloc_bitmap_sscanf(loc.location.cpuset, initiator_cpuset_s);
+    } else {
+      loc.type = HWLOC_LOCATION_TYPE_OBJECT;
+      loc.location.object.gp_index = strtoull(initiator_obj_gp_index_s, NULL, 10);
+      if (hwloc_type_sscanf(initiator_obj_type_s, &loc.location.object.type, NULL, 0) < 0) {
+        if (hwloc__xml_verbose())
+          fprintf(stderr, "%s: failed to identify memattr_value initiator object type %s\n",
+                  state->global->msgprefix, initiator_obj_type_s);
+        return -1;
+      }
+    }
+
+    hwloc_internal_memattr_set_value(topology, id, target_obj_type, target_obj_gp_index, (unsigned)-1, &loc, value);
+
+    if (loc.type == HWLOC_LOCATION_TYPE_CPUSET)
+      hwloc_bitmap_free(loc.location.cpuset);
+
+  } else {
+    /* add a value without initiator */
+    hwloc_internal_memattr_set_value(topology, id, target_obj_type, target_obj_gp_index, (unsigned)-1, NULL, value);
+  }
+
+  return 0;
+}
+
+static int
+hwloc__xml_import_memattr(hwloc_topology_t topology,
+                          hwloc__xml_import_state_t state)
+{
+  char *name = NULL;
+  unsigned long flags = (unsigned long) -1;
+  hwloc_memattr_id_t id = (hwloc_memattr_id_t) -1;
+  int ret;
+
+  while (1) {
+    char *attrname, *attrvalue;
+    if (state->global->next_attr(state, &attrname, &attrvalue) < 0)
+      break;
+    if (!strcmp(attrname, "name"))
+      name = attrvalue;
+    else if (!strcmp(attrname, "flags"))
+      flags = strtoul(attrvalue, NULL, 10);
+    else {
+      if (hwloc__xml_verbose())
+        fprintf(stderr, "%s: ignoring unknown memattr attribute %s\n",
+                state->global->msgprefix, attrname);
+      return -1;
+    }
+  }
+
+  if (name && flags != (unsigned long) -1) {
+    hwloc_memattr_id_t _id;
+
+    ret = hwloc_memattr_get_by_name(topology, name, &_id);
+    if (ret < 0) {
+      /* register a new attribute */
+      ret = hwloc_memattr_register(topology, name, flags, &_id);
+      if (!ret)
+        id = _id;
+    } else {
+      /* check the flags of the existing attribute  */
+      unsigned long mflags;
+      ret = hwloc_memattr_get_flags(topology, _id, &mflags);
+      if (!ret && mflags == flags)
+        id = _id;
+    }
+    /* if there's no matching attribute, id is -1 and values will be ignored below */
+  }
+
+  while (1) {
+    struct hwloc__xml_import_state_s childstate;
+    char *tag;
+
+    ret = state->global->find_child(state, &childstate, &tag);
+    if (ret <= 0)
+      break;
+
+    if (!strcmp(tag, "memattr_value")) {
+      ret = hwloc__xml_import_memattr_value(topology, id, flags, &childstate);
+    } else {
+      if (hwloc__xml_verbose())
+        fprintf(stderr, "%s: memattr with unrecognized child %s\n",
+                state->global->msgprefix, tag);
+      ret = -1;
+    }
+
+    if (ret < 0)
+      goto error;
+
+    state->global->close_child(&childstate);
+  }
+
+  return state->global->close_tag(state);
+
+ error:
+  return -1;
+}
+
+static int
+hwloc__xml_import_cpukind(hwloc_topology_t topology,
+                          hwloc__xml_import_state_t state)
+{
+  hwloc_bitmap_t cpuset = NULL;
+  int forced_efficiency = HWLOC_CPUKIND_EFFICIENCY_UNKNOWN;
+  unsigned nr_infos = 0;
+  struct hwloc_info_s *infos = NULL;
+  int ret;
+
+  while (1) {
+    char *attrname, *attrvalue;
+    if (state->global->next_attr(state, &attrname, &attrvalue) < 0)
+      break;
+    if (!strcmp(attrname, "cpuset")) {
+      if (!cpuset)
+        cpuset = hwloc_bitmap_alloc();
+      hwloc_bitmap_sscanf(cpuset, attrvalue);
+    } else if (!strcmp(attrname, "forced_efficiency")) {
+      forced_efficiency = atoi(attrvalue);
+    } else {
+      if (hwloc__xml_verbose())
+        fprintf(stderr, "%s: ignoring unknown cpukind attribute %s\n",
+                state->global->msgprefix, attrname);
+      hwloc_bitmap_free(cpuset);
+      return -1;
+    }
+  }
+
+  while (1) {
+    struct hwloc__xml_import_state_s childstate;
+    char *tag;
+
+    ret = state->global->find_child(state, &childstate, &tag);
+    if (ret <= 0)
+      break;
+
+    if (!strcmp(tag, "info")) {
+      char *infoname = NULL;
+      char *infovalue = NULL;
+      ret = hwloc___xml_import_info(&infoname, &infovalue, &childstate);
+      if (!ret && infoname && infovalue)
+        hwloc__add_info(&infos, &nr_infos, infoname, infovalue);
+    } else {
+      if (hwloc__xml_verbose())
+        fprintf(stderr, "%s: cpukind with unrecognized child %s\n",
+                state->global->msgprefix, tag);
+      ret = -1;
+    }
+
+    if (ret < 0)
+      goto error;
+
+    state->global->close_child(&childstate);
+  }
+
+  if (!cpuset) {
+    if (hwloc__xml_verbose())
+      fprintf(stderr, "%s: ignoring cpukind without cpuset\n",
+              state->global->msgprefix);
+    goto error;
+  }
+
+  hwloc_internal_cpukinds_register(topology, cpuset, forced_efficiency, infos, nr_infos, HWLOC_CPUKINDS_REGISTER_FLAG_OVERWRITE_FORCED_EFFICIENCY);
+
+  return state->global->close_tag(state);
+
+ error:
+  hwloc__free_infos(infos, nr_infos);
+  hwloc_bitmap_free(cpuset);
+  return -1;
+}
+
 static int
 hwloc__xml_import_diff_one(hwloc__xml_import_state_t state,
 			   hwloc_topology_diff_t *firstdiffp,
@@ -1759,6 +2108,18 @@ hwloc_look_xml(struct hwloc_backend *backend, struct hwloc_disc_status *dstatus)
 	ret = hwloc__xml_v2import_distances(topology, &childstate, 1);
 	if (ret < 0)
 	  goto failed;
+      } else if (!strcmp(tag, "support")) {
+	ret = hwloc__xml_v2import_support(topology, &childstate);
+	if (ret < 0)
+	  goto failed;
+      } else if (!strcmp(tag, "memattr")) {
+        ret = hwloc__xml_import_memattr(topology, &childstate);
+        if (ret < 0)
+          goto failed;
+      } else if (!strcmp(tag, "cpukind")) {
+        ret = hwloc__xml_import_cpukind(topology, &childstate);
+        if (ret < 0)
+          goto failed;
      } else {
 	if (hwloc__xml_verbose())
 	  fprintf(stderr, "%s: ignoring unknown tag `%s' after root object.\n",
@@ -1864,6 +2225,7 @@ done:
  /* keep the "Backend" information intact */
  /* we could add "BackendSource=XML" to notify that XML was used between the actual backend and here */

+  if (!(topology->flags & HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT)) {
    topology->support.discovery->pu = 1;
    topology->support.discovery->disallowed_pu = 1;
    if (data->nbnumanodes) {
@@ -1871,6 +2233,7 @@ done:
      topology->support.discovery->numa_memory = 1; // FIXME
      topology->support.discovery->disallowed_numa = 1;
    }
+  }

  if (data->look_done)
    data->look_done(data, 0);
@@ -2620,9 +2983,199 @@ hwloc__xml_v2export_distances(hwloc__xml_export_state_t parentstate, hwloc_topol
      hwloc___xml_v2export_distances(parentstate, dist);
 }

+static void
+hwloc__xml_v2export_support(hwloc__xml_export_state_t parentstate, hwloc_topology_t topology)
+{
+  struct hwloc__xml_export_state_s state;
+  char tmp[11];
+
+#ifdef HWLOC_DEBUG
+  HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_support) == 4*sizeof(void*));
+  HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_discovery_support) == 6);
+  HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_cpubind_support) == 11);
+  HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_membind_support) == 15);
+  HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_misc_support) == 1);
+#endif
+
+#define DO(_cat,_name) do {                                     \
+    if (topology->support._cat->_name) {                        \
+      parentstate->new_child(parentstate, &state, "support");   \
+      state.new_prop(&state, "name", #_cat "." #_name);         \
+      if (topology->support._cat->_name != 1) {                 \
+        sprintf(tmp, "%u", topology->support._cat->_name); \
+        state.new_prop(&state, "value", tmp);                   \
+      }                                                         \
+      state.end_object(&state, "support");                      \
+    }                                                           \
+  } while (0)
+
+  DO(discovery,pu);
+  DO(discovery,numa);
+  DO(discovery,numa_memory);
+  DO(discovery,disallowed_pu);
+  DO(discovery,disallowed_numa);
+  DO(discovery,cpukind_efficiency);
+  DO(cpubind,set_thisproc_cpubind);
+  DO(cpubind,get_thisproc_cpubind);
+  DO(cpubind,set_proc_cpubind);
+  DO(cpubind,get_proc_cpubind);
+  DO(cpubind,set_thisthread_cpubind);
+  DO(cpubind,get_thisthread_cpubind);
+  DO(cpubind,set_thread_cpubind);
+  DO(cpubind,get_thread_cpubind);
+  DO(cpubind,get_thisproc_last_cpu_location);
+  DO(cpubind,get_proc_last_cpu_location);
+  DO(cpubind,get_thisthread_last_cpu_location);
+  DO(membind,set_thisproc_membind);
+  DO(membind,get_thisproc_membind);
+  DO(membind,set_proc_membind);
+  DO(membind,get_proc_membind);
+  DO(membind,set_thisthread_membind);
+  DO(membind,get_thisthread_membind);
+  DO(membind,set_area_membind);
+  DO(membind,get_area_membind);
+  DO(membind,alloc_membind);
+  DO(membind,firsttouch_membind);
+  DO(membind,bind_membind);
+  DO(membind,interleave_membind);
+  DO(membind,nexttouch_membind);
+  DO(membind,migrate_membind);
+  DO(membind,get_area_memlocation);
+
+  /* misc.imported_support would be meaningless in the remote importer,
+   * but the importer needs to know whether we exported support or not
+   * (in case there are no support bit set at all),
+   * use a custom/fake field to do so.
+   */
+  parentstate->new_child(parentstate, &state, "support");
+  state.new_prop(&state, "name", "custom.exported_support");
+  state.end_object(&state, "support");
+
+#undef DO
+}
+
+static void
+hwloc__xml_export_memattr_target(hwloc__xml_export_state_t state,
+                                 struct hwloc_internal_memattr_s *imattr,
+                                 struct hwloc_internal_memattr_target_s *imtg)
+{
+  struct hwloc__xml_export_state_s vstate;
+  char tmp[255];
+
+  if (imattr->flags & HWLOC_MEMATTR_FLAG_NEED_INITIATOR) {
+    /* export all initiators */
+    unsigned k;
+    for(k=0; k<imtg->nr_initiators; k++) {
+      struct hwloc_internal_memattr_initiator_s *imi = &imtg->initiators[k];
+      state->new_child(state, &vstate, "memattr_value");
+      vstate.new_prop(&vstate, "target_obj_type", hwloc_obj_type_string(imtg->type));
+      snprintf(tmp, sizeof(tmp), "%llu", (unsigned long long) imtg->gp_index);
+      vstate.new_prop(&vstate, "target_obj_gp_index", tmp);
+      snprintf(tmp, sizeof(tmp), "%llu", (unsigned long long) imi->value);
+      vstate.new_prop(&vstate, "value", tmp);
+      switch (imi->initiator.type) {
+      case HWLOC_LOCATION_TYPE_OBJECT:
+        snprintf(tmp, sizeof(tmp), "%llu", (unsigned long long) imi->initiator.location.object.gp_index);
+        vstate.new_prop(&vstate, "initiator_obj_gp_index", tmp);
+        vstate.new_prop(&vstate, "initiator_obj_type", hwloc_obj_type_string(imi->initiator.location.object.type));
+        break;
+      case HWLOC_LOCATION_TYPE_CPUSET: {
+        char *setstring;
+        hwloc_bitmap_asprintf(&setstring, imi->initiator.location.cpuset);
+        if (setstring)
+          vstate.new_prop(&vstate, "initiator_cpuset", setstring);
+        free(setstring);
+        break;
+      }
+      default:
+        assert(0);
+      }
+      vstate.end_object(&vstate, "memattr_value");
+    }
+  } else {
+    /* just export the global value */
+    state->new_child(state, &vstate, "memattr_value");
+    vstate.new_prop(&vstate, "target_obj_type", hwloc_obj_type_string(imtg->type));
+    snprintf(tmp, sizeof(tmp), "%llu", (unsigned long long) imtg->gp_index);
+    vstate.new_prop(&vstate, "target_obj_gp_index", tmp);
+    snprintf(tmp, sizeof(tmp), "%llu", (unsigned long long) imtg->noinitiator_value);
+    vstate.new_prop(&vstate, "value", tmp);
+    vstate.end_object(&vstate, "memattr_value");
+  }
+}
+
+static void
+hwloc__xml_export_memattrs(hwloc__xml_export_state_t state, hwloc_topology_t topology)
+{
+  unsigned id;
+  for(id=0; id<topology->nr_memattrs; id++) {
+    struct hwloc_internal_memattr_s *imattr;
+    struct hwloc__xml_export_state_s mstate;
+    char tmp[255];
+    unsigned j;
+
+    if (id == HWLOC_MEMATTR_ID_CAPACITY || id == HWLOC_MEMATTR_ID_LOCALITY)
+      /* no need to export virtual memattrs */
+      continue;
+
+    imattr = &topology->memattrs[id];
+    if ((id == HWLOC_MEMATTR_ID_LATENCY || id == HWLOC_MEMATTR_ID_BANDWIDTH)
+        && !imattr->nr_targets)
+      /* no need to export target-less attributes for initial attributes, no release support attributes without those definitions */
+      continue;
+
+    state->new_child(state, &mstate, "memattr");
+    mstate.new_prop(&mstate, "name", imattr->name);
+    snprintf(tmp, sizeof(tmp), "%lu", imattr->flags);
+    mstate.new_prop(&mstate, "flags", tmp);
+
+    for(j=0; j<imattr->nr_targets; j++)
+      hwloc__xml_export_memattr_target(&mstate, imattr, &imattr->targets[j]);
+
+    mstate.end_object(&mstate, "memattr");
+  }
+}
+
+static void
+hwloc__xml_export_cpukinds(hwloc__xml_export_state_t state, hwloc_topology_t topology)
+{
+  unsigned i;
+  for(i=0; i<topology->nr_cpukinds; i++) {
+    struct hwloc_internal_cpukind_s *kind = &topology->cpukinds[i];
+    struct hwloc__xml_export_state_s cstate;
+    char *setstring;
+    unsigned j;
+
+    state->new_child(state, &cstate, "cpukind");
+    hwloc_bitmap_asprintf(&setstring, kind->cpuset);
+    cstate.new_prop(&cstate, "cpuset", setstring);
+    free(setstring);
+    if (kind->forced_efficiency != HWLOC_CPUKIND_EFFICIENCY_UNKNOWN) {
+      char tmp[11];
+      snprintf(tmp, sizeof(tmp), "%d", kind->forced_efficiency);
+      cstate.new_prop(&cstate, "forced_efficiency", tmp);
+    }
+
+    for(j=0; j<kind->nr_infos; j++) {
+      char *name = hwloc__xml_export_safestrdup(kind->infos[j].name);
+      char *value = hwloc__xml_export_safestrdup(kind->infos[j].value);
+      struct hwloc__xml_export_state_s istate;
+      cstate.new_child(&cstate, &istate, "info");
+      istate.new_prop(&istate, "name", name);
+      istate.new_prop(&istate, "value", value);
+      istate.end_object(&istate, "info");
+      free(name);
+      free(value);
+    }
+
+    cstate.end_object(&cstate, "cpukind");
+  }
+}
+
 void
 hwloc__xml_export_topology(hwloc__xml_export_state_t state, hwloc_topology_t topology, unsigned long flags)
 {
+  char *env;
  hwloc_obj_t root = hwloc_get_root_obj(topology);

  if (flags & HWLOC_TOPOLOGY_EXPORT_XML_FLAG_V1) {
@@ -2665,6 +3218,11 @@ hwloc__xml_export_topology(hwloc__xml_export_state_t state, hwloc_topology_t top
  } else {
    hwloc__xml_v2export_object (state, topology, root, flags);
    hwloc__xml_v2export_distances (state, topology);
+    env = getenv("HWLOC_XML_EXPORT_SUPPORT");
+    if (!env || atoi(env))
+      hwloc__xml_v2export_support(state, topology);
+    hwloc__xml_export_memattrs(state, topology);
+    hwloc__xml_export_cpukinds(state, topology);
  }
 }

--- a/src/3rdparty/hwloc/src/topology.c
+++ b/src/3rdparty/hwloc/src/topology.c
@@ -1,6 +1,6 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2019 Inria.  All rights reserved.
+ * Copyright © 2009-2021 Inria.  All rights reserved.
 * Copyright © 2009-2012, 2020 Université Bordeaux
 * Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
@@ -75,16 +75,49 @@ int hwloc_hide_errors(void)
  return hide;
 }

-void hwloc_report_os_error(const char *msg, int line)
+
+/* format the obj info to print in error messages */
+static void
+report_insert_error_format_obj(char *buf, size_t buflen, hwloc_obj_t obj)
+{
+  char typestr[64];
+  char *cpusetstr;
+  char *nodesetstr = NULL;
+
+  hwloc_obj_type_snprintf(typestr, sizeof(typestr), obj, 0);
+  hwloc_bitmap_asprintf(&cpusetstr, obj->cpuset);
+  if (obj->nodeset) /* may be missing during insert */
+    hwloc_bitmap_asprintf(&nodesetstr, obj->nodeset);
+  if (obj->os_index != HWLOC_UNKNOWN_INDEX)
+    snprintf(buf, buflen, "%s (P#%u cpuset %s%s%s)",
+             typestr, obj->os_index, cpusetstr,
+             nodesetstr ? " nodeset " : "",
+             nodesetstr ? nodesetstr : "");
+  else
+    snprintf(buf, buflen, "%s (cpuset %s%s%s)",
+             typestr, cpusetstr,
+             nodesetstr ? " nodeset " : "",
+             nodesetstr ? nodesetstr : "");
+  free(cpusetstr);
+  free(nodesetstr);
+}
+
+static void report_insert_error(hwloc_obj_t new, hwloc_obj_t old, const char *msg, const char *reason)
 {
  static int reported = 0;

-  if (!reported && !hwloc_hide_errors()) {
+  if (reason && !reported && !hwloc_hide_errors()) {
+    char newstr[512];
+    char oldstr[512];
+    report_insert_error_format_obj(newstr, sizeof(newstr), new);
+    report_insert_error_format_obj(oldstr, sizeof(oldstr), old);
+
    fprintf(stderr, "****************************************************************************\n");
    fprintf(stderr, "* hwloc %s received invalid information from the operating system.\n", HWLOC_VERSION);
    fprintf(stderr, "*\n");
-    fprintf(stderr, "* %s\n", msg);
-    fprintf(stderr, "* Error occurred in topology.c line %d\n", line);
+    fprintf(stderr, "* Failed with: %s\n", msg);
+    fprintf(stderr, "* while inserting %s at %s\n", newstr, oldstr);
+    fprintf(stderr, "* coming from: %s\n", reason);
    fprintf(stderr, "*\n");
    fprintf(stderr, "* The following FAQ entry in the hwloc documentation may help:\n");
    fprintf(stderr, "*   What should I do when hwloc reports \"operating system\" warnings?\n");
@@ -264,7 +297,7 @@ hwloc_setup_pu_level(struct hwloc_topology *topology,

      hwloc_debug_2args_bitmap("cpu %u (os %u) has cpuset %s\n",
 		 cpu, oscpu, obj->cpuset);
-      hwloc_insert_object_by_cpuset(topology, obj);
+      hwloc__insert_object_by_cpuset(topology, NULL, obj, "core:pulevel");

      cpu++;
    }
@@ -347,6 +380,7 @@ hwloc_debug_print_object(int indent __hwloc_attribute_unused, hwloc_obj_t obj)
 static void
 hwloc_debug_print_objects(int indent __hwloc_attribute_unused, hwloc_obj_t obj)
 {
+  if (hwloc_debug_enabled() >= 2) {
    hwloc_obj_t child;
    hwloc_debug_print_object(indent, obj);
    for_each_child (child, obj)
@@ -358,6 +392,7 @@ hwloc_debug_print_objects(int indent __hwloc_attribute_unused, hwloc_obj_t obj)
    for_each_misc_child (child, obj)
      hwloc_debug_print_objects(indent + 1, child);
  }
+}
 #else /* !HWLOC_DEBUG */
 #define hwloc_debug_print_object(indent, obj) do { /* nothing */ } while (0)
 #define hwloc_debug_print_objects(indent, obj) do { /* nothing */ } while (0)
@@ -472,29 +507,33 @@ int hwloc_obj_add_info(hwloc_obj_t obj, const char *name, const char *value)
 }

 /* This function may be called with topology->tma set, it cannot free() or realloc() */
-static int hwloc__tma_dup_infos(struct hwloc_tma *tma, hwloc_obj_t new, hwloc_obj_t src)
+int hwloc__tma_dup_infos(struct hwloc_tma *tma,
+                         struct hwloc_info_s **newip, unsigned *newcp,
+                         struct hwloc_info_s *oldi, unsigned oldc)
 {
+  struct hwloc_info_s *newi;
  unsigned i, j;
-  new->infos = hwloc_tma_calloc(tma, src->infos_count * sizeof(*src->infos));
-  if (!new->infos)
+  newi = hwloc_tma_calloc(tma, oldc * sizeof(*newi));
+  if (!newi)
    return -1;
-  for(i=0; i<src->infos_count; i++) {
-    new->infos[i].name = hwloc_tma_strdup(tma, src->infos[i].name);
-    new->infos[i].value = hwloc_tma_strdup(tma, src->infos[i].value);
-    if (!new->infos[i].name || !new->infos[i].value)
+  for(i=0; i<oldc; i++) {
+    newi[i].name = hwloc_tma_strdup(tma, oldi[i].name);
+    newi[i].value = hwloc_tma_strdup(tma, oldi[i].value);
+    if (!newi[i].name || !newi[i].value)
      goto failed;
  }
-  new->infos_count = src->infos_count;
+  *newip = newi;
+  *newcp = oldc;
  return 0;

 failed:
  assert(!tma || !tma->dontfree); /* this tma cannot fail to allocate */
  for(j=0; j<=i; j++) {
-    free(new->infos[i].name);
-    free(new->infos[i].value);
+    free(newi[i].name);
+    free(newi[i].value);
  }
-  free(new->infos);
-  new->infos = NULL;
+  free(newi);
+  *newip = NULL;
  return -1;
 }

@@ -528,8 +567,9 @@ hwloc_free_unlinked_object(hwloc_obj_t obj)
 }

 /* Replace old with contents of new object, and make new freeable by the caller.
- * Only updates next_sibling/first_child pointers,
- * so may only be used during early discovery.
+ * Requires reconnect (for siblings pointers and group depth),
+ * fixup of sets (only the main cpuset was likely compared before merging),
+ * and update of total_memory and group depth.
 */
 static void
 hwloc_replace_linked_object(hwloc_obj_t old, hwloc_obj_t new)
@@ -812,7 +852,7 @@ hwloc__duplicate_object(struct hwloc_topology *newtopology,
  newobj->nodeset = hwloc_bitmap_tma_dup(tma, src->nodeset);
  newobj->complete_nodeset = hwloc_bitmap_tma_dup(tma, src->complete_nodeset);

-  hwloc__tma_dup_infos(tma, newobj, src);
+  hwloc__tma_dup_infos(tma, &newobj->infos, &newobj->infos_count, src->infos, src->infos_count);

  /* find our level */
  if (src->depth < 0) {
@@ -970,6 +1010,7 @@ hwloc__topology_dup(hwloc_topology_t *newp,
  memcpy(new->support.discovery, old->support.discovery, sizeof(*old->support.discovery));
  memcpy(new->support.cpubind, old->support.cpubind, sizeof(*old->support.cpubind));
  memcpy(new->support.membind, old->support.membind, sizeof(*old->support.membind));
+  memcpy(new->support.misc, old->support.misc, sizeof(*old->support.misc));

  new->allowed_cpuset = hwloc_bitmap_tma_dup(tma, old->allowed_cpuset);
  new->allowed_nodeset = hwloc_bitmap_tma_dup(tma, old->allowed_nodeset);
@@ -1008,6 +1049,14 @@ hwloc__topology_dup(hwloc_topology_t *newp,
  if (err < 0)
    goto out_with_topology;

+  err = hwloc_internal_memattrs_dup(new, old);
+  if (err < 0)
+    goto out_with_topology;
+
+  err = hwloc_internal_cpukinds_dup(new, old);
+  if (err < 0)
+    goto out_with_topology;
+
  /* we connected everything during duplication */
  new->modified = 0;

@@ -1229,31 +1278,6 @@ hwloc__object_cpusets_compare_first(hwloc_obj_t obj1, hwloc_obj_t obj2)
  return 0;
 }

-/* format the obj info to print in error messages */
-static void
-hwloc__report_error_format_obj(char *buf, size_t buflen, hwloc_obj_t obj)
-{
-	char typestr[64];
-	char *cpusetstr;
-	char *nodesetstr = NULL;
-	hwloc_obj_type_snprintf(typestr, sizeof(typestr), obj, 0);
-	hwloc_bitmap_asprintf(&cpusetstr, obj->cpuset);
-	if (obj->nodeset) /* may be missing during insert */
-	  hwloc_bitmap_asprintf(&nodesetstr, obj->nodeset);
-	if (obj->os_index != HWLOC_UNKNOWN_INDEX)
-	  snprintf(buf, buflen, "%s (P#%u cpuset %s%s%s)",
-		   typestr, obj->os_index, cpusetstr,
-		   nodesetstr ? " nodeset " : "",
-		   nodesetstr ? nodesetstr : "");
-	else
-	  snprintf(buf, buflen, "%s (cpuset %s%s%s)",
-		   typestr, cpusetstr,
-		   nodesetstr ? " nodeset " : "",
-		   nodesetstr ? nodesetstr : "");
-	free(cpusetstr);
-	free(nodesetstr);
-}
-
 /*
 * How to insert objects into the topology.
 *
@@ -1325,7 +1349,7 @@ merge_insert_equal(hwloc_obj_t new, hwloc_obj_t old)

 /* returns the result of merge, or NULL if not merged */
 static __hwloc_inline hwloc_obj_t
-hwloc__insert_try_merge_group(hwloc_obj_t old, hwloc_obj_t new)
+hwloc__insert_try_merge_group(hwloc_topology_t topology, hwloc_obj_t old, hwloc_obj_t new)
 {
  if (new->type == HWLOC_OBJ_GROUP && old->type == HWLOC_OBJ_GROUP) {
    /* which group do we keep? */
@@ -1336,6 +1360,7 @@ hwloc__insert_try_merge_group(hwloc_obj_t old, hwloc_obj_t new)

      /* keep the new one, it doesn't want to be merged */
      hwloc_replace_linked_object(old, new);
+      topology->modified = 1;
      return new;

    } else {
@@ -1343,9 +1368,12 @@ hwloc__insert_try_merge_group(hwloc_obj_t old, hwloc_obj_t new)
 	/* keep the old one, it doesn't want to be merged */
 	return old;

-      /* compare subkinds to decice who to keep */
-      if (new->attr->group.kind < old->attr->group.kind)
+      /* compare subkinds to decide which group to keep */
+      if (new->attr->group.kind < old->attr->group.kind) {
+        /* keep smaller kind */
 	hwloc_replace_linked_object(old, new);
+        topology->modified = 1;
+      }
      return old;
    }
  }
@@ -1371,6 +1399,7 @@ hwloc__insert_try_merge_group(hwloc_obj_t old, hwloc_obj_t new)
     * and let the caller free the new object
     */
    hwloc_replace_linked_object(old, new);
+    topology->modified = 1;
    return old;

  } else {
@@ -1390,9 +1419,9 @@ hwloc__insert_try_merge_group(hwloc_obj_t old, hwloc_obj_t new)
 */
 static struct hwloc_obj *
 hwloc___insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t cur, hwloc_obj_t obj,
-			        hwloc_report_error_t report_error)
+			        const char *reason)
 {
-  hwloc_obj_t child, next_child = NULL;
+  hwloc_obj_t child, next_child = NULL, tmp;
  /* These will always point to the pointer to their next last child. */
  hwloc_obj_t *cur_children = &cur->first_child;
  hwloc_obj_t *obj_children = &obj->first_child;
@@ -1412,7 +1441,7 @@ hwloc___insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t cur
    int setres = res;

    if (res == HWLOC_OBJ_EQUAL) {
-      hwloc_obj_t merged = hwloc__insert_try_merge_group(child, obj);
+      hwloc_obj_t merged = hwloc__insert_try_merge_group(topology, child, obj);
      if (merged)
 	return merged;
      /* otherwise compare actual types to decide of the inclusion */
@@ -1430,18 +1459,10 @@ hwloc___insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t cur

      case HWLOC_OBJ_INCLUDED:
 	/* OBJ is strictly contained is some child of CUR, go deeper.  */
-	return hwloc___insert_object_by_cpuset(topology, child, obj, report_error);
+	return hwloc___insert_object_by_cpuset(topology, child, obj, reason);

      case HWLOC_OBJ_INTERSECTS:
-        if (report_error) {
-	  char childstr[512];
-	  char objstr[512];
-	  char msg[1100];
-	  hwloc__report_error_format_obj(objstr, sizeof(objstr), obj);
-	  hwloc__report_error_format_obj(childstr, sizeof(childstr), child);
-	  snprintf(msg, sizeof(msg), "%s intersects with %s without inclusion!", objstr, childstr);
-	  report_error(msg, __LINE__);
-	}
+        report_insert_error(obj, child, "intersection without inclusion", reason);
 	goto putback;

      case HWLOC_OBJ_DIFFERENT:
@@ -1464,6 +1485,8 @@ hwloc___insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t cur
 	if (setres == HWLOC_OBJ_EQUAL) {
 	  obj->memory_first_child = child->memory_first_child;
 	  child->memory_first_child = NULL;
+	  for(tmp=obj->memory_first_child; tmp; tmp = tmp->next_sibling)
+	    tmp->parent = obj;
 	}
 	break;
    }
@@ -1483,7 +1506,9 @@ hwloc___insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t cur
  return obj;

 putback:
-  /* Put-back OBJ children in CUR and return an error. */
+  /* OBJ cannot be inserted.
+   * Put-back OBJ children in CUR and return an error.
+   */
  if (putp)
    cur_children = putp; /* No need to try to insert before where OBJ was supposed to go */
  else
@@ -1492,12 +1517,12 @@ hwloc___insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t cur
  while ((child = obj->first_child) != NULL) {
    /* Remove from OBJ */
    obj->first_child = child->next_sibling;
-    obj->parent = cur;
-    /* Find child position in CUR, and insert. */
+    /* Find child position in CUR, and reinsert it. */
    while (*cur_children && hwloc__object_cpusets_compare_first(*cur_children, child) < 0)
      cur_children = &(*cur_children)->next_sibling;
    child->next_sibling = *cur_children;
    *cur_children = child;
+    child->parent = cur;
  }
  return NULL;
 }
@@ -1521,7 +1546,7 @@ hwloc__find_obj_covering_memory_cpuset(struct hwloc_topology *topology, hwloc_ob

 static struct hwloc_obj *
 hwloc__find_insert_memory_parent(struct hwloc_topology *topology, hwloc_obj_t obj,
-				 hwloc_report_error_t report_error)
+                                 const char *reason)
 {
  hwloc_obj_t parent, group, result;

@@ -1573,7 +1598,7 @@ hwloc__find_insert_memory_parent(struct hwloc_topology *topology, hwloc_obj_t ob
    return parent;
  }

-  result = hwloc__insert_object_by_cpuset(topology, parent, group, report_error);
+  result = hwloc__insert_object_by_cpuset(topology, parent, group, reason);
  if (!result) {
    /* failed to insert, fallback to larger parent */
    return parent;
@@ -1586,8 +1611,7 @@ hwloc__find_insert_memory_parent(struct hwloc_topology *topology, hwloc_obj_t ob
 /* only works for MEMCACHE and NUMAnode with a single bit in nodeset */
 static hwloc_obj_t
 hwloc___attach_memory_object_by_nodeset(struct hwloc_topology *topology, hwloc_obj_t parent,
-					hwloc_obj_t obj,
-					hwloc_report_error_t report_error)
+					hwloc_obj_t obj, const char *reason)
 {
  hwloc_obj_t *curp = &parent->memory_first_child;
  unsigned first = hwloc_bitmap_first(obj->nodeset);
@@ -1611,20 +1635,12 @@ hwloc___attach_memory_object_by_nodeset(struct hwloc_topology *topology, hwloc_o
      if (obj->type == HWLOC_OBJ_NUMANODE) {
 	if (cur->type == HWLOC_OBJ_NUMANODE) {
 	  /* identical NUMA nodes? ignore the new one */
-	  if (report_error) {
-	    char curstr[512];
-	    char objstr[512];
-	    char msg[1100];
-	    hwloc__report_error_format_obj(curstr, sizeof(curstr), cur);
-	    hwloc__report_error_format_obj(objstr, sizeof(objstr), obj);
-	    snprintf(msg, sizeof(msg), "%s and %s have identical nodesets!", objstr, curstr);
-	    report_error(msg, __LINE__);
-	  }
+          report_insert_error(obj, cur, "NUMAnodes with identical nodesets", reason);
 	  return NULL;
 	}
 	assert(cur->type == HWLOC_OBJ_MEMCACHE);
 	/* insert the new NUMA node below that existing memcache */
-	return hwloc___attach_memory_object_by_nodeset(topology, cur, obj, report_error);
+	return hwloc___attach_memory_object_by_nodeset(topology, cur, obj, reason);

      } else {
 	assert(obj->type == HWLOC_OBJ_MEMCACHE);
@@ -1637,7 +1653,7 @@ hwloc___attach_memory_object_by_nodeset(struct hwloc_topology *topology, hwloc_o
 	     * (depth starts from the NUMA node).
 	     * insert the new memcache below the existing one
 	     */
-	    return hwloc___attach_memory_object_by_nodeset(topology, cur, obj, report_error);
+	    return hwloc___attach_memory_object_by_nodeset(topology, cur, obj, reason);
 	}
 	/* insert the memcache above the existing memcache or numa node */
 	obj->next_sibling = cur->next_sibling;
@@ -1673,8 +1689,7 @@ hwloc___attach_memory_object_by_nodeset(struct hwloc_topology *topology, hwloc_o
 */
 struct hwloc_obj *
 hwloc__attach_memory_object(struct hwloc_topology *topology, hwloc_obj_t parent,
-			    hwloc_obj_t obj,
-			    hwloc_report_error_t report_error)
+			    hwloc_obj_t obj, const char *reason)
 {
  hwloc_obj_t result;

@@ -1704,7 +1719,7 @@ hwloc__attach_memory_object(struct hwloc_topology *topology, hwloc_obj_t parent,
  hwloc_bitmap_copy(obj->complete_cpuset, parent->complete_cpuset);
 #endif

-  result = hwloc___attach_memory_object_by_nodeset(topology, parent, obj, report_error);
+  result = hwloc___attach_memory_object_by_nodeset(topology, parent, obj, reason);
  if (result == obj) {
    /* Add the bit to the top sets, and to the parent CPU-side object */
    if (obj->type == HWLOC_OBJ_NUMANODE) {
@@ -1722,8 +1737,7 @@ hwloc__attach_memory_object(struct hwloc_topology *topology, hwloc_obj_t parent,
 /* insertion routine that lets you change the error reporting callback */
 struct hwloc_obj *
 hwloc__insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t root,
-			       hwloc_obj_t obj,
-			       hwloc_report_error_t report_error)
+			       hwloc_obj_t obj, const char *reason)
 {
  struct hwloc_obj *result;

@@ -1740,20 +1754,20 @@ hwloc__insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t root

  if (hwloc__obj_type_is_memory(obj->type)) {
    if (!root) {
-      root = hwloc__find_insert_memory_parent(topology, obj, report_error);
+      root = hwloc__find_insert_memory_parent(topology, obj, reason);
      if (!root) {
 	hwloc_free_unlinked_object(obj);
 	return NULL;
      }
    }
-    return hwloc__attach_memory_object(topology, root, obj, report_error);
+    return hwloc__attach_memory_object(topology, root, obj, reason);
  }

  if (!root)
    /* Start at the top. */
    root = topology->levels[0][0];

-  result = hwloc___insert_object_by_cpuset(topology, root, obj, report_error);
+  result = hwloc___insert_object_by_cpuset(topology, root, obj, reason);
  if (result && result->type == HWLOC_OBJ_PU) {
      /* Add the bit to the top sets */
      if (hwloc_bitmap_isset(result->cpuset, result->os_index))
@@ -1769,12 +1783,6 @@ hwloc__insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t root

 /* the default insertion routine warns in case of error.
 * it's used by most backends */
-struct hwloc_obj *
-hwloc_insert_object_by_cpuset(struct hwloc_topology *topology, hwloc_obj_t obj)
-{
-  return hwloc__insert_object_by_cpuset(topology, NULL, obj, hwloc_report_os_error);
-}
-
 void
 hwloc_insert_object_by_parent(struct hwloc_topology *topology, hwloc_obj_t parent, hwloc_obj_t obj)
 {
@@ -1917,6 +1925,7 @@ hwloc_topology_insert_group_object(struct hwloc_topology *topology, hwloc_obj_t
      if (hwloc_bitmap_isset(nodeset, numa->os_index))
 	hwloc_bitmap_or(obj->cpuset, obj->cpuset, numa->cpuset);
  }
+  /* FIXME insert by nodeset to group NUMAs even if CPUless? */

  cmp = hwloc_obj_cmp_sets(obj, root);
  if (cmp == HWLOC_OBJ_INCLUDED) {
@@ -1928,12 +1937,24 @@ hwloc_topology_insert_group_object(struct hwloc_topology *topology, hwloc_obj_t

  if (!res)
    return NULL;
-  if (res != obj)
-    /* merged */
+
+  if (res != obj && res->type != HWLOC_OBJ_GROUP)
+    /* merged, not into a Group, nothing to update */
    return res;

+  /* res == obj means that the object was inserted.
+   * We need to reconnect levels, fill all its cpu/node sets,
+   * compute its total memory, group depth, etc.
+   *
+   * res != obj usually means that our new group was merged into an
+   * existing object, no need to recompute anything.
+   * However, if merging with an existing group, depending on their kinds,
+   * the contents of obj may overwrite the contents of the old group.
+   * This requires reconnecting levels, filling sets, recomputing total memory, etc.
+   */
+
  /* properly inserted */
-  hwloc_obj_add_children_sets(obj);
+  hwloc_obj_add_children_sets(res);
  if (hwloc_topology_reconnect(topology, 0) < 0)
    return NULL;

@@ -1945,7 +1966,7 @@ hwloc_topology_insert_group_object(struct hwloc_topology *topology, hwloc_obj_t
 #endif
    hwloc_topology_check(topology);

-  return obj;
+  return res;
 }

 hwloc_obj_t
@@ -2047,7 +2068,7 @@ hwloc_find_insert_io_parent_by_complete_cpuset(struct hwloc_topology *topology,
  hwloc_bitmap_and(cpuset, cpuset, hwloc_topology_get_topology_cpuset(topology));
  group_obj->cpuset = hwloc_bitmap_dup(cpuset);
  group_obj->attr->group.kind = HWLOC_GROUP_KIND_IO;
-  parent = hwloc__insert_object_by_cpuset(topology, largeparent, group_obj, hwloc_report_os_error);
+  parent = hwloc__insert_object_by_cpuset(topology, largeparent, group_obj, "topology:io_parent");
  if (!parent)
    /* Failed to insert the Group, maybe a conflicting cpuset */
    return largeparent;
@@ -3251,7 +3272,7 @@ hwloc_discover(struct hwloc_topology *topology,
   * produced by hwloc_setup_pu_level()
   */

-  /* To be able to just use hwloc_insert_object_by_cpuset to insert the object
+  /* To be able to just use hwloc__insert_object_by_cpuset to insert the object
   * in the topology according to the cpuset, the cpuset field must be
   * initialized.
   */
@@ -3356,7 +3377,7 @@ hwloc_discover(struct hwloc_topology *topology,
    hwloc_bitmap_set(node->nodeset, 0);
    memcpy(&node->attr->numanode, &topology->machine_memory, sizeof(topology->machine_memory));
    memset(&topology->machine_memory, 0, sizeof(topology->machine_memory));
-    hwloc_insert_object_by_cpuset(topology, node);
+    hwloc__insert_object_by_cpuset(topology, NULL, node, "core:defaultnumanode");
  } else {
    /* if we're sure we found all NUMA nodes without their sizes (x86 backend?),
     * we could split topology->total_memory in all of them.
@@ -3514,6 +3535,7 @@ hwloc_topology_setup_defaults(struct hwloc_topology *topology)
  memset(topology->support.discovery, 0, sizeof(*topology->support.discovery));
  memset(topology->support.cpubind, 0, sizeof(*topology->support.cpubind));
  memset(topology->support.membind, 0, sizeof(*topology->support.membind));
+  memset(topology->support.misc, 0, sizeof(*topology->support.misc));

  /* Only the System object on top by default */
  topology->next_gp_index = 1; /* keep 0 as an invalid value */
@@ -3590,6 +3612,7 @@ hwloc__topology_init (struct hwloc_topology **topologyp,
  topology->support.discovery = hwloc_tma_malloc(tma, sizeof(*topology->support.discovery));
  topology->support.cpubind = hwloc_tma_malloc(tma, sizeof(*topology->support.cpubind));
  topology->support.membind = hwloc_tma_malloc(tma, sizeof(*topology->support.membind));
+  topology->support.misc = hwloc_tma_malloc(tma, sizeof(*topology->support.misc));

  topology->nb_levels_allocated = nblevels; /* enough for default 10 levels = Mach+Pack+Die+NUMA+L3+L2+L1d+L1i+Co+PU */
  topology->levels = hwloc_tma_calloc(tma, topology->nb_levels_allocated * sizeof(*topology->levels));
@@ -3598,6 +3621,8 @@ hwloc__topology_init (struct hwloc_topology **topologyp,
  hwloc__topology_filter_init(topology);

  hwloc_internal_distances_init(topology);
+  hwloc_internal_memattrs_init(topology);
+  hwloc_internal_cpukinds_init(topology);

  topology->userdata_export_cb = NULL;
  topology->userdata_import_cb = NULL;
@@ -3691,7 +3716,7 @@ hwloc_topology_set_flags (struct hwloc_topology *topology, unsigned long flags)
    return -1;
  }

-  if (flags & ~(HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED|HWLOC_TOPOLOGY_FLAG_IS_THISSYSTEM|HWLOC_TOPOLOGY_FLAG_THISSYSTEM_ALLOWED_RESOURCES)) {
+  if (flags & ~(HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED|HWLOC_TOPOLOGY_FLAG_IS_THISSYSTEM|HWLOC_TOPOLOGY_FLAG_THISSYSTEM_ALLOWED_RESOURCES|HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT)) {
    errno = EINVAL;
    return -1;
  }
@@ -3827,7 +3852,9 @@ hwloc_topology_clear (struct hwloc_topology *topology)
 {
  /* no need to set to NULL after free() since callers will call setup_defaults() or just destroy the rest of the topology */
  unsigned l;
+  hwloc_internal_cpukinds_destroy(topology);
  hwloc_internal_distances_destroy(topology);
+  hwloc_internal_memattrs_destroy(topology);
  hwloc_free_object_and_children(topology->levels[0][0]);
  hwloc_bitmap_free(topology->allowed_cpuset);
  hwloc_bitmap_free(topology->allowed_nodeset);
@@ -3858,6 +3885,7 @@ hwloc_topology_destroy (struct hwloc_topology *topology)
  free(topology->support.discovery);
  free(topology->support.cpubind);
  free(topology->support.membind);
+  free(topology->support.misc);
  free(topology);
 }

@@ -3873,7 +3901,9 @@ hwloc_topology_load (struct hwloc_topology *topology)
    return -1;
  }

+  /* initialize envvar-related things */
  hwloc_internal_distances_prepare(topology);
+  hwloc_internal_memattrs_prepare(topology);

  if (getenv("HWLOC_XML_USERDATA_NOT_DECODED"))
    topology->userdata_not_decoded = 1;
@@ -3954,6 +3984,9 @@ hwloc_topology_load (struct hwloc_topology *topology)
 #endif
    hwloc_topology_check(topology);

+  /* Rank cpukinds */
+  hwloc_internal_cpukinds_rank(topology);
+
  /* Mark distances objs arrays as invalid since we may have removed objects
   * from the topology after adding the distances (remove_empty, etc).
   * It would be hard to actually verify whether it's needed.
@@ -3964,6 +3997,10 @@ hwloc_topology_load (struct hwloc_topology *topology)
   */
  hwloc_internal_distances_refresh(topology);

+  /* Same for memattrs */
+  hwloc_internal_memattrs_need_refresh(topology);
+  hwloc_internal_memattrs_refresh(topology);
+
  topology->is_loaded = 1;

  if (topology->backend_phases & HWLOC_DISC_PHASE_TWEAK) {
@@ -4246,10 +4283,12 @@ hwloc_topology_restrict(struct hwloc_topology *topology, hwloc_const_bitmap_t se

  /* some objects may have disappeared, we need to update distances objs arrays */
  hwloc_internal_distances_invalidate_cached_objs(topology);
+  hwloc_internal_memattrs_need_refresh(topology);

  hwloc_filter_levels_keep_structure(topology);
  hwloc_propagate_symmetric_subtree(topology, topology->levels[0][0]);
  propagate_total_memory(topology->levels[0][0]);
+  hwloc_internal_cpukinds_restrict(topology);

 #ifndef HWLOC_DEBUG
  if (getenv("HWLOC_DEBUG_CHECK"))
@@ -4334,6 +4373,15 @@ hwloc_topology_allow(struct hwloc_topology *topology,
  return -1;
 }

+int
+hwloc_topology_refresh(struct hwloc_topology *topology)
+{
+  hwloc_internal_cpukinds_rank(topology);
+  hwloc_internal_distances_refresh(topology);
+  hwloc_internal_memattrs_refresh(topology);
+  return 0;
+}
+
 int
 hwloc_topology_is_thissystem(struct hwloc_topology *topology)
 {
@@ -4628,6 +4676,9 @@ hwloc__check_misc_children(hwloc_topology_t topology, hwloc_bitmap_t gp_indexes,
 static void
 hwloc__check_object(hwloc_topology_t topology, hwloc_bitmap_t gp_indexes, hwloc_obj_t obj)
 {
+  hwloc_uint64_t total_memory;
+  hwloc_obj_t child;
+
  assert(!hwloc_bitmap_isset(gp_indexes, obj->gp_index));
  hwloc_bitmap_set(gp_indexes, obj->gp_index);

@@ -4685,6 +4736,18 @@ hwloc__check_object(hwloc_topology_t topology, hwloc_bitmap_t gp_indexes, hwloc_
    assert(hwloc_cache_type_by_depth_type(obj->attr->cache.depth, obj->attr->cache.type) == obj->type);
  }

+  /* check total memory */
+  total_memory = 0;
+  if (obj->type == HWLOC_OBJ_NUMANODE)
+    total_memory += obj->attr->numanode.local_memory;
+  for_each_child(child, obj) {
+    total_memory += child->total_memory;
+  }
+  for_each_memory_child(child, obj) {
+    total_memory += child->total_memory;
+  }
+  assert(total_memory == obj->total_memory);
+
  /* check children */
  hwloc__check_normal_children(topology, gp_indexes, obj);
  hwloc__check_memory_children(topology, gp_indexes, obj);
--- a/src/3rdparty/hwloc/src/traversal.c
+++ b/src/3rdparty/hwloc/src/traversal.c
@@ -1,7 +1,7 @@
 /*
 * Copyright © 2009 CNRS
- * Copyright © 2009-2019 Inria.  All rights reserved.
- * Copyright © 2009-2010 Université Bordeaux
+ * Copyright © 2009-2020 Inria.  All rights reserved.
+ * Copyright © 2009-2010, 2020 Université Bordeaux
 * Copyright © 2009-2011 Cisco Systems, Inc.  All rights reserved.
 * See COPYING in top-level directory.
 */
@@ -138,6 +138,37 @@ hwloc_obj_type_is_icache(hwloc_obj_type_t type)
  return hwloc__obj_type_is_icache(type);
 }

+static hwloc_obj_t hwloc_get_obj_by_depth_and_gp_index(hwloc_topology_t topology, unsigned depth, uint64_t gp_index)
+{
+  hwloc_obj_t obj = hwloc_get_obj_by_depth(topology, depth, 0);
+  while (obj) {
+    if (obj->gp_index == gp_index)
+      return obj;
+    obj = obj->next_cousin;
+  }
+  return NULL;
+}
+
+hwloc_obj_t hwloc_get_obj_by_type_and_gp_index(hwloc_topology_t topology, hwloc_obj_type_t type, uint64_t gp_index)
+{
+  int depth = hwloc_get_type_depth(topology, type);
+  if (depth == HWLOC_TYPE_DEPTH_UNKNOWN)
+    return NULL;
+  if (depth == HWLOC_TYPE_DEPTH_MULTIPLE) {
+    for(depth=1 /* no multiple machine levels */;
+	(unsigned) depth < topology->nb_levels-1 /* no multiple PU levels */;
+	depth++) {
+      if (hwloc_get_depth_type(topology, depth) == type) {
+	hwloc_obj_t obj = hwloc_get_obj_by_depth_and_gp_index(topology, depth, gp_index);
+	if (obj)
+	  return obj;
+      }
+    }
+    return NULL;
+  }
+  return hwloc_get_obj_by_depth_and_gp_index(topology, depth, gp_index);
+}
+
 unsigned hwloc_get_closest_objs (struct hwloc_topology *topology, struct hwloc_obj *src, struct hwloc_obj **objs, unsigned max)
 {
  struct hwloc_obj *parent, *nextparent, **src_objs;
@@ -654,7 +685,11 @@ hwloc_obj_attr_snprintf(char * __hwloc_restrict string, size_t size, hwloc_obj_t
    unsigned i;
    for(i=0; i<obj->infos_count; i++) {
      struct hwloc_info_s *info = &obj->infos[i];
-      const char *quote = strchr(info->value, ' ') ? "\"" : "";
+      const char *quote;
+      if (strchr(info->value, ' '))
+        quote = "\"";
+      else
+        quote = "";
      res = hwloc_snprintf(tmp, tmplen, "%s%s=%s%s%s",
 			     prefix,
 			     info->name,
@@ -673,3 +708,31 @@ hwloc_obj_attr_snprintf(char * __hwloc_restrict string, size_t size, hwloc_obj_t

  return ret;
 }
+
+int hwloc_bitmap_singlify_per_core(hwloc_topology_t topology, hwloc_bitmap_t cpuset, unsigned which)
+{
+  hwloc_obj_t core = NULL;
+  while ((core = hwloc_get_next_obj_covering_cpuset_by_type(topology, cpuset, HWLOC_OBJ_CORE, core)) != NULL) {
+    /* this core has some PUs in the cpuset, find the index-th one */
+    unsigned i = 0;
+    int pu = -1;
+    do {
+      pu = hwloc_bitmap_next(core->cpuset, pu);
+      if (pu == -1) {
+	/* no which-th PU in cpuset and core, remove the entire core */
+	hwloc_bitmap_andnot(cpuset, cpuset, core->cpuset);
+	break;
+      }
+      if (hwloc_bitmap_isset(cpuset, pu)) {
+	if (i == which) {
+	  /* remove the entire core except that exact pu */
+	  hwloc_bitmap_andnot(cpuset, cpuset, core->cpuset);
+	  hwloc_bitmap_set(cpuset, pu);
+	  break;
+	}
+	i++;
+      }
+    } while (1);
+  }
+  return 0;
+}
--- a/src/3rdparty/libcpuid/CMakeLists.txt
+++ b/src/3rdparty/libcpuid/CMakeLists.txt
@@ -1,38 +0,0 @@
-cmake_minimum_required (VERSION 2.8)
-project (cpuid C)
-
-add_definitions(/DVERSION="0.4.0")
-
-set(CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE} -Os")
-
-set(HEADERS
-    libcpuid.h
-    libcpuid_types.h
-    libcpuid_constants.h
-    libcpuid_internal.h
-    amd_code_t.h
-    intel_code_t.h
-    recog_amd.h
-    recog_intel.h
-    asm-bits.h
-    libcpuid_util.h
-    )
-
-set(SOURCES
-    cpuid_main.c
-    asm-bits.c
-    recog_amd.c
-    recog_intel.c
-    libcpuid_util.c
-   )
-
-if (CMAKE_CL_64)
-    enable_language(ASM_MASM)
-    set(SOURCES_ASM masm-x64.asm)
-endif()
-
-add_library(cpuid STATIC
-    ${HEADERS}
-    ${SOURCES}
-    ${SOURCES_ASM}
-    )
--- a/src/3rdparty/libcpuid/amd_code_t.h
+++ b/src/3rdparty/libcpuid/amd_code_t.h
@@ -1,39 +0,0 @@
-/*
- * Copyright 2016  Veselin Georgiev,
- * anrieffNOSPAM @ mgail_DOT.com (convert to gmail)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
- * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
- * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-/*
- * This file contains a list of internal codes we use in detection. It is
- * of no external use and isn't a complete list of AMD products.
- */
-	CODE2(OPTERON_800, 1000),
-	CODE(PHENOM),
-	CODE(PHENOM2),
-	CODE(FUSION_C),
-	CODE(FUSION_E),
-	CODE(FUSION_EA),
-	CODE(FUSION_Z),
-	CODE(FUSION_A),
-	
--- a/src/3rdparty/libcpuid/asm-bits.c
+++ b/src/3rdparty/libcpuid/asm-bits.c
@@ -1,836 +0,0 @@
-/*
- * Copyright 2008  Veselin Georgiev,
- * anrieffNOSPAM @ mgail_DOT.com (convert to gmail)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
- * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
- * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include "libcpuid.h"
-#include "asm-bits.h"
-
-int cpuid_exists_by_eflags(void)
-{
-#if defined(PLATFORM_X64)
-	return 1; /* CPUID is always present on the x86_64 */
-#elif defined(PLATFORM_X86)
-#  if defined(COMPILER_GCC) || defined(COMPILER_CLANG)
-	int result;
-	__asm __volatile(
-		"	pushfl\n"
-		"	pop	%%eax\n"
-		"	mov	%%eax,	%%ecx\n"
-		"	xor	$0x200000,	%%eax\n"
-		"	push	%%eax\n"
-		"	popfl\n"
-		"	pushfl\n"
-		"	pop	%%eax\n"
-		"	xor	%%ecx,	%%eax\n"
-		"	mov	%%eax,	%0\n"
-		"	push	%%ecx\n"
-		"	popfl\n"
-		: "=m"(result)
-		: :"eax", "ecx", "memory");
-	return (result != 0);
-#  elif defined(COMPILER_MICROSOFT)
-	int result;
-	__asm {
-		pushfd
-		pop	eax
-		mov	ecx,	eax
-		xor	eax,	0x200000
-		push	eax
-		popfd
-		pushfd
-		pop	eax
-		xor	eax,	ecx
-		mov	result,	eax
-		push	ecx
-		popfd
-	};
-	return (result != 0);
-#  else
-	return 0;
-#  endif /* COMPILER_MICROSOFT */
-#elif defined(PLATFORM_ARM)
-  return 0;
-#else
-	return 0;
-#endif /* PLATFORM_X86 */
-}
-
-#ifdef INLINE_ASM_SUPPORTED
-/* 
- * with MSVC/AMD64, the exec_cpuid() and cpu_rdtsc() functions
- * are implemented in separate .asm files. Otherwise, use inline assembly
- */
-void exec_cpuid(uint32_t *regs)
-{
-#  if defined(COMPILER_GCC) || defined(COMPILER_CLANG)
-#	ifdef PLATFORM_X64
-	__asm __volatile(
-		"	mov	%0,	%%rdi\n"
-
-		"	push	%%rbx\n"
-		"	push	%%rcx\n"
-		"	push	%%rdx\n"
-		
-		"	mov	(%%rdi),	%%eax\n"
-		"	mov	4(%%rdi),	%%ebx\n"
-		"	mov	8(%%rdi),	%%ecx\n"
-		"	mov	12(%%rdi),	%%edx\n"
-		
-		"	cpuid\n"
-		
-		"	movl	%%eax,	(%%rdi)\n"
-		"	movl	%%ebx,	4(%%rdi)\n"
-		"	movl	%%ecx,	8(%%rdi)\n"
-		"	movl	%%edx,	12(%%rdi)\n"
-		"	pop	%%rdx\n"
-		"	pop	%%rcx\n"
-		"	pop	%%rbx\n"
-		:
-		:"m"(regs)
-		:"memory", "eax", "rdi"
-	);
-#	elif defined(PLATFORM_X86)
-	__asm __volatile(
-		"	mov	%0,	%%edi\n"
-
-		"	push	%%ebx\n"
-		"	push	%%ecx\n"
-		"	push	%%edx\n"
-		
-		"	mov	(%%edi),	%%eax\n"
-		"	mov	4(%%edi),	%%ebx\n"
-		"	mov	8(%%edi),	%%ecx\n"
-		"	mov	12(%%edi),	%%edx\n"
-		
-		"	cpuid\n"
-		
-		"	mov	%%eax,	(%%edi)\n"
-		"	mov	%%ebx,	4(%%edi)\n"
-		"	mov	%%ecx,	8(%%edi)\n"
-		"	mov	%%edx,	12(%%edi)\n"
-		"	pop	%%edx\n"
-		"	pop	%%ecx\n"
-		"	pop	%%ebx\n"
-		:
-		:"m"(regs)
-		:"memory", "eax", "edi"
-	);
-#	elif defined(PLATFORM_ARM)
-#	endif /* COMPILER_GCC */
-#else
-#  ifdef COMPILER_MICROSOFT
-	__asm {
-		push	ebx
-		push	ecx
-		push	edx
-		push	edi
-		mov	edi,	regs
-		
-		mov	eax,	[edi]
-		mov	ebx,	[edi+4]
-		mov	ecx,	[edi+8]
-		mov	edx,	[edi+12]
-		
-		cpuid
-		
-		mov	[edi],		eax
-		mov	[edi+4],	ebx
-		mov	[edi+8],	ecx
-		mov	[edi+12],	edx
-		
-		pop	edi
-		pop	edx
-		pop	ecx
-		pop	ebx
-	}
-#  else
-#    error "Unsupported compiler"
-#  endif /* COMPILER_MICROSOFT */
-#endif
-}
-#endif /* INLINE_ASSEMBLY_SUPPORTED */
-
-#ifdef INLINE_ASM_SUPPORTED
-void cpu_rdtsc(uint64_t* result)
-{
-	uint32_t low_part, hi_part;
-#if defined(COMPILER_GCC) || defined(COMPILER_CLANG)
-#ifdef PLATFORM_ARM
-  low_part = 0;
-  hi_part = 0;
-#else
-	__asm __volatile (
-		"	rdtsc\n"
-		"	mov	%%eax,	%0\n"
-		"	mov	%%edx,	%1\n"
-		:"=m"(low_part), "=m"(hi_part)::"memory", "eax", "edx"
-	);
-#endif
-#else
-#  ifdef COMPILER_MICROSOFT
-	__asm {
-		rdtsc
-		mov	low_part,	eax
-		mov	hi_part,	edx
-	};
-#  else
-#    error "Unsupported compiler"
-#  endif /* COMPILER_MICROSOFT */
-#endif /* COMPILER_GCC */
-	*result = (uint64_t)low_part + (((uint64_t) hi_part) << 32);
-}
-#endif /* INLINE_ASM_SUPPORTED */
-
-#ifdef INLINE_ASM_SUPPORTED
-void busy_sse_loop(int cycles)
-{
-#  if defined(COMPILER_GCC) || defined(COMPILER_CLANG)
-#ifndef __APPLE__
-#	define XALIGN ".balign 16\n"
-#else
-#	define XALIGN ".align 4\n"
-#endif
-#ifdef PLATFORM_ARM
-#else
-	__asm __volatile (
-		"	xorps	%%xmm0,	%%xmm0\n"
-		"	xorps	%%xmm1,	%%xmm1\n"
-		"	xorps	%%xmm2,	%%xmm2\n"
-		"	xorps	%%xmm3,	%%xmm3\n"
-		"	xorps	%%xmm4,	%%xmm4\n"
-		"	xorps	%%xmm5,	%%xmm5\n"
-		"	xorps	%%xmm6,	%%xmm6\n"
-		"	xorps	%%xmm7,	%%xmm7\n"
-		XALIGN
-		/* ".bsLoop:\n" */
-		"1:\n"
-		// 0:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		// 1:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		// 2:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		// 3:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		// 4:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		// 5:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		// 6:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		// 7:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		// 8:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		// 9:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//10:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//11:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//12:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//13:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//14:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//15:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//16:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//17:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//18:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//19:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//20:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//21:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//22:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//23:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//24:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//25:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//26:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//27:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//28:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//29:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//30:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		//31:
-		"	addps	%%xmm1, %%xmm0\n"
-		"	addps	%%xmm2, %%xmm1\n"
-		"	addps	%%xmm3, %%xmm2\n"
-		"	addps	%%xmm4, %%xmm3\n"
-		"	addps	%%xmm5, %%xmm4\n"
-		"	addps	%%xmm6, %%xmm5\n"
-		"	addps	%%xmm7, %%xmm6\n"
-		"	addps	%%xmm0, %%xmm7\n"
-		
-		"	dec	%%eax\n"
-		/* "jnz	.bsLoop\n" */
-		"	jnz	1b\n"
-		::"a"(cycles)
-	);
-#endif
-#else
-#  ifdef COMPILER_MICROSOFT
-	__asm {
-		mov	eax,	cycles
-		xorps	xmm0,	xmm0
-		xorps	xmm1,	xmm1
-		xorps	xmm2,	xmm2
-		xorps	xmm3,	xmm3
-		xorps	xmm4,	xmm4
-		xorps	xmm5,	xmm5
-		xorps	xmm6,	xmm6
-		xorps	xmm7,	xmm7
-		//--
-		align 16
-bsLoop:
-		// 0:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 1:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 2:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 3:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 4:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 5:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 6:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 7:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 8:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 9:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 10:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 11:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 12:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 13:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 14:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 15:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 16:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 17:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 18:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 19:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 20:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 21:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 22:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 23:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 24:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 25:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 26:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 27:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 28:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 29:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 30:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		// 31:
-		addps	xmm0,	xmm1
-		addps	xmm1,	xmm2
-		addps	xmm2,	xmm3
-		addps	xmm3,	xmm4
-		addps	xmm4,	xmm5
-		addps	xmm5,	xmm6
-		addps	xmm6,	xmm7
-		addps	xmm7,	xmm0
-		//----------------------
-		dec		eax
-		jnz		bsLoop
-	}
-#  else
-#    error "Unsupported compiler"
-#  endif /* COMPILER_MICROSOFT */
-#endif /* COMPILER_GCC */
-}
-#endif /* INLINE_ASSEMBLY_SUPPORTED */
--- a/src/3rdparty/libcpuid/asm-bits.h
+++ b/src/3rdparty/libcpuid/asm-bits.h
@@ -1,71 +0,0 @@
-/*
- * Copyright 2008  Veselin Georgiev,
- * anrieffNOSPAM @ mgail_DOT.com (convert to gmail)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
- * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
- * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-#ifndef __ASM_BITS_H__
-#define __ASM_BITS_H__
-#include "libcpuid.h"
-
-/* Determine Compiler: */
-#if defined(_MSC_VER)
-#if !defined(COMPILER_MICROSOFT)
-#	define COMPILER_MICROSOFT
-#endif
-#elif defined(__GNUC__)
-#if !defined(COMPILER_GCC)
-#	define COMPILER_GCC
-#endif
-#elif defined(__clang__)
-#if !defined(COMPILER_CLANG)
-#	define COMPILER_CLANG
-#endif
-#endif
-
-/* Determine Platform */
-#if defined(__x86_64__) || defined(_M_AMD64)
-#if !defined(PLATFORM_X64)
-#	define PLATFORM_X64
-#endif
-#elif defined(__i386__) || defined(_M_IX86)
-#if !defined(PLATFORM_X86)
-#	define PLATFORM_X86
-#endif
-#elif defined(__ARMEL__)
-#if !defined(PLATFORM_ARM)
-#	define PLATFORM_ARM
-#endif
-#endif
-
-/* Under Windows/AMD64 with MSVC, inline assembly isn't supported */
-#if (((defined(COMPILER_GCC) || defined(COMPILER_CLANG))) &&  \
-     (defined(PLATFORM_X64) || defined(PLATFORM_X86) || defined(PLATFORM_ARM))) || \
-	 (defined(COMPILER_MICROSOFT) && defined(PLATFORM_X86))
-#	define INLINE_ASM_SUPPORTED
-#endif
-
-int cpuid_exists_by_eflags(void);
-void exec_cpuid(uint32_t *regs);
-void busy_sse_loop(int cycles);
-
-#endif /* __ASM_BITS_H__ */
--- a/src/3rdparty/libcpuid/cpuid_main.c
+++ b/src/3rdparty/libcpuid/cpuid_main.c
@@ -1,389 +0,0 @@
-/*
- * Copyright 2008  Veselin Georgiev,
- * anrieffNOSPAM @ mgail_DOT.com (convert to gmail)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
- * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
- * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-#include "libcpuid.h"
-#include "libcpuid_internal.h"
-#include "recog_intel.h"
-#include "recog_amd.h"
-#include "asm-bits.h"
-#include "libcpuid_util.h"
-#ifdef HAVE_CONFIG_H
-#include "config.h"
-#endif
-#include <stdio.h>
-#include <string.h>
-#include <stdlib.h>
-
-/* Implementation: */
-
-static int _libcpiud_errno = ERR_OK;
-
-int set_error(cpu_error_t err)
-{
-	_libcpiud_errno = (int) err;
-	return (int) err;
-}
-
-static void cpu_id_t_constructor(struct cpu_id_t* id)
-{
-	memset(id, 0, sizeof(struct cpu_id_t));
-	id->l1_data_cache = id->l1_instruction_cache = id->l2_cache = id->l3_cache = id->l4_cache = -1;
-	id->l1_assoc = id->l2_assoc = id->l3_assoc = id->l4_assoc = -1;
-	id->l1_cacheline = id->l2_cacheline = id->l3_cacheline = id->l4_cacheline = -1;
-	id->sse_size = -1;
-}
-
-/* get_total_cpus() system specific code: uses OS routines to determine total number of CPUs */
-#ifdef __APPLE__
-#include <unistd.h>
-#include <mach/clock_types.h>
-#include <mach/clock.h>
-#include <mach/mach.h>
-static int get_total_cpus(void)
-{
-	kern_return_t kr;
-	host_basic_info_data_t basic_info;
-	host_info_t info = (host_info_t)&basic_info;
-	host_flavor_t flavor = HOST_BASIC_INFO;
-	mach_msg_type_number_t count = HOST_BASIC_INFO_COUNT;
-	kr = host_info(mach_host_self(), flavor, info, &count);
-	if (kr != KERN_SUCCESS) return 1;
-	return basic_info.avail_cpus;
-}
-#define GET_TOTAL_CPUS_DEFINED
-#endif
-
-#ifdef _WIN32
-#include <windows.h>
-static int get_total_cpus(void)
-{
-	SYSTEM_INFO system_info;
-	GetSystemInfo(&system_info);
-	return system_info.dwNumberOfProcessors;
-}
-#define GET_TOTAL_CPUS_DEFINED
-#endif
-
-#if defined linux || defined __linux__ || defined __sun
-#include <sys/sysinfo.h>
-#include <unistd.h>
- 
-static int get_total_cpus(void)
-{
-	return sysconf(_SC_NPROCESSORS_ONLN);
-}
-#define GET_TOTAL_CPUS_DEFINED
-#endif
-
-#if defined __FreeBSD__ || defined __OpenBSD__ || defined __NetBSD__ || defined __bsdi__ || defined __QNX__
-#include <sys/types.h>
-#include <sys/sysctl.h>
-
-static int get_total_cpus(void)
-{
-	int mib[2] = { CTL_HW, HW_NCPU };
-	int ncpus;
-	size_t len = sizeof(ncpus);
-	if (sysctl(mib, 2, &ncpus, &len, (void *) 0, 0) != 0) return 1;
-	return ncpus;
-}
-#define GET_TOTAL_CPUS_DEFINED
-#endif
-
-#ifndef GET_TOTAL_CPUS_DEFINED
-static int get_total_cpus(void)
-{
-	static int warning_printed = 0;
-	if (!warning_printed) {
-		warning_printed = 1;
-		warnf("Your system is not supported by libcpuid -- don't know how to detect the\n");
-		warnf("total number of CPUs on your system. It will be reported as 1.\n");
-		printf("Please use cpu_id_t.logical_cpus field instead.\n");
-	}
-	return 1;
-}
-#endif /* GET_TOTAL_CPUS_DEFINED */
-
-
-static void load_features_common(struct cpu_raw_data_t* raw, struct cpu_id_t* data)
-{
-	const struct feature_map_t matchtable_edx1[] = {
-		{  0, CPU_FEATURE_FPU },
-		{  1, CPU_FEATURE_VME },
-		{  2, CPU_FEATURE_DE },
-		{  3, CPU_FEATURE_PSE },
-		{  4, CPU_FEATURE_TSC },
-		{  5, CPU_FEATURE_MSR },
-		{  6, CPU_FEATURE_PAE },
-		{  7, CPU_FEATURE_MCE },
-		{  8, CPU_FEATURE_CX8 },
-		{  9, CPU_FEATURE_APIC },
-		{ 11, CPU_FEATURE_SEP },
-		{ 12, CPU_FEATURE_MTRR },
-		{ 13, CPU_FEATURE_PGE },
-		{ 14, CPU_FEATURE_MCA },
-		{ 15, CPU_FEATURE_CMOV },
-		{ 16, CPU_FEATURE_PAT },
-		{ 17, CPU_FEATURE_PSE36 },
-		{ 19, CPU_FEATURE_CLFLUSH },
-		{ 23, CPU_FEATURE_MMX },
-		{ 24, CPU_FEATURE_FXSR },
-		{ 25, CPU_FEATURE_SSE },
-		{ 26, CPU_FEATURE_SSE2 },
-		{ 28, CPU_FEATURE_HT },
-	};
-	const struct feature_map_t matchtable_ecx1[] = {
-		{  0, CPU_FEATURE_PNI },
-		{  1, CPU_FEATURE_PCLMUL },
-		{  3, CPU_FEATURE_MONITOR },
-		{  9, CPU_FEATURE_SSSE3 },
-		{ 12, CPU_FEATURE_FMA3 },
-		{ 13, CPU_FEATURE_CX16 },
-		{ 19, CPU_FEATURE_SSE4_1 },
-		{ 20, CPU_FEATURE_SSE4_2 },
-		{ 22, CPU_FEATURE_MOVBE },
-		{ 23, CPU_FEATURE_POPCNT },
-		{ 25, CPU_FEATURE_AES },
-		{ 26, CPU_FEATURE_XSAVE },
-		{ 27, CPU_FEATURE_OSXSAVE },
-		{ 28, CPU_FEATURE_AVX },
-		{ 29, CPU_FEATURE_F16C },
-		{ 30, CPU_FEATURE_RDRAND },
-	};
-	const struct feature_map_t matchtable_ebx7[] = {
-		{  3, CPU_FEATURE_BMI1 },
-		{  5, CPU_FEATURE_AVX2 },
-		{  8, CPU_FEATURE_BMI2 },
-	};
-	const struct feature_map_t matchtable_edx81[] = {
-		{ 11, CPU_FEATURE_SYSCALL },
-		{ 27, CPU_FEATURE_RDTSCP },
-		{ 29, CPU_FEATURE_LM },
-	};
-	const struct feature_map_t matchtable_ecx81[] = {
-		{  0, CPU_FEATURE_LAHF_LM },
-	};
-	const struct feature_map_t matchtable_edx87[] = {
-		{  8, CPU_FEATURE_CONSTANT_TSC },
-	};
-	if (raw->basic_cpuid[0][0] >= 1) {
-		match_features(matchtable_edx1, COUNT_OF(matchtable_edx1), raw->basic_cpuid[1][3], data);
-		match_features(matchtable_ecx1, COUNT_OF(matchtable_ecx1), raw->basic_cpuid[1][2], data);
-	}
-	if (raw->basic_cpuid[0][0] >= 7) {
-		match_features(matchtable_ebx7, COUNT_OF(matchtable_ebx7), raw->basic_cpuid[7][1], data);
-	}
-	if (raw->ext_cpuid[0][0] >= 0x80000001) {
-		match_features(matchtable_edx81, COUNT_OF(matchtable_edx81), raw->ext_cpuid[1][3], data);
-		match_features(matchtable_ecx81, COUNT_OF(matchtable_ecx81), raw->ext_cpuid[1][2], data);
-	}
-	if (raw->ext_cpuid[0][0] >= 0x80000007) {
-		match_features(matchtable_edx87, COUNT_OF(matchtable_edx87), raw->ext_cpuid[7][3], data);
-	}
-	if (data->flags[CPU_FEATURE_SSE]) {
-		/* apply guesswork to check if the SSE unit width is 128 bit */
-		switch (data->vendor) {
-			case VENDOR_AMD:
-				data->sse_size = (data->ext_family >= 16 && data->ext_family != 17) ? 128 : 64;
-				break;
-			case VENDOR_INTEL:
-				data->sse_size = (data->family == 6 && data->ext_model >= 15) ? 128 : 64;
-				break;
-			default:
-				break;
-		}
-		/* leave the CPU_FEATURE_128BIT_SSE_AUTH 0; the advanced per-vendor detection routines
-		 * will set it accordingly if they detect the needed bit */
-	}
-}
-
-static cpu_vendor_t cpuid_vendor_identify(const uint32_t *raw_vendor, char *vendor_str)
-{
-	int i;
-	cpu_vendor_t vendor = VENDOR_UNKNOWN;
-	const struct { cpu_vendor_t vendor; char match[16]; }
-	matchtable[NUM_CPU_VENDORS] = {
-		/* source: http://www.sandpile.org/ia32/cpuid.htm */
-		{ VENDOR_INTEL		, "GenuineIntel" },
-		{ VENDOR_AMD		, "AuthenticAMD" },
-		{ VENDOR_CYRIX		, "CyrixInstead" },
-		{ VENDOR_NEXGEN		, "NexGenDriven" },
-		{ VENDOR_TRANSMETA	, "GenuineTMx86" },
-		{ VENDOR_UMC		, "UMC UMC UMC " },
-		{ VENDOR_CENTAUR	, "CentaurHauls" },
-		{ VENDOR_RISE		, "RiseRiseRise" },
-		{ VENDOR_SIS		, "SiS SiS SiS " },
-		{ VENDOR_NSC		, "Geode by NSC" },
-	};
-
-	memcpy(vendor_str + 0, &raw_vendor[1], 4);
-	memcpy(vendor_str + 4, &raw_vendor[3], 4);
-	memcpy(vendor_str + 8, &raw_vendor[2], 4);
-	vendor_str[12] = 0;
-
-	/* Determine vendor: */
-	for (i = 0; i < NUM_CPU_VENDORS; i++)
-		if (!strcmp(vendor_str, matchtable[i].match)) {
-			vendor = matchtable[i].vendor;
-			break;
-		}
-	return vendor;
-}
-
-static int cpuid_basic_identify(struct cpu_raw_data_t* raw, struct cpu_id_t* data)
-{
-	int i, j, basic, xmodel, xfamily, ext;
-	char brandstr[64] = {0};
-	data->vendor = cpuid_vendor_identify(raw->basic_cpuid[0], data->vendor_str);
-
-	if (data->vendor == VENDOR_UNKNOWN)
-		return set_error(ERR_CPU_UNKN);
-	basic = raw->basic_cpuid[0][0];
-	if (basic >= 1) {
-		data->family = (raw->basic_cpuid[1][0] >> 8) & 0xf;
-		data->model = (raw->basic_cpuid[1][0] >> 4) & 0xf;
-		data->stepping = raw->basic_cpuid[1][0] & 0xf;
-		xmodel = (raw->basic_cpuid[1][0] >> 16) & 0xf;
-		xfamily = (raw->basic_cpuid[1][0] >> 20) & 0xff;
-		if (data->vendor == VENDOR_AMD && data->family < 0xf)
-			data->ext_family = data->family;
-		else
-			data->ext_family = data->family + xfamily;
-		data->ext_model = data->model + (xmodel << 4);
-	}
-	ext = raw->ext_cpuid[0][0] - 0x8000000;
-	
-	/* obtain the brand string, if present: */
-	if (ext >= 4) {
-		for (i = 0; i < 3; i++)
-			for (j = 0; j < 4; j++)
-				memcpy(brandstr + i * 16 + j * 4,
-				       &raw->ext_cpuid[2 + i][j], 4);
-		brandstr[48] = 0;
-		i = 0;
-		while (brandstr[i] == ' ') i++;
-		strncpy(data->brand_str, brandstr + i, sizeof(data->brand_str));
-		data->brand_str[48] = 0;
-	}
-	load_features_common(raw, data);
-	data->total_logical_cpus = get_total_cpus();
-	return set_error(ERR_OK);
-}
-
-/* Interface: */
-
-int cpuid_get_total_cpus(void)
-{
-	return get_total_cpus();
-}
-
-int cpuid_present(void)
-{
-	return cpuid_exists_by_eflags();
-}
-
-void cpu_exec_cpuid(uint32_t eax, uint32_t* regs)
-{
-	regs[0] = eax;
-	regs[1] = regs[2] = regs[3] = 0;
-	exec_cpuid(regs);
-}
-
-void cpu_exec_cpuid_ext(uint32_t* regs)
-{
-	exec_cpuid(regs);
-}
-
-int cpuid_get_raw_data(struct cpu_raw_data_t* data)
-{
-	unsigned i;
-	if (!cpuid_present())
-		return set_error(ERR_NO_CPUID);
-	for (i = 0; i < 32; i++)
-		cpu_exec_cpuid(i, data->basic_cpuid[i]);
-	for (i = 0; i < 32; i++)
-		cpu_exec_cpuid(0x80000000 + i, data->ext_cpuid[i]);
-	for (i = 0; i < MAX_INTELFN4_LEVEL; i++) {
-		memset(data->intel_fn4[i], 0, sizeof(data->intel_fn4[i]));
-		data->intel_fn4[i][0] = 4;
-		data->intel_fn4[i][2] = i;
-		cpu_exec_cpuid_ext(data->intel_fn4[i]);
-	}
-	for (i = 0; i < MAX_INTELFN11_LEVEL; i++) {
-		memset(data->intel_fn11[i], 0, sizeof(data->intel_fn11[i]));
-		data->intel_fn11[i][0] = 11;
-		data->intel_fn11[i][2] = i;
-		cpu_exec_cpuid_ext(data->intel_fn11[i]);
-	}
-	for (i = 0; i < MAX_INTELFN12H_LEVEL; i++) {
-		memset(data->intel_fn12h[i], 0, sizeof(data->intel_fn12h[i]));
-		data->intel_fn12h[i][0] = 0x12;
-		data->intel_fn12h[i][2] = i;
-		cpu_exec_cpuid_ext(data->intel_fn12h[i]);
-	}
-	for (i = 0; i < MAX_INTELFN14H_LEVEL; i++) {
-		memset(data->intel_fn14h[i], 0, sizeof(data->intel_fn14h[i]));
-		data->intel_fn14h[i][0] = 0x14;
-		data->intel_fn14h[i][2] = i;
-		cpu_exec_cpuid_ext(data->intel_fn14h[i]);
-	}
-	return set_error(ERR_OK);
-}
-
-int cpu_ident_internal(struct cpu_raw_data_t* raw, struct cpu_id_t* data, struct internal_id_info_t* internal)
-{
-	int r;
-	struct cpu_raw_data_t myraw;
-	if (!raw) {
-		if ((r = cpuid_get_raw_data(&myraw)) < 0)
-			return set_error(r);
-		raw = &myraw;
-	}
-	cpu_id_t_constructor(data);
-	if ((r = cpuid_basic_identify(raw, data)) < 0)
-		return set_error(r);
-	switch (data->vendor) {
-		case VENDOR_INTEL:
-			r = cpuid_identify_intel(raw, data, internal);
-			break;
-		case VENDOR_AMD:
-			r = cpuid_identify_amd(raw, data, internal);
-			break;
-		default:
-			break;
-	}
-	return set_error(r);
-}
-
-int cpu_identify(struct cpu_raw_data_t* raw, struct cpu_id_t* data)
-{
-	struct internal_id_info_t throwaway;
-	return cpu_ident_internal(raw, data, &throwaway);
-}
-
-const char* cpuid_lib_version(void)
-{
-	return VERSION;
-}
--- a/src/3rdparty/libcpuid/intel_code_t.h
+++ b/src/3rdparty/libcpuid/intel_code_t.h
@@ -1,58 +0,0 @@
-/*
- * Copyright 2016  Veselin Georgiev,
- * anrieffNOSPAM @ mgail_DOT.com (convert to gmail)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
- * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
- * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-/*
- * This file contains a list of internal codes we use in detection. It is
- * of no external use and isn't a complete list of intel products.
- */
-	CODE2(PENTIUM, 2000),
-	
-	CODE(IRWIN),
-	CODE(POTOMAC),
-	CODE(GAINESTOWN),
-	CODE(WESTMERE),
-	
-	CODE(PENTIUM_M),
-	CODE(NOT_CELERON),	
-	
-	CODE(CORE_SOLO),
-	CODE(MOBILE_CORE_SOLO),
-	CODE(CORE_DUO),
-	CODE(MOBILE_CORE_DUO),
-	
-	CODE(WOLFDALE),
-	CODE(MEROM),
-	CODE(PENRYN),
-	CODE(QUAD_CORE),
-	CODE(DUAL_CORE_HT),
-	CODE(QUAD_CORE_HT),
-	CODE(MORE_THAN_QUADCORE),
-	CODE(PENTIUM_D),
-	
-	CODE(SILVERTHORNE),
-	CODE(DIAMONDVILLE),
-	CODE(PINEVIEW),
-	CODE(CEDARVIEW),
--- a/src/3rdparty/libcpuid/libcpuid.h
+++ b/src/3rdparty/libcpuid/libcpuid.h
@@ -1,678 +0,0 @@
-/*
- * Copyright 2008  Veselin Georgiev,
- * anrieffNOSPAM @ mgail_DOT.com (convert to gmail)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
- * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
- * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-#ifndef __LIBCPUID_H__
-#define __LIBCPUID_H__
-/**
- * \file     libcpuid.h
- * \author   Veselin Georgiev
- * \date     Oct 2008
- * \version  0.4.0
- *
- * Version history:
- *
- * * 0.1.0 (2008-10-15): initial adaptation from wxfractgui sources
- * * 0.1.1 (2009-07-06): Added intel_fn11 fields to cpu_raw_data_t to handle
- *                       new processor topology enumeration required on Core i7
- * * 0.1.2 (2009-09-26): Added support for MSR reading through self-extracting
- *                       kernel driver on Win32.
- * * 0.1.3 (2010-04-20): Added support for greater more accurate CPU clock
- *                       measurements with cpu_clock_by_ic()
- * * 0.2.0 (2011-10-11): Support for AMD Bulldozer CPUs, 128-bit SSE unit size
- *                       checking. A backwards-incompatible change, since the
- *                       sizeof cpu_id_t is now different.
- * * 0.2.1 (2012-05-26): Support for Ivy Bridge, and detecting the presence of
- *                       the RdRand instruction.
- * * 0.2.2 (2015-11-04): Support for newer processors up to Haswell and Vishera.
- *                       Fix clock detection in cpu_clock_by_ic() for Bulldozer.
- *                       More entries supported in cpu_msrinfo().
- *                       *BSD and Solaris support (unofficial).
- * * 0.3.0 (2016-07-09): Support for Skylake; MSR ops in FreeBSD; INFO_VOLTAGE
- *                       for AMD CPUs. Level 4 cache support for Crystalwell
- *                       (a backwards-incompatible change since the sizeof
- *                        cpu_raw_data_t is now different).
- * * 0.4.0 (2016-09-30): Better detection of AMD clock multiplier with msrinfo.
- *                       Support for Intel SGX detection
- *                       (a backwards-incompatible change since the sizeof
- *                        cpu_raw_data_t and cpu_id_t is now different).
- */
-
-/** @mainpage A simple libcpuid introduction
- * 
- * LibCPUID provides CPU identification and access to the CPUID and RDTSC
- * instructions on the x86.
- * <p>
- * To execute CPUID, use \ref cpu_exec_cpuid <br>
- * To execute RDTSC, use \ref cpu_rdtsc <br>
- * To fetch the CPUID info needed for CPU identification, use
- *   \ref cpuid_get_raw_data <br>
- * To make sense of that data (decode, extract features), use \ref cpu_identify <br>
- * To detect the CPU speed, use either \ref cpu_clock, \ref cpu_clock_by_os,
- * \ref cpu_tsc_mark + \ref cpu_tsc_unmark + \ref cpu_clock_by_mark,
- * \ref cpu_clock_measure or \ref cpu_clock_by_ic.
- * Read carefully for pros/cons of each method. <br>
- * 
- * To read MSRs, use \ref cpu_msr_driver_open to get a handle, and then
- * \ref cpu_rdmsr for querying abilities. Some MSR decoding is available on recent
- * CPUs, and can be queried through \ref cpu_msrinfo; the various types of queries
- * are described in \ref cpu_msrinfo_request_t.
- * </p>
- */
-
-/** @defgroup libcpuid LibCPUID
- * @brief LibCPUID provides CPU identification
- @{ */
-
-/* Include some integer type specifications: */
-#include "libcpuid_types.h"
-
-/* Some limits and other constants */
-#include "libcpuid_constants.h"
-
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-/**
- * @brief CPU vendor, as guessed from the Vendor String.
- */
-typedef enum {
-	VENDOR_INTEL = 0,  /*!< Intel CPU */
-	VENDOR_AMD,        /*!< AMD CPU */
-	VENDOR_CYRIX,      /*!< Cyrix CPU */
-	VENDOR_NEXGEN,     /*!< NexGen CPU */
-	VENDOR_TRANSMETA,  /*!< Transmeta CPU */
-	VENDOR_UMC,        /*!< x86 CPU by UMC */
-	VENDOR_CENTAUR,    /*!< x86 CPU by IDT */
-	VENDOR_RISE,       /*!< x86 CPU by Rise Technology */
-	VENDOR_SIS,        /*!< x86 CPU by SiS */
-	VENDOR_NSC,        /*!< x86 CPU by National Semiconductor */
-	
-	NUM_CPU_VENDORS,   /*!< Valid CPU vendor ids: 0..NUM_CPU_VENDORS - 1 */
-	VENDOR_UNKNOWN = -1,
-} cpu_vendor_t;
-#define NUM_CPU_VENDORS NUM_CPU_VENDORS
-
-/**
- * @brief Contains just the raw CPUID data.
- *
- * This contains only the most basic CPU data, required to do identification
- * and feature recognition. Every processor should be identifiable using this
- * data only.
- */
-struct cpu_raw_data_t {
-	/** contains results of CPUID for eax = 0, 1, ...*/
-	uint32_t basic_cpuid[MAX_CPUID_LEVEL][4];
-
-	/** contains results of CPUID for eax = 0x80000000, 0x80000001, ...*/
-	uint32_t ext_cpuid[MAX_EXT_CPUID_LEVEL][4];
-	
-	/** when the CPU is intel and it supports deterministic cache
-	    information: this contains the results of CPUID for eax = 4
-	    and ecx = 0, 1, ... */
-	uint32_t intel_fn4[MAX_INTELFN4_LEVEL][4];
-	
-	/** when the CPU is intel and it supports leaf 0Bh (Extended Topology
-	    enumeration leaf), this stores the result of CPUID with 
-	    eax = 11 and ecx = 0, 1, 2... */
-	uint32_t intel_fn11[MAX_INTELFN11_LEVEL][4];
-	
-	/** when the CPU is intel and supports leaf 12h (SGX enumeration leaf),
-	 *  this stores the result of CPUID with eax = 0x12 and
-	 *  ecx = 0, 1, 2... */
-	uint32_t intel_fn12h[MAX_INTELFN12H_LEVEL][4];
-
-	/** when the CPU is intel and supports leaf 14h (Intel Processor Trace
-	 *  capabilities leaf).
-	 *  this stores the result of CPUID with eax = 0x12 and
-	 *  ecx = 0, 1, 2... */
-	uint32_t intel_fn14h[MAX_INTELFN14H_LEVEL][4];
-};
-
-/**
- * @brief This contains information about SGX features of the processor
- * Example usage:
- * @code
- * ...
- * struct cpu_raw_data_t raw;
- * struct cpu_id_t id;
- * 
- * if (cpuid_get_raw_data(&raw) == 0 && cpu_identify(&raw, &id) == 0 && id.sgx.present) {
- *   printf("SGX is present.\n");
- *   printf("SGX1 instructions: %s.\n", id.sgx.flags[INTEL_SGX1] ? "present" : "absent");
- *   printf("SGX2 instructions: %s.\n", id.sgx.flags[INTEL_SGX2] ? "present" : "absent");
- *   printf("Max 32-bit enclave size: 2^%d bytes.\n", id.sgx.max_enclave_32bit);
- *   printf("Max 64-bit enclave size: 2^%d bytes.\n", id.sgx.max_enclave_64bit);
- *   for (int i = 0; i < id.sgx.num_epc_sections; i++) {
- *     struct cpu_epc_t epc = cpuid_get_epc(i, NULL);
- *     printf("EPC section #%d: address = %x, size = %d bytes.\n", epc.address, epc.size);
- *   }
- * } else {
- *   printf("SGX is not present.\n");
- * }
- * @endcode
- */ 
-struct cpu_sgx_t {
-	/** Whether SGX is present (boolean) */
-	uint32_t present;
-	
-	/** Max enclave size in 32-bit mode. This is a power-of-two value:
-	 *  if it is "31", then the max enclave size is 2^31 bytes (2 GiB).
-	 */
-	uint8_t max_enclave_32bit;
-	
-	/** Max enclave size in 64-bit mode. This is a power-of-two value:
-	 *  if it is "36", then the max enclave size is 2^36 bytes (64 GiB).
-	 */
-	uint8_t max_enclave_64bit;
-	
-	/**
-	 * contains SGX feature flags. See the \ref cpu_sgx_feature_t
-	 * "INTEL_SGX*" macros below.
-	 */
-	uint8_t flags[SGX_FLAGS_MAX];
-	
-	/** number of Enclave Page Cache (EPC) sections. Info for each
-	 *  section is available through the \ref cpuid_get_epc() function
-	 */
-	int num_epc_sections;
-	
-	/** bit vector of the supported extended  features that can be written
-	 *  to the MISC region of the SSA (Save State Area)
-	 */ 
-	uint32_t misc_select;
-	
-	/** a bit vector of the attributes that can be set to SECS.ATTRIBUTES
-	 *  via ECREATE. Corresponds to bits 0-63 (incl.) of SECS.ATTRIBUTES.
-	 */ 
-	uint64_t secs_attributes;
-	
-	/** a bit vector of the bits that can be set in the XSAVE feature
-	 *  request mask; Corresponds to bits 64-127 of SECS.ATTRIBUTES.
-	 */
-	uint64_t secs_xfrm;
-};
-
-/**
- * @brief This contains the recognized CPU features/info
- */
-struct cpu_id_t {
-	/** contains the CPU vendor string, e.g. "GenuineIntel" */
-	char vendor_str[VENDOR_STR_MAX];
-	
-	/** contains the brand string, e.g. "Intel(R) Xeon(TM) CPU 2.40GHz" */
-	char brand_str[BRAND_STR_MAX];
-	
-	/** contains the recognized CPU vendor */
-	cpu_vendor_t vendor;
-	
-	/**
-	 * contain CPU flags. Used to test for features. See
-	 * the \ref cpu_feature_t "CPU_FEATURE_*" macros below.
-	 * @see Features
-	 */
-	uint8_t flags[CPU_FLAGS_MAX];
-	
-	/** CPU family */
-	int32_t family;
-	
-	/** CPU model */
-	int32_t model;
-	
-	/** CPU stepping */
-	int32_t stepping;
-	
-	/** CPU extended family */
-	int32_t ext_family;
-	
-	/** CPU extended model */
-	int32_t ext_model;
-	
-	/** Number of CPU cores on the current processor */
-	int32_t num_cores;
-	
-	/**
-	 * Number of logical processors on the current processor.
-	 * Could be more than the number of physical cores,
-	 * e.g. when the processor has HyperThreading.
-	 */
-	int32_t num_logical_cpus;
-	
-	/**
-	 * The total number of logical processors.
-	 * The same value is availabe through \ref cpuid_get_total_cpus.
-	 *
-	 * This is num_logical_cpus * {total physical processors in the system}
-	 * (but only on a real system, under a VM this number may be lower).
-	 *
-	 * If you're writing a multithreaded program and you want to run it on
-	 * all CPUs, this is the number of threads you need.
-	 *
-	 * @note in a VM, this will exactly match the number of CPUs set in
-	 *       the VM's configuration.
-	 *
-	 */
-	int32_t total_logical_cpus;
-	
-	/**
-	 * L1 data cache size in KB. Could be zero, if the CPU lacks cache.
-	 * If the size cannot be determined, it will be -1.
-	 */
-	int32_t l1_data_cache;
-	
-	/**
-	 * L1 instruction cache size in KB. Could be zero, if the CPU lacks
-	 * cache. If the size cannot be determined, it will be -1.
-	 * @note On some Intel CPUs, whose instruction cache is in fact
-	 * a trace cache, the size will be expressed in K uOps.
-	 */
-	int32_t l1_instruction_cache;
-	
-	/**
-	 * L2 cache size in KB. Could be zero, if the CPU lacks L2 cache.
-	 * If the size of the cache could not be determined, it will be -1
-	 */
-	int32_t l2_cache;
-	
-	/** L3 cache size in KB. Zero on most systems */
-	int32_t l3_cache;
-
-	/** L4 cache size in KB. Zero on most systems */
-	int32_t l4_cache;
-	
-	/** Cache associativity for the L1 data cache. -1 if undetermined */
-	int32_t l1_assoc;
-	
-	/** Cache associativity for the L2 cache. -1 if undetermined */
-	int32_t l2_assoc;
-	
-	/** Cache associativity for the L3 cache. -1 if undetermined */
-	int32_t l3_assoc;
-
-	/** Cache associativity for the L4 cache. -1 if undetermined */
-	int32_t l4_assoc;
-	
-	/** Cache-line size for L1 data cache. -1 if undetermined */
-	int32_t l1_cacheline;
-	
-	/** Cache-line size for L2 cache. -1 if undetermined */
-	int32_t l2_cacheline;
-	
-	/** Cache-line size for L3 cache. -1 if undetermined */
-	int32_t l3_cacheline;
-	
-	/** Cache-line size for L4 cache. -1 if undetermined */
-	int32_t l4_cacheline;
-
-	/**
-	 * The brief and human-friendly CPU codename, which was recognized.<br>
-	 * Examples:
-	 * @code
-	 * +--------+--------+-------+-------+-------+---------------------------------------+-----------------------+
-	 * | Vendor | Family | Model | Step. | Cache |       Brand String                    | cpu_id_t.cpu_codename |
-	 * +--------+--------+-------+-------+-------+---------------------------------------+-----------------------+
-	 * | AMD    |      6 |     8 |     0 |   256 | (not available - will be ignored)     | "K6-2"                |
-	 * | Intel  |     15 |     2 |     5 |   512 | "Intel(R) Xeon(TM) CPU 2.40GHz"       | "Xeon (Prestonia)"    |
-	 * | Intel  |      6 |    15 |    11 |  4096 | "Intel(R) Core(TM)2 Duo CPU E6550..." | "Conroe (Core 2 Duo)" |
-	 * | AMD    |     15 |    35 |     2 |  1024 | "Dual Core AMD Opteron(tm) Proces..." | "Opteron (Dual Core)" |
-	 * +--------+--------+-------+-------+-------+---------------------------------------+-----------------------+
-	 * @endcode
-	 */
-	char cpu_codename[64];
-	
-	/** SSE execution unit size (64 or 128; -1 if N/A) */
-	int32_t sse_size;
-	
-	/**
-	 * contain miscellaneous detection information. Used to test about specifics of
-	 * certain detected features. See \ref cpu_hint_t "CPU_HINT_*" macros below.
-	 * @see Hints
-	 */
-	uint8_t detection_hints[CPU_HINTS_MAX];
-	
-	/** contains information about SGX features if the processor, if present */
-	struct cpu_sgx_t sgx;
-};
-
-/**
- * @brief CPU feature identifiers
- *
- * Usage:
- * @code
- * ...
- * struct cpu_raw_data_t raw;
- * struct cpu_id_t id;
- * if (cpuid_get_raw_data(&raw) == 0 && cpu_identify(&raw, &id) == 0) {
- *     if (id.flags[CPU_FEATURE_SSE2]) {
- *         // The CPU has SSE2...
- *         ...
- *     } else {
- *         // no SSE2
- *     }
- * } else {
- *   // processor cannot be determined.
- * }
- * @endcode
- */
-typedef enum {
-	CPU_FEATURE_FPU = 0,	/*!< Floating point unit */
-	CPU_FEATURE_VME,	/*!< Virtual mode extension */
-	CPU_FEATURE_DE,		/*!< Debugging extension */
-	CPU_FEATURE_PSE,	/*!< Page size extension */
-	CPU_FEATURE_TSC,	/*!< Time-stamp counter */
-	CPU_FEATURE_MSR,	/*!< Model-specific regsisters, RDMSR/WRMSR supported */
-	CPU_FEATURE_PAE,	/*!< Physical address extension */
-	CPU_FEATURE_MCE,	/*!< Machine check exception */
-	CPU_FEATURE_CX8,	/*!< CMPXCHG8B instruction supported */
-	CPU_FEATURE_APIC,	/*!< APIC support */
-	CPU_FEATURE_MTRR,	/*!< Memory type range registers */
-	CPU_FEATURE_SEP,	/*!< SYSENTER / SYSEXIT instructions supported */
-	CPU_FEATURE_PGE,	/*!< Page global enable */
-	CPU_FEATURE_MCA,	/*!< Machine check architecture */
-	CPU_FEATURE_CMOV,	/*!< CMOVxx instructions supported */
-	CPU_FEATURE_PAT,	/*!< Page attribute table */
-	CPU_FEATURE_PSE36,	/*!< 36-bit page address extension */
-	CPU_FEATURE_PN,		/*!< Processor serial # implemented (Intel P3 only) */
-	CPU_FEATURE_CLFLUSH,	/*!< CLFLUSH instruction supported */
-	CPU_FEATURE_DTS,	/*!< Debug store supported */
-	CPU_FEATURE_ACPI,	/*!< ACPI support (power states) */
-	CPU_FEATURE_MMX,	/*!< MMX instruction set supported */
-	CPU_FEATURE_FXSR,	/*!< FXSAVE / FXRSTOR supported */
-	CPU_FEATURE_SSE,	/*!< Streaming-SIMD Extensions (SSE) supported */
-	CPU_FEATURE_SSE2,	/*!< SSE2 instructions supported */
-	CPU_FEATURE_SS,		/*!< Self-snoop */
-	CPU_FEATURE_HT,		/*!< Hyper-threading supported (but might be disabled) */
-	CPU_FEATURE_TM,		/*!< Thermal monitor */
-	CPU_FEATURE_IA64,	/*!< IA64 supported (Itanium only) */
-	CPU_FEATURE_PBE,	/*!< Pending-break enable */
-	CPU_FEATURE_PNI,	/*!< PNI (SSE3) instructions supported */
-	CPU_FEATURE_PCLMUL,	/*!< PCLMULQDQ instruction supported */
-	CPU_FEATURE_DTS64,	/*!< 64-bit Debug store supported */
-	CPU_FEATURE_MONITOR,	/*!< MONITOR / MWAIT supported */
-	CPU_FEATURE_DS_CPL,	/*!< CPL Qualified Debug Store */
-	CPU_FEATURE_VMX,	/*!< Virtualization technology supported */
-	CPU_FEATURE_SMX,	/*!< Safer mode exceptions */
-	CPU_FEATURE_EST,	/*!< Enhanced SpeedStep */
-	CPU_FEATURE_TM2,	/*!< Thermal monitor 2 */
-	CPU_FEATURE_SSSE3,	/*!< SSSE3 instructionss supported (this is different from SSE3!) */
-	CPU_FEATURE_CID,	/*!< Context ID supported */
-	CPU_FEATURE_CX16,	/*!< CMPXCHG16B instruction supported */
-	CPU_FEATURE_XTPR,	/*!< Send Task Priority Messages disable */
-	CPU_FEATURE_PDCM,	/*!< Performance capabilities MSR supported */
-	CPU_FEATURE_DCA,	/*!< Direct cache access supported */
-	CPU_FEATURE_SSE4_1,	/*!< SSE 4.1 instructions supported */
-	CPU_FEATURE_SSE4_2,	/*!< SSE 4.2 instructions supported */
-	CPU_FEATURE_SYSCALL,	/*!< SYSCALL / SYSRET instructions supported */
-	CPU_FEATURE_XD,		/*!< Execute disable bit supported */
-	CPU_FEATURE_MOVBE,	/*!< MOVBE instruction supported */
-	CPU_FEATURE_POPCNT,	/*!< POPCNT instruction supported */
-	CPU_FEATURE_AES,	/*!< AES* instructions supported */
-	CPU_FEATURE_XSAVE,	/*!< XSAVE/XRSTOR/etc instructions supported */
-	CPU_FEATURE_OSXSAVE,	/*!< non-privileged copy of OSXSAVE supported */
-	CPU_FEATURE_AVX,	/*!< Advanced vector extensions supported */
-	CPU_FEATURE_MMXEXT,	/*!< AMD MMX-extended instructions supported */
-	CPU_FEATURE_3DNOW,	/*!< AMD 3DNow! instructions supported */
-	CPU_FEATURE_3DNOWEXT,	/*!< AMD 3DNow! extended instructions supported */
-	CPU_FEATURE_NX,		/*!< No-execute bit supported */
-	CPU_FEATURE_FXSR_OPT,	/*!< FFXSR: FXSAVE and FXRSTOR optimizations */
-	CPU_FEATURE_RDTSCP,	/*!< RDTSCP instruction supported (AMD-only) */
-	CPU_FEATURE_LM,		/*!< Long mode (x86_64/EM64T) supported */
-	CPU_FEATURE_LAHF_LM,	/*!< LAHF/SAHF supported in 64-bit mode */
-	CPU_FEATURE_CMP_LEGACY,	/*!< core multi-processing legacy mode */
-	CPU_FEATURE_SVM,	/*!< AMD Secure virtual machine */
-	CPU_FEATURE_ABM,	/*!< LZCNT instruction support */
-	CPU_FEATURE_MISALIGNSSE,/*!< Misaligned SSE supported */
-	CPU_FEATURE_SSE4A,	/*!< SSE 4a from AMD */
-	CPU_FEATURE_3DNOWPREFETCH,	/*!< PREFETCH/PREFETCHW support */
-	CPU_FEATURE_OSVW,	/*!< OS Visible Workaround (AMD) */
-	CPU_FEATURE_IBS,	/*!< Instruction-based sampling */
-	CPU_FEATURE_SSE5,	/*!< SSE 5 instructions supported (deprecated, will never be 1) */
-	CPU_FEATURE_SKINIT,	/*!< SKINIT / STGI supported */
-	CPU_FEATURE_WDT,	/*!< Watchdog timer support */
-	CPU_FEATURE_TS,		/*!< Temperature sensor */
-	CPU_FEATURE_FID,	/*!< Frequency ID control */
-	CPU_FEATURE_VID,	/*!< Voltage ID control */
-	CPU_FEATURE_TTP,	/*!< THERMTRIP */
-	CPU_FEATURE_TM_AMD,	/*!< AMD-specified hardware thermal control */
-	CPU_FEATURE_STC,	/*!< Software thermal control */
-	CPU_FEATURE_100MHZSTEPS,/*!< 100 MHz multiplier control */
-	CPU_FEATURE_HWPSTATE,	/*!< Hardware P-state control */
-	CPU_FEATURE_CONSTANT_TSC,	/*!< TSC ticks at constant rate */
-	CPU_FEATURE_XOP,	/*!< The XOP instruction set (same as the old CPU_FEATURE_SSE5) */
-	CPU_FEATURE_FMA3,	/*!< The FMA3 instruction set */
-	CPU_FEATURE_FMA4,	/*!< The FMA4 instruction set */
-	CPU_FEATURE_TBM,	/*!< Trailing bit manipulation instruction support */
-	CPU_FEATURE_F16C,	/*!< 16-bit FP convert instruction support */
-	CPU_FEATURE_RDRAND,     /*!< RdRand instruction */
-	CPU_FEATURE_X2APIC,     /*!< x2APIC, APIC_BASE.EXTD, MSRs 0000_0800h...0000_0BFFh 64-bit ICR (+030h but not +031h), no DFR (+00Eh), SELF_IPI (+040h) also see standard level 0000_000Bh */
-	CPU_FEATURE_CPB,	/*!< Core performance boost */
-	CPU_FEATURE_APERFMPERF,	/*!< MPERF/APERF MSRs support */
-	CPU_FEATURE_PFI,	/*!< Processor Feedback Interface support */
-	CPU_FEATURE_PA,		/*!< Processor accumulator */
-	CPU_FEATURE_AVX2,	/*!< AVX2 instructions */
-	CPU_FEATURE_BMI1,	/*!< BMI1 instructions */
-	CPU_FEATURE_BMI2,	/*!< BMI2 instructions */
-	CPU_FEATURE_HLE,	/*!< Hardware Lock Elision prefixes */
-	CPU_FEATURE_RTM,	/*!< Restricted Transactional Memory instructions */
-	CPU_FEATURE_AVX512F,	/*!< AVX-512 Foundation */
-	CPU_FEATURE_AVX512DQ,	/*!< AVX-512 Double/Quad granular insns */
-	CPU_FEATURE_AVX512PF,	/*!< AVX-512 Prefetch */
-	CPU_FEATURE_AVX512ER,	/*!< AVX-512 Exponential/Reciprocal */
-	CPU_FEATURE_AVX512CD,	/*!< AVX-512 Conflict detection */
-	CPU_FEATURE_SHA_NI,	/*!< SHA-1/SHA-256 instructions */
-	CPU_FEATURE_AVX512BW,	/*!< AVX-512 Byte/Word granular insns */
-	CPU_FEATURE_AVX512VL,	/*!< AVX-512 128/256 vector length extensions */
-	CPU_FEATURE_SGX,	/*!< SGX extensions. Non-autoritative, check cpu_id_t::sgx::present to verify presence */
-	CPU_FEATURE_RDSEED,	/*!< RDSEED instruction */
-	CPU_FEATURE_ADX,	/*!< ADX extensions (arbitrary precision) */
-	/* termination: */
-	NUM_CPU_FEATURES,
-} cpu_feature_t;
-
-/**
- * @brief CPU detection hints identifiers
- *
- * Usage: similar to the flags usage
- */
-typedef enum {
-	CPU_HINT_SSE_SIZE_AUTH = 0,	/*!< SSE unit size is authoritative (not only a Family/Model guesswork, but based on an actual CPUID bit) */
-	/* termination */
-	NUM_CPU_HINTS,
-} cpu_hint_t;
-
-/**
- * @brief SGX features flags
- * \see cpu_sgx_t
- *
- * Usage:
- * @code
- * ...
- * struct cpu_raw_data_t raw;
- * struct cpu_id_t id;
- * if (cpuid_get_raw_data(&raw) == 0 && cpu_identify(&raw, &id) == 0 && id.sgx.present) {
- *     if (id.sgx.flags[INTEL_SGX1])
- *         // The CPU has SGX1 instructions support...
- *         ...
- *     } else {
- *         // no SGX
- *     }
- * } else {
- *   // processor cannot be determined.
- * }
- * @endcode
- */
- 
-typedef enum {
-	INTEL_SGX1,		/*!< SGX1 instructions support */
-	INTEL_SGX2,		/*!< SGX2 instructions support */
-	
-	/* termination: */
-	NUM_SGX_FEATURES,
-} cpu_sgx_feature_t;
-
-/**
- * @brief Describes common library error codes
- */
-typedef enum {
-	ERR_OK       =  0,	/*!< No error */
-	ERR_NO_CPUID = -1,	/*!< CPUID instruction is not supported */
-	ERR_NO_RDTSC = -2,	/*!< RDTSC instruction is not supported */
-	ERR_NO_MEM   = -3,	/*!< Memory allocation failed */
-	ERR_OPEN     = -4,	/*!< File open operation failed */
-	ERR_BADFMT   = -5,	/*!< Bad file format */
-	ERR_NOT_IMP  = -6,	/*!< Not implemented */
-	ERR_CPU_UNKN = -7,	/*!< Unsupported processor */
-	ERR_NO_RDMSR = -8,	/*!< RDMSR instruction is not supported */
-	ERR_NO_DRIVER= -9,	/*!< RDMSR driver error (generic) */
-	ERR_NO_PERMS = -10,	/*!< No permissions to install RDMSR driver */
-	ERR_EXTRACT  = -11,	/*!< Cannot extract RDMSR driver (read only media?) */
-	ERR_HANDLE   = -12,	/*!< Bad handle */
-	ERR_INVMSR   = -13,	/*!< Invalid MSR */
-	ERR_INVCNB   = -14,	/*!< Invalid core number */
-	ERR_HANDLE_R = -15,	/*!< Error on handle read */
-	ERR_INVRANGE = -16,	/*!< Invalid given range */
-} cpu_error_t;
-
-/**
- * @brief Internal structure, used in cpu_tsc_mark, cpu_tsc_unmark and
- *        cpu_clock_by_mark
- */
-struct cpu_mark_t {
-	uint64_t tsc;		/*!< Time-stamp from RDTSC */
-	uint64_t sys_clock;	/*!< In microsecond resolution */
-};
-
-/**
- * @brief Returns the total number of logical CPU threads (even if CPUID is not present).
- *
- * Under VM, this number (and total_logical_cpus, since they are fetched with the same code)
- * may be nonsensical, i.e. might not equal NumPhysicalCPUs*NumCoresPerCPU*HyperThreading.
- * This is because no matter how many logical threads the host machine has, you may limit them
- * in the VM to any number you like. **This** is the number returned by cpuid_get_total_cpus().
- *
- * @returns Number of logical CPU threads available. Equals the \ref cpu_id_t::total_logical_cpus.
- */
-int cpuid_get_total_cpus(void);
-
-/**
- * @brief Checks if the CPUID instruction is supported
- * @retval 1 if CPUID is present
- * @retval 0 the CPU doesn't have CPUID.
- */
-int cpuid_present(void);
-
-/**
- * @brief Executes the CPUID instruction
- * @param eax - the value of the EAX register when executing CPUID
- * @param regs - the results will be stored here. regs[0] = EAX, regs[1] = EBX, ...
- * @note CPUID will be executed with EAX set to the given value and EBX, ECX,
- *       EDX set to zero.
- */
-void cpu_exec_cpuid(uint32_t eax, uint32_t* regs);
-
-/**
- * @brief Executes the CPUID instruction with the given input registers
- * @note This is just a bit more generic version of cpu_exec_cpuid - it allows
- *       you to control all the registers.
- * @param regs - Input/output. Prior to executing CPUID, EAX, EBX, ECX and
- *               EDX will be set to regs[0], regs[1], regs[2] and regs[3].
- *               After CPUID, this array will contain the results.
- */
-void cpu_exec_cpuid_ext(uint32_t* regs);
-
-/**
- * @brief Obtains the raw CPUID data from the current CPU
- * @param data - a pointer to cpu_raw_data_t structure
- * @returns zero if successful, and some negative number on error.
- *          The error message can be obtained by calling \ref cpuid_error.
- *          @see cpu_error_t
- */
-int cpuid_get_raw_data(struct cpu_raw_data_t* data);
-
-/**
- * @brief Identifies the CPU
- * @param raw - Input - a pointer to the raw CPUID data, which is obtained
- *              either by cpuid_get_raw_data or cpuid_deserialize_raw_data.
- *              Can also be NULL, in which case the functions calls
- *              cpuid_get_raw_data itself.
- * @param data - Output - the decoded CPU features/info is written here.
- * @note The function will not fail, even if some of the information
- *       cannot be obtained. Even when the CPU is new and thus unknown to
- *       libcpuid, some generic info, such as "AMD K9 family CPU" will be
- *       written to data.cpu_codename, and most other things, such as the
- *       CPU flags, cache sizes, etc. should be detected correctly anyway.
- *       However, the function CAN fail, if the CPU is completely alien to
- *       libcpuid.
- * @note While cpu_identify() and cpuid_get_raw_data() are fast for most
- *       purposes, running them several thousand times per second can hamper
- *       performance significantly. Specifically, avoid writing "cpu feature
- *       checker" wrapping function, which calls cpu_identify and returns the
- *       value of some flag, if that function is going to be called frequently.
- * @returns zero if successful, and some negative number on error.
- *          The error message can be obtained by calling \ref cpuid_error.
- *          @see cpu_error_t
- */
-int cpu_identify(struct cpu_raw_data_t* raw, struct cpu_id_t* data);
-
-/**
- * @brief The return value of cpuid_get_epc().
- * @details
- * Describes an EPC (Enclave Page Cache) layout (physical address and size).
- * A CPU may have one or more EPC areas, and information about each is
- * fetched via \ref cpuid_get_epc.
- */ 
-struct cpu_epc_t {
-	uint64_t start_addr;
-	uint64_t length;
-};
-
-/**
- * @brief Fetches information about an EPC (Enclave Page Cache) area.
- * @param index - zero-based index, valid range [0..cpu_id_t.egx.num_epc_sections)
- * @param raw   - a pointer to fetched raw CPUID data. Needed only for testing,
- *                you can safely pass NULL here (if you pass a real structure,
- *                it will be used for fetching the leaf 12h data if index < 2;
- *                otherwise the real CPUID instruction will be used).
- * @returns the requested data. If the CPU doesn't support SGX, or if
- *          index >= cpu_id_t.egx.num_epc_sections, both fields of the returned
- *          structure will be zeros.
- */
-struct cpu_epc_t cpuid_get_epc(int index, const struct cpu_raw_data_t* raw);
-
-/**
- * @brief Returns the libcpuid version
- *
- * @returns the string representation of the libcpuid version, like "0.1.1"
- */
-const char* cpuid_lib_version(void);
-
-#ifdef __cplusplus
-} /* extern "C" */
-#endif
-
-
-/** @} */
-
-#endif /* __LIBCPUID_H__ */
--- a/src/3rdparty/libcpuid/libcpuid_constants.h
+++ b/src/3rdparty/libcpuid/libcpuid_constants.h
@@ -1,47 +0,0 @@
-/*
- * Copyright 2008  Veselin Georgiev,
- * anrieffNOSPAM @ mgail_DOT.com (convert to gmail)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
- * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
- * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-/**
- * @File     libcpuid_constants.h
- * @Author   Veselin Georgiev
- * @Brief    Some limits and constants for libcpuid
- */
-
-#ifndef __LIBCPUID_CONSTANTS_H__
-#define __LIBCPUID_CONSTANTS_H__
-
-#define VENDOR_STR_MAX		16
-#define BRAND_STR_MAX		64
-#define CPU_FLAGS_MAX		128
-#define MAX_CPUID_LEVEL		32
-#define MAX_EXT_CPUID_LEVEL	32
-#define MAX_INTELFN4_LEVEL	8
-#define MAX_INTELFN11_LEVEL	4
-#define MAX_INTELFN12H_LEVEL	4
-#define MAX_INTELFN14H_LEVEL	4
-#define CPU_HINTS_MAX		16
-#define SGX_FLAGS_MAX		14
-
-#endif /* __LIBCPUID_CONSTANTS_H__ */
--- a/src/3rdparty/libcpuid/libcpuid_internal.h
+++ b/src/3rdparty/libcpuid/libcpuid_internal.h
@@ -1,107 +0,0 @@
-/*
- * Copyright 2016  Veselin Georgiev,
- * anrieffNOSPAM @ mgail_DOT.com (convert to gmail)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
- * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
- * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-#ifndef __LIBCPUID_INTERNAL_H__
-#define __LIBCPUID_INTERNAL_H__
-/*
- * This file contains internal undocumented declarations and function prototypes
- * for the workings of the internal library infrastructure.
- */
-
-enum _common_codes_t {
-	NA = 0,
-	NC, /* No code */
-};
-
-#define CODE(x) x
-#define CODE2(x, y) x = y
-enum _amd_code_t {
-	#include "amd_code_t.h"
-};
-typedef enum _amd_code_t amd_code_t;
-
-enum _intel_code_t {
-	#include "intel_code_t.h"
-};
-typedef enum _intel_code_t intel_code_t;
-#undef CODE
-#undef CODE2
-
-struct internal_id_info_t {
-	union {
-		amd_code_t   amd;
-		intel_code_t intel;
-	} code;
-	uint64_t bits;
-	int score; // detection (matchtable) score
-};
-
-#define LBIT(x) (((long long) 1) << x)
-
-enum _common_bits_t {
-	_M_                     = LBIT(  0 ),
-	MOBILE_                 = LBIT(  1 ),
-	_MP_                    = LBIT(  2 ),
-};
-
-// additional detection bits for Intel CPUs:
-enum _intel_bits_t {
-	PENTIUM_                = LBIT( 10 ),
-	CELERON_                = LBIT( 11 ),
-	CORE_                   = LBIT( 12 ),
-	_I_                     = LBIT( 13 ),
-	_3                      = LBIT( 14 ),
-	_5                      = LBIT( 15 ),
-	_7                      = LBIT( 16 ),
-	_9                      = LBIT( 17 ),
-	XEON_                   = LBIT( 18 ),
-	ATOM_                   = LBIT( 19 ),
-};
-typedef enum _intel_bits_t intel_bits_t;
-
-enum _amd_bits_t {
-	ATHLON_      = LBIT( 10 ),
-	_XP_         = LBIT( 11 ),
-	DURON_       = LBIT( 12 ),
-	SEMPRON_     = LBIT( 13 ),
-	OPTERON_     = LBIT( 14 ),
-	TURION_      = LBIT( 15 ),
-	_LV_         = LBIT( 16 ),
-	_64_         = LBIT( 17 ),
-	_X2          = LBIT( 18 ),
-	_X3          = LBIT( 19 ),
-	_X4          = LBIT( 20 ),
-	_X6          = LBIT( 21 ),
-	_FX          = LBIT( 22 ),
-	_APU_        = LBIT( 23 ),
-};
-typedef enum _amd_bits_t amd_bits_t;
-
-
-
-int cpu_ident_internal(struct cpu_raw_data_t* raw, struct cpu_id_t* data, 
-		       struct internal_id_info_t* internal);
-
-#endif /* __LIBCPUID_INTERNAL_H__ */
--- a/src/3rdparty/libcpuid/libcpuid_types.h
+++ b/src/3rdparty/libcpuid/libcpuid_types.h
@@ -1,63 +0,0 @@
-/*
- * Copyright 2008  Veselin Georgiev,
- * anrieffNOSPAM @ mgail_DOT.com (convert to gmail)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
- * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
- * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-/**
- * @File     libcpuid_types.h
- * @Author   Veselin Georgiev
- * @Brief    Type specifications for libcpuid.
- */
-
-#ifndef __LIBCPUID_TYPES_H__
-#define __LIBCPUID_TYPES_H__
-
-#if !defined(_MSC_VER) || _MSC_VER >= 1600
-#  include <stdint.h>
-#else
-/* we have to provide our own: */
-#  if !defined(__int32_t_defined)
-typedef int int32_t;
-#  endif
-
-#  if !defined(__uint32_t_defined)
-typedef unsigned uint32_t;
-#  endif
-
-typedef signed char		int8_t;
-typedef unsigned char		uint8_t;
-typedef signed short		int16_t;
-typedef unsigned short		uint16_t;
-#if (defined _MSC_VER) && (_MSC_VER <= 1300)
-	/* MSVC 6.0: no long longs ... */
-	typedef signed __int64		int64_t;
-	typedef unsigned __int64	uint64_t;
-#else
-	/* all other sane compilers: */
-	typedef signed long long   int64_t;
-	typedef unsigned long long uint64_t;
-#endif
-
-#endif
-
-#endif /* __LIBCPUID_TYPES_H__ */
--- a/src/3rdparty/libcpuid/libcpuid_util.c
+++ b/src/3rdparty/libcpuid/libcpuid_util.c
@@ -1,93 +0,0 @@
-/*
- * Copyright 2008  Veselin Georgiev,
- * anrieffNOSPAM @ mgail_DOT.com (convert to gmail)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
- * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
- * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include <stdio.h>
-#include <stdlib.h>
-#include <stdarg.h>
-#include <string.h>
-#include <ctype.h>
-#include "libcpuid.h"
-#include "libcpuid_util.h"
-
-void match_features(const struct feature_map_t* matchtable, int count, uint32_t reg, struct cpu_id_t* data)
-{
-	int i;
-	for (i = 0; i < count; i++)
-		if (reg & (1u << matchtable[i].bit))
-			data->flags[matchtable[i].feature] = 1;
-}
-
-static int xmatch_entry(char c, const char* p)
-{
-	int i, j;
-	if (c == 0) return -1;
-	if (c == p[0]) return 1;
-	if (p[0] == '.') return 1;
-	if (p[0] == '#' && isdigit(c)) return 1;
-	if (p[0] == '[') {
-		j = 1;
-		while (p[j] && p[j] != ']') j++;
-		if (!p[j]) return -1;
-		for (i = 1; i < j; i++)
-			if (p[i] == c) return j + 1;
-	}
-	return -1;
-}
-
-int match_pattern(const char* s, const char* p)
-{
-	int i, j, dj, k, n, m;
-	n = (int) strlen(s);
-	m = (int) strlen(p);
-	for (i = 0; i < n; i++) {
-		if (xmatch_entry(s[i], p) != -1) {
-			j = 0;
-			k = 0;
-			while (j < m && ((dj = xmatch_entry(s[i + k], p + j)) != -1)) {
-				k++;
-				j += dj;
-			}
-			if (j == m) return i + 1;
-		}
-	}
-	return 0;
-}
-
-struct cpu_id_t* get_cached_cpuid(void)
-{
-	static int initialized = 0;
-	static struct cpu_id_t id;
-	if (initialized) return &id;
-	if (cpu_identify(NULL, &id))
-		memset(&id, 0, sizeof(id));
-	initialized = 1;
-	return &id;
-}
-
-int match_all(uint64_t bits, uint64_t mask)
-{
-	return (bits & mask) == mask;
-}
--- a/Show More
+++ b/Show More