1
0
mirror of https://github.com/xmrig/xmrig.git synced 2026-04-17 13:02:57 -04:00

Compare commits

...

96 Commits

Author SHA1 Message Date
xmrig
a935641274 Merge pull request #3783 from SChernykh/dev
Fixed initial dataset prefetch for RandomX v2
2026-02-16 03:44:35 +07:00
SChernykh
976a08efb4 Fixed initial dataset prefetch for RandomX v2 2026-02-15 21:35:54 +01:00
xmrig
17aa97e543 Merge pull request #3782 from SChernykh/dev
RandomX v2: don't update dataset when switching to/from it
2026-02-15 20:54:01 +07:00
SChernykh
48b29fd68b RandomX v2: don't update dataset when switching to/from it 2026-02-13 13:34:11 +01:00
XMRig
6454a0abf2 Removed unused -P command line option. 2026-02-09 14:50:42 +07:00
xmrig
8464d474d4 Merge pull request #3778 from SChernykh/dev
ARM64 fixes
2026-02-04 14:33:37 +07:00
SChernykh
26ee1cd291 ARM64 fixes
- Added a check for free memory before enabling NUMA
- Removed duplicate AES tables from the ARM64 JIT compiler

Fixes #3729 and #3777
2026-02-04 00:07:16 +01:00
xmrig
d9b39e8c32 Merge pull request #3776 from SChernykh/dev
Sync changes with xmrig-proxy
2026-02-02 18:24:25 +07:00
SChernykh
05dd6dcc40 Sync changes with xmrig-proxy 2026-02-02 12:11:01 +01:00
xmrig
316a3673bc Merge pull request #3775 from SChernykh/dev
RandomX v2: added `commitment` field to stratum submit message
2026-02-01 19:57:41 +07:00
SChernykh
42c943c03f RandomX v2: added commitment field to stratum submit message 2026-02-01 12:32:58 +01:00
xmrig
c2c8080783 Merge pull request #3774 from SChernykh/dev
RandomX v2 (RISC-V)
2026-02-01 12:16:08 +07:00
SChernykh
d82d7f3f20 RandomX v2 (RISC-V) 2026-01-31 21:50:38 +01:00
xmrig
a189d84fcd Merge pull request #3772 from SChernykh/dev
RandomX v2 (ARM64)
2026-01-31 23:34:33 +07:00
SChernykh
cb6001945e RandomX v2 (ARM64) 2026-01-31 17:30:35 +01:00
xmrig
f16a06eb67 Merge pull request #3769 from SChernykh/dev
RandomX v2 (initial support)
2026-01-30 22:09:57 +07:00
SChernykh
9d71358f46 RandomX v2 + commitments 2026-01-30 16:07:25 +01:00
xmrig
5a80c65d31 Merge pull request #3765 from oxyzenQ/dev-typos
fix: cross typos detail below:
2026-01-22 12:57:13 +07:00
rezky_nightky
67cc6cfd1c fix: cross typos detail below:
What I did
Repository rules / exclusions
I didn’t find repo-specific spelling tooling already in place (no existing codespell config).
Given this is a C/C++ repo with vendored sources, I treated src/3rdparty/** as third-party and excluded it from typo fixing (and encoded that in the new .codespellrc).
Typos fixed (first-party only)
Docs
CHANGELOG.md: perfomance -> performance
doc/API.md: optionaly -> optionally, Offical -> Official
doc/BENCHMARK.md: parameteres -> parameters
doc/CPU.md: loosing -> losing, instuctions -> instructions
doc/CHANGELOG_OLD.md: multiple obvious text typos like Breaked -> Broken, singal -> signal, previos -> previous, secons -> seconds, automaticaly -> automatically, perfomance -> performance
Code comments / doc comments (safe text-only changes)
src/base/crypto/sha3.cpp: Inteface -> Interface (comment banner)
src/backend/opencl/cl/cn/cryptonight.cl: performe -> perform, crashs -> crashes (comments)
src/backend/opencl/cl/kawpow/kawpow.cl: regsters -> registers, intial -> initial (comments)
src/crypto/randomx/aes_hash.cpp: intial -> initial (comment)
src/crypto/randomx/intrin_portable.h: cant -> can't (comment)
src/crypto/randomx/randomx.h: intialization -> initialization (doc comment)
src/crypto/cn/c_jh.c: intital -> initial (comment)
src/crypto/cn/skein_port.h: varaiable -> variable (comment)
src/backend/opencl/cl/cn/wolf-skein.cl: Build-in -> Built-in (comment)
What I intentionally did NOT change
Anything under src/3rdparty/** (vendored).
A few remaining codespell hits are either:
Upstream/embedded sources we excluded (groestl256.cl, jh.cl contain Projet)
Potentially valid identifier/name (Carmel CPU codename)
Low-risk token in codegen comments (vor inside an instruction comment)
These are handled via ignore rules in .codespellrc instead of modifying code.

Added: .codespellrc
Created /.codespellrc with:

skip entries for vendored / embedded upstream areas:
./src/3rdparty
./src/crypto/ghostrider
./src/crypto/randomx/blake2
./src/crypto/cn/sse2neon.h
./src/backend/opencl/cl/cn/groestl256.cl
./src/backend/opencl/cl/cn/jh.cl
ignore-words-list for:
Carmel
vor
Verification
codespell . --config ./.codespellrc now exits clean (exit code 0).

Signed-off-by: rezky_nightky <with.rezky@gmail.com>
2026-01-21 22:36:59 +07:00
XMRig
db24bf5154 Revert "Merge branch 'pr3764' into dev"
This reverts commit 0d9a372e49, reversing
changes made to 1a04bf2904.
2026-01-21 21:32:51 +07:00
XMRig
0d9a372e49 Merge branch 'pr3764' into dev 2026-01-21 21:27:41 +07:00
XMRig
c1e3d386fe Merge branch 'master' of https://github.com/oxyzenQ/xmrig into pr3764 2026-01-21 21:27:11 +07:00
rezky_nightky
5ca4828255 feat: stability improvements, see detail below
Key stability improvements made (deterministic + bounded)
1) Bounded memory usage in long-running stats
Fixed unbounded growth in NetworkState latency tracking:
Replaced std::vector<uint16_t> m_latency + push_back() with a fixed-size ring buffer (kLatencyWindow = 1024) and explicit counters.
Median latency computation now operates on at most 1024 samples, preventing memory growth and avoiding performance cliffs from ever-growing copies/sorts.
2) Prevent crash/UAF on shutdown + more predictable teardown
Controller shutdown ordering (Controller::stop()):
Now stops m_miner before destroying m_network.
This reduces chances of worker threads submitting results into a network listener that’s already destroyed.
Thread teardown hardening (backend/common/Thread.h):
Destructor now checks std::thread::joinable() before join().
Avoids std::terminate() if a thread object exists but never started due to early exit/error paths.
3) Fixed real leaks (including executable memory)
Executable memory leak fixed (crypto/cn/CnCtx.cpp):
CnCtx::create() allocates executable memory for generated_code via VirtualMemory::allocateExecutableMemory(0x4000, ...).
Previously CnCtx::release() only _mm_free()’d the struct, leaking the executable mapping.
Now CnCtx::release() frees generated_code before freeing the ctx.
GPU verification leak fixed (net/JobResults.cpp):
In getResults() (GPU result verification), a cryptonight_ctx was created via CnCtx::create() but never released.
Added CnCtx::release(ctx, 1).
4) JobResults: bounded queues + backpressure + safe shutdown semantics
The old JobResults could:

enqueue unlimited std::list items (m_results, m_bundles) → unbounded RAM,
call uv_queue_work per async batch → unbounded libuv threadpool backlog,
delete handler directly while worker threads might still submit → potential crash/UAF.
Changes made:

Hard queue limits:
kMaxQueuedResults = 4096
kMaxQueuedBundles = 256
Excess is dropped (bounded behavior under load).
Async coalescing:
Only one pending async notification at a time (m_pendingAsync), reducing eventfd/uv wake storms.
Bounded libuv work scheduling:
Only one uv_queue_work is scheduled at a time (m_workScheduled), preventing CPU starvation and unpredictable backlog.
Safe shutdown:
JobResults::stop() now detaches global handler first, then calls handler->stop().
Shutdown detaches m_listener, clears queues, and defers deletion until in-flight work is done.
Defensive bound on GPU result count:
Clamp count to 0xFF inside JobResults as well, not just in the caller, to guard against corrupted kernels/drivers.
5) Idempotent cleanup
VirtualMemory::destroy() now sets pool = nullptr after delete:
prevents accidental double-delete on repeated teardown paths.
Verification performed
codespell . --config ./.codespellrc: clean
CMake configure + build completed successfully (Release build)

Signed-off-by: rezky_nightky <with.rezky@gmail.com>
2026-01-21 21:22:43 +07:00
XMRig
1a04bf2904 Merge branch 'pr3762' into dev 2026-01-21 21:22:34 +07:00
XMRig
5feb764b27 Merge branch 'fix-keepalive-timer' of https://github.com/HashVault/vltrig into pr3762 2026-01-21 21:21:48 +07:00
rezky_nightky
cb7511507f fix: cross typos detail below:
What I did
Repository rules / exclusions
I didn’t find repo-specific spelling tooling already in place (no existing codespell config).
Given this is a C/C++ repo with vendored sources, I treated src/3rdparty/** as third-party and excluded it from typo fixing (and encoded that in the new .codespellrc).
Typos fixed (first-party only)
Docs
CHANGELOG.md: perfomance -> performance
doc/API.md: optionaly -> optionally, Offical -> Official
doc/BENCHMARK.md: parameteres -> parameters
doc/CPU.md: loosing -> losing, instuctions -> instructions
doc/CHANGELOG_OLD.md: multiple obvious text typos like Breaked -> Broken, singal -> signal, previos -> previous, secons -> seconds, automaticaly -> automatically, perfomance -> performance
Code comments / doc comments (safe text-only changes)
src/base/crypto/sha3.cpp: Inteface -> Interface (comment banner)
src/backend/opencl/cl/cn/cryptonight.cl: performe -> perform, crashs -> crashes (comments)
src/backend/opencl/cl/kawpow/kawpow.cl: regsters -> registers, intial -> initial (comments)
src/crypto/randomx/aes_hash.cpp: intial -> initial (comment)
src/crypto/randomx/intrin_portable.h: cant -> can't (comment)
src/crypto/randomx/randomx.h: intialization -> initialization (doc comment)
src/crypto/cn/c_jh.c: intital -> initial (comment)
src/crypto/cn/skein_port.h: varaiable -> variable (comment)
src/backend/opencl/cl/cn/wolf-skein.cl: Build-in -> Built-in (comment)
What I intentionally did NOT change
Anything under src/3rdparty/** (vendored).
A few remaining codespell hits are either:
Upstream/embedded sources we excluded (groestl256.cl, jh.cl contain Projet)
Potentially valid identifier/name (Carmel CPU codename)
Low-risk token in codegen comments (vor inside an instruction comment)
These are handled via ignore rules in .codespellrc instead of modifying code.

Added: .codespellrc
Created /.codespellrc with:

skip entries for vendored / embedded upstream areas:
./src/3rdparty
./src/crypto/ghostrider
./src/crypto/randomx/blake2
./src/crypto/cn/sse2neon.h
./src/backend/opencl/cl/cn/groestl256.cl
./src/backend/opencl/cl/cn/jh.cl
ignore-words-list for:
Carmel
vor
Verification
codespell . --config ./.codespellrc now exits clean (exit code 0).

Signed-off-by: rezky_nightky <with.rezky@gmail.com>
2026-01-21 20:14:59 +07:00
HashVault
6e6eab1763 Fix keepalive timer logic
- Reset timer on send instead of receive (pool needs to know we're alive)
- Remove timer disable after first ping to enable continuous keepalives
2026-01-20 14:39:06 +03:00
xmrig
f35f9d7241 Merge pull request #3759 from SChernykh/dev
Optimized VAES code
2026-01-17 21:55:01 +07:00
SChernykh
45d0a15c98 Optimized VAES code
Use only 1 mask instead of 2
2026-01-16 20:43:35 +01:00
xmrig
f4845cbd68 Merge pull request #3758 from SChernykh/dev
RandomX: added VAES-512 support for Zen5
2026-01-16 19:07:09 +07:00
SChernykh
ed80a8a828 RandomX: added VAES-512 support for Zen5
+0.1-0.2% hashrate improvement.
2026-01-16 13:04:40 +01:00
xmrig
9e5492eecc Merge pull request #3757 from SChernykh/dev
Improved RISC-V code
2026-01-15 19:51:57 +07:00
SChernykh
e41b28ef78 Improved RISC-V code 2026-01-15 12:48:55 +01:00
xmrig
1bd59129c4 Merge pull request #3750 from SChernykh/dev
RISC-V: use vector hardware AES instead of scalar
2026-01-01 15:43:36 +07:00
SChernykh
8ccf7de304 RISC-V: use vector hardware AES instead of scalar 2025-12-31 23:37:55 +01:00
xmrig
30ffb9cb27 Merge pull request #3749 from SChernykh/dev
RISC-V: detect and use hardware AES
2025-12-30 14:13:44 +07:00
SChernykh
d3a84c4b52 RISC-V: detect and use hardware AES 2025-12-29 22:10:07 +01:00
xmrig
eb49237aaa Merge pull request #3748 from SChernykh/dev
RISC-V: auto-detect and use vector code for all RandomX AES functions
2025-12-28 13:12:50 +07:00
SChernykh
e1efd3dc7f RISC-V: auto-detect and use vector code for all RandomX AES functions 2025-12-27 21:30:14 +01:00
xmrig
e3d0135708 Merge pull request #3746 from SChernykh/dev
RISC-V: vectorized RandomX main loop
2025-12-27 18:40:47 +07:00
SChernykh
f661e1eb30 RISC-V: vectorized RandomX main loop 2025-12-26 22:11:39 +01:00
XMRig
99488751f1 v6.25.1-dev 2025-12-23 20:53:43 +07:00
XMRig
5fb0321c84 Merge branch 'master' into dev 2025-12-23 20:53:11 +07:00
XMRig
753859caea v6.25.0 2025-12-23 19:44:52 +07:00
XMRig
712a5a5e66 Merge branch 'dev' 2025-12-23 19:44:21 +07:00
XMRig
290a0de6e5 v6.25.0-dev 2025-12-23 19:37:24 +07:00
xmrig
e0564b5fdd Merge pull request #3743 from SChernykh/dev
Linux: added support for transparent huge pages
2025-12-12 01:20:03 +07:00
SChernykh
482a1f0b40 Linux: added support for transparent huge pages 2025-12-11 11:23:18 +01:00
xmrig
856813c1ae Merge pull request #3740 from SChernykh/dev
RISC-V: added vectorized soft AES
2025-12-06 19:39:47 +07:00
SChernykh
23da1a90f5 RISC-V: added vectorized soft AES 2025-12-05 21:09:22 +01:00
xmrig
7981e4a76a Merge pull request #3736 from SChernykh/dev
RISC-V: added vectorized dataset init
2025-12-01 10:46:03 +07:00
SChernykh
7ef5142a52 RISC-V: added vectorized dataset init (activated by setting init-avx2 to 1 in config.json) 2025-11-30 19:15:15 +01:00
xmrig
db5c6d9190 Merge pull request #3733 from void-512/master
Add detection for MSVC/2026
2025-11-13 15:52:43 +07:00
Tony Wang
e88009d575 add detection for MSVC/2026 2025-11-12 17:32:57 -05:00
XMRig
5115597e7f Improved compatibility for automatically enabling huge pages on Linux systems without NUMA support. 2025-11-07 01:55:00 +07:00
xmrig
4cdc35f966 Merge pull request #3731 from user0-07161/dev-haiku-os-support
feat: initial haiku os support
2025-11-05 18:47:22 +07:00
user0-07161
b02519b9f5 feat: initial support for haiku 2025-11-04 13:58:01 +00:00
XMRig
a44b21cef3 Cleanup 2025-10-27 19:18:52 +07:00
XMRig
ea832899f2 Fixed macOS build. 2025-10-23 11:17:59 +07:00
xmrig
3ecacf0ac2 Merge pull request #3725 from SChernykh/dev
RISC-V integration and JIT compiler
2025-10-23 11:02:21 +07:00
SChernykh
27c8e60919 Removed unused files 2025-10-22 23:31:02 +02:00
SChernykh
985fe06e8d RISC-V: test for instruction extensions 2025-10-22 19:21:26 +02:00
SChernykh
75b63ddde9 RISC-V JIT compiler 2025-10-22 19:00:20 +02:00
slayingripper
643b65f2c0 RISC-V Intergration 2025-10-22 18:57:20 +02:00
xmrig
116ba1828f Merge pull request #3722 from SChernykh/dev
Added Zen4 (Hawk Point) CPUs detection
2025-10-15 13:23:36 +07:00
SChernykh
da5a5674b4 Added Zen4 (Hawk Point) CPUs detection 2025-10-15 08:07:58 +02:00
xmrig
6cc4819cec Merge pull request #3719 from SChernykh/dev
Fix: correct FCMP++ version number
2025-10-05 18:28:21 +07:00
SChernykh
a659397c41 Fix: correct FCMP++ version number 2025-10-05 13:24:55 +02:00
xmrig
20acfd0d79 Merge pull request #3718 from SChernykh/dev
Solo mining: added support for FCMP++ hardfork
2025-10-05 18:04:23 +07:00
SChernykh
da683d8c3e Solo mining: added support for FCMP++ hardfork 2025-10-05 13:00:21 +02:00
XMRig
255565b533 Merge branch 'xtophyr-master' into dev 2025-09-22 21:31:28 +07:00
XMRig
878e83bf59 Merge branch 'master' of https://github.com/xtophyr/xmrig into xtophyr-master 2025-09-22 21:31:14 +07:00
Christopher Wright
7abf17cb59 adjust instruction/register suffixes to compile with gcc-based assemblers. 2025-09-21 14:57:42 -04:00
Christopher Wright
eeec5ecd10 undo this change 2025-09-20 08:38:40 -04:00
Christopher Wright
93f5067999 minor Aarch64 JIT changes (better instruction selection, don't emit instructions that add 0, etc) 2025-09-20 08:32:32 -04:00
XMRig
dd6671bc59 Merge branch 'dev' of github.com:xmrig/xmrig into dev 2025-06-29 12:29:01 +07:00
XMRig
a1ee2fd9d2 Improved LibreSSL support. 2025-06-29 12:28:35 +07:00
xmrig
2619131176 Merge pull request #3680 from benthetechguy/armhf
Add armv8l to list of 32 bit ARM targets
2025-06-25 04:14:22 +07:00
Ben Westover
1161f230c5 Add armv8l to list of 32 bit ARM targets
armv8l is what CMAKE_SYSTEM_PROCESSOR is set to when an ARMv8 processor
is in 32-bit mode, so it should be added to the ARMv7 target list even
though it's v8 because it's 32 bits. Currently, it's not in any ARM
target list which means x86 is assumed and the build fails.
2025-06-24 15:28:01 -04:00
XMRig
d2363ba28b v6.24.1-dev 2025-06-23 08:37:15 +07:00
XMRig
1676da1fe9 Merge branch 'master' into dev 2025-06-23 08:36:52 +07:00
XMRig
6e4a5a6d94 v6.24.0 2025-06-23 07:44:53 +07:00
XMRig
273133aa63 Merge branch 'dev' 2025-06-23 07:44:05 +07:00
xmrig
c69e30c9a0 Update CHANGELOG.md 2025-06-23 05:39:26 +07:00
XMRig
6a690ba1e9 More DNS cleanup. 2025-06-20 23:45:53 +07:00
XMRig
545aef0937 v6.24.0-dev 2025-06-20 08:34:58 +07:00
xmrig
9fa66d3242 Merge pull request #3678 from xmrig/dns_ip_version
Improved IPv6 support.
2025-06-20 08:33:50 +07:00
XMRig
ec286c7fef Improved IPv6 support. 2025-06-20 07:39:52 +07:00
xmrig
e28d663d80 Merge pull request #3677 from SChernykh/dev
Tweaked autoconfig for AMD CPUs with < 2 MB L3 cache per thread, again (hopefully the last time)
2025-06-19 18:07:54 +07:00
SChernykh
aba1ad8cfc Tweaked autoconfig for AMD CPUs with < 2 MB L3 cache per thread, again (hopefully the last time) 2025-06-19 12:58:31 +02:00
xmrig
bf44ed52e9 Merge pull request #3674 from benthetechguy/armhf
cflags: Add lax-vector-conversions on ARMv7
2025-06-19 04:46:02 +07:00
Ben Westover
762c435fa8 cflags: Add lax-vector-conversions on ARMv7
lax-vector-conversions is enabled in the CXXFLAGS but not CFLAGS for ARMv7.
This commit adds it to CFLAGS which fixes the ARMv7 build (Fixes: #3673).
2025-06-18 16:38:05 -04:00
xmrig
48faf0a11b Merge pull request #3671 from SChernykh/dev
Hwloc: fixed detection of L2 cache size for some complex NUMA topologies
2025-06-17 18:52:43 +07:00
SChernykh
d125d22d27 Hwloc: fixed detection of L2 cache size for some complex NUMA topologies 2025-06-17 13:49:02 +02:00
XMRig
9f3591ae0d v6.23.1-dev 2025-06-16 21:29:17 +07:00
XMRig
6bbbcc71f1 Merge branch 'master' into dev 2025-06-16 21:28:48 +07:00
139 changed files with 16463 additions and 6746 deletions

3
.codespellrc Normal file
View File

@@ -0,0 +1,3 @@
[codespell]
skip = ./src/3rdparty,./src/crypto/ghostrider,./src/crypto/randomx/blake2,./src/crypto/cn/sse2neon.h,./src/backend/opencl/cl/cn/groestl256.cl,./src/backend/opencl/cl/cn/jh.cl
ignore-words-list = Carmel,vor

2
.gitignore vendored
View File

@@ -4,3 +4,5 @@ scripts/deps
/CMakeLists.txt.user
/.idea
/src/backend/opencl/cl/cn/cryptonight_gen.cl
.vscode
/.qtcreator

View File

@@ -1,3 +1,23 @@
# v6.25.0
- [#3680](https://github.com/xmrig/xmrig/pull/3680) Added `armv8l` to the list of 32-bit ARM targets.
- [#3708](https://github.com/xmrig/xmrig/pull/3708) Minor Aarch64 JIT changes (better instruction selection, don't emit instructions that add 0, etc).
- [#3718](https://github.com/xmrig/xmrig/pull/3718) Solo mining: added support for FCMP++ hardfork.
- [#3722](https://github.com/xmrig/xmrig/pull/3722) Added Zen4 (Hawk Point) CPUs detection.
- [#3725](https://github.com/xmrig/xmrig/pull/3725) Added **RISC-V** support with JIT compiler.
- [#3731](https://github.com/xmrig/xmrig/pull/3731) Added initial Haiku OS support.
- [#3733](https://github.com/xmrig/xmrig/pull/3733) Added detection for MSVC/2026.
- [#3736](https://github.com/xmrig/xmrig/pull/3736) RISC-V: added vectorized dataset init.
- [#3740](https://github.com/xmrig/xmrig/pull/3740) RISC-V: added vectorized soft AES.
- [#3743](https://github.com/xmrig/xmrig/pull/3743) Linux: added support for transparent huge pages.
- Improved LibreSSL support.
- Improved compatibility for automatically enabling huge pages on Linux systems without NUMA support.
# v6.24.0
- [#3671](https://github.com/xmrig/xmrig/pull/3671) Fixed detection of L2 cache size for some complex NUMA topologies.
- [#3674](https://github.com/xmrig/xmrig/pull/3674) Fixed ARMv7 build.
- [#3677](https://github.com/xmrig/xmrig/pull/3677) Fixed auto-config for AMD CPUs with less than 2 MB L3 cache per thread.
- [#3678](https://github.com/xmrig/xmrig/pull/3678) Improved IPv6 support: the new default settings use IPv6 equally with IPv4.
# v6.23.0
- [#3668](https://github.com/xmrig/xmrig/issues/3668) Added support for Windows ARM64.
- [#3665](https://github.com/xmrig/xmrig/pull/3665) Tweaked auto-config for AMD CPUs with < 2 MB L3 cache per thread.
@@ -140,7 +160,7 @@
# v6.16.2
- [#2751](https://github.com/xmrig/xmrig/pull/2751) Fixed crash on CPUs supporting VAES and running GCC-compiled xmrig.
- [#2761](https://github.com/xmrig/xmrig/pull/2761) Fixed broken auto-tuning in GCC Windows build.
- [#2771](https://github.com/xmrig/xmrig/issues/2771) Fixed environment variables support for GhostRider and KawPow.
- [#2771](https://github.com/xmrig/xmrig/issues/2771) Fixed environment variables support for GhostRider and KawPow.
- [#2769](https://github.com/xmrig/xmrig/pull/2769) Performance fixes:
- Fixed several performance bottlenecks introduced in v6.16.1.
- Fixed overall GCC-compiled build performance, it's the same speed as MSVC build now.
@@ -448,7 +468,7 @@
- Compiler for Windows gcc builds updated to v10.1.
# v5.11.1
- [#1652](https://github.com/xmrig/xmrig/pull/1652) Up to 1% RandomX perfomance improvement on recent AMD CPUs.
- [#1652](https://github.com/xmrig/xmrig/pull/1652) Up to 1% RandomX performance improvement on recent AMD CPUs.
- [#1306](https://github.com/xmrig/xmrig/issues/1306) Fixed possible double connection to a pool.
- [#1654](https://github.com/xmrig/xmrig/issues/1654) Fixed build with LibreSSL.
@@ -554,9 +574,9 @@
- Added automatic huge pages configuration on Linux if use the miner with root privileges.
- **Added [automatic Intel prefetchers configuration](https://xmrig.com/docs/miner/randomx-optimization-guide#intel-specific-optimizations) on Linux.**
- Added new option `wrmsr` in `randomx` object with command line equivalent `--randomx-wrmsr=6`.
- [#1396](https://github.com/xmrig/xmrig/pull/1396) [#1401](https://github.com/xmrig/xmrig/pull/1401) New performance optimizations for Ryzen CPUs.
- [#1385](https://github.com/xmrig/xmrig/issues/1385) Added `max-threads-hint` option support for RandomX dataset initialization threads.
- [#1386](https://github.com/xmrig/xmrig/issues/1386) Added `priority` option support for RandomX dataset initialization threads.
- [#1396](https://github.com/xmrig/xmrig/pull/1396) [#1401](https://github.com/xmrig/xmrig/pull/1401) New performance optimizations for Ryzen CPUs.
- [#1385](https://github.com/xmrig/xmrig/issues/1385) Added `max-threads-hint` option support for RandomX dataset initialization threads.
- [#1386](https://github.com/xmrig/xmrig/issues/1386) Added `priority` option support for RandomX dataset initialization threads.
- For official builds all dependencies (libuv, hwloc, openssl) updated to recent versions.
- Windows `msvc` builds now use Visual Studio 2019 instead of 2017.
@@ -602,7 +622,7 @@ This release based on 4.x.x series and include all features from v4.6.2-beta, ch
- Removed command line option `--http-enabled`, HTTP API enabled automatically if any other `--http-*` option provided.
- [#1172](https://github.com/xmrig/xmrig/issues/1172) **Added OpenCL mining backend.**
- [#268](https://github.com/xmrig/xmrig-amd/pull/268) [#270](https://github.com/xmrig/xmrig-amd/pull/270) [#271](https://github.com/xmrig/xmrig-amd/pull/271) [#273](https://github.com/xmrig/xmrig-amd/pull/273) [#274](https://github.com/xmrig/xmrig-amd/pull/274) [#1171](https://github.com/xmrig/xmrig/pull/1171) Added RandomX support for OpenCL, thanks [@SChernykh](https://github.com/SChernykh).
- Algorithm `cn/wow` removed, as no longer alive.
- Algorithm `cn/wow` removed, as no longer alive.
# Previous versions
[doc/CHANGELOG_OLD.md](doc/CHANGELOG_OLD.md)

View File

@@ -95,7 +95,7 @@ set(HEADERS_CRYPTO
src/crypto/common/VirtualMemory.h
)
if (XMRIG_ARM)
if (XMRIG_ARM OR XMRIG_RISCV)
set(HEADERS_CRYPTO "${HEADERS_CRYPTO}" src/crypto/cn/CryptoNight_arm.h)
else()
set(HEADERS_CRYPTO "${HEADERS_CRYPTO}" src/crypto/cn/CryptoNight_x86.h)

View File

@@ -10,7 +10,7 @@
XMRig is a high performance, open source, cross platform RandomX, KawPow, CryptoNight and [GhostRider](https://github.com/xmrig/xmrig/tree/master/src/crypto/ghostrider#readme) unified CPU/GPU miner and [RandomX benchmark](https://xmrig.com/benchmark). Official binaries are available for Windows, Linux, macOS and FreeBSD.
## Mining backends
- **CPU** (x86/x64/ARMv7/ARMv8)
- **CPU** (x86/x64/ARMv7/ARMv8/RISC-V)
- **OpenCL** for AMD GPUs.
- **CUDA** for NVIDIA GPUs via external [CUDA plugin](https://github.com/xmrig/xmrig-cuda).

View File

@@ -1,4 +1,4 @@
if (WITH_ASM AND NOT XMRIG_ARM AND CMAKE_SIZEOF_VOID_P EQUAL 8)
if (WITH_ASM AND NOT XMRIG_ARM AND NOT XMRIG_RISCV AND CMAKE_SIZEOF_VOID_P EQUAL 8)
set(XMRIG_ASM_LIBRARY "xmrig-asm")
if (CMAKE_C_COMPILER_ID MATCHES MSVC)

View File

@@ -21,6 +21,19 @@ if (NOT VAES_SUPPORTED)
set(WITH_VAES OFF)
endif()
# Detect RISC-V architecture early (before it's used below)
if (CMAKE_SYSTEM_PROCESSOR MATCHES "^(riscv64|riscv|rv64)$")
set(RISCV_TARGET 64)
set(XMRIG_RISCV ON)
add_definitions(-DXMRIG_RISCV)
message(STATUS "Detected RISC-V 64-bit architecture (${CMAKE_SYSTEM_PROCESSOR})")
elseif (CMAKE_SYSTEM_PROCESSOR MATCHES "^(riscv32|rv32)$")
set(RISCV_TARGET 32)
set(XMRIG_RISCV ON)
add_definitions(-DXMRIG_RISCV)
message(STATUS "Detected RISC-V 32-bit architecture (${CMAKE_SYSTEM_PROCESSOR})")
endif()
if (XMRIG_64_BIT AND CMAKE_SYSTEM_PROCESSOR MATCHES "^(x86_64|AMD64)$")
add_definitions(-DRAPIDJSON_SSE2)
else()
@@ -29,6 +42,120 @@ else()
set(WITH_VAES OFF)
endif()
# Disable x86-specific features for RISC-V
if (XMRIG_RISCV)
set(WITH_SSE4_1 OFF)
set(WITH_AVX2 OFF)
set(WITH_VAES OFF)
# default build uses the RV64GC baseline
set(RVARCH "rv64gc")
enable_language(ASM)
try_run(RANDOMX_VECTOR_RUN_FAIL
RANDOMX_VECTOR_COMPILE_OK
${CMAKE_CURRENT_BINARY_DIR}/
${CMAKE_CURRENT_SOURCE_DIR}/src/crypto/randomx/tests/riscv64_vector.s
COMPILE_DEFINITIONS "-march=rv64gcv")
if (RANDOMX_VECTOR_COMPILE_OK AND NOT RANDOMX_VECTOR_RUN_FAIL)
set(RVARCH_V ON)
message(STATUS "RISC-V vector extension detected")
else()
set(RVARCH_V OFF)
endif()
try_run(RANDOMX_ZICBOP_RUN_FAIL
RANDOMX_ZICBOP_COMPILE_OK
${CMAKE_CURRENT_BINARY_DIR}/
${CMAKE_CURRENT_SOURCE_DIR}/src/crypto/randomx/tests/riscv64_zicbop.s
COMPILE_DEFINITIONS "-march=rv64gc_zicbop")
if (RANDOMX_ZICBOP_COMPILE_OK AND NOT RANDOMX_ZICBOP_RUN_FAIL)
set(RVARCH_ZICBOP ON)
message(STATUS "RISC-V zicbop extension detected")
else()
set(RVARCH_ZICBOP OFF)
endif()
try_run(RANDOMX_ZBA_RUN_FAIL
RANDOMX_ZBA_COMPILE_OK
${CMAKE_CURRENT_BINARY_DIR}/
${CMAKE_CURRENT_SOURCE_DIR}/src/crypto/randomx/tests/riscv64_zba.s
COMPILE_DEFINITIONS "-march=rv64gc_zba")
if (RANDOMX_ZBA_COMPILE_OK AND NOT RANDOMX_ZBA_RUN_FAIL)
set(RVARCH_ZBA ON)
message(STATUS "RISC-V zba extension detected")
else()
set(RVARCH_ZBA OFF)
endif()
try_run(RANDOMX_ZBB_RUN_FAIL
RANDOMX_ZBB_COMPILE_OK
${CMAKE_CURRENT_BINARY_DIR}/
${CMAKE_CURRENT_SOURCE_DIR}/src/crypto/randomx/tests/riscv64_zbb.s
COMPILE_DEFINITIONS "-march=rv64gc_zbb")
if (RANDOMX_ZBB_COMPILE_OK AND NOT RANDOMX_ZBB_RUN_FAIL)
set(RVARCH_ZBB ON)
message(STATUS "RISC-V zbb extension detected")
else()
set(RVARCH_ZBB OFF)
endif()
try_run(RANDOMX_ZVKB_RUN_FAIL
RANDOMX_ZVKB_COMPILE_OK
${CMAKE_CURRENT_BINARY_DIR}/
${CMAKE_CURRENT_SOURCE_DIR}/src/crypto/randomx/tests/riscv64_zvkb.s
COMPILE_DEFINITIONS "-march=rv64gcv_zvkb")
if (RANDOMX_ZVKB_COMPILE_OK AND NOT RANDOMX_ZVKB_RUN_FAIL)
set(RVARCH_ZVKB ON)
message(STATUS "RISC-V zvkb extension detected")
else()
set(RVARCH_ZVKB OFF)
endif()
try_run(RANDOMX_ZVKNED_RUN_FAIL
RANDOMX_ZVKNED_COMPILE_OK
${CMAKE_CURRENT_BINARY_DIR}/
${CMAKE_CURRENT_SOURCE_DIR}/src/crypto/randomx/tests/riscv64_zvkned.s
COMPILE_DEFINITIONS "-march=rv64gcv_zvkned")
if (RANDOMX_ZVKNED_COMPILE_OK AND NOT RANDOMX_ZVKNED_RUN_FAIL)
set(RVARCH_ZVKNED ON)
message(STATUS "RISC-V zvkned extension detected")
else()
set(RVARCH_ZVKNED OFF)
endif()
# for native builds, enable Zba and Zbb if supported by the CPU
if (ARCH STREQUAL "native")
if (RVARCH_V)
set(RVARCH "${RVARCH}v")
endif()
if (RVARCH_ZICBOP)
set(RVARCH "${RVARCH}_zicbop")
endif()
if (RVARCH_ZBA)
set(RVARCH "${RVARCH}_zba")
endif()
if (RVARCH_ZBB)
set(RVARCH "${RVARCH}_zbb")
endif()
if (RVARCH_ZVKB)
set(RVARCH "${RVARCH}_zvkb")
endif()
if (RVARCH_ZVKNED)
set(RVARCH "${RVARCH}_zvkned")
endif()
endif()
message(STATUS "Using -march=${RVARCH}")
endif()
add_definitions(-DRAPIDJSON_WRITE_DEFAULT_FLAGS=6) # rapidjson::kWriteNanAndInfFlag | rapidjson::kWriteNanAndInfNullFlag
if (ARM_V8)
@@ -40,7 +167,7 @@ endif()
if (NOT ARM_TARGET)
if (CMAKE_SYSTEM_PROCESSOR MATCHES "^(aarch64|arm64|ARM64|armv8-a)$")
set(ARM_TARGET 8)
elseif (CMAKE_SYSTEM_PROCESSOR MATCHES "^(armv7|armv7f|armv7s|armv7k|armv7-a|armv7l|armv7ve)$")
elseif (CMAKE_SYSTEM_PROCESSOR MATCHES "^(armv7|armv7f|armv7s|armv7k|armv7-a|armv7l|armv7ve|armv8l)$")
set(ARM_TARGET 7)
endif()
endif()

View File

@@ -26,8 +26,13 @@ if (CMAKE_CXX_COMPILER_ID MATCHES GNU)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${ARM8_CXX_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${ARM8_CXX_FLAGS} -flax-vector-conversions")
elseif (ARM_TARGET EQUAL 7)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -march=armv7-a -mfpu=neon")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -march=armv7-a -mfpu=neon -flax-vector-conversions")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=armv7-a -mfpu=neon -flax-vector-conversions")
elseif (XMRIG_RISCV)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -march=${RVARCH}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=${RVARCH}")
add_definitions(-DHAVE_ROTR)
else()
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -maes")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -maes")
@@ -41,6 +46,8 @@ if (CMAKE_CXX_COMPILER_ID MATCHES GNU)
else()
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static -Wl,--large-address-aware")
endif()
elseif(CMAKE_SYSTEM_NAME STREQUAL "Haiku")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static-libgcc")
else()
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static-libgcc -static-libstdc++")
endif()
@@ -74,6 +81,11 @@ elseif (CMAKE_CXX_COMPILER_ID MATCHES Clang)
elseif (ARM_TARGET EQUAL 7)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mfpu=neon -march=${CMAKE_SYSTEM_PROCESSOR}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mfpu=neon -march=${CMAKE_SYSTEM_PROCESSOR}")
elseif (XMRIG_RISCV)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -march=${RVARCH}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=${RVARCH}")
add_definitions(-DHAVE_ROTR)
else()
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -maes")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -maes")

View File

@@ -17,6 +17,10 @@ else()
set(XMRIG_OS_LINUX ON)
elseif(CMAKE_SYSTEM_NAME STREQUAL FreeBSD OR CMAKE_SYSTEM_NAME STREQUAL DragonFly)
set(XMRIG_OS_FREEBSD ON)
elseif(CMAKE_SYSTEM_NAME STREQUAL OpenBSD)
set(XMRIG_OS_OPENBSD ON)
elseif(CMAKE_SYSTEM_NAME STREQUAL "Haiku")
set(XMRIG_OS_HAIKU ON)
endif()
endif()
@@ -43,6 +47,10 @@ elseif(XMRIG_OS_UNIX)
add_definitions(-DXMRIG_OS_LINUX)
elseif (XMRIG_OS_FREEBSD)
add_definitions(-DXMRIG_OS_FREEBSD)
elseif (XMRIG_OS_OPENBSD)
add_definitions(-DXMRIG_OS_OPENBSD)
elseif (XMRIG_OS_HAIKU)
add_definitions(-DXMRIG_OS_HAIKU)
endif()
endif()

View File

@@ -62,7 +62,7 @@ if (WITH_RANDOMX)
src/crypto/randomx/jit_compiler_x86_static.asm
src/crypto/randomx/jit_compiler_x86.cpp
)
elseif (WITH_ASM AND NOT XMRIG_ARM AND CMAKE_SIZEOF_VOID_P EQUAL 8)
elseif (WITH_ASM AND NOT XMRIG_ARM AND NOT XMRIG_RISCV AND CMAKE_SIZEOF_VOID_P EQUAL 8)
list(APPEND SOURCES_CRYPTO
src/crypto/randomx/jit_compiler_x86_static.S
src/crypto/randomx/jit_compiler_x86.cpp
@@ -80,6 +80,39 @@ if (WITH_RANDOMX)
else()
set_property(SOURCE src/crypto/randomx/jit_compiler_a64_static.S PROPERTY LANGUAGE C)
endif()
elseif (XMRIG_RISCV AND CMAKE_SIZEOF_VOID_P EQUAL 8)
list(APPEND SOURCES_CRYPTO
src/crypto/randomx/jit_compiler_rv64_static.S
src/crypto/randomx/jit_compiler_rv64_vector_static.S
src/crypto/randomx/jit_compiler_rv64.cpp
src/crypto/randomx/jit_compiler_rv64_vector.cpp
src/crypto/randomx/aes_hash_rv64_vector.cpp
src/crypto/randomx/aes_hash_rv64_zvkned.cpp
)
# cheat because cmake and ccache hate each other
set_property(SOURCE src/crypto/randomx/jit_compiler_rv64_static.S PROPERTY LANGUAGE C)
set_property(SOURCE src/crypto/randomx/jit_compiler_rv64_vector_static.S PROPERTY LANGUAGE C)
set(RV64_VECTOR_FILE_ARCH "rv64gcv")
if (ARCH STREQUAL "native")
if (RVARCH_ZICBOP)
set(RV64_VECTOR_FILE_ARCH "${RV64_VECTOR_FILE_ARCH}_zicbop")
endif()
if (RVARCH_ZBA)
set(RV64_VECTOR_FILE_ARCH "${RV64_VECTOR_FILE_ARCH}_zba")
endif()
if (RVARCH_ZBB)
set(RV64_VECTOR_FILE_ARCH "${RV64_VECTOR_FILE_ARCH}_zbb")
endif()
if (RVARCH_ZVKB)
set(RV64_VECTOR_FILE_ARCH "${RV64_VECTOR_FILE_ARCH}_zvkb")
endif()
endif()
set_source_files_properties(src/crypto/randomx/jit_compiler_rv64_vector_static.S PROPERTIES COMPILE_FLAGS "-march=${RV64_VECTOR_FILE_ARCH}_zvkned")
set_source_files_properties(src/crypto/randomx/aes_hash_rv64_vector.cpp PROPERTIES COMPILE_FLAGS "-O3 -march=${RV64_VECTOR_FILE_ARCH}")
set_source_files_properties(src/crypto/randomx/aes_hash_rv64_zvkned.cpp PROPERTIES COMPILE_FLAGS "-O3 -march=${RV64_VECTOR_FILE_ARCH}_zvkned")
else()
list(APPEND SOURCES_CRYPTO
src/crypto/randomx/jit_compiler_fallback.cpp
@@ -116,7 +149,7 @@ if (WITH_RANDOMX)
)
endif()
if (WITH_MSR AND NOT XMRIG_ARM AND CMAKE_SIZEOF_VOID_P EQUAL 8 AND (XMRIG_OS_WIN OR XMRIG_OS_LINUX))
if (WITH_MSR AND NOT XMRIG_ARM AND NOT XMRIG_RISCV AND CMAKE_SIZEOF_VOID_P EQUAL 8 AND (XMRIG_OS_WIN OR XMRIG_OS_LINUX))
add_definitions(/DXMRIG_FEATURE_MSR)
add_definitions(/DXMRIG_FIX_RYZEN)
message("-- WITH_MSR=ON")
@@ -157,6 +190,15 @@ if (WITH_RANDOMX)
list(APPEND HEADERS_CRYPTO src/crypto/rx/Profiler.h)
list(APPEND SOURCES_CRYPTO src/crypto/rx/Profiler.cpp)
endif()
if (WITH_VAES)
set(SOURCES_CRYPTO "${SOURCES_CRYPTO}" src/crypto/randomx/aes_hash_vaes512.cpp)
if (CMAKE_C_COMPILER_ID MATCHES MSVC)
set_source_files_properties(src/crypto/randomx/aes_hash_vaes512.cpp PROPERTIES COMPILE_FLAGS "/arch:AVX512")
elseif (CMAKE_C_COMPILER_ID MATCHES GNU OR CMAKE_C_COMPILER_ID MATCHES Clang)
set_source_files_properties(src/crypto/randomx/aes_hash_vaes512.cpp PROPERTIES COMPILE_FLAGS "-mavx512f -mvaes")
endif()
endif()
else()
remove_definitions(/DXMRIG_ALGO_RANDOMX)
endif()

View File

@@ -1,8 +1,8 @@
# HTTP API
If you want use HTTP API you need enable it (`"enabled": true,`) then choice `port` and optionaly `host`. API not available if miner built without HTTP support (`-DWITH_HTTP=OFF`).
If you want use HTTP API you need enable it (`"enabled": true,`) then choice `port` and optionally `host`. API not available if miner built without HTTP support (`-DWITH_HTTP=OFF`).
Offical HTTP client for API: http://workers.xmrig.info/
Official HTTP client for API: http://workers.xmrig.info/
Example configuration:

View File

@@ -17,7 +17,7 @@ Double check that you see `Huge pages 100%` both for dataset and for all threads
### Benchmark with custom config
You can run benchmark with any configuration you want. Just start without command line parameteres, use regular config.json and add `"benchmark":"1M",` on the next line after pool url.
You can run benchmark with any configuration you want. Just start without command line parameters, use regular config.json and add `"benchmark":"1M",` on the next line after pool url.
# Stress test
@@ -26,4 +26,4 @@ You can also run continuous stress-test that is as close to the real RandomX min
xmrig --stress
xmrig --stress -a rx/wow
```
This will require Internet connection and will run indefinitely.
This will require Internet connection and will run indefinitely.

View File

@@ -57,7 +57,7 @@
# v4.0.0-beta
- [#1172](https://github.com/xmrig/xmrig/issues/1172) **Added OpenCL mining backend.**
- [#268](https://github.com/xmrig/xmrig-amd/pull/268) [#270](https://github.com/xmrig/xmrig-amd/pull/270) [#271](https://github.com/xmrig/xmrig-amd/pull/271) [#273](https://github.com/xmrig/xmrig-amd/pull/273) [#274](https://github.com/xmrig/xmrig-amd/pull/274) [#1171](https://github.com/xmrig/xmrig/pull/1171) Added RandomX support for OpenCL, thanks [@SChernykh](https://github.com/SChernykh).
- Algorithm `cn/wow` removed, as no longer alive.
- Algorithm `cn/wow` removed, as no longer alive.
# v3.2.0
- Added per pool option `coin` with single possible value `monero` for pools without algorithm negotiation, for upcoming Monero fork.
@@ -103,7 +103,7 @@
- [#1105](https://github.com/xmrig/xmrig/issues/1105) Improved auto configuration for `cn-pico` algorithm.
- Added commands `pause` and `resume` via JSON RPC 2.0 API (`POST /json_rpc`).
- Added command line option `--export-topology` for export hwloc topology to a XML file.
- Breaked backward compatibility with previous configs and command line, `variant` option replaced to `algo`, global option `algo` removed, all CPU related settings moved to `cpu` object.
- Broken backward compatibility with previous configs and command line, `variant` option replaced to `algo`, global option `algo` removed, all CPU related settings moved to `cpu` object.
- Options `av`, `safe` and `max-cpu-usage` removed.
- Algorithm `cn/msr` renamed to `cn/fast`.
- Algorithm `cn/xtl` removed.
@@ -122,7 +122,7 @@
- [#1092](https://github.com/xmrig/xmrig/issues/1092) Fixed crash if wrong CPU affinity used.
- [#1103](https://github.com/xmrig/xmrig/issues/1103) Improved auto configuration for RandomX for CPUs where L2 cache is limiting factor.
- [#1105](https://github.com/xmrig/xmrig/issues/1105) Improved auto configuration for `cn-pico` algorithm.
- [#1106](https://github.com/xmrig/xmrig/issues/1106) Fixed `hugepages` field in summary API.
- [#1106](https://github.com/xmrig/xmrig/issues/1106) Fixed `hugepages` field in summary API.
- Added alternative short format for CPU threads.
- Changed format for CPU threads with intensity above 1.
- Name for reference RandomX configuration changed to `rx/test` to avoid potential conflicts in future.
@@ -150,7 +150,7 @@
- [#1050](https://github.com/xmrig/xmrig/pull/1050) Added RandomXL algorithm for [Loki](https://loki.network/), algorithm name used by miner is `randomx/loki` or `rx/loki`.
- Added [flexible](https://github.com/xmrig/xmrig/blob/evo/doc/CPU.md) multi algorithm configuration.
- Added unlimited switching between incompatible algorithms, all mining options can be changed in runtime.
- Breaked backward compatibility with previous configs and command line, `variant` option replaced to `algo`, global option `algo` removed, all CPU related settings moved to `cpu` object.
- Broken backward compatibility with previous configs and command line, `variant` option replaced to `algo`, global option `algo` removed, all CPU related settings moved to `cpu` object.
- Options `av`, `safe` and `max-cpu-usage` removed.
- Algorithm `cn/msr` renamed to `cn/fast`.
- Algorithm `cn/xtl` removed.
@@ -183,7 +183,7 @@
- [#314](https://github.com/xmrig/xmrig-proxy/issues/314) Added donate over proxy feature.
- Added new option `donate-over-proxy`.
- Added real graceful exit.
# v2.14.4
- [#992](https://github.com/xmrig/xmrig/pull/992) Fixed compilation with Clang 3.5.
- [#1012](https://github.com/xmrig/xmrig/pull/1012) Fixed compilation with Clang 9.0.
@@ -250,7 +250,7 @@
# v2.8.1
- [#768](https://github.com/xmrig/xmrig/issues/768) Fixed build with Visual Studio 2015.
- [#769](https://github.com/xmrig/xmrig/issues/769) Fixed regression, some ANSI escape sequences was in log with disabled colors.
- [#777](https://github.com/xmrig/xmrig/issues/777) Better report about pool connection issues.
- [#777](https://github.com/xmrig/xmrig/issues/777) Better report about pool connection issues.
- Simplified checks for ASM auto detection, only AES support necessary.
- Added missing options to `--help` output.
@@ -259,7 +259,7 @@
- Added global and per thread option `"asm"` and command line equivalent.
- **[#758](https://github.com/xmrig/xmrig/issues/758) Added SSL/TLS support for secure connections to pools.**
- Added per pool options `"tls"` and `"tls-fingerprint"` and command line equivalents.
- [#767](https://github.com/xmrig/xmrig/issues/767) Added config autosave feature, same with GPU miners.
- [#767](https://github.com/xmrig/xmrig/issues/767) Added config autosave feature, same with GPU miners.
- [#245](https://github.com/xmrig/xmrig-proxy/issues/245) Fixed API ID collision when run multiple miners on same machine.
- [#757](https://github.com/xmrig/xmrig/issues/757) Fixed send buffer overflow.
@@ -346,7 +346,7 @@
# v2.4.4
- Added libmicrohttpd version to --version output.
- Fixed bug in singal handler, in some cases miner wasn't shutdown properly.
- Fixed bug in signal handler, in some cases miner wasn't shutdown properly.
- Fixed recent MSVC 2017 version detection.
- [#279](https://github.com/xmrig/xmrig/pull/279) Fixed build on some macOS versions.
@@ -359,7 +359,7 @@
# v2.4.2
- [#60](https://github.com/xmrig/xmrig/issues/60) Added FreeBSD support, thanks [vcambur](https://github.com/vcambur).
- [#153](https://github.com/xmrig/xmrig/issues/153) Fixed issues with dwarfpool.com.
# v2.4.1
- [#147](https://github.com/xmrig/xmrig/issues/147) Fixed comparability with monero-stratum.
@@ -371,7 +371,7 @@
- [#101](https://github.com/xmrig/xmrig/issues/101) Fixed MSVC 2017 (15.3) compile time version detection.
- [#108](https://github.com/xmrig/xmrig/issues/108) Silently ignore invalid values for `donate-level` option.
- [#111](https://github.com/xmrig/xmrig/issues/111) Fixed build without AEON support.
# v2.3.1
- [#68](https://github.com/xmrig/xmrig/issues/68) Fixed compatibility with Docker containers, was nothing print on console.
@@ -398,7 +398,7 @@
# v2.1.0
- [#40](https://github.com/xmrig/xmrig/issues/40)
Improved miner shutdown, fixed crash on exit for Linux and OS X.
- Fixed, login request was contain malformed JSON if username or password has some special characters for example `\`.
- Fixed, login request was contain malformed JSON if username or password has some special characters for example `\`.
- [#220](https://github.com/fireice-uk/xmr-stak-cpu/pull/220) Better support for Round Robin DNS, IP address now always chosen randomly instead of stuck on first one.
- Changed donation address, new [xmrig-proxy](https://github.com/xmrig/xmrig-proxy) is coming soon.
@@ -418,16 +418,16 @@ Improved miner shutdown, fixed crash on exit for Linux and OS X.
- Fixed Windows XP support.
- Fixed regression, option `--no-color` was not fully disable colored output.
- Show resolved pool IP address in miner output.
# v1.0.1
- Fix broken software AES implementation, app has crashed if CPU not support AES-NI, only version 1.0.0 affected.
# v1.0.0
- Miner complete rewritten in C++ with libuv.
- This version should be fully compatible (except config file) with previos versions, many new nice features will come in next versions.
- This is still beta. If you found regression, stability or perfomance issues or have an idea for new feature please fell free to open new [issue](https://github.com/xmrig/xmrig/issues/new).
- This version should be fully compatible (except config file) with previous versions, many new nice features will come in next versions.
- This is still beta. If you found regression, stability or performance issues or have an idea for new feature please fell free to open new [issue](https://github.com/xmrig/xmrig/issues/new).
- Added new option `--print-time=N`, print hashrate report every N seconds.
- New hashrate reports, by default every 60 secons.
- New hashrate reports, by default every 60 seconds.
- Added Microsoft Visual C++ 2015 and 2017 support.
- Removed dependency on libcurl.
- To compile this version from source please switch to [dev](https://github.com/xmrig/xmrig/tree/dev) branch.
@@ -440,7 +440,7 @@ Improved miner shutdown, fixed crash on exit for Linux and OS X.
- Fixed gcc 7.1 support.
# v0.8.1
- Added nicehash support, detects automaticaly by pool URL, for example `cryptonight.eu.nicehash.com:3355` or manually via option `--nicehash`.
- Added nicehash support, detects automatically by pool URL, for example `cryptonight.eu.nicehash.com:3355` or manually via option `--nicehash`.
# v0.8.0
- Added double hash mode, also known as lower power mode. `--av=2` and `--av=4`.

View File

@@ -124,7 +124,7 @@ Force enable (`true`) or disable (`false`) hardware AES support. Default value `
Mining threads priority, value from `1` (lowest priority) to `5` (highest possible priority). Default value `null` means miner don't change threads priority at all. Setting priority higher than 2 can make your PC unresponsive.
#### `memory-pool` (since v4.3.0)
Use continuous, persistent memory block for mining threads, useful for preserve huge pages allocation while algorithm switching. Possible values `false` (feature disabled, by default) or `true` or specific count of 2 MB huge pages. It helps to avoid loosing huge pages for scratchpads when RandomX dataset is updated and mining threads restart after a 2-3 days of mining.
Use continuous, persistent memory block for mining threads, useful for preserve huge pages allocation while algorithm switching. Possible values `false` (feature disabled, by default) or `true` or specific count of 2 MB huge pages. It helps to avoid losing huge pages for scratchpads when RandomX dataset is updated and mining threads restart after a 2-3 days of mining.
#### `yield` (since v5.1.1)
Prefer system better system response/stability `true` (default value) or maximum hashrate `false`.
@@ -133,7 +133,7 @@ Prefer system better system response/stability `true` (default value) or maximum
Enable/configure or disable ASM optimizations. Possible values: `true`, `false`, `"intel"`, `"ryzen"`, `"bulldozer"`.
#### `argon2-impl` (since v3.1.0)
Allow override automatically detected Argon2 implementation, this option added mostly for debug purposes, default value `null` means autodetect. This is used in RandomX dataset initialization and also in some other mining algorithms. Other possible values: `"x86_64"`, `"SSE2"`, `"SSSE3"`, `"XOP"`, `"AVX2"`, `"AVX-512F"`. Manual selection has no safe guards - if your CPU doesn't support required instuctions, miner will crash.
Allow override automatically detected Argon2 implementation, this option added mostly for debug purposes, default value `null` means autodetect. This is used in RandomX dataset initialization and also in some other mining algorithms. Other possible values: `"x86_64"`, `"SSE2"`, `"SSSE3"`, `"XOP"`, `"AVX2"`, `"AVX-512F"`. Manual selection has no safe guards - if your CPU doesn't support required instructions, miner will crash.
#### `astrobwt-max-size`
AstroBWT algorithm: skip hashes with large stage 2 size, default: `550`, min: `400`, max: `1200`. Optimal value depends on your CPU/GPU

365
doc/RISCV_PERF_TUNING.md Normal file
View File

@@ -0,0 +1,365 @@
# RISC-V Performance Optimization Guide
This guide provides comprehensive instructions for optimizing XMRig on RISC-V architectures.
## Build Optimizations
### Compiler Flags Applied Automatically
The CMake build now applies aggressive RISC-V-specific optimizations:
```cmake
# RISC-V ISA with extensions
-march=rv64gcv_zba_zbb_zbc_zbs
# Aggressive compiler optimizations
-funroll-loops # Unroll loops for ILP (instruction-level parallelism)
-fomit-frame-pointer # Free up frame pointer register (RISC-V has limited registers)
-fno-common # Better code generation for global variables
-finline-functions # Inline more functions for better cache locality
-ffast-math # Relaxed FP semantics (safe for mining)
-flto # Link-time optimization for cross-module inlining
# Release build additions
-minline-atomics # Inline atomic operations for faster synchronization
```
### Optimal Build Command
```bash
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j$(nproc)
```
**Expected build time**: 5-15 minutes depending on CPU
## Runtime Optimizations
### 1. Memory Configuration (Most Important)
Enable huge pages to reduce TLB misses and fragmentation:
#### Enable 2MB Huge Pages
```bash
# Calculate required huge pages (1 page = 2MB)
# For 2 GB dataset: 1024 pages
# For cache + dataset: 1536 pages minimum
sudo sysctl -w vm.nr_hugepages=2048
```
Verify:
```bash
grep HugePages /proc/meminfo
# Expected: HugePages_Free should be close to nr_hugepages
```
#### Enable 1GB Huge Pages (Optional but Recommended)
```bash
# Run provided helper script
sudo ./scripts/enable_1gb_pages.sh
# Verify 1GB pages are available
cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
# Should be: >= 1 (one 1GB page)
```
Update config.json:
```json
{
"cpu": {
"huge-pages": true
},
"randomx": {
"1gb-pages": true
}
}
```
### 2. RandomX Mode Selection
| Mode | Memory | Init Time | Throughput | Recommendation |
|------|--------|-----------|-----------|-----------------|
| **light** | 256 MB | 10 sec | Low | Testing, resource-constrained |
| **fast** | 2 GB | 2-5 min* | High | Production (with huge pages) |
| **auto** | 2 GB | Varies | High | Default (uses fast if possible) |
*With optimizations; can be 30+ minutes without huge pages
**For RISC-V, use fast mode with huge pages enabled.**
### 3. Dataset Initialization Threads
Optimal thread count = 60-75% of CPU cores (leaves headroom for OS/other tasks)
```json
{
"randomx": {
"init": 4
}
}
```
Or auto-detect (rewritten for RISC-V):
```json
{
"randomx": {
"init": -1
}
}
```
### 4. CPU Affinity (Optional)
Pin threads to specific cores for better cache locality:
```json
{
"cpu": {
"rx/0": [
{ "threads": 1, "affinity": 0 },
{ "threads": 1, "affinity": 1 },
{ "threads": 1, "affinity": 2 },
{ "threads": 1, "affinity": 3 }
]
}
}
```
### 5. CPU Governor (Linux)
Set to performance mode for maximum throughput:
```bash
# Check current governor
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
# Set to performance (requires root)
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
# Verify
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
# Should output: performance
```
## Configuration Examples
### Minimum (Testing)
```json
{
"randomx": {
"mode": "light"
},
"cpu": {
"huge-pages": false
}
}
```
### Recommended (Balanced)
```json
{
"randomx": {
"mode": "auto",
"init": 4,
"1gb-pages": true
},
"cpu": {
"huge-pages": true,
"priority": 2
}
}
```
### Maximum Performance (Production)
```json
{
"randomx": {
"mode": "fast",
"init": -1,
"1gb-pages": true,
"scratchpad_prefetch_mode": 1
},
"cpu": {
"huge-pages": true,
"priority": 3,
"yield": false
}
}
```
## CLI Equivalents
```bash
# Light mode
./xmrig --randomx-mode=light
# Fast mode with 4 init threads
./xmrig --randomx-mode=fast --randomx-init=4
# Benchmark
./xmrig --bench=1M --algo=rx/0
# Benchmark Wownero variant (1 MB scratchpad)
./xmrig --bench=1M --algo=rx/wow
# Mine to pool
./xmrig -o pool.example.com:3333 -u YOUR_WALLET -p x
```
## Performance Diagnostics
### Check if Vector Extensions are Detected
Look for `FEATURES:` line in output:
```
* CPU: ky,x60 (uarch ky,x1)
* FEATURES: rv64imafdcv zba zbb zbc zbs
```
- `v`: Vector extension (RVV) ✓
- `zba`, `zbb`, `zbc`, `zbs`: Bit manipulation ✓
- If missing, make sure build used `-march=rv64gcv_zba_zbb_zbc_zbs`
### Verify Huge Pages at Runtime
```bash
# Run xmrig with --bench=1M and check output
./xmrig --bench=1M
# Look for line like:
# HUGE PAGES 100% 1 / 1 (1024 MB)
```
- Should show 100% for dataset AND threads
- If less, increase `vm.nr_hugepages` and reboot
### Monitor Performance
```bash
# Run benchmark multiple times to find stable hashrate
./xmrig --bench=1M --algo=rx/0
./xmrig --bench=10M --algo=rx/0
./xmrig --bench=100M --algo=rx/0
# Check system load and memory during mining
while true; do free -h; grep HugePages /proc/meminfo; sleep 2; done
```
## Expected Performance
### Hardware: Orange Pi RV2 (Ky X1, 8 cores @ ~1.5 GHz)
| Config | Mode | Hashrate | Init Time |
|--------|------|----------|-----------|
| Scalar (baseline) | fast | 30 H/s | 10 min |
| Scalar + huge pages | fast | 33 H/s | 2 min |
| RVV (if enabled) | fast | 70-100 H/s | 3 min |
*Actual results depend on CPU frequency, memory speed, and load*
## Troubleshooting
### Long Initialization Times (30+ minutes)
**Cause**: Huge pages not enabled, system using swap
**Solution**:
1. Enable huge pages: `sudo sysctl -w vm.nr_hugepages=2048`
2. Reboot: `sudo reboot`
3. Reduce mining threads to free memory
4. Check available memory: `free -h`
### Low Hashrate (50% of expected)
**Cause**: CPU governor set to power-save, no huge pages, high contention
**Solution**:
1. Set governor to performance: `echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor`
2. Enable huge pages
3. Reduce number of mining threads
4. Check system load: `top` or `htop`
### Dataset Init Crashes or Hangs
**Cause**: Insufficient memory, corrupted huge pages
**Solution**:
1. Disable huge pages temporarily: set `huge-pages: false` in config
2. Reduce mining threads
3. Reboot and re-enable huge pages
4. Try light mode: `--randomx-mode=light`
### Out of Memory During Benchmark
**Cause**: Not enough RAM for dataset + cache + threads
**Solution**:
1. Use light mode: `--randomx-mode=light`
2. Reduce mining threads: `--threads=1`
3. Increase available memory (kill other processes)
4. Check: `free -h` before mining
## Advanced Tuning
### Vector Length (VLEN) Detection
RISC-V vector extension variable length (VLEN) affects performance:
```bash
# Check VLEN on your CPU
cat /proc/cpuinfo | grep vlen
# Expected values:
# - 128 bits (16 bytes) = minimum
# - 256 bits (32 bytes) = common
# - 512 bits (64 bytes) = high performance
```
Larger VLEN generally means better performance for vectorized operations.
### Prefetch Optimization
The code automatically optimizes memory prefetching for RISC-V:
```
scratchpad_prefetch_mode: 0 = disabled (slowest)
scratchpad_prefetch_mode: 1 = prefetch.r (default, recommended)
scratchpad_prefetch_mode: 2 = prefetch.w (experimental)
```
### Memory Bandwidth Saturation
If experiencing memory bandwidth saturation (high latency):
1. Reduce mining threads
2. Increase L2/L3 cache by mining fewer threads per core
3. Enable cache QoS (AMD Ryzen): `cache_qos: true`
## Building with Custom Flags
To build with custom RISC-V flags:
```bash
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release \
-DCMAKE_C_FLAGS="-march=rv64gcv_zba_zbb_zbc_zbs -O3 -funroll-loops -fomit-frame-pointer" \
..
make -j$(nproc)
```
## Future Optimizations
- [ ] Zbk* (crypto) support detection and usage
- [ ] Optimal VLEN-aware algorithm selection
- [ ] Per-core memory affinity (NUMA support)
- [ ] Dynamic thread count adjustment based on thermals
- [ ] Cross-compile optimizations for various RISC-V cores
## References
- [RISC-V Vector Extension Spec](https://github.com/riscv/riscv-v-spec)
- [RISC-V Bit Manipulation Spec](https://github.com/riscv/riscv-bitmanip)
- [RISC-V Crypto Spec](https://github.com/riscv/riscv-crypto)
- [XMRig Documentation](https://xmrig.com/docs)
---
For further optimization, enable RVV intrinsics by replacing `sse2rvv.h` with `sse2rvv_optimized.h` in the build.

View File

@@ -12,7 +12,7 @@ if grep -E 'AMD Ryzen|AMD EPYC|AuthenticAMD' /proc/cpuinfo > /dev/null;
then
if grep "cpu family[[:space:]]\{1,\}:[[:space:]]25" /proc/cpuinfo > /dev/null;
then
if grep "model[[:space:]]\{1,\}:[[:space:]]97" /proc/cpuinfo > /dev/null;
if grep "model[[:space:]]\{1,\}:[[:space:]]\(97\|117\)" /proc/cpuinfo > /dev/null;
then
echo "Detected Zen4 CPU"
wrmsr -a 0xc0011020 0x4400000000000

View File

@@ -35,7 +35,7 @@ if (CMAKE_C_COMPILER_ID MATCHES MSVC)
add_feature_impl(xop "" HAVE_XOP)
add_feature_impl(avx2 "/arch:AVX2" HAVE_AVX2)
add_feature_impl(avx512f "/arch:AVX512F" HAVE_AVX512F)
elseif (NOT XMRIG_ARM AND CMAKE_SIZEOF_VOID_P EQUAL 8)
elseif (NOT XMRIG_ARM AND NOT XMRIG_RISCV AND CMAKE_SIZEOF_VOID_P EQUAL 8)
function(add_feature_impl FEATURE GCC_FLAG DEF)
add_library(argon2-${FEATURE} STATIC arch/x86_64/lib/argon2-${FEATURE}.c)
target_include_directories(argon2-${FEATURE} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/../../)

View File

@@ -31,7 +31,7 @@
#include <libkern/OSByteOrder.h>
#define ethash_swap_u32(input_) OSSwapInt32(input_)
#define ethash_swap_u64(input_) OSSwapInt64(input_)
#elif defined(__FreeBSD__) || defined(__DragonFly__) || defined(__NetBSD__)
#elif defined(__FreeBSD__) || defined(__DragonFly__) || defined(__NetBSD__) || defined(__HAIKU__)
#define ethash_swap_u32(input_) bswap32(input_)
#define ethash_swap_u64(input_) bswap64(input_)
#elif defined(__OpenBSD__)

View File

@@ -89,11 +89,16 @@ static void print_cpu(const Config *)
{
const auto info = Cpu::info();
Log::print(GREEN_BOLD(" * ") WHITE_BOLD("%-13s%s (%zu)") " %s %sAES%s",
Log::print(GREEN_BOLD(" * ") WHITE_BOLD("%-13s%s (%zu)") " %s %s%sAES%s",
"CPU",
info->brand(),
info->packages(),
ICpuInfo::is64bit() ? GREEN_BOLD("64-bit") : RED_BOLD("32-bit"),
#ifdef XMRIG_RISCV
info->hasRISCV_Vector() ? GREEN_BOLD_S "RVV " : RED_BOLD_S "-RVV ",
#else
"",
#endif
info->hasAES() ? GREEN_BOLD_S : RED_BOLD_S "-",
info->isVM() ? RED_BOLD_S " VM" : ""
);

View File

@@ -48,6 +48,24 @@ static const std::map<int, std::map<uint32_t, uint64_t> > hashCheck = {
{ 9000000U, 0x323935102AB6B45CULL },
{ 10000000U, 0xB5231262E2792B26ULL }
}},
{ Algorithm::RX_V2, {
# ifndef NDEBUG
{ 10000U, 0x57d2051d099613a4ULL },
{ 20000U, 0x0bae0155cc797f01ULL },
# endif
{ 250000U, 0x18cf741a71484072ULL },
{ 500000U, 0xcd8c3e6ec31b2faeULL },
{ 1000000U, 0x88d6b8fb70cd479dULL },
{ 2000000U, 0x0e16828d236a1a63ULL },
{ 3000000U, 0x2739bdd0f25b83a6ULL },
{ 4000000U, 0x32f42d9006d2d34bULL },
{ 5000000U, 0x16d9c6286cb82251ULL },
{ 6000000U, 0x1f916ae19d6bcf07ULL },
{ 7000000U, 0x1f474f99a873948fULL },
{ 8000000U, 0x8d67e0ddf05476bbULL },
{ 9000000U, 0x3ebf37dcd5c4a215ULL },
{ 10000000U, 0x7efbddff3f30fb74ULL }
}},
{ Algorithm::RX_WOW, {
# ifndef NDEBUG
{ 10000U, 0x6B0918757100B338ULL },
@@ -88,6 +106,24 @@ static const std::map<int, std::map<uint32_t, uint64_t> > hashCheck1T = {
{ 9000000U, 0xC6D39EF59213A07CULL },
{ 10000000U, 0x95E6BAE68DD779CDULL }
}},
{ Algorithm::RX_V2, {
# ifndef NDEBUG
{ 10000, 0x90eb7c07cd9e0d90ULL },
{ 20000, 0x6523a3658d7d9930ULL },
# endif
{ 250000, 0xf83b6d9d355ee5b1ULL },
{ 500000, 0xbea3c1bf1465e9abULL },
{ 1000000, 0x9e16f7cb56b366e1ULL },
{ 2000000, 0x3b5e671f47e15e55ULL },
{ 3000000, 0xec5819c180df03e2ULL },
{ 4000000, 0x19d31b498f86aad4ULL },
{ 5000000, 0x2487626c75cd12ccULL },
{ 6000000, 0xa323a25a5286c39aULL },
{ 7000000, 0xa123b100f3104dfcULL },
{ 8000000, 0x602db9d83bfa0ddcULL },
{ 9000000, 0x98da909e579765ddULL },
{ 10000000, 0x3a45b7247cec9895ULL }
}},
{ Algorithm::RX_WOW, {
# ifndef NDEBUG
{ 10000U, 0x9EC1B9B8C8C7F082ULL },

View File

@@ -87,14 +87,14 @@ xmrig::CpuWorker<N>::CpuWorker(size_t id, const CpuLaunchData &data) :
if (!cn_heavyZen3Memory) {
// Round up number of threads to the multiple of 8
const size_t num_threads = ((m_threads + 7) / 8) * 8;
cn_heavyZen3Memory = new VirtualMemory(m_algorithm.l3() * num_threads, data.hugePages, false, false, node());
cn_heavyZen3Memory = new VirtualMemory(m_algorithm.l3() * num_threads, data.hugePages, false, false, node(), VirtualMemory::kDefaultHugePageSize);
}
m_memory = cn_heavyZen3Memory;
}
else
# endif
{
m_memory = new VirtualMemory(m_algorithm.l3() * N, data.hugePages, false, true, node());
m_memory = new VirtualMemory(m_algorithm.l3() * N, data.hugePages, false, true, node(), VirtualMemory::kDefaultHugePageSize);
}
# ifdef XMRIG_ALGO_GHOSTRIDER
@@ -256,7 +256,10 @@ void xmrig::CpuWorker<N>::start()
# ifdef XMRIG_ALGO_RANDOMX
bool first = true;
alignas(16) uint64_t tempHash[8] = {};
alignas(64) uint64_t tempHash[8] = {};
size_t prev_job_size = 0;
alignas(64) uint8_t prev_job[Job::kMaxBlobSize] = {};
# endif
while (!Nonce::isOutdated(Nonce::CPU, m_job.sequence())) {
@@ -297,6 +300,11 @@ void xmrig::CpuWorker<N>::start()
job.generateMinerSignature(m_job.blob(), job.size(), miner_signature_ptr);
}
randomx_calculate_hash_first(m_vm, tempHash, m_job.blob(), job.size());
if (RandomX_CurrentConfig.Tweak_V2_COMMITMENT) {
prev_job_size = job.size();
memcpy(prev_job, m_job.blob(), prev_job_size);
}
}
if (!nextRound()) {
@@ -307,7 +315,15 @@ void xmrig::CpuWorker<N>::start()
memcpy(miner_signature_saved, miner_signature_ptr, sizeof(miner_signature_saved));
job.generateMinerSignature(m_job.blob(), job.size(), miner_signature_ptr);
}
randomx_calculate_hash_next(m_vm, tempHash, m_job.blob(), job.size(), m_hash);
if (RandomX_CurrentConfig.Tweak_V2_COMMITMENT) {
memcpy(m_commitment, m_hash, RANDOMX_HASH_SIZE);
randomx_calculate_commitment(prev_job, prev_job_size, m_hash, m_hash);
prev_job_size = job.size();
memcpy(prev_job, m_job.blob(), prev_job_size);
}
}
else
# endif
@@ -347,8 +363,20 @@ void xmrig::CpuWorker<N>::start()
}
else
# endif
if (value < job.target()) {
JobResults::submit(job, current_job_nonces[i], m_hash + (i * 32), job.hasMinerSignature() ? miner_signature_saved : nullptr);
uint8_t* extra_data = nullptr;
if (job.algorithm().family() == Algorithm::RANDOM_X) {
if (RandomX_CurrentConfig.Tweak_V2_COMMITMENT) {
extra_data = m_commitment;
}
else if (job.hasMinerSignature()) {
extra_data = miner_signature_saved;
}
}
JobResults::submit(job, current_job_nonces[i], m_hash + (i * 32), extra_data);
}
}
m_count += N;

View File

@@ -83,6 +83,7 @@ private:
void allocateCnCtx();
void consumeJob();
alignas(8) uint8_t m_commitment[N * 32]{ 0 };
alignas(8) uint8_t m_hash[N * 32]{ 0 };
const Algorithm m_algorithm;
const Assembly m_assembly;

View File

@@ -46,7 +46,12 @@ else()
set(CPUID_LIB "")
endif()
if (XMRIG_ARM)
if (XMRIG_RISCV)
list(APPEND SOURCES_BACKEND_CPU
src/backend/cpu/platform/lscpu_riscv.cpp
src/backend/cpu/platform/BasicCpuInfo_riscv.cpp
)
elseif (XMRIG_ARM)
list(APPEND SOURCES_BACKEND_CPU src/backend/cpu/platform/BasicCpuInfo_arm.cpp)
if (XMRIG_OS_WIN)

View File

@@ -85,13 +85,14 @@ public:
FLAG_POPCNT,
FLAG_CAT_L3,
FLAG_VM,
FLAG_RISCV_VECTOR,
FLAG_MAX
};
ICpuInfo() = default;
virtual ~ICpuInfo() = default;
# if defined(__x86_64__) || defined(_M_AMD64) || defined (__arm64__) || defined (__aarch64__)
# if defined(__x86_64__) || defined(_M_AMD64) || defined (__arm64__) || defined (__aarch64__) || defined(__riscv) && (__riscv_xlen == 64)
inline constexpr static bool is64bit() { return true; }
# else
inline constexpr static bool is64bit() { return false; }
@@ -109,6 +110,7 @@ public:
virtual bool hasOneGbPages() const = 0;
virtual bool hasXOP() const = 0;
virtual bool isVM() const = 0;
virtual bool hasRISCV_Vector() const = 0;
virtual bool jccErratum() const = 0;
virtual const char *backend() const = 0;
virtual const char *brand() const = 0;

View File

@@ -58,8 +58,8 @@
namespace xmrig {
constexpr size_t kCpuFlagsSize = 15;
static const std::array<const char *, kCpuFlagsSize> flagNames = { "aes", "vaes", "avx", "avx2", "avx512f", "bmi2", "osxsave", "pdpe1gb", "sse2", "ssse3", "sse4.1", "xop", "popcnt", "cat_l3", "vm" };
constexpr size_t kCpuFlagsSize = 16;
static const std::array<const char *, kCpuFlagsSize> flagNames = { "aes", "vaes", "avx", "avx2", "avx512f", "bmi2", "osxsave", "pdpe1gb", "sse2", "ssse3", "sse4.1", "xop", "popcnt", "cat_l3", "vm", "rvv" };
static_assert(kCpuFlagsSize == ICpuInfo::FLAG_MAX, "kCpuFlagsSize and FLAG_MAX mismatch");
@@ -250,7 +250,7 @@ xmrig::BasicCpuInfo::BasicCpuInfo() :
break;
case 0x19:
if (m_model == 0x61) {
if ((m_model == 0x61) || (m_model == 0x75)) {
m_arch = ARCH_ZEN4;
m_msrMod = MSR_MOD_RYZEN_19H_ZEN4;
}

View File

@@ -52,6 +52,7 @@ protected:
inline bool hasOneGbPages() const override { return has(FLAG_PDPE1GB); }
inline bool hasXOP() const override { return has(FLAG_XOP); }
inline bool isVM() const override { return has(FLAG_VM); }
inline bool hasRISCV_Vector() const override { return has(FLAG_RISCV_VECTOR); }
inline bool jccErratum() const override { return m_jccErratum; }
inline const char *brand() const override { return m_brand; }
inline const std::vector<int32_t> &units() const override { return m_units; }
@@ -65,7 +66,7 @@ protected:
inline Vendor vendor() const override { return m_vendor; }
inline uint32_t model() const override
{
# ifndef XMRIG_ARM
# if !defined(XMRIG_ARM) && !defined(XMRIG_RISCV)
return m_model;
# else
return 0;
@@ -80,7 +81,7 @@ protected:
Vendor m_vendor = VENDOR_UNKNOWN;
private:
# ifndef XMRIG_ARM
# if !defined(XMRIG_ARM) && !defined(XMRIG_RISCV)
uint32_t m_procInfo = 0;
uint32_t m_family = 0;
uint32_t m_model = 0;

View File

@@ -0,0 +1,119 @@
/* XMRig
* Copyright (c) 2025 Slayingripper <https://github.com/Slayingripper>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2017-2019 XMR-Stak <https://github.com/fireice-uk>, <https://github.com/psychocrypt>
* Copyright (c) 2016-2025 XMRig <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <array>
#include <cstring>
#include <fstream>
#include <thread>
#include "backend/cpu/platform/BasicCpuInfo.h"
#include "base/tools/String.h"
#include "3rdparty/rapidjson/document.h"
namespace xmrig {
extern String cpu_name_riscv();
extern bool has_riscv_vector();
extern bool has_riscv_aes();
} // namespace xmrig
xmrig::BasicCpuInfo::BasicCpuInfo() :
m_threads(std::thread::hardware_concurrency())
{
m_units.resize(m_threads);
for (int32_t i = 0; i < static_cast<int32_t>(m_threads); ++i) {
m_units[i] = i;
}
memcpy(m_brand, "RISC-V", 6);
auto name = cpu_name_riscv();
if (!name.isNull()) {
strncpy(m_brand, name.data(), sizeof(m_brand) - 1);
}
// Check for vector extensions
m_flags.set(FLAG_RISCV_VECTOR, has_riscv_vector());
// Check for AES extensions (Zknd/Zkne)
m_flags.set(FLAG_AES, has_riscv_aes());
// RISC-V typically supports 1GB huge pages
m_flags.set(FLAG_PDPE1GB, std::ifstream("/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages").good());
}
const char *xmrig::BasicCpuInfo::backend() const
{
return "basic/1";
}
xmrig::CpuThreads xmrig::BasicCpuInfo::threads(const Algorithm &algorithm, uint32_t) const
{
# ifdef XMRIG_ALGO_GHOSTRIDER
if (algorithm.family() == Algorithm::GHOSTRIDER) {
return CpuThreads(threads(), 8);
}
# endif
return CpuThreads(threads());
}
rapidjson::Value xmrig::BasicCpuInfo::toJSON(rapidjson::Document &doc) const
{
using namespace rapidjson;
auto &allocator = doc.GetAllocator();
Value out(kObjectType);
out.AddMember("brand", StringRef(brand()), allocator);
out.AddMember("aes", hasAES(), allocator);
out.AddMember("avx2", false, allocator);
out.AddMember("x64", is64bit(), allocator); // DEPRECATED will be removed in the next major release.
out.AddMember("64_bit", is64bit(), allocator);
out.AddMember("l2", static_cast<uint64_t>(L2()), allocator);
out.AddMember("l3", static_cast<uint64_t>(L3()), allocator);
out.AddMember("cores", static_cast<uint64_t>(cores()), allocator);
out.AddMember("threads", static_cast<uint64_t>(threads()), allocator);
out.AddMember("packages", static_cast<uint64_t>(packages()), allocator);
out.AddMember("nodes", static_cast<uint64_t>(nodes()), allocator);
out.AddMember("backend", StringRef(backend()), allocator);
out.AddMember("msr", "none", allocator);
out.AddMember("assembly", "none", allocator);
out.AddMember("arch", "riscv64", allocator);
Value flags(kArrayType);
if (hasAES()) {
flags.PushBack("aes", allocator);
}
out.AddMember("flags", flags, allocator);
return out;
}

View File

@@ -87,7 +87,7 @@ static inline size_t countByType(hwloc_topology_t topology, hwloc_obj_type_t typ
}
#ifndef XMRIG_ARM
#if !defined(XMRIG_ARM) && !defined(XMRIG_RISCV)
static inline std::vector<hwloc_obj_t> findByType(hwloc_obj_t obj, hwloc_obj_type_t type)
{
std::vector<hwloc_obj_t> out;
@@ -207,7 +207,7 @@ bool xmrig::HwlocCpuInfo::membind(hwloc_const_bitmap_t nodeset)
xmrig::CpuThreads xmrig::HwlocCpuInfo::threads(const Algorithm &algorithm, uint32_t limit) const
{
# ifndef XMRIG_ARM
# if !defined(XMRIG_ARM) && !defined(XMRIG_RISCV)
if (L2() == 0 && L3() == 0) {
return BasicCpuInfo::threads(algorithm, limit);
}
@@ -277,7 +277,7 @@ xmrig::CpuThreads xmrig::HwlocCpuInfo::allThreads(const Algorithm &algorithm, ui
void xmrig::HwlocCpuInfo::processTopLevelCache(hwloc_obj_t cache, const Algorithm &algorithm, CpuThreads &threads, size_t limit) const
{
# ifndef XMRIG_ARM
# if !defined(XMRIG_ARM) && !defined(XMRIG_RISCV)
constexpr size_t oneMiB = 1024U * 1024U;
size_t PUs = countByType(cache, HWLOC_OBJ_PU);
@@ -311,17 +311,17 @@ void xmrig::HwlocCpuInfo::processTopLevelCache(hwloc_obj_t cache, const Algorith
uint32_t intensity = algorithm.maxIntensity() == 1 ? 0 : 1;
if (cache->attr->cache.depth == 3) {
for (size_t i = 0; i < cache->arity; ++i) {
hwloc_obj_t l2 = cache->children[i];
auto process_L2 = [&L2, &L2_associativity, L3_exclusive, this, &extra, scratchpad](hwloc_obj_t l2) {
if (!hwloc_obj_type_is_cache(l2->type) || l2->attr == nullptr) {
continue;
return;
}
L2 += l2->attr->cache.size;
L2_associativity = l2->attr->cache.associativity;
if (L3_exclusive) {
if (vendor() == VENDOR_AMD) {
if ((vendor() == VENDOR_AMD) && ((arch() == ARCH_ZEN4) || (arch() == ARCH_ZEN5))) {
// Use extra L2 only on newer CPUs because older CPUs (Zen 3 and older) don't benefit from it.
// For some reason, AMD CPUs can use only half of the exclusive L2/L3 cache combo efficiently
extra += std::min<size_t>(l2->attr->cache.size / 2, scratchpad);
}
@@ -329,6 +329,18 @@ void xmrig::HwlocCpuInfo::processTopLevelCache(hwloc_obj_t cache, const Algorith
extra += scratchpad;
}
}
};
for (size_t i = 0; i < cache->arity; ++i) {
hwloc_obj_t ch = cache->children[i];
if (ch->type == HWLOC_OBJ_GROUP) {
for (size_t j = 0; j < ch->arity; ++j) {
process_L2(ch->children[j]);
}
}
else {
process_L2(ch);
}
}
}

View File

@@ -0,0 +1,150 @@
/* XMRig
* Copyright (c) 2025 Slayingripper <https://github.com/Slayingripper>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "base/tools/String.h"
#include "3rdparty/fmt/core.h"
#include <cstdio>
#include <cstring>
#include <string>
namespace xmrig {
struct riscv_cpu_desc
{
String model;
String isa;
String uarch;
bool has_vector = false;
bool has_aes = false;
inline bool isReady() const { return !isa.isNull(); }
};
static bool lookup_riscv(char *line, const char *pattern, String &value)
{
char *p = strstr(line, pattern);
if (!p) {
return false;
}
p += strlen(pattern);
while (isspace(*p)) {
++p;
}
if (*p == ':') {
++p;
}
while (isspace(*p)) {
++p;
}
// Remove trailing newline
size_t len = strlen(p);
if (len > 0 && p[len - 1] == '\n') {
p[len - 1] = '\0';
}
// Ensure we call the const char* assignment (which performs a copy)
// instead of the char* overload (which would take ownership of the pointer)
value = (const char*)p;
return true;
}
static bool read_riscv_cpuinfo(riscv_cpu_desc *desc)
{
auto fp = fopen("/proc/cpuinfo", "r");
if (!fp) {
return false;
}
char buf[2048]; // Larger buffer for long ISA strings
while (fgets(buf, sizeof(buf), fp) != nullptr) {
lookup_riscv(buf, "model name", desc->model);
if (lookup_riscv(buf, "isa", desc->isa)) {
desc->isa.toLower();
for (const String& s : desc->isa.split('_')) {
const char* p = s.data();
const size_t n = s.size();
if ((s.size() > 4) && (memcmp(p, "rv64", 4) == 0)) {
for (size_t i = 4; i < n; ++i) {
if (p[i] == 'v') {
desc->has_vector = true;
break;
}
}
}
else if (s == "zve64d") {
desc->has_vector = true;
}
else if ((s == "zvkn") || (s == "zvknc") || (s == "zvkned") || (s == "zvkng")){
desc->has_aes = true;
}
}
}
lookup_riscv(buf, "uarch", desc->uarch);
if (desc->isReady()) {
break;
}
}
fclose(fp);
return desc->isReady();
}
String cpu_name_riscv()
{
riscv_cpu_desc desc;
if (read_riscv_cpuinfo(&desc)) {
if (!desc.uarch.isNull()) {
return fmt::format("{} ({})", desc.model, desc.uarch).c_str();
}
return desc.model;
}
return "RISC-V";
}
bool has_riscv_vector()
{
riscv_cpu_desc desc;
if (read_riscv_cpuinfo(&desc)) {
return desc.has_vector;
}
return false;
}
bool has_riscv_aes()
{
riscv_cpu_desc desc;
if (read_riscv_cpuinfo(&desc)) {
return desc.has_aes;
}
return false;
}
} // namespace xmrig

View File

@@ -19,6 +19,7 @@
#define ALGO_CN_PICO_TLO 0x63120274
#define ALGO_CN_UPX2 0x63110200
#define ALGO_RX_0 0x72151200
#define ALGO_RX_V2 0x72151202
#define ALGO_RX_WOW 0x72141177
#define ALGO_RX_ARQMA 0x72121061
#define ALGO_RX_SFX 0x72151273

View File

@@ -706,7 +706,7 @@ __kernel void cn2(__global uint4 *Scratchpad, __global ulong *states, __global u
}
# if (ALGO_FAMILY == FAMILY_CN_HEAVY)
/* Also left over threads performe this loop.
/* Also left over threads perform this loop.
* The left over thread results will be ignored
*/
#pragma unroll 16
@@ -1005,7 +1005,7 @@ __kernel void Groestl(__global ulong *states, __global uint *BranchBuf, __global
ulong State[8] = { 0UL, 0UL, 0UL, 0UL, 0UL, 0UL, 0UL, 0x0001000000000000UL };
ulong H[8], M[8];
// BUG: AMD driver 19.7.X crashs if this is written as loop
// BUG: AMD driver 19.7.X crashes if this is written as loop
// Thx AMD for so bad software
{
((ulong8 *)M)[0] = vload8(0, states);

File diff suppressed because it is too large Load Diff

View File

@@ -10,7 +10,7 @@
#else
# define STATIC
/* taken from https://www.khronos.org/registry/OpenCL/extensions/amd/cl_amd_media_ops.txt
* Build-in Function
* Built-in Function
* uintn amd_bitalign (uintn src0, uintn src1, uintn src2)
* Description
* dst.s0 = (uint) (((((long)src0.s0) << 32) | (long)src1.s0) >> (src2.s0 & 31))

View File

@@ -77,7 +77,7 @@ void keccak_f800_round(uint32_t st[25], const int r)
void keccak_f800(uint32_t* st)
{
// Complete all 22 rounds as a separate impl to
// evaluate only first 8 words is wasteful of regsters
// evaluate only first 8 words is wasteful of registers
for (int r = 0; r < 22; r++) {
keccak_f800_round(st, r);
}
@@ -181,7 +181,7 @@ __kernel void progpow_search(__global dag_t const* g_dag, __global uint* job_blo
for (int i = 10; i < 25; i++)
state[i] = ravencoin_rndc[i-10];
// Run intial keccak round
// Run initial keccak round
keccak_f800(state);
for (int i = 0; i < 8; i++)

File diff suppressed because it is too large Load Diff

View File

@@ -77,6 +77,7 @@ const char *Algorithm::kCN_UPX2 = "cn/upx2";
#ifdef XMRIG_ALGO_RANDOMX
const char *Algorithm::kRX = "rx";
const char *Algorithm::kRX_0 = "rx/0";
const char *Algorithm::kRX_V2 = "rx/2";
const char *Algorithm::kRX_WOW = "rx/wow";
const char *Algorithm::kRX_ARQ = "rx/arq";
const char *Algorithm::kRX_GRAFT = "rx/graft";
@@ -143,6 +144,7 @@ static const std::map<uint32_t, const char *> kAlgorithmNames = {
# ifdef XMRIG_ALGO_RANDOMX
ALGO_NAME(RX_0),
ALGO_NAME(RX_V2),
ALGO_NAME(RX_WOW),
ALGO_NAME(RX_ARQ),
ALGO_NAME(RX_GRAFT),
@@ -253,6 +255,8 @@ static const std::map<const char *, Algorithm::Id, aliasCompare> kAlgorithmAlias
ALGO_ALIAS(RX_0, "rx/test"),
ALGO_ALIAS(RX_0, "randomx"),
ALGO_ALIAS(RX_0, "rx"),
ALGO_ALIAS_AUTO(RX_V2), ALGO_ALIAS(RX_V2, "randomx/v2"),
ALGO_ALIAS(RX_V2, "rx/v2"),
ALGO_ALIAS_AUTO(RX_WOW), ALGO_ALIAS(RX_WOW, "randomx/wow"),
ALGO_ALIAS(RX_WOW, "randomwow"),
ALGO_ALIAS_AUTO(RX_ARQ), ALGO_ALIAS(RX_ARQ, "randomx/arq"),
@@ -350,7 +354,7 @@ std::vector<xmrig::Algorithm> xmrig::Algorithm::all(const std::function<bool(con
CN_HEAVY_0, CN_HEAVY_TUBE, CN_HEAVY_XHV,
CN_PICO_0, CN_PICO_TLO,
CN_UPX2,
RX_0, RX_WOW, RX_ARQ, RX_GRAFT, RX_SFX, RX_YADA,
RX_0, RX_V2, RX_WOW, RX_ARQ, RX_GRAFT, RX_SFX, RX_YADA,
AR2_CHUKWA, AR2_CHUKWA_V2, AR2_WRKZ,
KAWPOW_RVN,
GHOSTRIDER_RTM

View File

@@ -73,6 +73,7 @@ public:
CN_GR_5 = 0x63120105, // "cn/turtle-lite" GhostRider
GHOSTRIDER_RTM = 0x6c150000, // "ghostrider" GhostRider
RX_0 = 0x72151200, // "rx/0" RandomX (reference configuration).
RX_V2 = 0x72151202, // "rx/2" RandomX (Monero v2).
RX_WOW = 0x72141177, // "rx/wow" RandomWOW (Wownero).
RX_ARQ = 0x72121061, // "rx/arq" RandomARQ (Arqma).
RX_GRAFT = 0x72151267, // "rx/graft" RandomGRAFT (Graft).
@@ -139,6 +140,7 @@ public:
# ifdef XMRIG_ALGO_RANDOMX
static const char *kRX;
static const char *kRX_0;
static const char* kRX_V2;
static const char *kRX_WOW;
static const char *kRX_ARQ;
static const char *kRX_GRAFT;

View File

@@ -48,7 +48,7 @@
#define KECCAK_ROUNDS 24
/* *************************** Public Inteface ************************ */
/* *************************** Public Interface ************************ */
/* For Init or Reset call these: */
sha3_return_t

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -71,11 +71,11 @@ char *xmrig::Platform::createUserAgent()
#ifndef XMRIG_FEATURE_HWLOC
#ifdef __DragonFly__
#if defined(__DragonFly__) || defined(XMRIG_OS_OPENBSD) || defined(XMRIG_OS_HAIKU)
bool xmrig::Platform::setThreadAffinity(uint64_t cpu_id)
{
return true;
return false;
}
#else

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -18,14 +18,12 @@
#include <cstdio>
#ifdef _MSC_VER
# include "getopt/getopt.h"
#else
# include <getopt.h>
#endif
#include "base/kernel/config/BaseTransform.h"
#include "base/io/json/JsonChain.h"
#include "base/io/log/Log.h"
@@ -37,7 +35,6 @@
#include "base/net/stratum/Pools.h"
#include "core/config/Config_platform.h"
#ifdef XMRIG_FEATURE_TLS
# include "base/net/tls/TlsConfig.h"
#endif
@@ -47,9 +44,9 @@ void xmrig::BaseTransform::load(JsonChain &chain, Process *process, IConfigTrans
{
using namespace rapidjson;
int key = 0;
int argc = process->arguments().argc();
char **argv = process->arguments().argv();
int key = 0;
const int argc = process->arguments().argc();
char **argv = process->arguments().argv();
Document doc(kObjectType);
@@ -262,7 +259,8 @@ void xmrig::BaseTransform::transform(rapidjson::Document &doc, int key, const ch
case IConfig::DaemonKey: /* --daemon */
case IConfig::SubmitToOriginKey: /* --submit-to-origin */
case IConfig::VerboseKey: /* --verbose */
case IConfig::DnsIPv6Key: /* --dns-ipv6 */
case IConfig::DnsIPv4Key: /* --ipv4 */
case IConfig::DnsIPv6Key: /* --ipv6 */
return transformBoolean(doc, key, true);
case IConfig::ColorKey: /* --no-color */
@@ -323,8 +321,11 @@ void xmrig::BaseTransform::transformBoolean(rapidjson::Document &doc, int key, b
case IConfig::NoTitleKey: /* --no-title */
return set(doc, BaseConfig::kTitle, enable);
case IConfig::DnsIPv6Key: /* --dns-ipv6 */
return set(doc, DnsConfig::kField, DnsConfig::kIPv6, enable);
case IConfig::DnsIPv4Key: /* --ipv4 */
return set(doc, DnsConfig::kField, DnsConfig::kIPv, 4);
case IConfig::DnsIPv6Key: /* --ipv6 */
return set(doc, DnsConfig::kField, DnsConfig::kIPv, 6);
default:
break;

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -16,9 +16,7 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef XMRIG_ICONFIG_H
#define XMRIG_ICONFIG_H
#pragma once
#include "3rdparty/rapidjson/fwd.h"
@@ -82,7 +80,8 @@ public:
HugePageSizeKey = 1050,
PauseOnActiveKey = 1051,
SubmitToOriginKey = 1052,
DnsIPv6Key = 1053,
DnsIPv4Key = '4',
DnsIPv6Key = '6',
DnsTtlKey = 1054,
SpendSecretKey = 1055,
DaemonZMQPortKey = 1056,
@@ -177,7 +176,4 @@ public:
};
} /* namespace xmrig */
#endif // XMRIG_ICONFIG_H
} // namespace xmrig

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -16,21 +16,16 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef XMRIG_IDNSBACKEND_H
#define XMRIG_IDNSBACKEND_H
#pragma once
#include "base/tools/Object.h"
#include <memory>
namespace xmrig {
class DnsConfig;
class DnsRecords;
class DnsRequest;
class IDnsListener;
class String;
@@ -43,12 +38,8 @@ public:
IDnsBackend() = default;
virtual ~IDnsBackend() = default;
virtual const DnsRecords &records() const = 0;
virtual std::shared_ptr<DnsRequest> resolve(const String &host, IDnsListener *listener, uint64_t ttl) = 0;
virtual void resolve(const String &host, const std::weak_ptr<IDnsListener> &listener, const DnsConfig &config) = 0;
};
} /* namespace xmrig */
#endif // XMRIG_IDNSBACKEND_H
} // namespace xmrig

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -18,6 +18,7 @@
#include "base/net/dns/Dns.h"
#include "base/net/dns/DnsRequest.h"
#include "base/net/dns/DnsUvBackend.h"
@@ -25,17 +26,21 @@ namespace xmrig {
DnsConfig Dns::m_config;
std::map<String, std::shared_ptr<IDnsBackend> > Dns::m_backends;
std::map<String, std::shared_ptr<IDnsBackend>> Dns::m_backends;
} // namespace xmrig
std::shared_ptr<xmrig::DnsRequest> xmrig::Dns::resolve(const String &host, IDnsListener *listener, uint64_t ttl)
std::shared_ptr<xmrig::DnsRequest> xmrig::Dns::resolve(const String &host, IDnsListener *listener)
{
auto req = std::make_shared<DnsRequest>(listener);
if (m_backends.find(host) == m_backends.end()) {
m_backends.insert({ host, std::make_shared<DnsUvBackend>() });
}
return m_backends.at(host)->resolve(host, listener, ttl == 0 ? m_config.ttl() : ttl);
m_backends.at(host)->resolve(host, req, m_config);
return req;
}

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -43,7 +43,7 @@ public:
inline static const DnsConfig &config() { return m_config; }
inline static void set(const DnsConfig &config) { m_config = config; }
static std::shared_ptr<DnsRequest> resolve(const String &host, IDnsListener *listener, uint64_t ttl = 0);
static std::shared_ptr<DnsRequest> resolve(const String &host, IDnsListener *listener);
private:
static DnsConfig m_config;

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -20,15 +20,15 @@
#include "3rdparty/rapidjson/document.h"
#include "base/io/json/Json.h"
#include <algorithm>
#include <uv.h>
namespace xmrig {
const char *DnsConfig::kField = "dns";
const char *DnsConfig::kIPv6 = "ipv6";
const char *DnsConfig::kIPv = "ip_version";
const char *DnsConfig::kTTL = "ttl";
@@ -37,8 +37,26 @@ const char *DnsConfig::kTTL = "ttl";
xmrig::DnsConfig::DnsConfig(const rapidjson::Value &value)
{
m_ipv6 = Json::getBool(value, kIPv6, m_ipv6);
m_ttl = std::max(Json::getUint(value, kTTL, m_ttl), 1U);
const uint32_t ipv = Json::getUint(value, kIPv, m_ipv);
if (ipv == 0 || ipv == 4 || ipv == 6) {
m_ipv = ipv;
}
m_ttl = std::max(Json::getUint(value, kTTL, m_ttl), 1U);
}
int xmrig::DnsConfig::ai_family() const
{
if (m_ipv == 4) {
return AF_INET;
}
if (m_ipv == 6) {
return AF_INET6;
}
return AF_UNSPEC;
}
@@ -49,8 +67,8 @@ rapidjson::Value xmrig::DnsConfig::toJSON(rapidjson::Document &doc) const
auto &allocator = doc.GetAllocator();
Value obj(kObjectType);
obj.AddMember(StringRef(kIPv6), m_ipv6, allocator);
obj.AddMember(StringRef(kTTL), m_ttl, allocator);
obj.AddMember(StringRef(kIPv), m_ipv, allocator);
obj.AddMember(StringRef(kTTL), m_ttl, allocator);
return obj;
}

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -16,9 +16,7 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef XMRIG_DNSCONFIG_H
#define XMRIG_DNSCONFIG_H
#pragma once
#include "3rdparty/rapidjson/fwd.h"
@@ -30,25 +28,22 @@ class DnsConfig
{
public:
static const char *kField;
static const char *kIPv6;
static const char *kIPv;
static const char *kTTL;
DnsConfig() = default;
DnsConfig(const rapidjson::Value &value);
inline bool isIPv6() const { return m_ipv6; }
inline uint32_t ipv() const { return m_ipv; }
inline uint32_t ttl() const { return m_ttl * 1000U; }
int ai_family() const;
rapidjson::Value toJSON(rapidjson::Document &doc) const;
private:
bool m_ipv6 = false;
uint32_t m_ttl = 30U;
uint32_t m_ttl = 30U;
uint32_t m_ipv = 0U;
};
} /* namespace xmrig */
#endif /* XMRIG_DNSCONFIG_H */
} // namespace xmrig

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2023 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2023 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -16,19 +16,16 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <uv.h>
#include "base/net/dns/DnsRecord.h"
xmrig::DnsRecord::DnsRecord(const addrinfo *addr) :
m_type(addr->ai_family == AF_INET6 ? AAAA : (addr->ai_family == AF_INET ? A : Unknown))
xmrig::DnsRecord::DnsRecord(const addrinfo *addr)
{
static_assert(sizeof(m_data) >= sizeof(sockaddr_in6), "Not enough storage for IPv6 address.");
memcpy(m_data, addr->ai_addr, m_type == AAAA ? sizeof(sockaddr_in6) : sizeof(sockaddr_in));
memcpy(m_data, addr->ai_addr, addr->ai_family == AF_INET6 ? sizeof(sockaddr_in6) : sizeof(sockaddr_in));
}
@@ -44,7 +41,7 @@ xmrig::String xmrig::DnsRecord::ip() const
{
char *buf = nullptr;
if (m_type == AAAA) {
if (reinterpret_cast<const sockaddr &>(m_data).sa_family == AF_INET6) {
buf = new char[45]();
uv_ip6_name(reinterpret_cast<const sockaddr_in6*>(m_data), buf, 45);
}

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -16,14 +16,11 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef XMRIG_DNSRECORD_H
#define XMRIG_DNSRECORD_H
#pragma once
struct addrinfo;
struct sockaddr;
#include "base/tools/String.h"
@@ -33,28 +30,15 @@ namespace xmrig {
class DnsRecord
{
public:
enum Type : uint32_t {
Unknown,
A,
AAAA
};
DnsRecord() {}
DnsRecord(const addrinfo *addr);
const sockaddr *addr(uint16_t port = 0) const;
String ip() const;
inline bool isValid() const { return m_type != Unknown; }
inline Type type() const { return m_type; }
private:
mutable uint8_t m_data[28]{};
const Type m_type = Unknown;
};
} /* namespace xmrig */
#endif /* XMRIG_DNSRECORD_H */
} // namespace xmrig

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -18,90 +18,96 @@
#include <uv.h>
#include "base/net/dns/DnsRecords.h"
#include "base/net/dns/Dns.h"
const xmrig::DnsRecord &xmrig::DnsRecords::get(DnsRecord::Type prefered) const
namespace {
static size_t dns_records_count(const addrinfo *res, int &ai_family)
{
size_t ipv4 = 0;
size_t ipv6 = 0;
while (res != nullptr) {
if (res->ai_family == AF_INET) {
++ipv4;
}
if (res->ai_family == AF_INET6) {
++ipv6;
}
res = res->ai_next;
}
if (ai_family == AF_INET6 && !ipv6) {
ai_family = AF_INET;
}
switch (ai_family) {
case AF_UNSPEC:
return ipv4 + ipv6;
case AF_INET:
return ipv4;
case AF_INET6:
return ipv6;
default:
break;
}
return 0;
}
} // namespace
xmrig::DnsRecords::DnsRecords(const addrinfo *res, int ai_family)
{
size_t size = dns_records_count(res, ai_family);
if (!size) {
return;
}
m_records.reserve(size);
if (ai_family == AF_UNSPEC) {
while (res != nullptr) {
if (res->ai_family == AF_INET || res->ai_family == AF_INET6) {
m_records.emplace_back(res);
}
res = res->ai_next;
};
} else {
while (res != nullptr) {
if (res->ai_family == ai_family) {
m_records.emplace_back(res);
}
res = res->ai_next;
};
}
size = m_records.size();
if (size > 1) {
m_index = static_cast<size_t>(rand()) % size; // NOLINT(concurrency-mt-unsafe, cert-msc30-c, cert-msc50-cpp)
}
}
const xmrig::DnsRecord &xmrig::DnsRecords::get() const
{
static const DnsRecord defaultRecord;
if (isEmpty()) {
return defaultRecord;
}
const size_t ipv4 = m_ipv4.size();
const size_t ipv6 = m_ipv6.size();
if (ipv6 && (prefered == DnsRecord::AAAA || Dns::config().isIPv6() || !ipv4)) {
return m_ipv6[ipv6 == 1 ? 0 : static_cast<size_t>(rand()) % ipv6]; // NOLINT(concurrency-mt-unsafe, cert-msc30-c, cert-msc50-cpp)
}
if (ipv4) {
return m_ipv4[ipv4 == 1 ? 0 : static_cast<size_t>(rand()) % ipv4]; // NOLINT(concurrency-mt-unsafe, cert-msc30-c, cert-msc50-cpp)
const size_t size = m_records.size();
if (size > 0) {
return m_records[m_index++ % size];
}
return defaultRecord;
}
size_t xmrig::DnsRecords::count(DnsRecord::Type type) const
{
if (type == DnsRecord::A) {
return m_ipv4.size();
}
if (type == DnsRecord::AAAA) {
return m_ipv6.size();
}
return m_ipv4.size() + m_ipv6.size();
}
void xmrig::DnsRecords::clear()
{
m_ipv4.clear();
m_ipv6.clear();
}
void xmrig::DnsRecords::parse(addrinfo *res)
{
clear();
addrinfo *ptr = res;
size_t ipv4 = 0;
size_t ipv6 = 0;
while (ptr != nullptr) {
if (ptr->ai_family == AF_INET) {
++ipv4;
}
else if (ptr->ai_family == AF_INET6) {
++ipv6;
}
ptr = ptr->ai_next;
}
if (ipv4 == 0 && ipv6 == 0) {
return;
}
m_ipv4.reserve(ipv4);
m_ipv6.reserve(ipv6);
ptr = res;
while (ptr != nullptr) {
if (ptr->ai_family == AF_INET) {
m_ipv4.emplace_back(ptr);
}
else if (ptr->ai_family == AF_INET6) {
m_ipv6.emplace_back(ptr);
}
ptr = ptr->ai_next;
}
}

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -16,9 +16,7 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef XMRIG_DNSRECORDS_H
#define XMRIG_DNSRECORDS_H
#pragma once
#include "base/net/dns/DnsRecord.h"
@@ -29,20 +27,19 @@ namespace xmrig {
class DnsRecords
{
public:
inline bool isEmpty() const { return m_ipv4.empty() && m_ipv6.empty(); }
DnsRecords() = default;
DnsRecords(const addrinfo *res, int ai_family);
const DnsRecord &get(DnsRecord::Type prefered = DnsRecord::Unknown) const;
size_t count(DnsRecord::Type type = DnsRecord::Unknown) const;
void clear();
void parse(addrinfo *res);
inline bool isEmpty() const { return m_records.empty(); }
inline const std::vector<DnsRecord> &records() const { return m_records; }
inline size_t size() const { return m_records.size(); }
const DnsRecord &get() const;
private:
std::vector<DnsRecord> m_ipv4;
std::vector<DnsRecord> m_ipv6;
mutable size_t m_index = 0;
std::vector<DnsRecord> m_records;
};
} /* namespace xmrig */
#endif /* XMRIG_DNSRECORDS_H */
} // namespace xmrig

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -16,35 +16,30 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef XMRIG_DNSREQUEST_H
#define XMRIG_DNSREQUEST_H
#pragma once
#include "base/tools/Object.h"
#include <cstdint>
#include "base/kernel/interfaces/IDnsListener.h"
namespace xmrig {
class IDnsListener;
class DnsRequest
class DnsRequest : public IDnsListener
{
public:
XMRIG_DISABLE_COPY_MOVE_DEFAULT(DnsRequest)
DnsRequest(IDnsListener *listener) : listener(listener) {}
~DnsRequest() = default;
inline DnsRequest(IDnsListener *listener) : m_listener(listener) {}
~DnsRequest() override = default;
IDnsListener *listener;
protected:
inline void onResolved(const DnsRecords &records, int status, const char *error) override {
m_listener->onResolved(records, status, error);
}
private:
IDnsListener *m_listener;
};
} /* namespace xmrig */
#endif /* XMRIG_DNSREQUEST_H */
} // namespace xmrig

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2023 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2023 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -16,13 +16,11 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <uv.h>
#include "base/net/dns/DnsUvBackend.h"
#include "base/kernel/interfaces/IDnsListener.h"
#include "base/net/dns/DnsRequest.h"
#include "base/net/dns/DnsConfig.h"
#include "base/tools/Chrono.h"
@@ -73,21 +71,23 @@ xmrig::DnsUvBackend::~DnsUvBackend()
}
std::shared_ptr<xmrig::DnsRequest> xmrig::DnsUvBackend::resolve(const String &host, IDnsListener *listener, uint64_t ttl)
void xmrig::DnsUvBackend::resolve(const String &host, const std::weak_ptr<IDnsListener> &listener, const DnsConfig &config)
{
auto req = std::make_shared<DnsRequest>(listener);
m_queue.emplace_back(listener);
if (Chrono::currentMSecsSinceEpoch() - m_ts <= ttl && !m_records.isEmpty()) {
req->listener->onResolved(m_records, 0, nullptr);
} else {
m_queue.emplace(req);
if (Chrono::currentMSecsSinceEpoch() - m_ts <= config.ttl()) {
return notify();
}
if (m_queue.size() == 1 && !resolve(host)) {
done();
if (m_req) {
return;
}
return req;
m_ai_family = config.ai_family();
if (!resolve(host)) {
notify();
}
}
@@ -102,44 +102,46 @@ bool xmrig::DnsUvBackend::resolve(const String &host)
}
void xmrig::DnsUvBackend::done()
void xmrig::DnsUvBackend::notify()
{
const char *error = m_status < 0 ? uv_strerror(m_status) : nullptr;
while (!m_queue.empty()) {
auto req = std::move(m_queue.front()).lock();
if (req) {
req->listener->onResolved(m_records, m_status, error);
for (const auto &l : m_queue) {
auto listener = l.lock();
if (listener) {
listener->onResolved(m_records, m_status, error);
}
m_queue.pop();
}
m_queue.clear();
m_req.reset();
}
void xmrig::DnsUvBackend::onResolved(int status, addrinfo *res)
{
m_ts = Chrono::currentMSecsSinceEpoch();
m_status = status;
m_ts = Chrono::currentMSecsSinceEpoch();
if ((m_status = status) < 0) {
return done();
if (m_status < 0) {
m_records = {};
return notify();
}
m_records.parse(res);
m_records = { res, m_ai_family };
if (m_records.isEmpty()) {
m_status = UV_EAI_NONAME;
}
done();
notify();
}
void xmrig::DnsUvBackend::onResolved(uv_getaddrinfo_t *req, int status, addrinfo *res)
{
auto backend = getStorage().get(req->data);
auto *backend = getStorage().get(req->data);
if (backend) {
backend->onResolved(status, res);
}

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -16,16 +16,13 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef XMRIG_DNSUVBACKEND_H
#define XMRIG_DNSUVBACKEND_H
#pragma once
#include "base/kernel/interfaces/IDnsBackend.h"
#include "base/net/dns/DnsRecords.h"
#include "base/net/tools/Storage.h"
#include <queue>
#include <deque>
using uv_getaddrinfo_t = struct uv_getaddrinfo_s;
@@ -43,20 +40,19 @@ public:
~DnsUvBackend() override;
protected:
inline const DnsRecords &records() const override { return m_records; }
std::shared_ptr<DnsRequest> resolve(const String &host, IDnsListener *listener, uint64_t ttl) override;
void resolve(const String &host, const std::weak_ptr<IDnsListener> &listener, const DnsConfig &config) override;
private:
bool resolve(const String &host);
void done();
void notify();
void onResolved(int status, addrinfo *res);
static void onResolved(uv_getaddrinfo_t *req, int status, addrinfo *res);
DnsRecords m_records;
int m_ai_family = 0;
int m_status = 0;
std::queue<std::weak_ptr<DnsRequest> > m_queue;
std::deque<std::weak_ptr<IDnsListener>> m_queue;
std::shared_ptr<uv_getaddrinfo_t> m_req;
uint64_t m_ts = 0;
uintptr_t m_key;
@@ -66,7 +62,4 @@ private:
};
} /* namespace xmrig */
#endif /* XMRIG_DNSUVBACKEND_H */
} // namespace xmrig

View File

@@ -81,7 +81,7 @@ xmrig::Client::Client(int id, const char *agent, IClientListener *listener) :
BaseClient(id, listener),
m_agent(agent),
m_sendBuf(1024),
m_tempBuf(256)
m_tempBuf(320)
{
m_reader.setListener(this);
m_key = m_storage.add(this);
@@ -199,6 +199,7 @@ int64_t xmrig::Client::submit(const JobResult &result)
char *nonce = m_tempBuf.data();
char *data = m_tempBuf.data() + 16;
char *signature = m_tempBuf.data() + 88;
char *commitment = m_tempBuf.data() + 224;
Cvt::toHex(nonce, sizeof(uint32_t) * 2 + 1, reinterpret_cast<const uint8_t *>(&result.nonce), sizeof(uint32_t));
Cvt::toHex(data, 65, result.result(), 32);
@@ -206,6 +207,10 @@ int64_t xmrig::Client::submit(const JobResult &result)
if (result.minerSignature()) {
Cvt::toHex(signature, 129, result.minerSignature(), 64);
}
if (result.commitment()) {
Cvt::toHex(commitment, 65, result.commitment(), 32);
}
# endif
Document doc(kObjectType);
@@ -227,6 +232,16 @@ int64_t xmrig::Client::submit(const JobResult &result)
}
# endif
# ifndef XMRIG_PROXY_PROJECT
if (result.commitment()) {
params.AddMember("commitment", StringRef(commitment), allocator);
}
# else
if (result.commitment) {
params.AddMember("commitment", StringRef(result.commitment), allocator);
}
# endif
if (has<EXT_ALGO>() && result.algorithm.isValid()) {
params.AddMember("algo", StringRef(result.algorithm.name()), allocator);
}
@@ -554,6 +569,7 @@ int64_t xmrig::Client::send(size_t size)
}
m_expire = Chrono::steadyMSecs() + kResponseTimeout;
startTimeout();
return m_sequence++;
}
@@ -661,8 +677,6 @@ void xmrig::Client::onClose()
void xmrig::Client::parse(char *line, size_t len)
{
startTimeout();
LOG_DEBUG("[%s] received (%d bytes): \"%.*s\"", url(), len, static_cast<int>(len), line);
if (len < 22 || line[0] != '{') {
@@ -857,8 +871,6 @@ void xmrig::Client::parseResponse(int64_t id, const rapidjson::Value &result, co
void xmrig::Client::ping()
{
send(snprintf(m_sendBuf.data(), m_sendBuf.size(), "{\"id\":%" PRId64 ",\"jsonrpc\":\"2.0\",\"method\":\"keepalived\",\"params\":{\"id\":\"%s\"}}\n", m_sequence, m_rpcId.data()));
m_keepAlive = 0;
}

View File

@@ -410,6 +410,7 @@ bool xmrig::DaemonClient::parseJob(const rapidjson::Value &params, int *code)
m_blocktemplate.offset(BlockTemplate::TX_EXTRA_NONCE_OFFSET) - k,
m_blocktemplate.txExtraNonce().size(),
m_blocktemplate.minerTxMerkleTreeBranch(),
m_blocktemplate.minerTxMerkleTreePath(),
m_blocktemplate.outputType() == 3
);
# endif

View File

@@ -269,6 +269,7 @@ void xmrig::Job::copy(const Job &other)
m_minerTxExtraNonceOffset = other.m_minerTxExtraNonceOffset;
m_minerTxExtraNonceSize = other.m_minerTxExtraNonceSize;
m_minerTxMerkleTreeBranch = other.m_minerTxMerkleTreeBranch;
m_minerTxMerkleTreePath = other.m_minerTxMerkleTreePath;
m_hasViewTag = other.m_hasViewTag;
# else
memcpy(m_ephPublicKey, other.m_ephPublicKey, sizeof(m_ephPublicKey));
@@ -325,6 +326,7 @@ void xmrig::Job::move(Job &&other)
m_minerTxExtraNonceOffset = other.m_minerTxExtraNonceOffset;
m_minerTxExtraNonceSize = other.m_minerTxExtraNonceSize;
m_minerTxMerkleTreeBranch = std::move(other.m_minerTxMerkleTreeBranch);
m_minerTxMerkleTreePath = other.m_minerTxMerkleTreePath;
m_hasViewTag = other.m_hasViewTag;
# else
memcpy(m_ephPublicKey, other.m_ephPublicKey, sizeof(m_ephPublicKey));
@@ -349,7 +351,7 @@ void xmrig::Job::setSpendSecretKey(const uint8_t *key)
}
void xmrig::Job::setMinerTx(const uint8_t *begin, const uint8_t *end, size_t minerTxEphPubKeyOffset, size_t minerTxPubKeyOffset, size_t minerTxExtraNonceOffset, size_t minerTxExtraNonceSize, const Buffer &minerTxMerkleTreeBranch, bool hasViewTag)
void xmrig::Job::setMinerTx(const uint8_t *begin, const uint8_t *end, size_t minerTxEphPubKeyOffset, size_t minerTxPubKeyOffset, size_t minerTxExtraNonceOffset, size_t minerTxExtraNonceSize, const Buffer &minerTxMerkleTreeBranch, uint32_t minerTxMerkleTreePath, bool hasViewTag)
{
m_minerTxPrefix.assign(begin, end);
m_minerTxEphPubKeyOffset = minerTxEphPubKeyOffset;
@@ -357,6 +359,7 @@ void xmrig::Job::setMinerTx(const uint8_t *begin, const uint8_t *end, size_t min
m_minerTxExtraNonceOffset = minerTxExtraNonceOffset;
m_minerTxExtraNonceSize = minerTxExtraNonceSize;
m_minerTxMerkleTreeBranch = minerTxMerkleTreeBranch;
m_minerTxMerkleTreePath = minerTxMerkleTreePath;
m_hasViewTag = hasViewTag;
}
@@ -401,7 +404,7 @@ void xmrig::Job::generateHashingBlob(String &blob) const
{
uint8_t root_hash[32];
const uint8_t* p = m_minerTxPrefix.data();
BlockTemplate::calculateRootHash(p, p + m_minerTxPrefix.size(), m_minerTxMerkleTreeBranch, root_hash);
BlockTemplate::calculateRootHash(p, p + m_minerTxPrefix.size(), m_minerTxMerkleTreeBranch, m_minerTxMerkleTreePath, root_hash);
uint64_t root_hash_offset = nonceOffset() + nonceSize();

View File

@@ -121,7 +121,7 @@ public:
inline bool hasViewTag() const { return m_hasViewTag; }
void setSpendSecretKey(const uint8_t* key);
void setMinerTx(const uint8_t* begin, const uint8_t* end, size_t minerTxEphPubKeyOffset, size_t minerTxPubKeyOffset, size_t minerTxExtraNonceOffset, size_t minerTxExtraNonceSize, const Buffer& minerTxMerkleTreeBranch, bool hasViewTag);
void setMinerTx(const uint8_t* begin, const uint8_t* end, size_t minerTxEphPubKeyOffset, size_t minerTxPubKeyOffset, size_t minerTxExtraNonceOffset, size_t minerTxExtraNonceSize, const Buffer& minerTxMerkleTreeBranch, uint32_t minerTxMerkleTreePath, bool hasViewTag);
void setViewTagInMinerTx(uint8_t view_tag);
void setExtraNonceInMinerTx(uint32_t extra_nonce);
void generateSignatureData(String& signatureData, uint8_t& view_tag) const;
@@ -179,6 +179,7 @@ private:
size_t m_minerTxExtraNonceOffset = 0;
size_t m_minerTxExtraNonceSize = 0;
Buffer m_minerTxMerkleTreeBranch;
uint32_t m_minerTxMerkleTreePath = 0;
bool m_hasViewTag = false;
# else
// Miner signatures

View File

@@ -1,7 +1,7 @@
/* XMRig
* Copyright (c) 2018 Lee Clagett <https://github.com/vtnerd>
* Copyright (c) 2018-2023 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2023 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -45,7 +45,7 @@ namespace xmrig {
// https://wiki.openssl.org/index.php/Diffie-Hellman_parameters
#if OPENSSL_VERSION_NUMBER < 0x30000000L || defined(LIBRESSL_VERSION_NUMBER)
#if OPENSSL_VERSION_NUMBER < 0x30000000L || (defined(LIBRESSL_VERSION_NUMBER) && !defined(LIBRESSL_HAS_TLS1_3))
static DH *get_dh2048()
{
static unsigned char dhp_2048[] = {
@@ -152,7 +152,7 @@ bool xmrig::TlsContext::load(const TlsConfig &config)
SSL_CTX_set_options(m_ctx, SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3);
SSL_CTX_set_options(m_ctx, SSL_OP_CIPHER_SERVER_PREFERENCE);
# if OPENSSL_VERSION_NUMBER >= 0x1010100fL && !defined(LIBRESSL_VERSION_NUMBER)
# if OPENSSL_VERSION_NUMBER >= 0x1010100fL || defined(LIBRESSL_HAS_TLS1_3)
SSL_CTX_set_max_early_data(m_ctx, 0);
# endif
@@ -180,7 +180,7 @@ bool xmrig::TlsContext::setCipherSuites(const char *ciphersuites)
return true;
}
# if OPENSSL_VERSION_NUMBER >= 0x1010100fL && !defined(LIBRESSL_VERSION_NUMBER)
# if OPENSSL_VERSION_NUMBER >= 0x1010100fL || defined(LIBRESSL_HAS_TLS1_3)
if (SSL_CTX_set_ciphersuites(m_ctx, ciphersuites) == 1) {
return true;
}
@@ -194,7 +194,7 @@ bool xmrig::TlsContext::setCipherSuites(const char *ciphersuites)
bool xmrig::TlsContext::setDH(const char *dhparam)
{
# if OPENSSL_VERSION_NUMBER < 0x30000000L || defined(LIBRESSL_VERSION_NUMBER)
# if OPENSSL_VERSION_NUMBER < 0x30000000L || (defined(LIBRESSL_VERSION_NUMBER) && !defined(LIBRESSL_HAS_TLS1_3))
DH *dh = nullptr;
if (dhparam != nullptr) {

View File

@@ -48,69 +48,98 @@ void xmrig::BlockTemplate::calculateMinerTxHash(const uint8_t *prefix_begin, con
}
void xmrig::BlockTemplate::calculateRootHash(const uint8_t *prefix_begin, const uint8_t *prefix_end, const Buffer &miner_tx_merkle_tree_branch, uint8_t *root_hash)
void xmrig::BlockTemplate::calculateRootHash(const uint8_t *prefix_begin, const uint8_t *prefix_end, const Buffer &miner_tx_merkle_tree_branch, uint32_t miner_tx_merkle_tree_path, uint8_t *root_hash)
{
calculateMinerTxHash(prefix_begin, prefix_end, root_hash);
for (size_t i = 0; i < miner_tx_merkle_tree_branch.size(); i += kHashSize) {
const size_t depth = miner_tx_merkle_tree_branch.size() / kHashSize;
for (size_t d = 0; d < depth; ++d) {
uint8_t h[kHashSize * 2];
memcpy(h, root_hash, kHashSize);
memcpy(h + kHashSize, miner_tx_merkle_tree_branch.data() + i, kHashSize);
const uint32_t t = (miner_tx_merkle_tree_path >> (depth - d - 1)) & 1;
memcpy(h + kHashSize * t, root_hash, kHashSize);
memcpy(h + kHashSize * (t ^ 1), miner_tx_merkle_tree_branch.data() + d * kHashSize, kHashSize);
keccak(h, kHashSize * 2, root_hash, kHashSize);
}
}
void xmrig::BlockTemplate::calculateMerkleTreeHash()
void xmrig::BlockTemplate::calculateMerkleTreeHash(uint32_t index)
{
m_minerTxMerkleTreeBranch.clear();
m_minerTxMerkleTreePath = 0;
const uint64_t count = m_numHashes + 1;
const size_t count = m_hashes.size() / kHashSize;
const uint8_t *h = m_hashes.data();
if (count == 1) {
if (count == 1) {
memcpy(m_rootHash, h, kHashSize);
}
else if (count == 2) {
m_minerTxMerkleTreeBranch.insert(m_minerTxMerkleTreeBranch.end(), h + kHashSize, h + kHashSize * 2);
}
else if (count == 2) {
keccak(h, kHashSize * 2, m_rootHash, kHashSize);
}
else {
size_t i = 0;
size_t j = 0;
size_t cnt = 0;
for (i = 0, cnt = 1; cnt <= count; ++i, cnt <<= 1) {}
m_minerTxMerkleTreeBranch.reserve(1);
m_minerTxMerkleTreeBranch.insert(m_minerTxMerkleTreeBranch.end(), h + kHashSize * (index ^ 1), h + kHashSize * ((index ^ 1) + 1));
m_minerTxMerkleTreePath = static_cast<uint32_t>(index);
}
else {
uint8_t h2[kHashSize];
memcpy(h2, h + kHashSize * index, kHashSize);
cnt >>= 1;
size_t cnt = 1, proof_max_size = 0;
do {
cnt <<= 1;
++proof_max_size;
} while (cnt <= count);
cnt >>= 1;
m_minerTxMerkleTreeBranch.reserve(kHashSize * (i - 1));
m_minerTxMerkleTreeBranch.reserve(proof_max_size);
Buffer ints(cnt * kHashSize);
memcpy(ints.data(), h, (cnt * 2 - count) * kHashSize);
for (i = cnt * 2 - count, j = cnt * 2 - count; j < cnt; i += 2, ++j) {
if (i == 0) {
m_minerTxMerkleTreeBranch.insert(m_minerTxMerkleTreeBranch.end(), h + kHashSize, h + kHashSize * 2);
}
keccak(h + i * kHashSize, kHashSize * 2, ints.data() + j * kHashSize, kHashSize);
}
const size_t k = cnt * 2 - count;
memcpy(ints.data(), h, k * kHashSize);
while (cnt > 2) {
cnt >>= 1;
for (i = 0, j = 0; j < cnt; i += 2, ++j) {
if (i == 0) {
m_minerTxMerkleTreeBranch.insert(m_minerTxMerkleTreeBranch.end(), ints.data() + kHashSize, ints.data() + kHashSize * 2);
}
keccak(ints.data() + i * kHashSize, kHashSize * 2, ints.data() + j * kHashSize, kHashSize);
}
}
for (size_t i = k, j = k; j < cnt; i += 2, ++j) {
keccak(h + i * kHashSize, kHashSize * 2, ints.data() + j * kHashSize, kHashSize);
m_minerTxMerkleTreeBranch.insert(m_minerTxMerkleTreeBranch.end(), ints.data() + kHashSize, ints.data() + kHashSize * 2);
keccak(ints.data(), kHashSize * 2, m_rootHash, kHashSize);
}
if (memcmp(h + i * kHashSize, h2, kHashSize) == 0) {
m_minerTxMerkleTreeBranch.insert(m_minerTxMerkleTreeBranch.end(), h + kHashSize * (i + 1), h + kHashSize * (i + 2));
memcpy(h2, ints.data() + j * kHashSize, kHashSize);
}
else if (memcmp(h + (i + 1) * kHashSize, h2, kHashSize) == 0) {
m_minerTxMerkleTreeBranch.insert(m_minerTxMerkleTreeBranch.end(), h + kHashSize * i, h + kHashSize * (i + 1));
memcpy(h2, ints.data() + j * kHashSize, kHashSize);
m_minerTxMerkleTreePath = 1;
}
}
while (cnt >= 2) {
cnt >>= 1;
for (size_t i = 0, j = 0; j < cnt; i += 2, ++j) {
uint8_t tmp[kHashSize];
keccak(ints.data() + i * kHashSize, kHashSize * 2, tmp, kHashSize);
if (memcmp(ints.data() + i * kHashSize, h2, kHashSize) == 0) {
m_minerTxMerkleTreeBranch.insert(m_minerTxMerkleTreeBranch.end(), ints.data() + kHashSize * (i + 1), ints.data() + kHashSize * (i + 2));
memcpy(h2, tmp, kHashSize);
m_minerTxMerkleTreePath <<= 1;
}
else if (memcmp(ints.data() + (i + 1) * kHashSize, h2, kHashSize) == 0) {
m_minerTxMerkleTreeBranch.insert(m_minerTxMerkleTreeBranch.end(), ints.data() + kHashSize * i, ints.data() + kHashSize * (i + 1));
memcpy(h2, tmp, kHashSize);
m_minerTxMerkleTreePath = (m_minerTxMerkleTreePath << 1) | 1;
}
memcpy(ints.data() + j * kHashSize, tmp, kHashSize);
}
}
memcpy(m_rootHash, ints.data(), kHashSize);
}
}
@@ -241,8 +270,13 @@ bool xmrig::BlockTemplate::parse(bool hashes)
ar(m_amount);
ar(m_outputType);
// output type must be txout_to_key (2) or txout_to_tagged_key (3)
if ((m_outputType != 2) && (m_outputType != 3)) {
const bool is_fcmp_pp = (m_coin == Coin::MONERO) && (m_version.first >= 17);
// output type must be txout_to_key (2) or txout_to_tagged_key (3) for versions < 17, and txout_to_carrot_v1 (0) for version FCMP++
if (is_fcmp_pp && (m_outputType == 0)) {
// all good
}
else if ((m_outputType != 2) && (m_outputType != 3)) {
return false;
}
@@ -250,6 +284,11 @@ bool xmrig::BlockTemplate::parse(bool hashes)
ar(m_ephPublicKey, kKeySize);
if (is_fcmp_pp) {
ar(m_carrotViewTag);
ar(m_janusAnchor);
}
if (m_coin == Coin::ZEPHYR) {
if (m_outputType != 2) {
return false;
@@ -365,16 +404,42 @@ bool xmrig::BlockTemplate::parse(bool hashes)
ar(m_numHashes);
if (hashes) {
m_hashes.resize((m_numHashes + 1) * kHashSize);
calculateMinerTxHash(blob(MINER_TX_PREFIX_OFFSET), blob(MINER_TX_PREFIX_END_OFFSET), m_hashes.data());
// FCMP++ layout:
//
// index 0 fcmp_pp_n_tree_layers + 31 zero bytes
// index 1 fcmp_pp_tree_root
// index 2 coinbase transaction hash
// index 3+ other transaction hashes
//
// pre-FCMP++ layout:
//
// index 0 coinbase transaction hash
// index 1+ other transaction hashes
//
const uint32_t coinbase_tx_index = is_fcmp_pp ? 2 : 0;
m_hashes.clear();
m_hashes.resize((coinbase_tx_index + m_numHashes + 1) * kHashSize);
uint8_t* data = m_hashes.data() + coinbase_tx_index * kHashSize;
calculateMinerTxHash(blob(MINER_TX_PREFIX_OFFSET), blob(MINER_TX_PREFIX_END_OFFSET), data);
for (uint64_t i = 1; i <= m_numHashes; ++i) {
Span h;
ar(h, kHashSize);
memcpy(m_hashes.data() + i * kHashSize, h.data(), kHashSize);
memcpy(data + i * kHashSize, h.data(), kHashSize);
}
calculateMerkleTreeHash();
if (is_fcmp_pp) {
ar(m_FCMPTreeLayers);
ar(m_FCMPTreeRoot);
m_hashes[0] = m_FCMPTreeLayers;
memcpy(m_hashes.data() + kHashSize, m_FCMPTreeRoot, kHashSize);
}
calculateMerkleTreeHash(coinbase_tx_index);
}
return true;

View File

@@ -93,6 +93,7 @@ public:
inline uint64_t numHashes() const { return m_numHashes; }
inline const Buffer &hashes() const { return m_hashes; }
inline const Buffer &minerTxMerkleTreeBranch() const { return m_minerTxMerkleTreeBranch; }
inline uint32_t minerTxMerkleTreePath() const { return m_minerTxMerkleTreePath; }
inline const uint8_t *rootHash() const { return m_rootHash; }
inline Buffer generateHashingBlob() const
@@ -104,13 +105,13 @@ public:
}
static void calculateMinerTxHash(const uint8_t *prefix_begin, const uint8_t *prefix_end, uint8_t *hash);
static void calculateRootHash(const uint8_t *prefix_begin, const uint8_t *prefix_end, const Buffer &miner_tx_merkle_tree_branch, uint8_t *root_hash);
static void calculateRootHash(const uint8_t *prefix_begin, const uint8_t *prefix_end, const Buffer &miner_tx_merkle_tree_branch, uint32_t miner_tx_merkle_tree_path, uint8_t *root_hash);
bool parse(const Buffer &blocktemplate, const Coin &coin, bool hashes = kCalcHashes);
bool parse(const char *blocktemplate, size_t size, const Coin &coin, bool hashes);
bool parse(const rapidjson::Value &blocktemplate, const Coin &coin, bool hashes = kCalcHashes);
bool parse(const String &blocktemplate, const Coin &coin, bool hashes = kCalcHashes);
void calculateMerkleTreeHash();
void calculateMerkleTreeHash(uint32_t index);
void generateHashingBlob(Buffer &out) const;
private:
@@ -147,7 +148,12 @@ private:
uint64_t m_numHashes = 0;
Buffer m_hashes;
Buffer m_minerTxMerkleTreeBranch;
uint32_t m_minerTxMerkleTreePath = 0;
uint8_t m_rootHash[kHashSize]{};
uint8_t m_carrotViewTag[3]{};
uint8_t m_janusAnchor[16]{};
uint8_t m_FCMPTreeLayers = 0;
uint8_t m_FCMPTreeRoot[kHashSize]{};
};

View File

@@ -93,7 +93,7 @@
"dhparam": null
},
"dns": {
"ipv6": false,
"ip_version": 0,
"ttl": 30
},
"user-agent": null,

View File

@@ -59,6 +59,7 @@
# include "crypto/rx/Profiler.h"
# include "crypto/rx/Rx.h"
# include "crypto/rx/RxConfig.h"
# include "crypto/rx/RxAlgo.h"
#endif
@@ -556,11 +557,12 @@ void xmrig::Miner::setJob(const Job &job, bool donate)
}
# ifdef XMRIG_ALGO_RANDOMX
if (job.algorithm().family() == Algorithm::RANDOM_X && !Rx::isReady(job)) {
if (job.algorithm().family() == Algorithm::RANDOM_X) {
if (d_ptr->algorithm != job.algorithm()) {
stop();
RxAlgo::apply(job.algorithm());
}
else {
else if (!Rx::isReady(job)) {
Nonce::pause(true);
Nonce::touch();
}

View File

@@ -1,24 +1,11 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2026 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2026 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
* SPDX-License-Identifier: GPL-3.0-or-later
*/
#ifndef XMRIG_CONFIG_PLATFORM_H
#define XMRIG_CONFIG_PLATFORM_H
#pragma once
#ifdef _MSC_VER
# include "getopt/getopt.h"
@@ -28,13 +15,12 @@
#include "base/kernel/interfaces/IConfig.h"
#include "version.h"
namespace xmrig {
static const char short_options[] = "a:c:kBp:Px:r:R:s:t:T:o:u:O:v:l:Sx:";
static const char short_options[] = "a:c:kBp:x:r:R:s:t:T:o:u:O:v:l:S46";
static const option options[] = {
@@ -99,7 +85,8 @@ static const option options[] = {
{ "no-title", 0, nullptr, IConfig::NoTitleKey },
{ "pause-on-battery", 0, nullptr, IConfig::PauseOnBatteryKey },
{ "pause-on-active", 1, nullptr, IConfig::PauseOnActiveKey },
{ "dns-ipv6", 0, nullptr, IConfig::DnsIPv6Key },
{ "ipv4", 0, nullptr, IConfig::DnsIPv4Key },
{ "ipv6", 0, nullptr, IConfig::DnsIPv6Key },
{ "dns-ttl", 1, nullptr, IConfig::DnsTtlKey },
{ "spend-secret-key", 1, nullptr, IConfig::SpendSecretKey },
# ifdef XMRIG_FEATURE_BENCHMARK
@@ -169,6 +156,3 @@ static const option options[] = {
} // namespace xmrig
#endif /* XMRIG_CONFIG_PLATFORM_H */

View File

@@ -4,8 +4,8 @@
* Copyright (c) 2014 Lucas Jones <https://github.com/lucasjones>
* Copyright (c) 2014-2016 Wolf9466 <https://github.com/OhGodAPet>
* Copyright (c) 2016 Jay D Dee <jayddee246@gmail.com>
* Copyright (c) 2018-2024 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2024 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -21,13 +21,10 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef XMRIG_USAGE_H
#define XMRIG_USAGE_H
#pragma once
#include "version.h"
#include <string>
@@ -59,7 +56,8 @@ static inline const std::string &usage()
u += " --tls-fingerprint=HEX pool TLS certificate fingerprint for strict certificate pinning\n";
# endif
u += " --dns-ipv6 prefer IPv6 records from DNS responses\n";
u += " -4, --ipv4 resolve names to IPv4 addresses\n";
u += " -6, --ipv6 resolve names to IPv6 addresses\n";
u += " --dns-ttl=N N seconds (default: 30) TTL for internal DNS cache\n";
# ifdef XMRIG_FEATURE_HTTP
@@ -205,6 +203,4 @@ static inline const std::string &usage()
}
} /* namespace xmrig */
#endif /* XMRIG_USAGE_H */
} // namespace xmrig

View File

@@ -23,7 +23,7 @@
#include "crypto/common/VirtualMemory.h"
#if defined(XMRIG_ARM)
#if defined(XMRIG_ARM) || defined(XMRIG_RISCV)
# include "crypto/cn/CryptoNight_arm.h"
#else
# include "crypto/cn/CryptoNight_x86.h"

View File

@@ -30,7 +30,7 @@
#include <stddef.h>
#include <stdint.h>
#if defined _MSC_VER || defined XMRIG_ARM
#if defined _MSC_VER || defined XMRIG_ARM || defined XMRIG_RISCV
# define ABI_ATTRIBUTE
#else
# define ABI_ATTRIBUTE __attribute__((ms_abi))

View File

@@ -27,6 +27,9 @@
#ifndef XMRIG_CRYPTONIGHT_ARM_H
#define XMRIG_CRYPTONIGHT_ARM_H
#ifdef XMRIG_RISCV
# include "crypto/cn/sse2rvv.h"
#endif
#include "base/crypto/keccak.h"
#include "crypto/cn/CnAlgo.h"

View File

@@ -30,7 +30,7 @@
#include <math.h>
// VARIANT ALTERATIONS
#ifndef XMRIG_ARM
#if !defined(XMRIG_ARM) && !defined(XMRIG_RISCV)
# define VARIANT1_INIT(part) \
uint64_t tweak1_2_##part = 0; \
if (BASE == Algorithm::CN_1) { \
@@ -60,7 +60,7 @@
}
#ifndef XMRIG_ARM
#if !defined(XMRIG_ARM) && !defined(XMRIG_RISCV)
# define VARIANT2_INIT(part) \
__m128i division_result_xmm_##part = _mm_cvtsi64_si128(static_cast<int64_t>(h##part[12])); \
__m128i sqrt_result_xmm_##part = _mm_cvtsi64_si128(static_cast<int64_t>(h##part[13]));

View File

@@ -235,7 +235,7 @@ static HashReturn Init(hashState *state, int hashbitlen)
/*initialize the initial hash value of JH*/
state->hashbitlen = hashbitlen;
/*load the intital hash value into state*/
/*load the initial hash value into state*/
switch (hashbitlen)
{
case 224: memcpy(state->x,JH224_H0,128); break;

View File

@@ -48,7 +48,7 @@
multiple of size / 8)
ptr_cast(x,size) casts a pointer to a pointer to a
varaiable of length 'size' bits
variable of length 'size' bits
*/
#define ui_type(size) uint##size##_t

View File

@@ -29,6 +29,8 @@
#if defined(XMRIG_ARM)
# include "crypto/cn/sse2neon.h"
#elif defined(XMRIG_RISCV)
# include "crypto/cn/sse2rvv.h"
#elif defined(__GNUC__)
# include <x86intrin.h>
#else

748
src/crypto/cn/sse2rvv.h Normal file
View File

@@ -0,0 +1,748 @@
/* XMRig
* Copyright (c) 2025 Slayingripper <https://github.com/Slayingripper>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
/*
* SSE to RISC-V Vector (RVV) optimized compatibility header
* Provides both scalar fallback and vectorized implementations using RVV intrinsics
*
* Based on sse2neon.h concepts, adapted for RISC-V architecture with RVV extensions
* Original sse2neon.h: https://github.com/DLTcollab/sse2neon
*/
#ifndef XMRIG_SSE2RVV_OPTIMIZED_H
#define XMRIG_SSE2RVV_OPTIMIZED_H
#ifdef __cplusplus
extern "C" {
#endif
#include <stdint.h>
#include <string.h>
/* Check if RVV is available */
#if defined(__riscv_vector)
#include <riscv_vector.h>
#define USE_RVV_INTRINSICS 1
#else
#define USE_RVV_INTRINSICS 0
#endif
/* 128-bit vector type */
typedef union {
uint8_t u8[16];
uint16_t u16[8];
uint32_t u32[4];
uint64_t u64[2];
int8_t i8[16];
int16_t i16[8];
int32_t i32[4];
int64_t i64[2];
} __m128i_union;
typedef __m128i_union __m128i;
/* Set operations */
static inline __m128i _mm_set_epi32(int e3, int e2, int e1, int e0)
{
__m128i result;
result.i32[0] = e0;
result.i32[1] = e1;
result.i32[2] = e2;
result.i32[3] = e3;
return result;
}
static inline __m128i _mm_set_epi64x(int64_t e1, int64_t e0)
{
__m128i result;
result.i64[0] = e0;
result.i64[1] = e1;
return result;
}
static inline __m128i _mm_setzero_si128(void)
{
__m128i result;
memset(&result, 0, sizeof(result));
return result;
}
/* Extract/insert operations */
static inline int _mm_cvtsi128_si32(__m128i a)
{
return a.i32[0];
}
static inline int64_t _mm_cvtsi128_si64(__m128i a)
{
return a.i64[0];
}
static inline __m128i _mm_cvtsi32_si128(int a)
{
__m128i result = _mm_setzero_si128();
result.i32[0] = a;
return result;
}
static inline __m128i _mm_cvtsi64_si128(int64_t a)
{
__m128i result = _mm_setzero_si128();
result.i64[0] = a;
return result;
}
/* Shuffle operations */
static inline __m128i _mm_shuffle_epi32(__m128i a, int imm8)
{
__m128i result;
result.u32[0] = a.u32[(imm8 >> 0) & 0x3];
result.u32[1] = a.u32[(imm8 >> 2) & 0x3];
result.u32[2] = a.u32[(imm8 >> 4) & 0x3];
result.u32[3] = a.u32[(imm8 >> 6) & 0x3];
return result;
}
/* Logical operations - optimized with RVV when available */
static inline __m128i _mm_xor_si128(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vr = __riscv_vxor_vv_u64m1(va, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = a.u64[0] ^ b.u64[0];
result.u64[1] = a.u64[1] ^ b.u64[1];
return result;
#endif
}
static inline __m128i _mm_or_si128(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vr = __riscv_vor_vv_u64m1(va, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = a.u64[0] | b.u64[0];
result.u64[1] = a.u64[1] | b.u64[1];
return result;
#endif
}
static inline __m128i _mm_and_si128(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vr = __riscv_vand_vv_u64m1(va, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = a.u64[0] & b.u64[0];
result.u64[1] = a.u64[1] & b.u64[1];
return result;
#endif
}
static inline __m128i _mm_andnot_si128(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vnot_a = __riscv_vnot_v_u64m1(va, vl);
vuint64m1_t vr = __riscv_vand_vv_u64m1(vnot_a, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = (~a.u64[0]) & b.u64[0];
result.u64[1] = (~a.u64[1]) & b.u64[1];
return result;
#endif
}
/* Shift operations */
static inline __m128i _mm_slli_si128(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result = _mm_setzero_si128();
int count = imm8 & 0xFF;
if (count > 15) return result;
size_t vl = __riscv_vsetvl_e8m1(16);
vuint8m1_t va = __riscv_vle8_v_u8m1(a.u8, vl);
vuint8m1_t vr = __riscv_vslideup_vx_u8m1(__riscv_vmv_v_x_u8m1(0, vl), va, count, vl);
__riscv_vse8_v_u8m1(result.u8, vr, vl);
return result;
#else
__m128i result = _mm_setzero_si128();
int count = imm8 & 0xFF;
if (count > 15) return result;
for (int i = 0; i < 16 - count; i++) {
result.u8[i + count] = a.u8[i];
}
return result;
#endif
}
static inline __m128i _mm_srli_si128(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result = _mm_setzero_si128();
int count = imm8 & 0xFF;
if (count > 15) return result;
size_t vl = __riscv_vsetvl_e8m1(16);
vuint8m1_t va = __riscv_vle8_v_u8m1(a.u8, vl);
vuint8m1_t vr = __riscv_vslidedown_vx_u8m1(va, count, vl);
__riscv_vse8_v_u8m1(result.u8, vr, vl);
return result;
#else
__m128i result = _mm_setzero_si128();
int count = imm8 & 0xFF;
if (count > 15) return result;
for (int i = count; i < 16; i++) {
result.u8[i - count] = a.u8[i];
}
return result;
#endif
}
static inline __m128i _mm_slli_epi64(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result;
if (imm8 > 63) {
result.u64[0] = 0;
result.u64[1] = 0;
} else {
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vr = __riscv_vsll_vx_u64m1(va, imm8, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
}
return result;
#else
__m128i result;
if (imm8 > 63) {
result.u64[0] = 0;
result.u64[1] = 0;
} else {
result.u64[0] = a.u64[0] << imm8;
result.u64[1] = a.u64[1] << imm8;
}
return result;
#endif
}
static inline __m128i _mm_srli_epi64(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result;
if (imm8 > 63) {
result.u64[0] = 0;
result.u64[1] = 0;
} else {
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vr = __riscv_vsrl_vx_u64m1(va, imm8, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
}
return result;
#else
__m128i result;
if (imm8 > 63) {
result.u64[0] = 0;
result.u64[1] = 0;
} else {
result.u64[0] = a.u64[0] >> imm8;
result.u64[1] = a.u64[1] >> imm8;
}
return result;
#endif
}
/* Load/store operations - optimized with RVV */
static inline __m128i _mm_load_si128(const __m128i* p)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t v = __riscv_vle64_v_u64m1((const uint64_t*)p, vl);
__riscv_vse64_v_u64m1(result.u64, v, vl);
return result;
#else
__m128i result;
memcpy(&result, p, sizeof(__m128i));
return result;
#endif
}
static inline __m128i _mm_loadu_si128(const __m128i* p)
{
__m128i result;
memcpy(&result, p, sizeof(__m128i));
return result;
}
static inline void _mm_store_si128(__m128i* p, __m128i a)
{
#if USE_RVV_INTRINSICS
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t v = __riscv_vle64_v_u64m1(a.u64, vl);
__riscv_vse64_v_u64m1((uint64_t*)p, v, vl);
#else
memcpy(p, &a, sizeof(__m128i));
#endif
}
static inline void _mm_storeu_si128(__m128i* p, __m128i a)
{
memcpy(p, &a, sizeof(__m128i));
}
/* Arithmetic operations - optimized with RVV */
static inline __m128i _mm_add_epi64(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vr = __riscv_vadd_vv_u64m1(va, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = a.u64[0] + b.u64[0];
result.u64[1] = a.u64[1] + b.u64[1];
return result;
#endif
}
static inline __m128i _mm_add_epi32(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e32m1(4);
vuint32m1_t va = __riscv_vle32_v_u32m1(a.u32, vl);
vuint32m1_t vb = __riscv_vle32_v_u32m1(b.u32, vl);
vuint32m1_t vr = __riscv_vadd_vv_u32m1(va, vb, vl);
__riscv_vse32_v_u32m1(result.u32, vr, vl);
return result;
#else
__m128i result;
for (int i = 0; i < 4; i++) {
result.i32[i] = a.i32[i] + b.i32[i];
}
return result;
#endif
}
static inline __m128i _mm_sub_epi64(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vr = __riscv_vsub_vv_u64m1(va, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = a.u64[0] - b.u64[0];
result.u64[1] = a.u64[1] - b.u64[1];
return result;
#endif
}
static inline __m128i _mm_mul_epu32(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va_lo = __riscv_vzext_vf2_u64m1(__riscv_vle32_v_u32mf2(&a.u32[0], 2), vl);
vuint64m1_t vb_lo = __riscv_vzext_vf2_u64m1(__riscv_vle32_v_u32mf2(&b.u32[0], 2), vl);
vuint64m1_t vr = __riscv_vmul_vv_u64m1(va_lo, vb_lo, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = (uint64_t)a.u32[0] * (uint64_t)b.u32[0];
result.u64[1] = (uint64_t)a.u32[2] * (uint64_t)b.u32[2];
return result;
#endif
}
/* Unpack operations */
static inline __m128i _mm_unpacklo_epi64(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[0];
result.u64[1] = b.u64[0];
return result;
}
static inline __m128i _mm_unpackhi_epi64(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[1];
result.u64[1] = b.u64[1];
return result;
}
/* Pause instruction for spin-wait loops */
static inline void _mm_pause(void)
{
/* RISC-V pause hint if available (requires Zihintpause extension) */
#if defined(__riscv_zihintpause)
__asm__ __volatile__("pause");
#else
__asm__ __volatile__("nop");
#endif
}
/* Memory fence - optimized for RISC-V */
static inline void _mm_mfence(void)
{
__asm__ __volatile__("fence rw,rw" ::: "memory");
}
static inline void _mm_lfence(void)
{
__asm__ __volatile__("fence r,r" ::: "memory");
}
static inline void _mm_sfence(void)
{
__asm__ __volatile__("fence w,w" ::: "memory");
}
/* Comparison operations */
static inline __m128i _mm_cmpeq_epi32(__m128i a, __m128i b)
{
__m128i result;
for (int i = 0; i < 4; i++) {
result.u32[i] = (a.u32[i] == b.u32[i]) ? 0xFFFFFFFF : 0;
}
return result;
}
static inline __m128i _mm_cmpeq_epi64(__m128i a, __m128i b)
{
__m128i result;
for (int i = 0; i < 2; i++) {
result.u64[i] = (a.u64[i] == b.u64[i]) ? 0xFFFFFFFFFFFFFFFFULL : 0;
}
return result;
}
/* Additional shift operations */
static inline __m128i _mm_slli_epi32(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result;
if (imm8 > 31) {
memset(&result, 0, sizeof(result));
} else {
size_t vl = __riscv_vsetvl_e32m1(4);
vuint32m1_t va = __riscv_vle32_v_u32m1(a.u32, vl);
vuint32m1_t vr = __riscv_vsll_vx_u32m1(va, imm8, vl);
__riscv_vse32_v_u32m1(result.u32, vr, vl);
}
return result;
#else
__m128i result;
if (imm8 > 31) {
for (int i = 0; i < 4; i++) result.u32[i] = 0;
} else {
for (int i = 0; i < 4; i++) {
result.u32[i] = a.u32[i] << imm8;
}
}
return result;
#endif
}
static inline __m128i _mm_srli_epi32(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result;
if (imm8 > 31) {
memset(&result, 0, sizeof(result));
} else {
size_t vl = __riscv_vsetvl_e32m1(4);
vuint32m1_t va = __riscv_vle32_v_u32m1(a.u32, vl);
vuint32m1_t vr = __riscv_vsrl_vx_u32m1(va, imm8, vl);
__riscv_vse32_v_u32m1(result.u32, vr, vl);
}
return result;
#else
__m128i result;
if (imm8 > 31) {
for (int i = 0; i < 4; i++) result.u32[i] = 0;
} else {
for (int i = 0; i < 4; i++) {
result.u32[i] = a.u32[i] >> imm8;
}
}
return result;
#endif
}
/* 64-bit integer operations */
static inline __m128i _mm_set1_epi64x(int64_t a)
{
__m128i result;
result.i64[0] = a;
result.i64[1] = a;
return result;
}
/* Float type for compatibility */
typedef __m128i __m128;
/* Float operations - simplified scalar implementations */
static inline __m128 _mm_set1_ps(float a)
{
__m128 result;
uint32_t val;
memcpy(&val, &a, sizeof(float));
for (int i = 0; i < 4; i++) {
result.u32[i] = val;
}
return result;
}
static inline __m128 _mm_setzero_ps(void)
{
__m128 result;
memset(&result, 0, sizeof(result));
return result;
}
static inline __m128 _mm_add_ps(__m128 a, __m128 b)
{
__m128 result;
float fa[4], fb[4], fr[4];
memcpy(fa, &a, sizeof(__m128));
memcpy(fb, &b, sizeof(__m128));
for (int i = 0; i < 4; i++) {
fr[i] = fa[i] + fb[i];
}
memcpy(&result, fr, sizeof(__m128));
return result;
}
static inline __m128 _mm_mul_ps(__m128 a, __m128 b)
{
__m128 result;
float fa[4], fb[4], fr[4];
memcpy(fa, &a, sizeof(__m128));
memcpy(fb, &b, sizeof(__m128));
for (int i = 0; i < 4; i++) {
fr[i] = fa[i] * fb[i];
}
memcpy(&result, fr, sizeof(__m128));
return result;
}
static inline __m128 _mm_and_ps(__m128 a, __m128 b)
{
__m128 result;
result.u64[0] = a.u64[0] & b.u64[0];
result.u64[1] = a.u64[1] & b.u64[1];
return result;
}
static inline __m128 _mm_or_ps(__m128 a, __m128 b)
{
__m128 result;
result.u64[0] = a.u64[0] | b.u64[0];
result.u64[1] = a.u64[1] | b.u64[1];
return result;
}
static inline __m128 _mm_cvtepi32_ps(__m128i a)
{
__m128 result;
float fr[4];
for (int i = 0; i < 4; i++) {
fr[i] = (float)a.i32[i];
}
memcpy(&result, fr, sizeof(__m128));
return result;
}
static inline __m128i _mm_cvttps_epi32(__m128 a)
{
__m128i result;
float fa[4];
memcpy(fa, &a, sizeof(__m128));
for (int i = 0; i < 4; i++) {
result.i32[i] = (int32_t)fa[i];
}
return result;
}
/* Casting operations */
static inline __m128 _mm_castsi128_ps(__m128i a)
{
__m128 result;
memcpy(&result, &a, sizeof(__m128));
return result;
}
static inline __m128i _mm_castps_si128(__m128 a)
{
__m128i result;
memcpy(&result, &a, sizeof(__m128));
return result;
}
/* Additional set operations */
static inline __m128i _mm_set1_epi32(int a)
{
__m128i result;
for (int i = 0; i < 4; i++) {
result.i32[i] = a;
}
return result;
}
/* AES instructions - placeholders for soft_aes compatibility */
static inline __m128i _mm_aesenc_si128(__m128i a, __m128i roundkey)
{
return _mm_xor_si128(a, roundkey);
}
static inline __m128i _mm_aeskeygenassist_si128(__m128i a, const int rcon)
{
return a;
}
/* Rotate right operation for soft_aes.h */
static inline uint32_t _rotr(uint32_t value, unsigned int count)
{
const unsigned int mask = 31;
count &= mask;
return (value >> count) | (value << ((-count) & mask));
}
/* ARM NEON compatibility types and intrinsics for RISC-V */
typedef __m128i_union uint64x2_t;
typedef __m128i_union uint8x16_t;
typedef __m128i_union int64x2_t;
typedef __m128i_union int32x4_t;
static inline uint64x2_t vld1q_u64(const uint64_t *ptr)
{
uint64x2_t result;
result.u64[0] = ptr[0];
result.u64[1] = ptr[1];
return result;
}
static inline int64x2_t vld1q_s64(const int64_t *ptr)
{
int64x2_t result;
result.i64[0] = ptr[0];
result.i64[1] = ptr[1];
return result;
}
static inline void vst1q_u64(uint64_t *ptr, uint64x2_t val)
{
ptr[0] = val.u64[0];
ptr[1] = val.u64[1];
}
static inline uint64x2_t veorq_u64(uint64x2_t a, uint64x2_t b)
{
return _mm_xor_si128(a, b);
}
static inline uint64x2_t vaddq_u64(uint64x2_t a, uint64x2_t b)
{
return _mm_add_epi64(a, b);
}
static inline uint64x2_t vreinterpretq_u64_u8(uint8x16_t a)
{
uint64x2_t result;
memcpy(&result, &a, sizeof(uint64x2_t));
return result;
}
static inline uint64_t vgetq_lane_u64(uint64x2_t v, int lane)
{
return v.u64[lane];
}
static inline int64_t vgetq_lane_s64(int64x2_t v, int lane)
{
return v.i64[lane];
}
static inline int32_t vgetq_lane_s32(int32x4_t v, int lane)
{
return v.i32[lane];
}
typedef struct { uint64_t val[1]; } uint64x1_t;
static inline uint64x1_t vcreate_u64(uint64_t a)
{
uint64x1_t result;
result.val[0] = a;
return result;
}
static inline uint64x2_t vcombine_u64(uint64x1_t low, uint64x1_t high)
{
uint64x2_t result;
result.u64[0] = low.val[0];
result.u64[1] = high.val[0];
return result;
}
#ifdef __cplusplus
}
#endif
#endif /* XMRIG_SSE2RVV_OPTIMIZED_H */

View File

@@ -0,0 +1,748 @@
/* XMRig
* Copyright (c) 2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
/*
* SSE to RISC-V Vector (RVV) optimized compatibility header
* Provides both scalar fallback and vectorized implementations using RVV intrinsics
*/
#ifndef XMRIG_SSE2RVV_OPTIMIZED_H
#define XMRIG_SSE2RVV_OPTIMIZED_H
#ifdef __cplusplus
extern "C" {
#endif
#include <stdint.h>
#include <string.h>
/* Check if RVV is available */
#if defined(__riscv_vector)
#include <riscv_vector.h>
#define USE_RVV_INTRINSICS 1
#else
#define USE_RVV_INTRINSICS 0
#endif
/* 128-bit vector type */
typedef union {
uint8_t u8[16];
uint16_t u16[8];
uint32_t u32[4];
uint64_t u64[2];
int8_t i8[16];
int16_t i16[8];
int32_t i32[4];
int64_t i64[2];
#if USE_RVV_INTRINSICS
vuint64m1_t rvv_u64;
vuint32m1_t rvv_u32;
vuint8m1_t rvv_u8;
#endif
} __m128i_union;
typedef __m128i_union __m128i;
/* Set operations */
static inline __m128i _mm_set_epi32(int e3, int e2, int e1, int e0)
{
__m128i result;
result.i32[0] = e0;
result.i32[1] = e1;
result.i32[2] = e2;
result.i32[3] = e3;
return result;
}
static inline __m128i _mm_set_epi64x(int64_t e1, int64_t e0)
{
__m128i result;
result.i64[0] = e0;
result.i64[1] = e1;
return result;
}
static inline __m128i _mm_setzero_si128(void)
{
__m128i result;
memset(&result, 0, sizeof(result));
return result;
}
/* Extract/insert operations */
static inline int _mm_cvtsi128_si32(__m128i a)
{
return a.i32[0];
}
static inline int64_t _mm_cvtsi128_si64(__m128i a)
{
return a.i64[0];
}
static inline __m128i _mm_cvtsi32_si128(int a)
{
__m128i result = _mm_setzero_si128();
result.i32[0] = a;
return result;
}
static inline __m128i _mm_cvtsi64_si128(int64_t a)
{
__m128i result = _mm_setzero_si128();
result.i64[0] = a;
return result;
}
/* Shuffle operations */
static inline __m128i _mm_shuffle_epi32(__m128i a, int imm8)
{
__m128i result;
result.u32[0] = a.u32[(imm8 >> 0) & 0x3];
result.u32[1] = a.u32[(imm8 >> 2) & 0x3];
result.u32[2] = a.u32[(imm8 >> 4) & 0x3];
result.u32[3] = a.u32[(imm8 >> 6) & 0x3];
return result;
}
/* Logical operations - optimized with RVV when available */
static inline __m128i _mm_xor_si128(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vr = __riscv_vxor_vv_u64m1(va, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = a.u64[0] ^ b.u64[0];
result.u64[1] = a.u64[1] ^ b.u64[1];
return result;
#endif
}
static inline __m128i _mm_or_si128(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vr = __riscv_vor_vv_u64m1(va, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = a.u64[0] | b.u64[0];
result.u64[1] = a.u64[1] | b.u64[1];
return result;
#endif
}
static inline __m128i _mm_and_si128(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vr = __riscv_vand_vv_u64m1(va, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = a.u64[0] & b.u64[0];
result.u64[1] = a.u64[1] & b.u64[1];
return result;
#endif
}
static inline __m128i _mm_andnot_si128(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vnot_a = __riscv_vnot_v_u64m1(va, vl);
vuint64m1_t vr = __riscv_vand_vv_u64m1(vnot_a, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = (~a.u64[0]) & b.u64[0];
result.u64[1] = (~a.u64[1]) & b.u64[1];
return result;
#endif
}
/* Shift operations */
static inline __m128i _mm_slli_si128(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result = _mm_setzero_si128();
int count = imm8 & 0xFF;
if (count > 15) return result;
size_t vl = __riscv_vsetvl_e8m1(16);
vuint8m1_t va = __riscv_vle8_v_u8m1(a.u8, vl);
vuint8m1_t vr = __riscv_vslideup_vx_u8m1(__riscv_vmv_v_x_u8m1(0, vl), va, count, vl);
__riscv_vse8_v_u8m1(result.u8, vr, vl);
return result;
#else
__m128i result = _mm_setzero_si128();
int count = imm8 & 0xFF;
if (count > 15) return result;
for (int i = 0; i < 16 - count; i++) {
result.u8[i + count] = a.u8[i];
}
return result;
#endif
}
static inline __m128i _mm_srli_si128(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result = _mm_setzero_si128();
int count = imm8 & 0xFF;
if (count > 15) return result;
size_t vl = __riscv_vsetvl_e8m1(16);
vuint8m1_t va = __riscv_vle8_v_u8m1(a.u8, vl);
vuint8m1_t vr = __riscv_vslidedown_vx_u8m1(va, count, vl);
__riscv_vse8_v_u8m1(result.u8, vr, vl);
return result;
#else
__m128i result = _mm_setzero_si128();
int count = imm8 & 0xFF;
if (count > 15) return result;
for (int i = count; i < 16; i++) {
result.u8[i - count] = a.u8[i];
}
return result;
#endif
}
static inline __m128i _mm_slli_epi64(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result;
if (imm8 > 63) {
result.u64[0] = 0;
result.u64[1] = 0;
} else {
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vr = __riscv_vsll_vx_u64m1(va, imm8, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
}
return result;
#else
__m128i result;
if (imm8 > 63) {
result.u64[0] = 0;
result.u64[1] = 0;
} else {
result.u64[0] = a.u64[0] << imm8;
result.u64[1] = a.u64[1] << imm8;
}
return result;
#endif
}
static inline __m128i _mm_srli_epi64(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result;
if (imm8 > 63) {
result.u64[0] = 0;
result.u64[1] = 0;
} else {
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vr = __riscv_vsrl_vx_u64m1(va, imm8, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
}
return result;
#else
__m128i result;
if (imm8 > 63) {
result.u64[0] = 0;
result.u64[1] = 0;
} else {
result.u64[0] = a.u64[0] >> imm8;
result.u64[1] = a.u64[1] >> imm8;
}
return result;
#endif
}
/* Load/store operations - optimized with RVV */
static inline __m128i _mm_load_si128(const __m128i* p)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t v = __riscv_vle64_v_u64m1((const uint64_t*)p, vl);
__riscv_vse64_v_u64m1(result.u64, v, vl);
return result;
#else
__m128i result;
memcpy(&result, p, sizeof(__m128i));
return result;
#endif
}
static inline __m128i _mm_loadu_si128(const __m128i* p)
{
__m128i result;
memcpy(&result, p, sizeof(__m128i));
return result;
}
static inline void _mm_store_si128(__m128i* p, __m128i a)
{
#if USE_RVV_INTRINSICS
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t v = __riscv_vle64_v_u64m1(a.u64, vl);
__riscv_vse64_v_u64m1((uint64_t*)p, v, vl);
#else
memcpy(p, &a, sizeof(__m128i));
#endif
}
static inline void _mm_storeu_si128(__m128i* p, __m128i a)
{
memcpy(p, &a, sizeof(__m128i));
}
/* Arithmetic operations - optimized with RVV */
static inline __m128i _mm_add_epi64(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vr = __riscv_vadd_vv_u64m1(va, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = a.u64[0] + b.u64[0];
result.u64[1] = a.u64[1] + b.u64[1];
return result;
#endif
}
static inline __m128i _mm_add_epi32(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e32m1(4);
vuint32m1_t va = __riscv_vle32_v_u32m1(a.u32, vl);
vuint32m1_t vb = __riscv_vle32_v_u32m1(b.u32, vl);
vuint32m1_t vr = __riscv_vadd_vv_u32m1(va, vb, vl);
__riscv_vse32_v_u32m1(result.u32, vr, vl);
return result;
#else
__m128i result;
for (int i = 0; i < 4; i++) {
result.i32[i] = a.i32[i] + b.i32[i];
}
return result;
#endif
}
static inline __m128i _mm_sub_epi64(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va = __riscv_vle64_v_u64m1(a.u64, vl);
vuint64m1_t vb = __riscv_vle64_v_u64m1(b.u64, vl);
vuint64m1_t vr = __riscv_vsub_vv_u64m1(va, vb, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = a.u64[0] - b.u64[0];
result.u64[1] = a.u64[1] - b.u64[1];
return result;
#endif
}
static inline __m128i _mm_mul_epu32(__m128i a, __m128i b)
{
#if USE_RVV_INTRINSICS
__m128i result;
size_t vl = __riscv_vsetvl_e64m1(2);
vuint64m1_t va_lo = __riscv_vzext_vf2_u64m1(__riscv_vle32_v_u32mf2(&a.u32[0], 2), vl);
vuint64m1_t vb_lo = __riscv_vzext_vf2_u64m1(__riscv_vle32_v_u32mf2(&b.u32[0], 2), vl);
vuint64m1_t vr = __riscv_vmul_vv_u64m1(va_lo, vb_lo, vl);
__riscv_vse64_v_u64m1(result.u64, vr, vl);
return result;
#else
__m128i result;
result.u64[0] = (uint64_t)a.u32[0] * (uint64_t)b.u32[0];
result.u64[1] = (uint64_t)a.u32[2] * (uint64_t)b.u32[2];
return result;
#endif
}
/* Unpack operations */
static inline __m128i _mm_unpacklo_epi64(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[0];
result.u64[1] = b.u64[0];
return result;
}
static inline __m128i _mm_unpackhi_epi64(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[1];
result.u64[1] = b.u64[1];
return result;
}
/* Pause instruction for spin-wait loops */
static inline void _mm_pause(void)
{
/* RISC-V pause hint if available (requires Zihintpause extension) */
#if defined(__riscv_zihintpause)
__asm__ __volatile__("pause");
#else
__asm__ __volatile__("nop");
#endif
}
/* Memory fence - optimized for RISC-V */
static inline void _mm_mfence(void)
{
__asm__ __volatile__("fence rw,rw" ::: "memory");
}
static inline void _mm_lfence(void)
{
__asm__ __volatile__("fence r,r" ::: "memory");
}
static inline void _mm_sfence(void)
{
__asm__ __volatile__("fence w,w" ::: "memory");
}
/* Comparison operations */
static inline __m128i _mm_cmpeq_epi32(__m128i a, __m128i b)
{
__m128i result;
for (int i = 0; i < 4; i++) {
result.u32[i] = (a.u32[i] == b.u32[i]) ? 0xFFFFFFFF : 0;
}
return result;
}
static inline __m128i _mm_cmpeq_epi64(__m128i a, __m128i b)
{
__m128i result;
for (int i = 0; i < 2; i++) {
result.u64[i] = (a.u64[i] == b.u64[i]) ? 0xFFFFFFFFFFFFFFFFULL : 0;
}
return result;
}
/* Additional shift operations */
static inline __m128i _mm_slli_epi32(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result;
if (imm8 > 31) {
memset(&result, 0, sizeof(result));
} else {
size_t vl = __riscv_vsetvl_e32m1(4);
vuint32m1_t va = __riscv_vle32_v_u32m1(a.u32, vl);
vuint32m1_t vr = __riscv_vsll_vx_u32m1(va, imm8, vl);
__riscv_vse32_v_u32m1(result.u32, vr, vl);
}
return result;
#else
__m128i result;
if (imm8 > 31) {
for (int i = 0; i < 4; i++) result.u32[i] = 0;
} else {
for (int i = 0; i < 4; i++) {
result.u32[i] = a.u32[i] << imm8;
}
}
return result;
#endif
}
static inline __m128i _mm_srli_epi32(__m128i a, int imm8)
{
#if USE_RVV_INTRINSICS
__m128i result;
if (imm8 > 31) {
memset(&result, 0, sizeof(result));
} else {
size_t vl = __riscv_vsetvl_e32m1(4);
vuint32m1_t va = __riscv_vle32_v_u32m1(a.u32, vl);
vuint32m1_t vr = __riscv_vsrl_vx_u32m1(va, imm8, vl);
__riscv_vse32_v_u32m1(result.u32, vr, vl);
}
return result;
#else
__m128i result;
if (imm8 > 31) {
for (int i = 0; i < 4; i++) result.u32[i] = 0;
} else {
for (int i = 0; i < 4; i++) {
result.u32[i] = a.u32[i] >> imm8;
}
}
return result;
#endif
}
/* 64-bit integer operations */
static inline __m128i _mm_set1_epi64x(int64_t a)
{
__m128i result;
result.i64[0] = a;
result.i64[1] = a;
return result;
}
/* Float type for compatibility */
typedef __m128i __m128;
/* Float operations - simplified scalar implementations */
static inline __m128 _mm_set1_ps(float a)
{
__m128 result;
uint32_t val;
memcpy(&val, &a, sizeof(float));
for (int i = 0; i < 4; i++) {
result.u32[i] = val;
}
return result;
}
static inline __m128 _mm_setzero_ps(void)
{
__m128 result;
memset(&result, 0, sizeof(result));
return result;
}
static inline __m128 _mm_add_ps(__m128 a, __m128 b)
{
__m128 result;
float fa[4], fb[4], fr[4];
memcpy(fa, &a, sizeof(__m128));
memcpy(fb, &b, sizeof(__m128));
for (int i = 0; i < 4; i++) {
fr[i] = fa[i] + fb[i];
}
memcpy(&result, fr, sizeof(__m128));
return result;
}
static inline __m128 _mm_mul_ps(__m128 a, __m128 b)
{
__m128 result;
float fa[4], fb[4], fr[4];
memcpy(fa, &a, sizeof(__m128));
memcpy(fb, &b, sizeof(__m128));
for (int i = 0; i < 4; i++) {
fr[i] = fa[i] * fb[i];
}
memcpy(&result, fr, sizeof(__m128));
return result;
}
static inline __m128 _mm_and_ps(__m128 a, __m128 b)
{
__m128 result;
result.u64[0] = a.u64[0] & b.u64[0];
result.u64[1] = a.u64[1] & b.u64[1];
return result;
}
static inline __m128 _mm_or_ps(__m128 a, __m128 b)
{
__m128 result;
result.u64[0] = a.u64[0] | b.u64[0];
result.u64[1] = a.u64[1] | b.u64[1];
return result;
}
static inline __m128 _mm_cvtepi32_ps(__m128i a)
{
__m128 result;
float fr[4];
for (int i = 0; i < 4; i++) {
fr[i] = (float)a.i32[i];
}
memcpy(&result, fr, sizeof(__m128));
return result;
}
static inline __m128i _mm_cvttps_epi32(__m128 a)
{
__m128i result;
float fa[4];
memcpy(fa, &a, sizeof(__m128));
for (int i = 0; i < 4; i++) {
result.i32[i] = (int32_t)fa[i];
}
return result;
}
/* Casting operations */
static inline __m128 _mm_castsi128_ps(__m128i a)
{
__m128 result;
memcpy(&result, &a, sizeof(__m128));
return result;
}
static inline __m128i _mm_castps_si128(__m128 a)
{
__m128i result;
memcpy(&result, &a, sizeof(__m128));
return result;
}
/* Additional set operations */
static inline __m128i _mm_set1_epi32(int a)
{
__m128i result;
for (int i = 0; i < 4; i++) {
result.i32[i] = a;
}
return result;
}
/* AES instructions - placeholders for soft_aes compatibility */
static inline __m128i _mm_aesenc_si128(__m128i a, __m128i roundkey)
{
return _mm_xor_si128(a, roundkey);
}
static inline __m128i _mm_aeskeygenassist_si128(__m128i a, const int rcon)
{
return a;
}
/* Rotate right operation for soft_aes.h */
static inline uint32_t _rotr(uint32_t value, unsigned int count)
{
const unsigned int mask = 31;
count &= mask;
return (value >> count) | (value << ((-count) & mask));
}
/* ARM NEON compatibility types and intrinsics for RISC-V */
typedef __m128i_union uint64x2_t;
typedef __m128i_union uint8x16_t;
typedef __m128i_union int64x2_t;
typedef __m128i_union int32x4_t;
static inline uint64x2_t vld1q_u64(const uint64_t *ptr)
{
uint64x2_t result;
result.u64[0] = ptr[0];
result.u64[1] = ptr[1];
return result;
}
static inline int64x2_t vld1q_s64(const int64_t *ptr)
{
int64x2_t result;
result.i64[0] = ptr[0];
result.i64[1] = ptr[1];
return result;
}
static inline void vst1q_u64(uint64_t *ptr, uint64x2_t val)
{
ptr[0] = val.u64[0];
ptr[1] = val.u64[1];
}
static inline uint64x2_t veorq_u64(uint64x2_t a, uint64x2_t b)
{
return _mm_xor_si128(a, b);
}
static inline uint64x2_t vaddq_u64(uint64x2_t a, uint64x2_t b)
{
return _mm_add_epi64(a, b);
}
static inline uint64x2_t vreinterpretq_u64_u8(uint8x16_t a)
{
uint64x2_t result;
memcpy(&result, &a, sizeof(uint64x2_t));
return result;
}
static inline uint64_t vgetq_lane_u64(uint64x2_t v, int lane)
{
return v.u64[lane];
}
static inline int64_t vgetq_lane_s64(int64x2_t v, int lane)
{
return v.i64[lane];
}
static inline int32_t vgetq_lane_s32(int32x4_t v, int lane)
{
return v.i32[lane];
}
typedef struct { uint64_t val[1]; } uint64x1_t;
static inline uint64x1_t vcreate_u64(uint64_t a)
{
uint64x1_t result;
result.val[0] = a;
return result;
}
static inline uint64x2_t vcombine_u64(uint64x1_t low, uint64x1_t high)
{
uint64x2_t result;
result.u64[0] = low.val[0];
result.u64[1] = high.val[0];
return result;
}
#ifdef __cplusplus
}
#endif
#endif /* XMRIG_SSE2RVV_OPTIMIZED_H */

View File

@@ -0,0 +1,571 @@
/* XMRig
* Copyright (c) 2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
/*
* SSE to RISC-V compatibility header
* Provides scalar implementations of SSE intrinsics for RISC-V architecture
*/
#ifndef XMRIG_SSE2RVV_H
#define XMRIG_SSE2RVV_H
#ifdef __cplusplus
extern "C" {
#endif
#include <stdint.h>
#include <string.h>
/* 128-bit vector type */
typedef union {
uint8_t u8[16];
uint16_t u16[8];
uint32_t u32[4];
uint64_t u64[2];
int8_t i8[16];
int16_t i16[8];
int32_t i32[4];
int64_t i64[2];
} __m128i_union;
typedef __m128i_union __m128i;
/* Set operations */
static inline __m128i _mm_set_epi32(int e3, int e2, int e1, int e0)
{
__m128i result;
result.i32[0] = e0;
result.i32[1] = e1;
result.i32[2] = e2;
result.i32[3] = e3;
return result;
}
static inline __m128i _mm_set_epi64x(int64_t e1, int64_t e0)
{
__m128i result;
result.i64[0] = e0;
result.i64[1] = e1;
return result;
}
static inline __m128i _mm_setzero_si128(void)
{
__m128i result;
memset(&result, 0, sizeof(result));
return result;
}
/* Extract/insert operations */
static inline int _mm_cvtsi128_si32(__m128i a)
{
return a.i32[0];
}
static inline int64_t _mm_cvtsi128_si64(__m128i a)
{
return a.i64[0];
}
static inline __m128i _mm_cvtsi32_si128(int a)
{
__m128i result = _mm_setzero_si128();
result.i32[0] = a;
return result;
}
static inline __m128i _mm_cvtsi64_si128(int64_t a)
{
__m128i result = _mm_setzero_si128();
result.i64[0] = a;
return result;
}
/* Shuffle operations */
static inline __m128i _mm_shuffle_epi32(__m128i a, int imm8)
{
__m128i result;
result.u32[0] = a.u32[(imm8 >> 0) & 0x3];
result.u32[1] = a.u32[(imm8 >> 2) & 0x3];
result.u32[2] = a.u32[(imm8 >> 4) & 0x3];
result.u32[3] = a.u32[(imm8 >> 6) & 0x3];
return result;
}
/* Logical operations */
static inline __m128i _mm_xor_si128(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[0] ^ b.u64[0];
result.u64[1] = a.u64[1] ^ b.u64[1];
return result;
}
static inline __m128i _mm_or_si128(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[0] | b.u64[0];
result.u64[1] = a.u64[1] | b.u64[1];
return result;
}
static inline __m128i _mm_and_si128(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[0] & b.u64[0];
result.u64[1] = a.u64[1] & b.u64[1];
return result;
}
static inline __m128i _mm_andnot_si128(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = (~a.u64[0]) & b.u64[0];
result.u64[1] = (~a.u64[1]) & b.u64[1];
return result;
}
/* Shift operations */
static inline __m128i _mm_slli_si128(__m128i a, int imm8)
{
__m128i result = _mm_setzero_si128();
int count = imm8 & 0xFF;
if (count > 15) return result;
for (int i = 0; i < 16 - count; i++) {
result.u8[i + count] = a.u8[i];
}
return result;
}
static inline __m128i _mm_srli_si128(__m128i a, int imm8)
{
__m128i result = _mm_setzero_si128();
int count = imm8 & 0xFF;
if (count > 15) return result;
for (int i = count; i < 16; i++) {
result.u8[i - count] = a.u8[i];
}
return result;
}
static inline __m128i _mm_slli_epi64(__m128i a, int imm8)
{
__m128i result;
if (imm8 > 63) {
result.u64[0] = 0;
result.u64[1] = 0;
} else {
result.u64[0] = a.u64[0] << imm8;
result.u64[1] = a.u64[1] << imm8;
}
return result;
}
static inline __m128i _mm_srli_epi64(__m128i a, int imm8)
{
__m128i result;
if (imm8 > 63) {
result.u64[0] = 0;
result.u64[1] = 0;
} else {
result.u64[0] = a.u64[0] >> imm8;
result.u64[1] = a.u64[1] >> imm8;
}
return result;
}
/* Load/store operations */
static inline __m128i _mm_load_si128(const __m128i* p)
{
__m128i result;
memcpy(&result, p, sizeof(__m128i));
return result;
}
static inline __m128i _mm_loadu_si128(const __m128i* p)
{
__m128i result;
memcpy(&result, p, sizeof(__m128i));
return result;
}
static inline void _mm_store_si128(__m128i* p, __m128i a)
{
memcpy(p, &a, sizeof(__m128i));
}
static inline void _mm_storeu_si128(__m128i* p, __m128i a)
{
memcpy(p, &a, sizeof(__m128i));
}
/* Arithmetic operations */
static inline __m128i _mm_add_epi64(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[0] + b.u64[0];
result.u64[1] = a.u64[1] + b.u64[1];
return result;
}
static inline __m128i _mm_add_epi32(__m128i a, __m128i b)
{
__m128i result;
for (int i = 0; i < 4; i++) {
result.i32[i] = a.i32[i] + b.i32[i];
}
return result;
}
static inline __m128i _mm_sub_epi64(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[0] - b.u64[0];
result.u64[1] = a.u64[1] - b.u64[1];
return result;
}
static inline __m128i _mm_mul_epu32(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = (uint64_t)a.u32[0] * (uint64_t)b.u32[0];
result.u64[1] = (uint64_t)a.u32[2] * (uint64_t)b.u32[2];
return result;
}
/* Unpack operations */
static inline __m128i _mm_unpacklo_epi64(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[0];
result.u64[1] = b.u64[0];
return result;
}
static inline __m128i _mm_unpackhi_epi64(__m128i a, __m128i b)
{
__m128i result;
result.u64[0] = a.u64[1];
result.u64[1] = b.u64[1];
return result;
}
/* Pause instruction for spin-wait loops */
static inline void _mm_pause(void)
{
/* RISC-V doesn't have a direct equivalent to x86 PAUSE
* Use a simple NOP or yield hint */
__asm__ __volatile__("nop");
}
/* Memory fence */
static inline void _mm_mfence(void)
{
__asm__ __volatile__("fence" ::: "memory");
}
static inline void _mm_lfence(void)
{
__asm__ __volatile__("fence r,r" ::: "memory");
}
static inline void _mm_sfence(void)
{
__asm__ __volatile__("fence w,w" ::: "memory");
}
/* Comparison operations */
static inline __m128i _mm_cmpeq_epi32(__m128i a, __m128i b)
{
__m128i result;
for (int i = 0; i < 4; i++) {
result.u32[i] = (a.u32[i] == b.u32[i]) ? 0xFFFFFFFF : 0;
}
return result;
}
static inline __m128i _mm_cmpeq_epi64(__m128i a, __m128i b)
{
__m128i result;
for (int i = 0; i < 2; i++) {
result.u64[i] = (a.u64[i] == b.u64[i]) ? 0xFFFFFFFFFFFFFFFFULL : 0;
}
return result;
}
/* Additional shift operations */
static inline __m128i _mm_slli_epi32(__m128i a, int imm8)
{
__m128i result;
if (imm8 > 31) {
for (int i = 0; i < 4; i++) result.u32[i] = 0;
} else {
for (int i = 0; i < 4; i++) {
result.u32[i] = a.u32[i] << imm8;
}
}
return result;
}
static inline __m128i _mm_srli_epi32(__m128i a, int imm8)
{
__m128i result;
if (imm8 > 31) {
for (int i = 0; i < 4; i++) result.u32[i] = 0;
} else {
for (int i = 0; i < 4; i++) {
result.u32[i] = a.u32[i] >> imm8;
}
}
return result;
}
/* 64-bit integer operations */
static inline __m128i _mm_set1_epi64x(int64_t a)
{
__m128i result;
result.i64[0] = a;
result.i64[1] = a;
return result;
}
/* Float type for compatibility - we'll treat it as int for simplicity */
typedef __m128i __m128;
/* Float operations - simplified scalar implementations */
static inline __m128 _mm_set1_ps(float a)
{
__m128 result;
uint32_t val;
memcpy(&val, &a, sizeof(float));
for (int i = 0; i < 4; i++) {
result.u32[i] = val;
}
return result;
}
static inline __m128 _mm_setzero_ps(void)
{
__m128 result;
memset(&result, 0, sizeof(result));
return result;
}
static inline __m128 _mm_add_ps(__m128 a, __m128 b)
{
__m128 result;
float fa[4], fb[4], fr[4];
memcpy(fa, &a, sizeof(__m128));
memcpy(fb, &b, sizeof(__m128));
for (int i = 0; i < 4; i++) {
fr[i] = fa[i] + fb[i];
}
memcpy(&result, fr, sizeof(__m128));
return result;
}
static inline __m128 _mm_mul_ps(__m128 a, __m128 b)
{
__m128 result;
float fa[4], fb[4], fr[4];
memcpy(fa, &a, sizeof(__m128));
memcpy(fb, &b, sizeof(__m128));
for (int i = 0; i < 4; i++) {
fr[i] = fa[i] * fb[i];
}
memcpy(&result, fr, sizeof(__m128));
return result;
}
static inline __m128 _mm_and_ps(__m128 a, __m128 b)
{
__m128 result;
result.u64[0] = a.u64[0] & b.u64[0];
result.u64[1] = a.u64[1] & b.u64[1];
return result;
}
static inline __m128 _mm_or_ps(__m128 a, __m128 b)
{
__m128 result;
result.u64[0] = a.u64[0] | b.u64[0];
result.u64[1] = a.u64[1] | b.u64[1];
return result;
}
static inline __m128 _mm_cvtepi32_ps(__m128i a)
{
__m128 result;
float fr[4];
for (int i = 0; i < 4; i++) {
fr[i] = (float)a.i32[i];
}
memcpy(&result, fr, sizeof(__m128));
return result;
}
static inline __m128i _mm_cvttps_epi32(__m128 a)
{
__m128i result;
float fa[4];
memcpy(fa, &a, sizeof(__m128));
for (int i = 0; i < 4; i++) {
result.i32[i] = (int32_t)fa[i];
}
return result;
}
/* Casting operations */
static inline __m128 _mm_castsi128_ps(__m128i a)
{
__m128 result;
memcpy(&result, &a, sizeof(__m128));
return result;
}
static inline __m128i _mm_castps_si128(__m128 a)
{
__m128i result;
memcpy(&result, &a, sizeof(__m128));
return result;
}
/* Additional set operations */
static inline __m128i _mm_set1_epi32(int a)
{
__m128i result;
for (int i = 0; i < 4; i++) {
result.i32[i] = a;
}
return result;
}
/* AES instructions - these are placeholders, actual AES is done via soft_aes.h */
/* On RISC-V without crypto extensions, these should never be called directly */
/* They are only here for compilation compatibility */
static inline __m128i _mm_aesenc_si128(__m128i a, __m128i roundkey)
{
/* This is a placeholder - actual implementation should use soft_aes */
/* If this function is called, it means SOFT_AES template parameter wasn't used */
/* We return a XOR as a minimal fallback, but proper code should use soft_aesenc */
return _mm_xor_si128(a, roundkey);
}
static inline __m128i _mm_aeskeygenassist_si128(__m128i a, const int rcon)
{
/* Placeholder for AES key generation - should use soft_aeskeygenassist */
return a;
}
/* Rotate right operation for soft_aes.h */
static inline uint32_t _rotr(uint32_t value, unsigned int count)
{
const unsigned int mask = 31;
count &= mask;
return (value >> count) | (value << ((-count) & mask));
}
/* ARM NEON compatibility types and intrinsics for RISC-V */
typedef __m128i_union uint64x2_t;
typedef __m128i_union uint8x16_t;
typedef __m128i_union int64x2_t;
typedef __m128i_union int32x4_t;
static inline uint64x2_t vld1q_u64(const uint64_t *ptr)
{
uint64x2_t result;
result.u64[0] = ptr[0];
result.u64[1] = ptr[1];
return result;
}
static inline int64x2_t vld1q_s64(const int64_t *ptr)
{
int64x2_t result;
result.i64[0] = ptr[0];
result.i64[1] = ptr[1];
return result;
}
static inline void vst1q_u64(uint64_t *ptr, uint64x2_t val)
{
ptr[0] = val.u64[0];
ptr[1] = val.u64[1];
}
static inline uint64x2_t veorq_u64(uint64x2_t a, uint64x2_t b)
{
uint64x2_t result;
result.u64[0] = a.u64[0] ^ b.u64[0];
result.u64[1] = a.u64[1] ^ b.u64[1];
return result;
}
static inline uint64x2_t vaddq_u64(uint64x2_t a, uint64x2_t b)
{
uint64x2_t result;
result.u64[0] = a.u64[0] + b.u64[0];
result.u64[1] = a.u64[1] + b.u64[1];
return result;
}
static inline uint64x2_t vreinterpretq_u64_u8(uint8x16_t a)
{
uint64x2_t result;
memcpy(&result, &a, sizeof(uint64x2_t));
return result;
}
static inline uint64_t vgetq_lane_u64(uint64x2_t v, int lane)
{
return v.u64[lane];
}
static inline int64_t vgetq_lane_s64(int64x2_t v, int lane)
{
return v.i64[lane];
}
static inline int32_t vgetq_lane_s32(int32x4_t v, int lane)
{
return v.i32[lane];
}
typedef struct { uint64_t val[1]; } uint64x1_t;
static inline uint64x1_t vcreate_u64(uint64_t a)
{
uint64x1_t result;
result.val[0] = a;
return result;
}
static inline uint64x2_t vcombine_u64(uint64x1_t low, uint64x1_t high)
{
uint64x2_t result;
result.u64[0] = low.val[0];
result.u64[1] = high.val[0];
return result;
}
#ifdef __cplusplus
}
#endif
#endif /* XMRIG_SSE2RVV_H */

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -35,15 +35,69 @@ constexpr size_t twoMiB = 2U * 1024U * 1024U;
constexpr size_t oneGiB = 1024U * 1024U * 1024U;
static inline std::string sysfs_path(uint32_t node, size_t hugePageSize, bool nr)
static bool sysfs_write(const std::string &path, uint64_t value)
{
std::ofstream file(path, std::ios::out | std::ios::binary | std::ios::trunc);
if (!file.is_open()) {
return false;
}
file << value;
file.flush();
return true;
}
static int64_t sysfs_read(const std::string &path)
{
std::ifstream file(path);
if (!file.is_open()) {
return -1;
}
uint64_t value = 0;
file >> value;
return value;
}
static std::string sysfs_path(uint32_t node, size_t hugePageSize, bool nr)
{
return fmt::format("/sys/devices/system/node/node{}/hugepages/hugepages-{}kB/{}_hugepages", node, hugePageSize / 1024, nr ? "nr" : "free");
}
static inline bool write_nr_hugepages(uint32_t node, size_t hugePageSize, uint64_t count) { return LinuxMemory::write(sysfs_path(node, hugePageSize, true).c_str(), count); }
static inline int64_t free_hugepages(uint32_t node, size_t hugePageSize) { return LinuxMemory::read(sysfs_path(node, hugePageSize, false).c_str()); }
static inline int64_t nr_hugepages(uint32_t node, size_t hugePageSize) { return LinuxMemory::read(sysfs_path(node, hugePageSize, true).c_str()); }
static std::string sysfs_path(size_t hugePageSize, bool nr)
{
return fmt::format("/sys/kernel/mm/hugepages/hugepages-{}kB/{}_hugepages", hugePageSize / 1024, nr ? "nr" : "free");
}
static bool write_nr_hugepages(uint32_t node, size_t hugePageSize, uint64_t count)
{
if (sysfs_write(sysfs_path(node, hugePageSize, true), count)) {
return true;
}
return sysfs_write(sysfs_path(hugePageSize, true), count);
}
static int64_t sysfs_read_hugepages(uint32_t node, size_t hugePageSize, bool nr)
{
const int64_t value = sysfs_read(sysfs_path(node, hugePageSize, nr));
if (value >= 0) {
return value;
}
return sysfs_read(sysfs_path(hugePageSize, nr));
}
static inline int64_t free_hugepages(uint32_t node, size_t hugePageSize) { return sysfs_read_hugepages(node, hugePageSize, false); }
static inline int64_t nr_hugepages(uint32_t node, size_t hugePageSize) { return sysfs_read_hugepages(node, hugePageSize, true); }
} // namespace xmrig
@@ -62,31 +116,3 @@ bool xmrig::LinuxMemory::reserve(size_t size, uint32_t node, size_t hugePageSize
return write_nr_hugepages(node, hugePageSize, std::max<size_t>(nr_hugepages(node, hugePageSize), 0) + (required - available));
}
bool xmrig::LinuxMemory::write(const char *path, uint64_t value)
{
std::ofstream file(path, std::ios::out | std::ios::binary | std::ios::trunc);
if (!file.is_open()) {
return false;
}
file << value;
file.flush();
return true;
}
int64_t xmrig::LinuxMemory::read(const char *path)
{
std::ifstream file(path);
if (!file.is_open()) {
return -1;
}
uint64_t value = 0;
file >> value;
return value;
}

View File

@@ -1,6 +1,6 @@
/* XMRig
* Copyright (c) 2018-2021 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2021 XMRig <https://github.com/xmrig>, <support@xmrig.com>
* Copyright (c) 2018-2025 SChernykh <https://github.com/SChernykh>
* Copyright (c) 2016-2025 XMRig <https://github.com/xmrig>, <support@xmrig.com>
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -31,13 +31,10 @@ class LinuxMemory
{
public:
static bool reserve(size_t size, uint32_t node, size_t hugePageSize);
static bool write(const char *path, uint64_t value);
static int64_t read(const char *path);
};
} /* namespace xmrig */
} // namespace xmrig
#endif /* XMRIG_LINUXMEMORY_H */
#endif // XMRIG_LINUXMEMORY_H

View File

@@ -49,7 +49,7 @@ xmrig::MemoryPool::MemoryPool(size_t size, bool hugePages, uint32_t node)
constexpr size_t alignment = 1 << 24;
m_memory = new VirtualMemory(size * pageSize + alignment, hugePages, false, false, node);
m_memory = new VirtualMemory(size * pageSize + alignment, hugePages, false, false, node, VirtualMemory::kDefaultHugePageSize);
m_alignOffset = (alignment - (((size_t)m_memory->scratchpad()) % alignment)) % alignment;
}

View File

@@ -75,6 +75,16 @@ xmrig::VirtualMemory::VirtualMemory(size_t size, bool hugePages, bool oneGbPages
}
m_scratchpad = static_cast<uint8_t*>(_mm_malloc(m_size, alignSize));
// Huge pages failed to allocate, but try to enable transparent huge pages for the range
if (alignSize >= kDefaultHugePageSize) {
if (m_scratchpad) {
adviseLargePages(m_scratchpad, m_size);
}
else {
m_scratchpad = static_cast<uint8_t*>(_mm_malloc(m_size, 64));
}
}
}

View File

@@ -65,6 +65,7 @@ public:
static void *allocateExecutableMemory(size_t size, bool hugePages);
static void *allocateLargePagesMemory(size_t size);
static void *allocateOneGbPagesMemory(size_t size);
static bool adviseLargePages(void *p, size_t size);
static void destroy();
static void flushInstructionCache(void *p, size_t size);
static void freeLargePagesMemory(void *p, size_t size);

View File

@@ -86,7 +86,7 @@ bool xmrig::VirtualMemory::isHugepagesAvailable()
{
# ifdef XMRIG_OS_LINUX
return std::ifstream("/proc/sys/vm/nr_hugepages").good() || std::ifstream("/sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages").good();
# elif defined(XMRIG_OS_MACOS) && defined(XMRIG_ARM)
# elif defined(XMRIG_OS_MACOS) && defined(XMRIG_ARM) || defined(XMRIG_OS_HAIKU)
return false;
# else
return true;
@@ -156,7 +156,8 @@ void *xmrig::VirtualMemory::allocateExecutableMemory(size_t size, bool hugePages
if (!mem) {
mem = mmap(0, size, PROT_READ | PROT_WRITE | SECURE_PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
}
# elif defined(XMRIG_OS_HAIKU)
void *mem = mmap(0, size, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
# else
void *mem = nullptr;
@@ -181,6 +182,8 @@ void *xmrig::VirtualMemory::allocateLargePagesMemory(size_t size)
void *mem = mmap(0, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, VM_FLAGS_SUPERPAGE_SIZE_2MB, 0);
# elif defined(XMRIG_OS_FREEBSD)
void *mem = mmap(0, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_ALIGNED_SUPER | MAP_PREFAULT_READ, -1, 0);
# elif defined(XMRIG_OS_HAIKU)
void *mem = nullptr;
# else
void *mem = mmap(0, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_POPULATE | hugePagesFlag(hugePageSize()), 0, 0);
# endif
@@ -273,6 +276,16 @@ bool xmrig::VirtualMemory::allocateOneGbPagesMemory()
}
bool xmrig::VirtualMemory::adviseLargePages(void *p, size_t size)
{
# ifdef XMRIG_OS_LINUX
return (madvise(p, size, MADV_HUGEPAGE) == 0);
# else
return false;
# endif
}
void xmrig::VirtualMemory::freeLargePagesMemory()
{
if (m_flags.test(FLAG_LOCK)) {

View File

@@ -260,6 +260,12 @@ bool xmrig::VirtualMemory::allocateOneGbPagesMemory()
}
bool xmrig::VirtualMemory::adviseLargePages(void *p, size_t size)
{
return false;
}
void xmrig::VirtualMemory::freeLargePagesMemory()
{
freeLargePagesMemory(m_scratchpad, m_size);

View File

@@ -26,7 +26,7 @@
#define XMRIG_MM_MALLOC_PORTABLE_H
#if defined(XMRIG_ARM) && !defined(__clang__)
#if (defined(XMRIG_ARM) || defined(XMRIG_RISCV)) && !defined(__clang__)
#include <stdlib.h>

View File

@@ -57,6 +57,9 @@
#if defined(XMRIG_ARM)
# include "crypto/cn/sse2neon.h"
#elif defined(XMRIG_RISCV)
// RISC-V doesn't have SSE/NEON, provide minimal compatibility
# define _mm_pause() __asm__ __volatile__("nop")
#elif defined(__GNUC__)
# include <x86intrin.h>
#else
@@ -286,7 +289,7 @@ struct HelperThread
void benchmark()
{
#ifndef XMRIG_ARM
#if !defined(XMRIG_ARM) && !defined(XMRIG_RISCV)
static std::atomic<int> done{ 0 };
if (done.exchange(1)) {
return;
@@ -478,7 +481,7 @@ static inline bool findByType(hwloc_obj_t obj, hwloc_obj_type_t type, func lambd
HelperThread* create_helper_thread(int64_t cpu_index, int priority, const std::vector<int64_t>& affinities)
{
#ifndef XMRIG_ARM
#if !defined(XMRIG_ARM) && !defined(XMRIG_RISCV)
hwloc_bitmap_t helper_cpu_set = hwloc_bitmap_alloc();
hwloc_bitmap_t main_threads_set = hwloc_bitmap_alloc();
@@ -807,7 +810,7 @@ void hash_octa(const uint8_t* data, size_t size, uint8_t* output, cryptonight_ct
uint32_t cn_indices[6];
select_indices(cn_indices, seed);
#ifdef XMRIG_ARM
#if defined(XMRIG_ARM) || defined(XMRIG_RISCV)
uint32_t step[6] = { 1, 1, 1, 1, 1, 1 };
#else
uint32_t step[6] = { 4, 4, 1, 2, 4, 4 };

View File

@@ -38,6 +38,13 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#include "crypto/randomx/common.hpp"
#include "crypto/rx/Profiler.h"
#include "backend/cpu/Cpu.h"
#ifdef XMRIG_RISCV
#include "crypto/randomx/aes_hash_rv64_vector.hpp"
#include "crypto/randomx/aes_hash_rv64_zvkned.hpp"
#endif
#define AES_HASH_1R_STATE0 0xd7983aad, 0xcc82db47, 0x9fa856de, 0x92b52c0d
#define AES_HASH_1R_STATE1 0xace78057, 0xf59e125a, 0x15c7b798, 0x338d996e
#define AES_HASH_1R_STATE2 0xe8a07ce4, 0x5079506b, 0xae62c7d0, 0x6a770017
@@ -59,14 +66,27 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Hashing throughput: >20 GiB/s per CPU core with hardware AES
*/
template<int softAes>
void hashAes1Rx4(const void *input, size_t inputSize, void *hash) {
void hashAes1Rx4(const void *input, size_t inputSize, void *hash)
{
#ifdef XMRIG_RISCV
if (xmrig::Cpu::info()->hasAES()) {
hashAes1Rx4_zvkned(input, inputSize, hash);
return;
}
if (xmrig::Cpu::info()->hasRISCV_Vector()) {
hashAes1Rx4_RVV(input, inputSize, hash);
return;
}
#endif
const uint8_t* inptr = (uint8_t*)input;
const uint8_t* inputEnd = inptr + inputSize;
rx_vec_i128 state0, state1, state2, state3;
rx_vec_i128 in0, in1, in2, in3;
//intial state
//initial state
state0 = rx_set_int_vec_i128(AES_HASH_1R_STATE0);
state1 = rx_set_int_vec_i128(AES_HASH_1R_STATE1);
state2 = rx_set_int_vec_i128(AES_HASH_1R_STATE2);
@@ -127,7 +147,20 @@ template void hashAes1Rx4<true>(const void *input, size_t inputSize, void *hash)
calls to this function.
*/
template<int softAes>
void fillAes1Rx4(void *state, size_t outputSize, void *buffer) {
void fillAes1Rx4(void *state, size_t outputSize, void *buffer)
{
#ifdef XMRIG_RISCV
if (xmrig::Cpu::info()->hasAES()) {
fillAes1Rx4_zvkned(state, outputSize, buffer);
return;
}
if (xmrig::Cpu::info()->hasRISCV_Vector()) {
fillAes1Rx4_RVV(state, outputSize, buffer);
return;
}
#endif
const uint8_t* outptr = (uint8_t*)buffer;
const uint8_t* outputEnd = outptr + outputSize;
@@ -171,7 +204,20 @@ static constexpr randomx::Instruction inst{ 0xFF, 7, 7, 0xFF, 0xFFFFFFFFU };
alignas(16) static const randomx::Instruction inst_mask[2] = { inst, inst };
template<int softAes>
void fillAes4Rx4(void *state, size_t outputSize, void *buffer) {
void fillAes4Rx4(void *state, size_t outputSize, void *buffer)
{
#ifdef XMRIG_RISCV
if (xmrig::Cpu::info()->hasAES()) {
fillAes4Rx4_zvkned(state, outputSize, buffer);
return;
}
if (xmrig::Cpu::info()->hasRISCV_Vector()) {
fillAes4Rx4_RVV(state, outputSize, buffer);
return;
}
#endif
const uint8_t* outptr = (uint8_t*)buffer;
const uint8_t* outputEnd = outptr + outputSize;
@@ -235,10 +281,34 @@ void fillAes4Rx4(void *state, size_t outputSize, void *buffer) {
template void fillAes4Rx4<true>(void *state, size_t outputSize, void *buffer);
template void fillAes4Rx4<false>(void *state, size_t outputSize, void *buffer);
#ifdef XMRIG_VAES
void hashAndFillAes1Rx4_VAES512(void *scratchpad, size_t scratchpadSize, void *hash, void* fill_state);
#endif
template<int softAes, int unroll>
void hashAndFillAes1Rx4(void *scratchpad, size_t scratchpadSize, void *hash, void* fill_state) {
void hashAndFillAes1Rx4(void *scratchpad, size_t scratchpadSize, void *hash, void* fill_state)
{
PROFILE_SCOPE(RandomX_AES);
#ifdef XMRIG_RISCV
if (xmrig::Cpu::info()->hasAES()) {
hashAndFillAes1Rx4_zvkned(scratchpad, scratchpadSize, hash, fill_state);
return;
}
if (xmrig::Cpu::info()->hasRISCV_Vector()) {
hashAndFillAes1Rx4_RVV(scratchpad, scratchpadSize, hash, fill_state);
return;
}
#endif
#ifdef XMRIG_VAES
if (xmrig::Cpu::info()->arch() == xmrig::ICpuInfo::ARCH_ZEN5) {
hashAndFillAes1Rx4_VAES512(scratchpad, scratchpadSize, hash, fill_state);
return;
}
#endif
uint8_t* scratchpadPtr = (uint8_t*)scratchpad;
const uint8_t* scratchpadEnd = scratchpadPtr + scratchpadSize;
@@ -386,43 +456,54 @@ hashAndFillAes1Rx4_impl* softAESImpl = &hashAndFillAes1Rx4<1,1>;
void SelectSoftAESImpl(size_t threadsCount)
{
constexpr uint64_t test_length_ms = 100;
const std::array<hashAndFillAes1Rx4_impl *, 4> impl = {
&hashAndFillAes1Rx4<1,1>,
&hashAndFillAes1Rx4<2,1>,
&hashAndFillAes1Rx4<2,2>,
&hashAndFillAes1Rx4<2,4>,
};
size_t fast_idx = 0;
double fast_speed = 0.0;
for (size_t run = 0; run < 3; ++run) {
for (size_t i = 0; i < impl.size(); ++i) {
const double t1 = xmrig::Chrono::highResolutionMSecs();
std::vector<uint32_t> count(threadsCount, 0);
std::vector<std::thread> threads;
for (size_t t = 0; t < threadsCount; ++t) {
threads.emplace_back([&, t]() {
std::vector<uint8_t> scratchpad(10 * 1024);
alignas(16) uint8_t hash[64] = {};
alignas(16) uint8_t state[64] = {};
do {
(*impl[i])(scratchpad.data(), scratchpad.size(), hash, state);
++count[t];
} while (xmrig::Chrono::highResolutionMSecs() - t1 < test_length_ms);
});
}
uint32_t total = 0;
for (size_t t = 0; t < threadsCount; ++t) {
threads[t].join();
total += count[t];
}
const double t2 = xmrig::Chrono::highResolutionMSecs();
const double speed = total * 1e3 / (t2 - t1);
if (speed > fast_speed) {
fast_idx = i;
fast_speed = speed;
}
}
}
softAESImpl = impl[fast_idx];
constexpr uint64_t test_length_ms = 100;
const std::array<hashAndFillAes1Rx4_impl *, 4> impl = {
&hashAndFillAes1Rx4<1,1>,
&hashAndFillAes1Rx4<2,1>,
&hashAndFillAes1Rx4<2,2>,
&hashAndFillAes1Rx4<2,4>,
};
size_t fast_idx = 0;
double fast_speed = 0.0;
for (size_t run = 0; run < 3; ++run) {
for (size_t i = 0; i < impl.size(); ++i) {
const double t1 = xmrig::Chrono::highResolutionMSecs();
std::vector<uint32_t> count(threadsCount, 0);
std::vector<std::thread> threads;
for (size_t t = 0; t < threadsCount; ++t) {
threads.emplace_back([&, t]() {
std::vector<uint8_t> scratchpad(10 * 1024);
alignas(16) uint8_t hash[64] = {};
alignas(16) uint8_t state[64] = {};
do {
(*impl[i])(scratchpad.data(), scratchpad.size(), hash, state);
++count[t];
} while (xmrig::Chrono::highResolutionMSecs() - t1 < test_length_ms);
});
}
uint32_t total = 0;
for (size_t t = 0; t < threadsCount; ++t) {
threads[t].join();
total += count[t];
}
const double t2 = xmrig::Chrono::highResolutionMSecs();
const double speed = total * 1e3 / (t2 - t1);
if (speed > fast_speed) {
fast_idx = i;
fast_speed = speed;
}
}
}
softAESImpl = impl[fast_idx];
}

View File

@@ -0,0 +1,272 @@
/*
Copyright (c) 2025 SChernykh <https://github.com/SChernykh>
Copyright (c) 2025 XMRig <support@xmrig.com>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the
names of its contributors may be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <riscv_vector.h>
#include "crypto/randomx/soft_aes.h"
#include "crypto/randomx/randomx.h"
static FORCE_INLINE vuint32m1_t softaes_vector_double(
vuint32m1_t in,
vuint32m1_t key,
vuint8m1_t i0, vuint8m1_t i1, vuint8m1_t i2, vuint8m1_t i3,
const uint32_t* lut0, const uint32_t* lut1, const uint32_t *lut2, const uint32_t* lut3)
{
const vuint8m1_t in8 = __riscv_vreinterpret_v_u32m1_u8m1(in);
const vuint32m1_t index0 = __riscv_vreinterpret_v_u8m1_u32m1(__riscv_vrgather_vv_u8m1(in8, i0, 32));
const vuint32m1_t index1 = __riscv_vreinterpret_v_u8m1_u32m1(__riscv_vrgather_vv_u8m1(in8, i1, 32));
const vuint32m1_t index2 = __riscv_vreinterpret_v_u8m1_u32m1(__riscv_vrgather_vv_u8m1(in8, i2, 32));
const vuint32m1_t index3 = __riscv_vreinterpret_v_u8m1_u32m1(__riscv_vrgather_vv_u8m1(in8, i3, 32));
vuint32m1_t s0 = __riscv_vluxei32_v_u32m1(lut0, __riscv_vsll_vx_u32m1(index0, 2, 8), 8);
vuint32m1_t s1 = __riscv_vluxei32_v_u32m1(lut1, __riscv_vsll_vx_u32m1(index1, 2, 8), 8);
vuint32m1_t s2 = __riscv_vluxei32_v_u32m1(lut2, __riscv_vsll_vx_u32m1(index2, 2, 8), 8);
vuint32m1_t s3 = __riscv_vluxei32_v_u32m1(lut3, __riscv_vsll_vx_u32m1(index3, 2, 8), 8);
s0 = __riscv_vxor_vv_u32m1(s0, s1, 8);
s2 = __riscv_vxor_vv_u32m1(s2, s3, 8);
s0 = __riscv_vxor_vv_u32m1(s0, s2, 8);
return __riscv_vxor_vv_u32m1(s0, key, 8);
}
static constexpr uint32_t AES_HASH_1R_STATE02[8] = { 0x92b52c0d, 0x9fa856de, 0xcc82db47, 0xd7983aad, 0x6a770017, 0xae62c7d0, 0x5079506b, 0xe8a07ce4 };
static constexpr uint32_t AES_HASH_1R_STATE13[8] = { 0x338d996e, 0x15c7b798, 0xf59e125a, 0xace78057, 0x630a240c, 0x07ad828d, 0x79a10005, 0x7e994948 };
static constexpr uint32_t AES_GEN_1R_KEY02[8] = { 0x6daca553, 0x62716609, 0xdbb5552b, 0xb4f44917, 0x3f1262f1, 0x9f947ec6, 0xf4c0794f, 0x3e20e345 };
static constexpr uint32_t AES_GEN_1R_KEY13[8] = { 0x6d7caf07, 0x846a710d, 0x1725d378, 0x0da1dc4e, 0x6aef8135, 0xb1ba317c, 0x16314c88, 0x49169154 };
static constexpr uint32_t AES_HASH_1R_XKEY00[8] = { 0xf6fa8389, 0x8b24949f, 0x90dc56bf, 0x06890201, 0xf6fa8389, 0x8b24949f, 0x90dc56bf, 0x06890201 };
static constexpr uint32_t AES_HASH_1R_XKEY11[8] = { 0x61b263d1, 0x51f4e03c, 0xee1043c6, 0xed18f99b, 0x61b263d1, 0x51f4e03c, 0xee1043c6, 0xed18f99b };
static constexpr uint32_t AES_HASH_STRIDE_X2[8] = { 0, 4, 8, 12, 32, 36, 40, 44 };
static constexpr uint32_t AES_HASH_STRIDE_X4[8] = { 0, 4, 8, 12, 64, 68, 72, 76 };
#define lutEnc0 lutEnc[0]
#define lutEnc1 lutEnc[1]
#define lutEnc2 lutEnc[2]
#define lutEnc3 lutEnc[3]
#define lutDec0 lutDec[0]
#define lutDec1 lutDec[1]
#define lutDec2 lutDec[2]
#define lutDec3 lutDec[3]
void hashAes1Rx4_RVV(const void *input, size_t inputSize, void *hash) {
const uint8_t* inptr = (const uint8_t*)input;
const uint8_t* inputEnd = inptr + inputSize;
//intial state
vuint32m1_t state02 = __riscv_vle32_v_u32m1(AES_HASH_1R_STATE02, 8);
vuint32m1_t state13 = __riscv_vle32_v_u32m1(AES_HASH_1R_STATE13, 8);
const vuint32m1_t stride = __riscv_vle32_v_u32m1(AES_HASH_STRIDE_X2, 8);
const vuint8m1_t lutenc_index0 = __riscv_vle8_v_u8m1(lutEncIndex[0], 32);
const vuint8m1_t lutenc_index1 = __riscv_vle8_v_u8m1(lutEncIndex[1], 32);
const vuint8m1_t lutenc_index2 = __riscv_vle8_v_u8m1(lutEncIndex[2], 32);
const vuint8m1_t lutenc_index3 = __riscv_vle8_v_u8m1(lutEncIndex[3], 32);
const vuint8m1_t& lutdec_index0 = lutenc_index0;
const vuint8m1_t lutdec_index1 = __riscv_vle8_v_u8m1(lutDecIndex[1], 32);
const vuint8m1_t& lutdec_index2 = lutenc_index2;
const vuint8m1_t lutdec_index3 = __riscv_vle8_v_u8m1(lutDecIndex[3], 32);
//process 64 bytes at a time in 4 lanes
while (inptr < inputEnd) {
state02 = softaes_vector_double(state02, __riscv_vluxei32_v_u32m1((uint32_t*)inptr + 0, stride, 8), lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3);
state13 = softaes_vector_double(state13, __riscv_vluxei32_v_u32m1((uint32_t*)inptr + 4, stride, 8), lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
inptr += 64;
}
//two extra rounds to achieve full diffusion
const vuint32m1_t xkey00 = __riscv_vle32_v_u32m1(AES_HASH_1R_XKEY00, 8);
const vuint32m1_t xkey11 = __riscv_vle32_v_u32m1(AES_HASH_1R_XKEY11, 8);
state02 = softaes_vector_double(state02, xkey00, lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3);
state13 = softaes_vector_double(state13, xkey00, lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
state02 = softaes_vector_double(state02, xkey11, lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3);
state13 = softaes_vector_double(state13, xkey11, lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
//output hash
__riscv_vsuxei32_v_u32m1((uint32_t*)hash + 0, stride, state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)hash + 4, stride, state13, 8);
}
void fillAes1Rx4_RVV(void *state, size_t outputSize, void *buffer) {
const uint8_t* outptr = (uint8_t*)buffer;
const uint8_t* outputEnd = outptr + outputSize;
const vuint32m1_t key02 = __riscv_vle32_v_u32m1(AES_GEN_1R_KEY02, 8);
const vuint32m1_t key13 = __riscv_vle32_v_u32m1(AES_GEN_1R_KEY13, 8);
const vuint32m1_t stride = __riscv_vle32_v_u32m1(AES_HASH_STRIDE_X2, 8);
vuint32m1_t state02 = __riscv_vluxei32_v_u32m1((uint32_t*)state + 0, stride, 8);
vuint32m1_t state13 = __riscv_vluxei32_v_u32m1((uint32_t*)state + 4, stride, 8);
const vuint8m1_t lutenc_index0 = __riscv_vle8_v_u8m1(lutEncIndex[0], 32);
const vuint8m1_t lutenc_index1 = __riscv_vle8_v_u8m1(lutEncIndex[1], 32);
const vuint8m1_t lutenc_index2 = __riscv_vle8_v_u8m1(lutEncIndex[2], 32);
const vuint8m1_t lutenc_index3 = __riscv_vle8_v_u8m1(lutEncIndex[3], 32);
const vuint8m1_t& lutdec_index0 = lutenc_index0;
const vuint8m1_t lutdec_index1 = __riscv_vle8_v_u8m1(lutDecIndex[1], 32);
const vuint8m1_t& lutdec_index2 = lutenc_index2;
const vuint8m1_t lutdec_index3 = __riscv_vle8_v_u8m1(lutDecIndex[3], 32);
while (outptr < outputEnd) {
state02 = softaes_vector_double(state02, key02, lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
state13 = softaes_vector_double(state13, key13, lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3);
__riscv_vsuxei32_v_u32m1((uint32_t*)outptr + 0, stride, state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)outptr + 4, stride, state13, 8);
outptr += 64;
}
__riscv_vsuxei32_v_u32m1((uint32_t*)state + 0, stride, state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)state + 4, stride, state13, 8);
}
void fillAes4Rx4_RVV(void *state, size_t outputSize, void *buffer) {
const uint8_t* outptr = (uint8_t*)buffer;
const uint8_t* outputEnd = outptr + outputSize;
const vuint32m1_t stride4 = __riscv_vle32_v_u32m1(AES_HASH_STRIDE_X4, 8);
const vuint32m1_t key04 = __riscv_vluxei32_v_u32m1((uint32_t*)(RandomX_CurrentConfig.fillAes4Rx4_Key + 0), stride4, 8);
const vuint32m1_t key15 = __riscv_vluxei32_v_u32m1((uint32_t*)(RandomX_CurrentConfig.fillAes4Rx4_Key + 1), stride4, 8);
const vuint32m1_t key26 = __riscv_vluxei32_v_u32m1((uint32_t*)(RandomX_CurrentConfig.fillAes4Rx4_Key + 2), stride4, 8);
const vuint32m1_t key37 = __riscv_vluxei32_v_u32m1((uint32_t*)(RandomX_CurrentConfig.fillAes4Rx4_Key + 3), stride4, 8);
const vuint32m1_t stride = __riscv_vle32_v_u32m1(AES_HASH_STRIDE_X2, 8);
vuint32m1_t state02 = __riscv_vluxei32_v_u32m1((uint32_t*)state + 0, stride, 8);
vuint32m1_t state13 = __riscv_vluxei32_v_u32m1((uint32_t*)state + 4, stride, 8);
const vuint8m1_t lutenc_index0 = __riscv_vle8_v_u8m1(lutEncIndex[0], 32);
const vuint8m1_t lutenc_index1 = __riscv_vle8_v_u8m1(lutEncIndex[1], 32);
const vuint8m1_t lutenc_index2 = __riscv_vle8_v_u8m1(lutEncIndex[2], 32);
const vuint8m1_t lutenc_index3 = __riscv_vle8_v_u8m1(lutEncIndex[3], 32);
const vuint8m1_t& lutdec_index0 = lutenc_index0;
const vuint8m1_t lutdec_index1 = __riscv_vle8_v_u8m1(lutDecIndex[1], 32);
const vuint8m1_t& lutdec_index2 = lutenc_index2;
const vuint8m1_t lutdec_index3 = __riscv_vle8_v_u8m1(lutDecIndex[3], 32);
while (outptr < outputEnd) {
state02 = softaes_vector_double(state02, key04, lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
state13 = softaes_vector_double(state13, key04, lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3);
state02 = softaes_vector_double(state02, key15, lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
state13 = softaes_vector_double(state13, key15, lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3);
state02 = softaes_vector_double(state02, key26, lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
state13 = softaes_vector_double(state13, key26, lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3);
state02 = softaes_vector_double(state02, key37, lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
state13 = softaes_vector_double(state13, key37, lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3);
__riscv_vsuxei32_v_u32m1((uint32_t*)outptr + 0, stride, state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)outptr + 4, stride, state13, 8);
outptr += 64;
}
}
void hashAndFillAes1Rx4_RVV(void *scratchpad, size_t scratchpadSize, void *hash, void* fill_state) {
uint8_t* scratchpadPtr = (uint8_t*)scratchpad;
const uint8_t* scratchpadEnd = scratchpadPtr + scratchpadSize;
vuint32m1_t hash_state02 = __riscv_vle32_v_u32m1(AES_HASH_1R_STATE02, 8);
vuint32m1_t hash_state13 = __riscv_vle32_v_u32m1(AES_HASH_1R_STATE13, 8);
const vuint32m1_t key02 = __riscv_vle32_v_u32m1(AES_GEN_1R_KEY02, 8);
const vuint32m1_t key13 = __riscv_vle32_v_u32m1(AES_GEN_1R_KEY13, 8);
const vuint32m1_t stride = __riscv_vle32_v_u32m1(AES_HASH_STRIDE_X2, 8);
vuint32m1_t fill_state02 = __riscv_vluxei32_v_u32m1((uint32_t*)fill_state + 0, stride, 8);
vuint32m1_t fill_state13 = __riscv_vluxei32_v_u32m1((uint32_t*)fill_state + 4, stride, 8);
const vuint8m1_t lutenc_index0 = __riscv_vle8_v_u8m1(lutEncIndex[0], 32);
const vuint8m1_t lutenc_index1 = __riscv_vle8_v_u8m1(lutEncIndex[1], 32);
const vuint8m1_t lutenc_index2 = __riscv_vle8_v_u8m1(lutEncIndex[2], 32);
const vuint8m1_t lutenc_index3 = __riscv_vle8_v_u8m1(lutEncIndex[3], 32);
const vuint8m1_t& lutdec_index0 = lutenc_index0;
const vuint8m1_t lutdec_index1 = __riscv_vle8_v_u8m1(lutDecIndex[1], 32);
const vuint8m1_t& lutdec_index2 = lutenc_index2;
const vuint8m1_t lutdec_index3 = __riscv_vle8_v_u8m1(lutDecIndex[3], 32);
//process 64 bytes at a time in 4 lanes
while (scratchpadPtr < scratchpadEnd) {
#define HASH_STATE(k) \
hash_state02 = softaes_vector_double(hash_state02, __riscv_vluxei32_v_u32m1((uint32_t*)scratchpadPtr + k * 16 + 0, stride, 8), lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3); \
hash_state13 = softaes_vector_double(hash_state13, __riscv_vluxei32_v_u32m1((uint32_t*)scratchpadPtr + k * 16 + 4, stride, 8), lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
#define FILL_STATE(k) \
fill_state02 = softaes_vector_double(fill_state02, key02, lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3); \
fill_state13 = softaes_vector_double(fill_state13, key13, lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3); \
__riscv_vsuxei32_v_u32m1((uint32_t*)scratchpadPtr + k * 16 + 0, stride, fill_state02, 8); \
__riscv_vsuxei32_v_u32m1((uint32_t*)scratchpadPtr + k * 16 + 4, stride, fill_state13, 8);
HASH_STATE(0);
HASH_STATE(1);
FILL_STATE(0);
FILL_STATE(1);
scratchpadPtr += 128;
}
#undef HASH_STATE
#undef FILL_STATE
__riscv_vsuxei32_v_u32m1((uint32_t*)fill_state + 0, stride, fill_state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)fill_state + 4, stride, fill_state13, 8);
//two extra rounds to achieve full diffusion
const vuint32m1_t xkey00 = __riscv_vle32_v_u32m1(AES_HASH_1R_XKEY00, 8);
const vuint32m1_t xkey11 = __riscv_vle32_v_u32m1(AES_HASH_1R_XKEY11, 8);
hash_state02 = softaes_vector_double(hash_state02, xkey00, lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3);
hash_state13 = softaes_vector_double(hash_state13, xkey00, lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
hash_state02 = softaes_vector_double(hash_state02, xkey11, lutenc_index0, lutenc_index1, lutenc_index2, lutenc_index3, lutEnc0, lutEnc1, lutEnc2, lutEnc3);
hash_state13 = softaes_vector_double(hash_state13, xkey11, lutdec_index0, lutdec_index1, lutdec_index2, lutdec_index3, lutDec0, lutDec1, lutDec2, lutDec3);
//output hash
__riscv_vsuxei32_v_u32m1((uint32_t*)hash + 0, stride, hash_state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)hash + 4, stride, hash_state13, 8);
}

View File

@@ -0,0 +1,35 @@
/*
Copyright (c) 2025 SChernykh <https://github.com/SChernykh>
Copyright (c) 2025 XMRig <support@xmrig.com>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the
names of its contributors may be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#pragma once
void hashAes1Rx4_RVV(const void *input, size_t inputSize, void *hash);
void fillAes1Rx4_RVV(void *state, size_t outputSize, void *buffer);
void fillAes4Rx4_RVV(void *state, size_t outputSize, void *buffer);
void hashAndFillAes1Rx4_RVV(void *scratchpad, size_t scratchpadSize, void *hash, void* fill_state);

View File

@@ -0,0 +1,199 @@
/*
Copyright (c) 2025 SChernykh <https://github.com/SChernykh>
Copyright (c) 2025 XMRig <support@xmrig.com>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the
names of its contributors may be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include "crypto/randomx/aes_hash.hpp"
#include "crypto/randomx/randomx.h"
#include "crypto/rx/Profiler.h"
#include <riscv_vector.h>
static FORCE_INLINE vuint32m1_t aesenc_zvkned(vuint32m1_t a, vuint32m1_t b) { return __riscv_vaesem_vv_u32m1(a, b, 8); }
static FORCE_INLINE vuint32m1_t aesdec_zvkned(vuint32m1_t a, vuint32m1_t b, vuint32m1_t zero) { return __riscv_vxor_vv_u32m1(__riscv_vaesdm_vv_u32m1(a, zero, 8), b, 8); }
static constexpr uint32_t AES_HASH_1R_STATE02[8] = { 0x92b52c0d, 0x9fa856de, 0xcc82db47, 0xd7983aad, 0x6a770017, 0xae62c7d0, 0x5079506b, 0xe8a07ce4 };
static constexpr uint32_t AES_HASH_1R_STATE13[8] = { 0x338d996e, 0x15c7b798, 0xf59e125a, 0xace78057, 0x630a240c, 0x07ad828d, 0x79a10005, 0x7e994948 };
static constexpr uint32_t AES_GEN_1R_KEY02[8] = { 0x6daca553, 0x62716609, 0xdbb5552b, 0xb4f44917, 0x3f1262f1, 0x9f947ec6, 0xf4c0794f, 0x3e20e345 };
static constexpr uint32_t AES_GEN_1R_KEY13[8] = { 0x6d7caf07, 0x846a710d, 0x1725d378, 0x0da1dc4e, 0x6aef8135, 0xb1ba317c, 0x16314c88, 0x49169154 };
static constexpr uint32_t AES_HASH_1R_XKEY00[8] = { 0xf6fa8389, 0x8b24949f, 0x90dc56bf, 0x06890201, 0xf6fa8389, 0x8b24949f, 0x90dc56bf, 0x06890201 };
static constexpr uint32_t AES_HASH_1R_XKEY11[8] = { 0x61b263d1, 0x51f4e03c, 0xee1043c6, 0xed18f99b, 0x61b263d1, 0x51f4e03c, 0xee1043c6, 0xed18f99b };
static constexpr uint32_t AES_HASH_STRIDE_X2[8] = { 0, 4, 8, 12, 32, 36, 40, 44 };
static constexpr uint32_t AES_HASH_STRIDE_X4[8] = { 0, 4, 8, 12, 64, 68, 72, 76 };
void hashAes1Rx4_zvkned(const void *input, size_t inputSize, void *hash)
{
const uint8_t* inptr = (const uint8_t*)input;
const uint8_t* inputEnd = inptr + inputSize;
//intial state
vuint32m1_t state02 = __riscv_vle32_v_u32m1(AES_HASH_1R_STATE02, 8);
vuint32m1_t state13 = __riscv_vle32_v_u32m1(AES_HASH_1R_STATE13, 8);
const vuint32m1_t stride = __riscv_vle32_v_u32m1(AES_HASH_STRIDE_X2, 8);
const vuint32m1_t zero = {};
//process 64 bytes at a time in 4 lanes
while (inptr < inputEnd) {
state02 = aesenc_zvkned(state02, __riscv_vluxei32_v_u32m1((uint32_t*)inptr + 0, stride, 8));
state13 = aesdec_zvkned(state13, __riscv_vluxei32_v_u32m1((uint32_t*)inptr + 4, stride, 8), zero);
inptr += 64;
}
//two extra rounds to achieve full diffusion
const vuint32m1_t xkey00 = __riscv_vle32_v_u32m1(AES_HASH_1R_XKEY00, 8);
const vuint32m1_t xkey11 = __riscv_vle32_v_u32m1(AES_HASH_1R_XKEY11, 8);
state02 = aesenc_zvkned(state02, xkey00);
state13 = aesdec_zvkned(state13, xkey00, zero);
state02 = aesenc_zvkned(state02, xkey11);
state13 = aesdec_zvkned(state13, xkey11, zero);
//output hash
__riscv_vsuxei32_v_u32m1((uint32_t*)hash + 0, stride, state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)hash + 4, stride, state13, 8);
}
void fillAes1Rx4_zvkned(void *state, size_t outputSize, void *buffer)
{
const uint8_t* outptr = (uint8_t*)buffer;
const uint8_t* outputEnd = outptr + outputSize;
const vuint32m1_t key02 = __riscv_vle32_v_u32m1(AES_GEN_1R_KEY02, 8);
const vuint32m1_t key13 = __riscv_vle32_v_u32m1(AES_GEN_1R_KEY13, 8);
const vuint32m1_t stride = __riscv_vle32_v_u32m1(AES_HASH_STRIDE_X2, 8);
const vuint32m1_t zero = {};
vuint32m1_t state02 = __riscv_vluxei32_v_u32m1((uint32_t*)state + 0, stride, 8);
vuint32m1_t state13 = __riscv_vluxei32_v_u32m1((uint32_t*)state + 4, stride, 8);
while (outptr < outputEnd) {
state02 = aesdec_zvkned(state02, key02, zero);
state13 = aesenc_zvkned(state13, key13);
__riscv_vsuxei32_v_u32m1((uint32_t*)outptr + 0, stride, state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)outptr + 4, stride, state13, 8);
outptr += 64;
}
__riscv_vsuxei32_v_u32m1((uint32_t*)state + 0, stride, state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)state + 4, stride, state13, 8);
}
void fillAes4Rx4_zvkned(void *state, size_t outputSize, void *buffer)
{
const uint8_t* outptr = (uint8_t*)buffer;
const uint8_t* outputEnd = outptr + outputSize;
const vuint32m1_t stride4 = __riscv_vle32_v_u32m1(AES_HASH_STRIDE_X4, 8);
const vuint32m1_t key04 = __riscv_vluxei32_v_u32m1((uint32_t*)(RandomX_CurrentConfig.fillAes4Rx4_Key + 0), stride4, 8);
const vuint32m1_t key15 = __riscv_vluxei32_v_u32m1((uint32_t*)(RandomX_CurrentConfig.fillAes4Rx4_Key + 1), stride4, 8);
const vuint32m1_t key26 = __riscv_vluxei32_v_u32m1((uint32_t*)(RandomX_CurrentConfig.fillAes4Rx4_Key + 2), stride4, 8);
const vuint32m1_t key37 = __riscv_vluxei32_v_u32m1((uint32_t*)(RandomX_CurrentConfig.fillAes4Rx4_Key + 3), stride4, 8);
const vuint32m1_t stride = __riscv_vle32_v_u32m1(AES_HASH_STRIDE_X2, 8);
const vuint32m1_t zero = {};
vuint32m1_t state02 = __riscv_vluxei32_v_u32m1((uint32_t*)state + 0, stride, 8);
vuint32m1_t state13 = __riscv_vluxei32_v_u32m1((uint32_t*)state + 4, stride, 8);
while (outptr < outputEnd) {
state02 = aesdec_zvkned(state02, key04, zero);
state13 = aesenc_zvkned(state13, key04);
state02 = aesdec_zvkned(state02, key15, zero);
state13 = aesenc_zvkned(state13, key15);
state02 = aesdec_zvkned(state02, key26, zero);
state13 = aesenc_zvkned(state13, key26);
state02 = aesdec_zvkned(state02, key37, zero);
state13 = aesenc_zvkned(state13, key37);
__riscv_vsuxei32_v_u32m1((uint32_t*)outptr + 0, stride, state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)outptr + 4, stride, state13, 8);
outptr += 64;
}
}
void hashAndFillAes1Rx4_zvkned(void *scratchpad, size_t scratchpadSize, void *hash, void* fill_state)
{
uint8_t* scratchpadPtr = (uint8_t*)scratchpad;
const uint8_t* scratchpadEnd = scratchpadPtr + scratchpadSize;
vuint32m1_t hash_state02 = __riscv_vle32_v_u32m1(AES_HASH_1R_STATE02, 8);
vuint32m1_t hash_state13 = __riscv_vle32_v_u32m1(AES_HASH_1R_STATE13, 8);
const vuint32m1_t key02 = __riscv_vle32_v_u32m1(AES_GEN_1R_KEY02, 8);
const vuint32m1_t key13 = __riscv_vle32_v_u32m1(AES_GEN_1R_KEY13, 8);
const vuint32m1_t stride = __riscv_vle32_v_u32m1(AES_HASH_STRIDE_X2, 8);
const vuint32m1_t zero = {};
vuint32m1_t fill_state02 = __riscv_vluxei32_v_u32m1((uint32_t*)fill_state + 0, stride, 8);
vuint32m1_t fill_state13 = __riscv_vluxei32_v_u32m1((uint32_t*)fill_state + 4, stride, 8);
//process 64 bytes at a time in 4 lanes
while (scratchpadPtr < scratchpadEnd) {
hash_state02 = aesenc_zvkned(hash_state02, __riscv_vluxei32_v_u32m1((uint32_t*)scratchpadPtr + 0, stride, 8));
hash_state13 = aesdec_zvkned(hash_state13, __riscv_vluxei32_v_u32m1((uint32_t*)scratchpadPtr + 4, stride, 8), zero);
fill_state02 = aesdec_zvkned(fill_state02, key02, zero);
fill_state13 = aesenc_zvkned(fill_state13, key13);
__riscv_vsuxei32_v_u32m1((uint32_t*)scratchpadPtr + 0, stride, fill_state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)scratchpadPtr + 4, stride, fill_state13, 8);
scratchpadPtr += 64;
}
__riscv_vsuxei32_v_u32m1((uint32_t*)fill_state + 0, stride, fill_state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)fill_state + 4, stride, fill_state13, 8);
//two extra rounds to achieve full diffusion
const vuint32m1_t xkey00 = __riscv_vle32_v_u32m1(AES_HASH_1R_XKEY00, 8);
const vuint32m1_t xkey11 = __riscv_vle32_v_u32m1(AES_HASH_1R_XKEY11, 8);
hash_state02 = aesenc_zvkned(hash_state02, xkey00);
hash_state13 = aesdec_zvkned(hash_state13, xkey00, zero);
hash_state02 = aesenc_zvkned(hash_state02, xkey11);
hash_state13 = aesdec_zvkned(hash_state13, xkey11, zero);
//output hash
__riscv_vsuxei32_v_u32m1((uint32_t*)hash + 0, stride, hash_state02, 8);
__riscv_vsuxei32_v_u32m1((uint32_t*)hash + 4, stride, hash_state13, 8);
}

View File

@@ -0,0 +1,35 @@
/*
Copyright (c) 2025 SChernykh <https://github.com/SChernykh>
Copyright (c) 2025 XMRig <support@xmrig.com>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the
names of its contributors may be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#pragma once
void hashAes1Rx4_zvkned(const void *input, size_t inputSize, void *hash);
void fillAes1Rx4_zvkned(void *state, size_t outputSize, void *buffer);
void fillAes4Rx4_zvkned(void *state, size_t outputSize, void *buffer);
void hashAndFillAes1Rx4_zvkned(void *scratchpad, size_t scratchpadSize, void *hash, void* fill_state);

View File

@@ -0,0 +1,148 @@
/*
Copyright (c) 2018-2019, tevador <tevador@gmail.com>
Copyright (c) 2026 XMRig <support@xmrig.com>
Copyright (c) 2026 SChernykh <https://github.com/SChernykh>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the
names of its contributors may be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <cstddef>
#include <cstdint>
#include <immintrin.h>
#define REVERSE_4(A, B, C, D) D, C, B, A
alignas(64) static const uint32_t AES_HASH_1R_STATE[] = {
REVERSE_4(0xd7983aad, 0xcc82db47, 0x9fa856de, 0x92b52c0d),
REVERSE_4(0xace78057, 0xf59e125a, 0x15c7b798, 0x338d996e),
REVERSE_4(0xe8a07ce4, 0x5079506b, 0xae62c7d0, 0x6a770017),
REVERSE_4(0x7e994948, 0x79a10005, 0x07ad828d, 0x630a240c)
};
alignas(64) static const uint32_t AES_GEN_1R_KEY[] = {
REVERSE_4(0xb4f44917, 0xdbb5552b, 0x62716609, 0x6daca553),
REVERSE_4(0x0da1dc4e, 0x1725d378, 0x846a710d, 0x6d7caf07),
REVERSE_4(0x3e20e345, 0xf4c0794f, 0x9f947ec6, 0x3f1262f1),
REVERSE_4(0x49169154, 0x16314c88, 0xb1ba317c, 0x6aef8135)
};
alignas(64) static const uint32_t AES_HASH_1R_XKEY0[] = {
REVERSE_4(0x06890201, 0x90dc56bf, 0x8b24949f, 0xf6fa8389),
REVERSE_4(0x06890201, 0x90dc56bf, 0x8b24949f, 0xf6fa8389),
REVERSE_4(0x06890201, 0x90dc56bf, 0x8b24949f, 0xf6fa8389),
REVERSE_4(0x06890201, 0x90dc56bf, 0x8b24949f, 0xf6fa8389)
};
alignas(64) static const uint32_t AES_HASH_1R_XKEY1[] = {
REVERSE_4(0xed18f99b, 0xee1043c6, 0x51f4e03c, 0x61b263d1),
REVERSE_4(0xed18f99b, 0xee1043c6, 0x51f4e03c, 0x61b263d1),
REVERSE_4(0xed18f99b, 0xee1043c6, 0x51f4e03c, 0x61b263d1),
REVERSE_4(0xed18f99b, 0xee1043c6, 0x51f4e03c, 0x61b263d1)
};
void hashAndFillAes1Rx4_VAES512(void *scratchpad, size_t scratchpadSize, void *hash, void* fill_state)
{
uint8_t* scratchpadPtr = (uint8_t*)scratchpad;
const uint8_t* scratchpadEnd = scratchpadPtr + scratchpadSize;
const __m512i fill_key = _mm512_load_si512(AES_GEN_1R_KEY);
const __m512i initial_hash_state = _mm512_load_si512(AES_HASH_1R_STATE);
const __m512i initial_fill_state = _mm512_load_si512(fill_state);
constexpr uint8_t mask = 0b11001100;
// enc_data[0] = hash_state[0]
// enc_data[1] = fill_state[1]
// enc_data[2] = hash_state[2]
// enc_data[3] = fill_state[3]
__m512i enc_data = _mm512_mask_blend_epi64(mask, initial_hash_state, initial_fill_state);
// dec_data[0] = fill_state[0]
// dec_data[1] = hash_state[1]
// dec_data[2] = fill_state[2]
// dec_data[3] = hash_state[3]
__m512i dec_data = _mm512_mask_blend_epi64(mask, initial_fill_state, initial_hash_state);
constexpr int PREFETCH_DISTANCE = 7168;
const uint8_t* prefetchPtr = scratchpadPtr + PREFETCH_DISTANCE;
scratchpadEnd -= PREFETCH_DISTANCE;
for (const uint8_t* p = scratchpadPtr; p < prefetchPtr; p += 256) {
_mm_prefetch((const char*)(p + 0), _MM_HINT_T0);
_mm_prefetch((const char*)(p + 64), _MM_HINT_T0);
_mm_prefetch((const char*)(p + 128), _MM_HINT_T0);
_mm_prefetch((const char*)(p + 192), _MM_HINT_T0);
}
for (int i = 0; i < 2; ++i) {
while (scratchpadPtr < scratchpadEnd) {
const __m512i scratchpad_data = _mm512_load_si512(scratchpadPtr);
// enc_key[0] = scratchpad_data[0]
// enc_key[1] = fill_key[1]
// enc_key[2] = scratchpad_data[2]
// enc_key[3] = fill_key[3]
enc_data = _mm512_aesenc_epi128(enc_data, _mm512_mask_blend_epi64(mask, scratchpad_data, fill_key));
// dec_key[0] = fill_key[0]
// dec_key[1] = scratchpad_data[1]
// dec_key[2] = fill_key[2]
// dec_key[3] = scratchpad_data[3]
dec_data = _mm512_aesdec_epi128(dec_data, _mm512_mask_blend_epi64(mask, fill_key, scratchpad_data));
// fill_state[0] = dec_data[0]
// fill_state[1] = enc_data[1]
// fill_state[2] = dec_data[2]
// fill_state[3] = enc_data[3]
_mm512_store_si512(scratchpadPtr, _mm512_mask_blend_epi64(mask, dec_data, enc_data));
_mm_prefetch((const char*)prefetchPtr, _MM_HINT_T0);
scratchpadPtr += 64;
prefetchPtr += 64;
}
prefetchPtr = (const uint8_t*) scratchpad;
scratchpadEnd += PREFETCH_DISTANCE;
}
_mm512_store_si512(fill_state, _mm512_mask_blend_epi64(mask, dec_data, enc_data));
//two extra rounds to achieve full diffusion
const __m512i xkey0 = _mm512_load_si512(AES_HASH_1R_XKEY0);
const __m512i xkey1 = _mm512_load_si512(AES_HASH_1R_XKEY1);
enc_data = _mm512_aesenc_epi128(enc_data, xkey0);
dec_data = _mm512_aesdec_epi128(dec_data, xkey0);
enc_data = _mm512_aesenc_epi128(enc_data, xkey1);
dec_data = _mm512_aesdec_epi128(dec_data, xkey1);
//output hash
_mm512_store_si512(hash, _mm512_mask_blend_epi64(mask, enc_data, dec_data));
// Just in case
_mm256_zeroupper();
}

View File

@@ -1,5 +1,5 @@
;# save VM register values
add rsp, 40
add rsp, 248
pop rcx
mov qword ptr [rcx+0], r8
mov qword ptr [rcx+8], r9

View File

@@ -0,0 +1,30 @@
mov rcx, [rsp+24]
mov qword ptr [rcx+0], r8
mov qword ptr [rcx+8], r9
mov qword ptr [rcx+16], r10
mov qword ptr [rcx+24], r11
mov qword ptr [rcx+32], r12
mov qword ptr [rcx+40], r13
mov qword ptr [rcx+48], r14
mov qword ptr [rcx+56], r15
mov rcx, [rsp+16]
aesenc xmm0, xmm4
aesdec xmm1, xmm4
aesenc xmm2, xmm4
aesdec xmm3, xmm4
aesenc xmm0, xmm5
aesdec xmm1, xmm5
aesenc xmm2, xmm5
aesdec xmm3, xmm5
aesenc xmm0, xmm6
aesdec xmm1, xmm6
aesenc xmm2, xmm6
aesdec xmm3, xmm6
aesenc xmm0, xmm7
aesdec xmm1, xmm7
aesenc xmm2, xmm7
aesdec xmm3, xmm7
movapd xmmword ptr [rcx+0], xmm0
movapd xmmword ptr [rcx+16], xmm1
movapd xmmword ptr [rcx+32], xmm2
movapd xmmword ptr [rcx+48], xmm3

View File

@@ -0,0 +1,196 @@
mov rcx, [rsp+24]
mov qword ptr [rcx+0], r8
mov qword ptr [rcx+8], r9
mov qword ptr [rcx+16], r10
mov qword ptr [rcx+24], r11
mov qword ptr [rcx+32], r12
mov qword ptr [rcx+40], r13
mov qword ptr [rcx+48], r14
mov qword ptr [rcx+56], r15
movapd xmmword ptr [rsp+40], xmm0
movapd xmmword ptr [rsp+56], xmm1
movapd xmmword ptr [rsp+72], xmm2
movapd xmmword ptr [rsp+88], xmm3
movapd xmmword ptr [rsp+104], xmm4
movapd xmmword ptr [rsp+120], xmm5
movapd xmmword ptr [rsp+136], xmm6
movapd xmmword ptr [rsp+152], xmm7
mov [rsp+168], rax
mov [rsp+176], rbx
mov [rsp+184], rdx
mov [rsp+192], rsi
mov [rsp+200], rdi
mov [rsp+208], rbp
mov [rsp+216], r8
mov [rsp+224], r9
mov r8, [rsp+232] ;# aes_lut_enc
mov r9, [rsp+240] ;# aes_lut_dec
movapd xmm12, xmmword ptr [rsp-8] ;# "call" will overwrite IMUL_RCP's data on stack, so save it
lea rsi, [rsp+104]
lea rdi, [rsp+40]
call soft_aes_enc
lea rdi, [rsp+56]
call soft_aes_dec
lea rdi, [rsp+72]
call soft_aes_enc
lea rdi, [rsp+88]
call soft_aes_dec
lea rsi, [rsp+120]
lea rdi, [rsp+40]
call soft_aes_enc
lea rdi, [rsp+56]
call soft_aes_dec
lea rdi, [rsp+72]
call soft_aes_enc
lea rdi, [rsp+88]
call soft_aes_dec
lea rsi, [rsp+136]
lea rdi, [rsp+40]
call soft_aes_enc
lea rdi, [rsp+56]
call soft_aes_dec
lea rdi, [rsp+72]
call soft_aes_enc
lea rdi, [rsp+88]
call soft_aes_dec
lea rsi, [rsp+152]
lea rdi, [rsp+40]
call soft_aes_enc
lea rdi, [rsp+56]
call soft_aes_dec
lea rdi, [rsp+72]
call soft_aes_enc
lea rdi, [rsp+88]
call soft_aes_dec
movapd xmmword ptr [rsp-8], xmm12
jmp soft_aes_end
soft_aes_enc:
mov eax, dword ptr [rsi+0]
mov ebx, dword ptr [rsi+4]
mov ecx, dword ptr [rsi+8]
mov edx, dword ptr [rsi+12]
movzx ebp, byte ptr [rdi+0]
xor eax, dword ptr [r8+rbp*4]
movzx ebp, byte ptr [rdi+1]
xor edx, dword ptr [r8+rbp*4+1024]
movzx ebp, byte ptr [rdi+2]
xor ecx, dword ptr [r8+rbp*4+2048]
movzx ebp, byte ptr [rdi+3]
xor ebx, dword ptr [r8+rbp*4+3072]
movzx ebp, byte ptr [rdi+4]
xor ebx, dword ptr [r8+rbp*4]
movzx ebp, byte ptr [rdi+5]
xor eax, dword ptr [r8+rbp*4+1024]
movzx ebp, byte ptr [rdi+6]
xor edx, dword ptr [r8+rbp*4+2048]
movzx ebp, byte ptr [rdi+7]
xor ecx, dword ptr [r8+rbp*4+3072]
movzx ebp, byte ptr [rdi+8]
xor ecx, dword ptr [r8+rbp*4]
movzx ebp, byte ptr [rdi+9]
xor ebx, dword ptr [r8+rbp*4+1024]
movzx ebp, byte ptr [rdi+10]
xor eax, dword ptr [r8+rbp*4+2048]
movzx ebp, byte ptr [rdi+11]
xor edx, dword ptr [r8+rbp*4+3072]
movzx ebp, byte ptr [rdi+12]
xor edx, dword ptr [r8+rbp*4]
movzx ebp, byte ptr [rdi+13]
xor ecx, dword ptr [r8+rbp*4+1024]
movzx ebp, byte ptr [rdi+14]
xor ebx, dword ptr [r8+rbp*4+2048]
movzx ebp, byte ptr [rdi+15]
xor eax, dword ptr [r8+rbp*4+3072]
mov dword ptr [rdi+0], eax
mov dword ptr [rdi+4], ebx
mov dword ptr [rdi+8], ecx
mov dword ptr [rdi+12], edx
ret
soft_aes_dec:
mov eax, dword ptr [rsi+0]
mov ebx, dword ptr [rsi+4]
mov ecx, dword ptr [rsi+8]
mov edx, dword ptr [rsi+12]
movzx ebp, byte ptr [rdi+0]
xor eax, dword ptr [r9+rbp*4]
movzx ebp, byte ptr [rdi+1]
xor ebx, dword ptr [r9+rbp*4+1024]
movzx ebp, byte ptr [rdi+2]
xor ecx, dword ptr [r9+rbp*4+2048]
movzx ebp, byte ptr [rdi+3]
xor edx, dword ptr [r9+rbp*4+3072]
movzx ebp, byte ptr [rdi+4]
xor ebx, dword ptr [r9+rbp*4]
movzx ebp, byte ptr [rdi+5]
xor ecx, dword ptr [r9+rbp*4+1024]
movzx ebp, byte ptr [rdi+6]
xor edx, dword ptr [r9+rbp*4+2048]
movzx ebp, byte ptr [rdi+7]
xor eax, dword ptr [r9+rbp*4+3072]
movzx ebp, byte ptr [rdi+8]
xor ecx, dword ptr [r9+rbp*4]
movzx ebp, byte ptr [rdi+9]
xor edx, dword ptr [r9+rbp*4+1024]
movzx ebp, byte ptr [rdi+10]
xor eax, dword ptr [r9+rbp*4+2048]
movzx ebp, byte ptr [rdi+11]
xor ebx, dword ptr [r9+rbp*4+3072]
movzx ebp, byte ptr [rdi+12]
xor edx, dword ptr [r9+rbp*4]
movzx ebp, byte ptr [rdi+13]
xor eax, dword ptr [r9+rbp*4+1024]
movzx ebp, byte ptr [rdi+14]
xor ebx, dword ptr [r9+rbp*4+2048]
movzx ebp, byte ptr [rdi+15]
xor ecx, dword ptr [r9+rbp*4+3072]
mov dword ptr [rdi+0], eax
mov dword ptr [rdi+4], ebx
mov dword ptr [rdi+8], ecx
mov dword ptr [rdi+12], edx
ret
soft_aes_end:
mov rax, [rsp+168]
mov rbx, [rsp+176]
mov rcx, [rsp+16]
mov rdx, [rsp+184]
mov rsi, [rsp+192]
mov rdi, [rsp+200]
mov rbp, [rsp+208]
mov r8, [rsp+216]
mov r9, [rsp+224]
movapd xmm0, xmmword ptr [rsp+40]
movapd xmm1, xmmword ptr [rsp+56]
movapd xmm2, xmmword ptr [rsp+72]
movapd xmm3, xmmword ptr [rsp+88]
movapd xmmword ptr [rcx+0], xmm0
movapd xmmword ptr [rcx+16], xmm1
movapd xmmword ptr [rcx+32], xmm2
movapd xmmword ptr [rcx+48], xmm3

View File

@@ -13,12 +13,6 @@
mov rbp, qword ptr [rsi] ;# "mx", "ma"
mov rdi, qword ptr [rsi+8] ;# uint8_t* dataset
;# dataset prefetch for the first iteration of the main loop
mov rax, rbp
shr rax, 32
and eax, RANDOMX_DATASET_BASE_MASK
prefetchnta byte ptr [rdi+rax]
mov rsi, rdx ;# uint8_t* scratchpad
mov rax, rbp

View File

@@ -25,12 +25,6 @@
mov rbp, qword ptr [rdx] ;# "mx", "ma"
mov rdi, qword ptr [rdx+8] ;# uint8_t* dataset
;# dataset prefetch for the first iteration of the main loop
mov rax, rbp
shr rax, 32
and eax, RANDOMX_DATASET_BASE_MASK
prefetchnta byte ptr [rdi+rax]
mov rsi, r8 ;# uint8_t* scratchpad
mov rbx, r9 ;# loop counter

View File

@@ -0,0 +1,16 @@
mov ecx, ebp ;# ecx = ma
and ecx, RANDOMX_DATASET_BASE_MASK
xor r8, qword ptr [rdi+rcx]
xor rbp, rax ;# modify "ma"
mov edx, ebp ;# edx = "ma"
ror rbp, 32 ;# swap "ma" and "mx"
and edx, RANDOMX_DATASET_BASE_MASK
prefetchnta byte ptr [rdi+rdx]
xor r9, qword ptr [rdi+rcx+8]
xor r10, qword ptr [rdi+rcx+16]
xor r11, qword ptr [rdi+rcx+24]
xor r12, qword ptr [rdi+rcx+32]
xor r13, qword ptr [rdi+rcx+40]
xor r14, qword ptr [rdi+rcx+48]
xor r15, qword ptr [rdi+rcx+56]

View File

@@ -225,7 +225,10 @@ namespace randomx {
}
static void exe_CFROUND(RANDOMX_EXE_ARGS) {
rx_set_rounding_mode(rotr64(*ibc.isrc, static_cast<uint32_t>(ibc.imm)) % 4);
uint64_t isrc = rotr64(*ibc.isrc, ibc.imm);
if (!RandomX_CurrentConfig.Tweak_V2_CFROUND || ((isrc & 60) == 0)) {
rx_set_rounding_mode(isrc % 4);
}
}
static void exe_ISTORE(RANDOMX_EXE_ARGS) {

View File

@@ -111,6 +111,10 @@ namespace randomx {
#define RANDOMX_HAVE_COMPILER 1
class JitCompilerA64;
using JitCompiler = JitCompilerA64;
#elif defined(__riscv) && defined(__riscv_xlen) && (__riscv_xlen == 64)
#define RANDOMX_HAVE_COMPILER 1
class JitCompilerRV64;
using JitCompiler = JitCompilerRV64;
#else
#define RANDOMX_HAVE_COMPILER 0
class JitCompilerFallback;

View File

@@ -41,7 +41,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#define RANDOMX_DATASET_MAX_SIZE 2181038080
// Increase it if some configs use larger programs
#define RANDOMX_PROGRAM_MAX_SIZE 280
#define RANDOMX_PROGRAM_MAX_SIZE 384
// Increase it if some configs use larger scratchpad
#define RANDOMX_SCRATCHPAD_L3_MAX_SIZE 2097152

View File

@@ -174,7 +174,7 @@ FORCE_INLINE void rx_set_rounding_mode(uint32_t mode) {
_mm_setcsr(rx_mxcsr_default | (mode << 13));
}
#elif defined(__PPC64__) && defined(__ALTIVEC__) && defined(__VSX__) //sadly only POWER7 and newer will be able to use SIMD acceleration. Earlier processors cant use doubles or 64 bit integers with SIMD
#elif defined(__PPC64__) && defined(__ALTIVEC__) && defined(__VSX__) //sadly only POWER7 and newer will be able to use SIMD acceleration. Earlier processors can't use doubles or 64 bit integers with SIMD
#include <cstdint>
#include <stdexcept>
#include <cstdlib>
@@ -577,8 +577,13 @@ inline void* rx_aligned_alloc(size_t size, size_t align) {
# define rx_aligned_free(a) free(a)
#endif
#ifdef __GNUC__
#define rx_prefetch_nta(x) __builtin_prefetch((x), 0, 0)
#define rx_prefetch_t0(x) __builtin_prefetch((x), 0, 3)
#else
#define rx_prefetch_nta(x)
#define rx_prefetch_t0(x)
#endif
FORCE_INLINE rx_vec_f128 rx_load_vec_f128(const double* pd) {
rx_vec_f128 x;

View File

@@ -32,6 +32,8 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#include "crypto/randomx/jit_compiler_x86.hpp"
#elif defined(__aarch64__)
#include "crypto/randomx/jit_compiler_a64.hpp"
#elif defined(__riscv) && defined(__riscv_xlen) && (__riscv_xlen == 64)
#include "crypto/randomx/jit_compiler_rv64.hpp"
#else
#include "crypto/randomx/jit_compiler_fallback.hpp"
#endif

Some files were not shown because too many files have changed in this diff Show More