Skip to content

Add risc-v vector extension tests #1003

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Jul 11, 2025
Merged

Conversation

nadime15
Copy link
Contributor

@nadime15 nadime15 commented Jun 4, 2025

This commit adds a submodule containing precompiled tests (around ~1800 tests) for the risc-v vector extension and integrates them into the CMake testing workflow.

Due to their long runtime, these tests are executed weekly via scheduled CI runs or can be triggered manually. Developers can also run these tests locally by enabling the RVV_TESTS flag during the CMake configuration.

These tests were compiled with:
https://github.com/chipsalliance/riscv-vector-tests/tree/c093610

and with the following ISA options:
--isa=rv64gcv_zvl128b_zve64d_zfh_zfhmin_zvfh

Configurated with:
VLEN=128 and ELEN=64

This is still a work in progress, but I would appreciate early feedback and would like to discuss a few points. For now, I have hosted the binaries myself, but the idea is to move them to an official RISC-V repository soon.

Discussion points:

  1. Should we rename the binaries to include a .elf suffix, similar to what we do for riscv-tests?
  2. Should we also generate disassembly dumps (*.dump) like in riscv-tests?
  3. I set the default VLEN in config.json to 128 (7) to avoid maintaining multiple JSON files.
  4. Personally, I don't think it makes sense to run these tests on every push, so I added a weekly cron job instead. They can be manually triggered by developers or by us in case someone is pushing some changes that effect the vector extension.
  5. I excluded the tests for vector crypto extension because I though I would save much more time but I think at the end I might have saved 2-3 minutes. I think it might makes sense to add them back to the tests?
  6. I reuse existing workflows, which makes no use of caches etc. Should I create a whole seperate workflow or are people ok with the idea to reuse the existing workflow.

The whole workflow takes around ~40 Minutes, here is an example:
https://github.com/nadime15/sail-riscv/actions/runs/15445714226/job/43474657958

Copy link

github-actions bot commented Jun 4, 2025

Test Results

2 099 tests  +1 697   2 099 ✅ +1 697   17m 49s ⏱️ + 14m 55s
    1 suites ±    0       0 💤 ±    0 
    1 files   ±    0       0 ❌ ±    0 

Results for commit b8e56c2. ± Comparison against base commit 63661df.

This pull request removes 392 and adds 2089 tests. Note that renamed tests count towards both.
rv32d_rv32mi-p-breakpoint.elf ‑ rv32d_rv32mi-p-breakpoint.elf
rv32d_rv32mi-p-csr.elf ‑ rv32d_rv32mi-p-csr.elf
rv32d_rv32mi-p-illegal.elf ‑ rv32d_rv32mi-p-illegal.elf
rv32d_rv32mi-p-ma_addr.elf ‑ rv32d_rv32mi-p-ma_addr.elf
rv32d_rv32mi-p-ma_fetch.elf ‑ rv32d_rv32mi-p-ma_fetch.elf
rv32d_rv32mi-p-mcsr.elf ‑ rv32d_rv32mi-p-mcsr.elf
rv32d_rv32mi-p-sbreak.elf ‑ rv32d_rv32mi-p-sbreak.elf
rv32d_rv32mi-p-scall.elf ‑ rv32d_rv32mi-p-scall.elf
rv32d_rv32mi-p-shamt.elf ‑ rv32d_rv32mi-p-shamt.elf
rv32d_rv32si-p-csr.elf ‑ rv32d_rv32si-p-csr.elf
…
rv32d_2025-06-22/riscv-tests/rv32mi-p-breakpoint ‑ rv32d_2025-06-22/riscv-tests/rv32mi-p-breakpoint
rv32d_2025-06-22/riscv-tests/rv32mi-p-csr ‑ rv32d_2025-06-22/riscv-tests/rv32mi-p-csr
rv32d_2025-06-22/riscv-tests/rv32mi-p-illegal ‑ rv32d_2025-06-22/riscv-tests/rv32mi-p-illegal
rv32d_2025-06-22/riscv-tests/rv32mi-p-instret_overflow ‑ rv32d_2025-06-22/riscv-tests/rv32mi-p-instret_overflow
rv32d_2025-06-22/riscv-tests/rv32mi-p-lh-misaligned ‑ rv32d_2025-06-22/riscv-tests/rv32mi-p-lh-misaligned
rv32d_2025-06-22/riscv-tests/rv32mi-p-lw-misaligned ‑ rv32d_2025-06-22/riscv-tests/rv32mi-p-lw-misaligned
rv32d_2025-06-22/riscv-tests/rv32mi-p-ma_addr ‑ rv32d_2025-06-22/riscv-tests/rv32mi-p-ma_addr
rv32d_2025-06-22/riscv-tests/rv32mi-p-ma_fetch ‑ rv32d_2025-06-22/riscv-tests/rv32mi-p-ma_fetch
rv32d_2025-06-22/riscv-tests/rv32mi-p-mcsr ‑ rv32d_2025-06-22/riscv-tests/rv32mi-p-mcsr
rv32d_2025-06-22/riscv-tests/rv32mi-p-pmpaddr ‑ rv32d_2025-06-22/riscv-tests/rv32mi-p-pmpaddr
…

♻️ This comment has been updated with latest results.

@arichardson
Copy link
Collaborator

Since this already set up as a weekly run would it be possible to build it as part of the CI flow? Using something like ExternalProject_Add() with a fixed git commit hash?

@nadime15
Copy link
Contributor Author

nadime15 commented Jun 4, 2025

I think we decided to add the tests as a git submodule instead of using an external project. I purposely didn’t fix it to a specific commit hash so it tracks master. But honestly, I don't mind either way.

@Alasdair
Copy link
Collaborator

Alasdair commented Jun 4, 2025

I think the idea was we could create a separate repository that would pull these tests in as a submodule, then that repository would have a weekly job that would build the tests how we need them. That job could upload them as a release that we could pull into a job on this repo to test more regularly.

@arichardson
Copy link
Collaborator

I think the idea was we could create a separate repository that would pull these tests in as a submodule, then that repository would have a weekly job that would build the tests how we need them. That job could upload them as a release that we could pull into a job on this repo to test more regularly.

That's also fine. Since git is not a great way of storing binaries, I'd much prefer having it stored as a github release.

@nadime15
Copy link
Contributor Author

nadime15 commented Jun 4, 2025

That's also fine. Since git is not a great way of storing binaries, I'd much prefer having it stored as a github release.

Sure sounds good!

I think the idea was we could create a separate repository that would pull these tests in as a submodule, then that repository would have a weekly job that would build the tests how we need them. That job could upload them as a release that we could pull into a job on this repo to test more regularly.

I think I dont understand the part with the submodule then. What do you mean by "...a separate repository that would pull these tests in as a submodule, then that repository would have a weekly job that would build the tests how we need them."

The tests are all auto generated by a script and not manually written, so what exactly do you wanna pull in as a submodule and then build?

From what I understand, we create a separate repository that builds different versions of the vector tests using the vector test suite (with varying VLEN, XLEN, etc.), and publishes weekly releases (probably rather monthly releases because the rvv test suite is basically never changing).

And finally, we use these releases as part of a CI job in this project.

@pmundkur
Copy link
Collaborator

pmundkur commented Jun 4, 2025

The separate repo will pull in riscv-tests, riscv-vector-tests and perhaps other test repos as submodules.

@nadime15
Copy link
Contributor Author

nadime15 commented Jun 4, 2025

Couldn't I just install them via Github Actions and then build and release them? Whats the advantage of having them as submodules?

@arichardson
Copy link
Collaborator

Couldn't I just install them via Github Actions and then build and release them? Whats the advantage of having them as submodules?

I agree there is no need to have them as submodules. In my experience submodules just create lots of pain

@jordancarlin
Copy link
Collaborator

The benefit of submodules in the testing repo instead of just cloning with GitHub Actions during the CI jobs is that it pins the version of the tests we are using to a specific commit. We probably don't want the tests updating randomly because that could introduce unexpected failures. We're better off controlling when we update the sources.

Technically we could pin the version the CI job installs, but if we do it with submodules we could use something like Dependabot to automatically open a PR when there is an updated version of any of the repos available.

@pmundkur pmundkur added the tgmm-agenda Tagged for the next Golden Model meeting agenda. label Jun 6, 2025
@pmundkur
Copy link
Collaborator

pmundkur commented Jun 6, 2025

Discussion points:

1. Should we rename the binaries to include a .elf suffix, similar to what we do for riscv-tests?

Not sure why those had an .elf suffix; perhaps easier to glob from a script maybe? As long as the files are easily listed in CMake or a script, it shouldn't matter.

2. Should we also generate disassembly dumps (*.dump) like in riscv-tests?

That's not needed.

3. I set the default VLEN in config.json to 128 (7) to avoid maintaining multiple JSON files.

The repository should have a script to generate the appropriate configs, and a script/CMakeLists to run the simulator (or simulators until we have a unified build) with the appropriate configs for the tests.

4. Personally, I don't think it makes sense to run these tests on every push, so I added a weekly cron job instead. They can be manually triggered by developers or by us in case someone is pushing some changes that effect the vector extension.

Perhaps a daily job is also ok for the vector tests, depending on how long they take. A week might be too long.

5. I excluded the tests for vector crypto extension because I though I would save much more time but I think at the end I might have saved 2-3 minutes. I think it might makes sense to add them back to the tests?

Agreed, it makes sense to add back.

6. I reuse existing workflows, which makes no use of caches etc. Should I create a whole seperate workflow or are people ok with the idea to reuse the existing workflow.

The whole workflow takes around ~40 Minutes, here is an example: https://github.com/nadime15/sail-riscv/actions/runs/15445714226/job/43474657958

Don't know enough to comment.

@pmundkur
Copy link
Collaborator

pmundkur commented Jun 6, 2025

@jordancarlin
Copy link
Collaborator

If the whole workflow only takes 40 minutes, I'd be tempted to have it run on all PRs. The Lean workflow is 30-45+ minutes (depending on the cache), so it wouldn't even extend the total CI time for a PR.

@nadime15 nadime15 force-pushed the add_rvv_test branch 2 times, most recently from 0ec93b2 to 2bc4fe5 Compare June 8, 2025 22:32
@nadime15
Copy link
Contributor Author

nadime15 commented Jun 8, 2025

I have reworked the flow, made a few changes and followed @Alasdair description. (Still needs some work, like clearer step names and general cleanup etc.). I think it doesn’t really make sense to recompile the tests weekly, so instead I followed @jordancarlin idea to use Dependabot. I am not fully there yet, but I have at least enabled Dependabot. Now, Dependabot will monitor changes in the underlying submodules and open a pull request if one of them updates (so far, riscv-test and riscv-vector-test).

A feature I am working on, but have not fully implemented is to auto-merge Dependabots PR if an additional workflow successfully recompiles all tests with the updated commit. If that passes, we automatically merge the PR (which updates the submodule hash) and create a new release based on the latest riscv-test or riscv-vector-test commit.

I created a new repository that compiles tests from riscv-test and riscv-vector-test for VLEN = [512, 256, 128] and ELEN = [64, 32]. Thanks to parallel execution, this works pretty well (it takes around 3h in total). See the image:

Bildschirmfoto vom 2025-06-08 18-35-08

I needed to upload all binaries as separate chunks in the release because bundling them into a single tarball exceeds GitHub's 2GB per file limit.

This commit replaces the old workflow I had with a new one that always fetches the latest release and runs (in theory) the full test suite. For now, to save time, I’m only testing riscv-vector-test with VLEN=512 and ELEN=64.

Here is the link to one workflow, which adds about ~2,500 tests.

I was wondering how to handle different VLEN and ELEN values? Should I dynamically create new config files directly in the workflow with the corresponding parameters? We can also simply use the default config file and stick to VLEN = 512 and ELEN = 64 without testing different VLEN/ELEN values.

Edit: I just noticed that the workflow has failed due to:
System.IO.IOException: No space left on device : '/home/runner/runners/2.325.0/_diag/Worker_20250608-221729-utc.log' Unhandled exception. System.IO.IOException: No space left on device : ...

It stopped at around 687/3097 Test

@jordancarlin
Copy link
Collaborator

This is definitely moving in the right direction. A few thoughts/comments:

I needed to upload all binaries as separate chunks in the release because bundling them into a single tarball exceeds GitHub's 2GB per file limit.

I think it makes more sense to upload each suite as a separate release artifact. So one tarball for riscv-tests and one for each of the riscv-vector-test variants that you are compiling. Then we can have a matrix workflow in the sail repo where a separate job runs each set of tests to cut down the runtime. It should also help avoid file size issues.

I was wondering how to handle different VLEN and ELEN values? Should I dynamically create new config files directly in the workflow with the corresponding parameters? We can also simply use the default config file and stick to VLEN = 512 and ELEN = 64 without testing different VLEN/ELEN values.

What about using jq to modify VLEN/ELEN in the config file directly in the workflow? Seems like the simplest solution. I think testing multiple VLEN/ELEN combinations has value.

Edit: I just noticed that the workflow has failed due to:
System.IO.IOException: No space left on device : '/home/runner/runners/2.325.0/_diag/Worker_20250608-221729-utc.log' > Unhandled exception. System.IO.IOException: No space left on device : ...

It stopped at around 687/3097 Test

I have a script that I've been using in other GitHub CI jobs that increases the available space on the runner from ~24GB to ~61GB by deleting lots of unneeded things. I think that should be plenty of space for this. https://github.com/openhwgroup/cvw/blob/main/.github/scripts/cli-space-cleanup.sh

@nadime15
Copy link
Contributor Author

nadime15 commented Jun 9, 2025

I think it makes more sense to upload each suite as a separate release artifact. So one tarball for riscv-tests and one for each of the riscv-vector-test variants that you are compiling. Then we can have a matrix workflow in the sail repo where a separate job runs each set of tests to cut down the runtime. It should also help avoid file size issues.

Yeah, this would kinda work too, the only issue is that once you set VLEN ≥ 1028 (which we might do at some point), you hit the same problem again. No idea if that really matters or if people even care about those cases, but I think it is worth mentioning.

I was thinking if we could have one job that grabs all the parts, reconstructs them into a single tarball, unpacks it, uploads the artifact, and then launches multiple jobs. It’s basically what you’re describing, just with a slightly different starting point.

What about using jq to modify VLEN/ELEN in the config file directly in the workflow? Seems like the simplest solution. I think testing multiple VLEN/ELEN combinations has value.

Ahh ok, I was not familiar with that tool, makes sense!

I have a script that I've been using in other GitHub CI jobs that increases the available space on the runner from ~24GB to ~61GB by deleting lots of unneeded things. I think that should be plenty of space for this. https://github.com/openhwgroup/cvw/blob/main/.github/scripts/cli-space-cleanup.sh

That is helpful! Not 100% sure, but I think ctest (?) stores logs for each test, and that might be why I am running out of space. The vector test logs are insanely long, it might make sense to disable logging if that is really what is causing it.

@nadime15
Copy link
Contributor Author

I have updated a few things as @jordancarlin suggested. Each test set now gets its own tarball, and the new workflow pulls them all in parallel and runs them:

See:
Bildschirmfoto vom 2025-06-10 23-06-40

and one example run:
Bildschirmfoto vom 2025-06-10 23-07-31

As discussed on monday I also added the option to run the tests locally. By default, it fetches the riscv-tests and riscv-vector-tests. It checks vlen and elen dynamically and downloads the matching test set if the values are valid.

One thought: might be worth adding a CMake option to enable/disable running vector tests locally (by default should be off). The default config (VLEN=512, ELEN=64) takes over 3h, while something like VLEN=128, ELEN=32 finishes in ~45min. For GitHub workflows on push, we could tweak vlen/elen on the fly with jq if needed.

Copy link
Collaborator

@Timmmm Timmmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there must be a better way to handle testing multiple vlen/elen. This way seems very awkward.

I think the best option would be to have each VLEN/ELEN as an option. Something like this:

foreach (vlen IN ITEMS 128 256 512)
  foreach (elen IN ITEMS 32 64)
    option(ENABLE_RISCV_VECTOR_TESTS_V${vlen}_E{$elen} "Enable the riscv-vector-tests with vlen=${vlen}, elen=${elen}")
    if (ENABLE_RISCV_VECTOR_TESTS_V${vlen}_E{$elen})
       download_riscv_tests(...)
       file(GLOB elfs_rv32d ... )
       file(GLOB elfs_rv32d ... )

        foreach (arch IN ITEMS "rv32d" "rv64d")
            add_custom_command(... call `jq` to generate config JSON ...)
   
           foreach(elf IN LISTS elfs_${arch})
             file(RELATIVE_PATH elf_name "${CMAKE_CURRENT_SOURCE_DIR}" ${elf})

               add_test(
                  NAME "${arch}_${elf_name}"
                    COMMAND
                        $<TARGET_FILE:riscv_sim_${arch}>
                       --config "${CMAKE_CURRENT_BINARY_DIR}/config_${arch}_${vlen}_${elen}.json"
                ${elf}

Then I think you don't even need to change CI at all; just enable those CMake options.

@nadime15
Copy link
Contributor Author

About your change, I like it. The only thing I am unsure about is using jq, since not everyone has it installed. And even if they do, they might have a version that does not support certain features (mine cant do log2, jq-1.6).

@Timmmm
Copy link
Collaborator

Timmmm commented Jun 24, 2025

Yeah and now that I'm thinking about it our JSON format supports comments, which jq presumably doesn't. We don't have any comments yet but it would be nice to have them.

Maybe we should instead use configure_file() with a template like @arichardson was suggesting for #870

@nadime15
Copy link
Contributor Author

nadime15 commented Jul 4, 2025

@Timmmm I gave this another try and incorporated all your suggestions. I think the code is cleaner now and easier to understand.

I explicitly disabled tracing for Sail (I assumed the --quiet option for ctest would stop ctest from logging the simulators output, but it does not). Disabling it had a big impact and saved a lot of time and probably resources too, since the tests are now running without any errors.

I have only tried one run so far, but I’ll rerun it to be sure. Here is the workflow.

Updated RISC-V Vector Test Runtimes (New without logging vs Previous)

Job New Time Prev Time
run-riscv-vector-tests (128, 32) 23m 27s 40m 10s
run-riscv-vector-tests (128, 64) 28m 33s 45m 36s
run-riscv-vector-tests (256, 32) 41m 28s 1h 13m 19s
run-riscv-vector-tests (256, 64) 50m 08s 1h 22m 31s
run-riscv-vector-tests (512, 32) 1h 31m 38s 2h 43m 56s
run-riscv-vector-tests (512, 64) 1h 58m 02s 3h 16m 15s
run-riscv-tests 10m 13s 16m 44s
Total 6h 3m 29s 10h 18m 31s

Copy link
Collaborator

@Timmmm Timmmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - fantastic improvement!

Copy link
Collaborator

@jordancarlin jordancarlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor suggestion, but LGTM! This will be a great improvement.

@pmundkur pmundkur added will be merged Scheduled to be merged in a few days if nobody objects and removed tgmm-agenda Tagged for the next Golden Model meeting agenda. labels Jul 11, 2025
@pmundkur pmundkur added this pull request to the merge queue Jul 11, 2025
Merged via the queue into riscv:master with commit 5487fa7 Jul 11, 2025
7 checks passed
@Timmmm
Copy link
Collaborator

Timmmm commented Jul 14, 2025

Commit message should have been edited before merging this. :-/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
will be merged Scheduled to be merged in a few days if nobody objects
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants