Integration testing harness #2359

dead10ck · 2022-05-01T13:42:03Z

I was already working on an integration testing branch in the background for a little while, and I thought since this is such a critical code path, it warrants introducing the integration testing harness at the same time as #2267. This is very much a "just getting the ball rolling" approach, and anything is up for discussion. I fully expect the integration test framework to change over time.

Some things to note about this implementation:

In order to facilitate easier testing, I moved the toml config parsing back into the main function. This undoes the move that was done in Add support for local language configuration #1249.
The integration tests require shutting off actual rendering, so to do this, a feature branch was introduced to disable rendering for the integration tests. Tests are run by doing cargo test --features integration
We need some way to run the Application with some starting document and input keys, and then inspect the internal state while the Application is still running. We also need to run the actual main event loop in order to handle more things than just the key press event—we need to drive document writes, config changes, and eventually LSP messages, if and when any tests are added for these things. The problem is the main event loop doesn't exit until all documents are closed, and then it discards a lot of internal state.

So in order to inspect internal state, I first just did this with a simple timeout, but it quickly became clear that this is problematic, as the amount of time that a test finishes can differ dramatically, and it can lead to non-determinism: if the host system becomes busy for some reason, tests fail because they time out too soon, even when there isn't a problem. Additionally, we can't just use very long timeouts as this would artificially make it take a very long time for each integration test to finish, even if the actual work they did was done very quickly.

So I came up with the idea that we could piggy-back on the idle timer: since the event loop is using a biased event loop, and we submit all key strokes ahead of actually running the Application, the idle timer should only fire when every key has been handled. To implement this, the event loop was split into one which exits after a successful idle timer handle, and the main call site simply loops over this as well. Additionally, the idle timer is reset every time any event gets handled successfully. This lets the application signal when it is done processing the test key strokes, and leaves the internal state open for inspection.

Another bug was discovered along the way while writing these integration tests: when a document is first opened, the cursor has the initial state of (0, 0), and is not updated to reflect the document's contents until the first cursor movement. This led to some weird bugs where the selection did not end up in the right direction after an edit. As far as I have been able to find, this does not cause any bugs in the actual text transformation, so it's likely this did not cause any real harm to users, but nonetheless, it was making testing very confusing until it was fixed. Now, if there is text in the opened document, the cursor is placed over the whole first grapheme cluster.

This PR was written along with #2267, so there are tests that pertain to that PR, but I left them in and ignored them, for purposes of demonstrating different kinds of tests. I'll unignore them along with that PR.

Open questions

How do we handle the feature flag? Should we simply leave it as is, and require adding the flag to run integration tests, or should we alias test to do test --features integration? The latter would help by not requiring contributors to remember that you have to do this to run the integration tests, but has the issue that eventually, we may have tests that depend upon external state, such as an LSP on the build system. Perhaps there could be an xtask or something? I'm open to suggestions.
When we do have tests that need consistent external state, how do we do this? A docker container?

archseer · 2022-05-11T01:29:52Z

helix-term/src/application.rs

+    pub async fn event_loop_until_idle<S>(&mut self, input_stream: &mut S) -> bool
+    where
+        S: Stream<Item = crossterm::Result<crossterm::event::Event>> + Unpin,
+    {
+        loop {
+            if self.editor.should_close() {
+                return false;
+            }
+
            use futures_util::StreamExt;

            tokio::select! {
                biased;

-                event = reader.next() => {
-                    self.handle_terminal_events(event)
+                Some(event) = input_stream.next() => {
+                    self.handle_terminal_events(event);
+                    self.editor.reset_idle_timer();
                }
                Some(signal) = self.signals.next() => {
                    self.handle_signals(signal).await;
+                    self.editor.reset_idle_timer();
                }
                Some((id, call)) = self.editor.language_servers.incoming.next() => {
                    self.handle_language_server_message(call, id).await;
                    // limit render calls for fast language server messages
                    let last = self.editor.language_servers.incoming.is_empty();
-                    if last || last_render.elapsed() > deadline {
+
+                    if last || self.last_render.elapsed() > LSP_DEADLINE {
                        self.render();
-                        last_render = Instant::now();
+                        self.last_render = Instant::now();
                    }
+
+                    self.editor.reset_idle_timer();
                }
                Some(payload) = self.editor.debugger_events.next() => {
                    let needs_render = self.editor.handle_debugger_message(payload).await;
                    if needs_render {
                        self.render();
                    }
+                    self.editor.reset_idle_timer();
                }
                Some(config_event) = self.editor.config_events.1.recv() => {
                    self.handle_config_events(config_event);
                    self.render();
+                    self.editor.reset_idle_timer();
                }
                Some(callback) = self.jobs.futures.next() => {
                    self.jobs.handle_callback(&mut self.editor, &mut self.compositor, callback);
                    self.render();
+                    self.editor.reset_idle_timer();
                }
                Some(callback) = self.jobs.wait_futures.next() => {
                    self.jobs.handle_callback(&mut self.editor, &mut self.compositor, callback);
                    self.render();
+                    self.editor.reset_idle_timer();
                }
                _ = &mut self.editor.idle_timer => {
                    // idle timeout
                    self.editor.clear_idle_timer();
                    self.handle_idle_timeout();
+                    return true;
                }
            }


Only had the time for a brief look, but I don't like the event loop changes.

It adds an extra async function call to what was before a single loop (that's very frequently called)

Events & callbacks should not interact with the idle timer because it's intended to only be affected by keypresses. Continually resetting the idle timer on LSP events for example will ensure the timer never triggers for the first 10-20s while rust-analyzer boots up and continually spews log messages.

Alternate frontends will re-implement the application loop themselves, so maybe the solution is to treat the integration tests as a separate frontend? i.e. define a separate test::Application that has no rendering and declares it's own loop. I started working on this previously by moving the handle_debugger_message onto helix-view/Editor. We'd need to do the same for handle_language_server_message so we'd be able to share most of the code.

(Also, I move commands & parts of UI to helix-view on this branch: https://github.com/helix-editor/helix/tree/gui)

Only had the time for a brief look, but I don't like the event loop changes.

It adds an extra async function call to what was before a single loop (that's very frequently called)

Could you please elaborate on what your concern is about this? Is it performance? Aren't we just talking about an extra jump that happens once every few hundred milliseconds in the worst case?

Events & callbacks should not interact with the idle timer because it's intended to only be affected by keypresses. Continually resetting the idle timer on LSP events for example will ensure the timer never triggers for the first 10-20s while rust-analyzer boots up and continually spews log messages.

I believe this is already the case. It's a biased select, so it will handle key presses, signals, LSP, etc, all before it polls the idle timer. I believe the only practical difference this makes is adding a few hundred ms once that initialization finishes.

Also, even if this were not the case, the LSP would not be able to serve completions until it's fully initialized anyway.

Perhaps we could make the reset calls conditional with the integration feature?

Alternate frontends will re-implement the application loop themselves, so maybe the solution is to treat the integration tests as a separate frontend? i.e. define a separate test::Application that has no rendering and declares it's own loop.

This was my first thought when I ran into this problem as well, but then I considered that we want to be able to test all features, and I realized we would just be copy/pasting the entire loop with a few small tweaks. And this kind of defeats the point of integration testing, which is to test the very same code that the user will be running.

We want to be able to test things like :wq saves the file and exits the Application, in that order. Or doesn't exit when there's an error saving.

And we don't want to have to worry about updating the testing harness's event loop every time we touch the term Application's event loop.

@archseer I consolidated the reset calls into a single place and put it behind the integration feature flag. This way, non test builds will see no difference in behavior; and additionally, since key presses are the only thing that triggers idle timer resets, this reduces the frequency that the call stack exits and reenters the inner event loop to once every idle timer fire, which should make the performance impact negligible. How to you feel about this?

Replying to @archseer's message from Matrix:

right, but that's the loop that the terminal app will be running (with the rendering disabled anyway). What about the UI app etc? Are we going to start running integration tests per frontend?

For me, significantly shrinking application.rs to a small event loop + platform specific stuff, then having a custom loop is a reasonable tradeoff. We're primarily interested in covering commands & how they interact with the editor state

This makes a lot of sense; however, I think if the goal is to provide a sensible abstraction for different frontends, this is already an orthogonal problem. The Application event loop is driving a whole bunch of stuff that should be frontend-agnostic, like config changes, LSP message handling, debug events, and file writes. As it is, a GUI frontend would also need to copy/paste a lot of the event loop.

What you're saying makes sense, but I think fixing that would require a larger refactor to better enable different frontends, and might be outside the scope of this PR.

I wish we could somehow force inline https://github.com/helix-editor/helix/pull/2359/files#diff-d9796de334de9f547d45fd8b7476be53c771200ab48f28266e5f62e3493b5df9R252= to avoid more async state generated by the compiler

If it's really that concerning, maybe the event loop logic could be made into a macro? But does the extra await really cause that much trouble?

archseer · 2022-05-29T16:24:18Z

Can you give this a quick rebase? It should fix the the tests

dead10ck · 2022-05-29T18:21:50Z

Can you give this a quick rebase? It should fix the the tests

Sure thing, done. Also, I mentioned this in the chat room, but if you'd like, I could take some extra time to refactor at least the main event loop to make all the events happen behind a single function, somewhat similar to how I had to put all the editor events behind one function in the write path PR. That was due to borrowing restrictions, but it could be extended to all events. That way, it would be more reasonable to make a separate test Application object. I'd be happy to do that if that would help decouple the integration tests a bit from the term Application.

archseer · 2022-05-30T01:25:06Z

I'm okay with that happening in a follow-up, this PR is already quite large and it's useful as a standalone.

dead10ck · 2022-05-30T01:52:17Z

Ok cool. Any thoughts on how to handle the feature flag? There are two things I could think of that I wrote at the bottom of the PR description. If merged as is, users would have to explicitly opt in to integration tests.

archseer · 2022-05-31T03:15:47Z

I would add an integration-test alias to .cargo/config then run that as a separate step on the CI (cargo integration-test). That way it's clear if unit tests failed, or integration tests. We can document how to run the tests in the README, plus if it fails on the CI it's easy to figure out what failed: just run the same command.

helix-term/src/commands.rs

helix-term/src/compositor.rs

helix-view/src/editor.rs

helix-term/src/application.rs

helix-loader/Cargo.toml

helix-term/tests/test/write.rs

helix-term/tests/integration/movement.rs

archseer · 2022-06-03T05:50:50Z

helix-term/src/compositor.rs

+
+#[cfg(not(feature = "integration"))]
+use tui::backend::CrosstermBackend;
+
+#[cfg(not(feature = "integration"))]
 use std::io::stdout;
-use tui::backend::{Backend, CrosstermBackend};
+
+#[cfg(not(feature = "integration"))]
 type Terminal = tui::terminal::Terminal<CrosstermBackend<std::io::Stdout>>;

+#[cfg(feature = "integration")]
+type Terminal = tui::terminal::Terminal<TestBackend>;
+


Consider cherry picking these changes:

14f9878 57d4a9b e0f9d86

Oh interesting. Would that goal be to get rid of the feature flags? I think we'd still have to have them, since we're still building an Application.

Or are you just asking me to do it to get the code change into master?

It's part of an upcoming refactor and I think it would simplify the amount of cfg clauses you need

I'm sorry, I tried, but the code seems to be built on top of other bigger breaking API changes, like adding a new RenderContext. It's difficult to take just those changes without the rest of the branch.

archseer · 2022-06-03T05:52:23Z

.github/workflows/build.yml

-      - name: Copy minimal languages config
-        run: cp .github/workflows/languages.toml ./languages.toml


need the full languages config for integration tests

Do we? I'd list any languages we test with but avoid fetching uncommon grammars that won't ever be used in tests, that way we're not waiting for 50+ grammars to build.

Sure, we can do that. It just means that a contributor will have to remember to add a language if they want to add tests for it. And it will be a bit confusing because their tests will pass locally for them, but fail on GitHub, and for not obvious reasons. It was failing for me on the C indentation test, but the failure was the wrong number of spaces. It took a while to figure out it was because of the grammar missing.

That's a good point, maybe we should include languages.toml in the test directory, then conditionally compile using that languages.toml over the one in the root dir if running tests?

Hmm, I just tried implementing this, but I just realized, the feature flag would help if we were only running one test command, but if we separate regular unit tests from integration tests into separate phases, then I'm not sure how to detect if we're running unit tests without the integration feature flag.

helix-term/tests/test/write.rs

helix-term/src/application.rs

helix-term/src/ui/mod.rs

helix-view/src/document.rs

* Use new macro syntax for encoding sequences of keys * Make convenience helpers for common test pattern * Use indoc for inline indented raw strings * Add feature flag for integration testing to disable rendering

When a new View of a Document is created, a default cursor of 0, 0 is created, and it does not get normalized to a single width cursor until at least one movement of the cursor happens. This appears to have no practical negative effect that I could find, but it makes tests difficult to work with, since the initial selection is not what you expect it to be. This changes the initial selection of a new View to be the width of the first grapheme in the text.

Use the Application's main event loop to allow LSP, file writes, etc

During write-quit, if the file fails to be written for any reason, helix will still quit without saving the changes. This fixes this behavior by introducing fallibility to the asynchronous job queues. This will also benefit all contexts which may depend on these job queues. Fixes helix-editor#1575

archseer

Let's get this merged! The other pending comments can always be resolved later. Great work! 🎉

dead10ck mentioned this pull request May 1, 2022

Write path fixes #2267

Merged

dead10ck force-pushed the test-harness branch from 0f76ce7 to 806569c Compare May 1, 2022 18:44

archseer reviewed May 11, 2022

View reviewed changes

dead10ck force-pushed the test-harness branch 2 times, most recently from 8fc5075 to 22a3b33 Compare May 17, 2022 03:02

the-mikedavis added the S-waiting-on-review Status: Awaiting review from a maintainer. label May 18, 2022

dead10ck force-pushed the test-harness branch 2 times, most recently from 3292fc2 to 074c4ed Compare May 23, 2022 12:53

dead10ck force-pushed the test-harness branch from 074c4ed to 994ce4a Compare May 29, 2022 17:44

pickfire reviewed Jun 1, 2022

View reviewed changes

helix-term/src/commands.rs Outdated Show resolved Hide resolved

pickfire reviewed Jun 1, 2022

View reviewed changes

helix-term/src/compositor.rs Show resolved Hide resolved

pickfire reviewed Jun 1, 2022

View reviewed changes

helix-view/src/editor.rs Show resolved Hide resolved

dead10ck force-pushed the test-harness branch from de04691 to 721c381 Compare June 2, 2022 04:25

pickfire reviewed Jun 2, 2022

View reviewed changes

helix-term/src/application.rs Outdated Show resolved Hide resolved

archseer reviewed Jun 3, 2022

View reviewed changes

helix-loader/Cargo.toml Outdated Show resolved Hide resolved

archseer reviewed Jun 3, 2022

View reviewed changes

dead10ck force-pushed the test-harness branch 2 times, most recently from 636d6a0 to 01eb375 Compare June 4, 2022 19:36

dead10ck force-pushed the test-harness branch from 9650b0f to 68c42a0 Compare June 14, 2022 02:03

dead10ck and others added 6 commits June 18, 2022 23:54

move config parsing back into main

0623a72

Simplify handle_terminal_events signature

adb6cd5

Integration testing harness

308cab3

improve test harness

502d329

* Use new macro syntax for encoding sequences of keys * Make convenience helpers for common test pattern * Use indoc for inline indented raw strings * Add feature flag for integration testing to disable rendering

refactor helpers, use new test helpers

84bbe6b

dead10ck added 23 commits June 18, 2022 23:54

use main application event loop

ee705dc

Use the Application's main event loop to allow LSP, file writes, etc

tests for serialized writes

07fc80a

tests for buffer-close

4012096

use idle timer instead of fixed timeout

2386c81

add integration feature to github tests

2fbf833

use Results in integration tests for more error context

1533f48

Add more context; Editor::open doesn't need to own path

ed950fc

use test terminal backend for integration tests

652cdda

use env var for integration test log level

cb0440b

don't read from stdin for integration tests

4e34ee7

need the full languages config for integration tests

28e94fb

use system's appropriate line ending

ef8fe5a

use a read only file to ensure write failure

acf9317

rename top level module to satisfy cargo fmt

8d8d389

ignore failing write path tests until fixes are merged

374724f

consolidate idle timer logic, make conditional

526c9be

rename test helpers

7c0bca1

add test for write-quit happy path

fac36bc

add integration-test cargo alias

086b63a

update docs for integration tests

65bf683

factor new Application with file arg to function

665286c

replace phrase in tests

5f7c247

dead10ck force-pushed the test-harness branch from fe21a1f to 5f7c247 Compare June 19, 2022 04:01

Merge branch 'master' into test-harness

458b89e

archseer approved these changes Jun 21, 2022

View reviewed changes

the-mikedavis removed the S-waiting-on-review Status: Awaiting review from a maintainer. label Jun 21, 2022

archseer merged commit 19dccad into helix-editor:master Jun 21, 2022

the-mikedavis mentioned this pull request Jun 23, 2022

:wq! should not quit (and lose the changes) if there are no permissions to write the file #2864

Open

dead10ck deleted the test-harness branch January 14, 2025 03:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration testing harness #2359

Integration testing harness #2359

dead10ck commented May 1, 2022 •

edited

Loading

archseer May 11, 2022

dead10ck May 11, 2022 •

edited

Loading

dead10ck May 17, 2022

dead10ck May 17, 2022

archseer commented May 29, 2022

dead10ck commented May 29, 2022

archseer commented May 30, 2022

dead10ck commented May 30, 2022

archseer commented May 31, 2022

archseer Jun 3, 2022

dead10ck Jun 4, 2022

archseer Jun 6, 2022

dead10ck Jun 19, 2022

archseer Jun 3, 2022

dead10ck Jun 3, 2022

archseer Jun 6, 2022

dead10ck Jun 14, 2022

archseer left a comment

		- name: Copy minimal languages config
		run: cp .github/workflows/languages.toml ./languages.toml

Integration testing harness #2359

Integration testing harness #2359

Conversation

dead10ck commented May 1, 2022 • edited Loading

Open questions

Choose a reason for hiding this comment

dead10ck May 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

archseer commented May 29, 2022

dead10ck commented May 29, 2022

archseer commented May 30, 2022

dead10ck commented May 30, 2022

archseer commented May 31, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

archseer left a comment

Choose a reason for hiding this comment

dead10ck commented May 1, 2022 •

edited

Loading

dead10ck May 11, 2022 •

edited

Loading