Re-run flaky tests by ethomson · Pull Request #5140 · libgit2/libgit2

ethomson · 2019-06-23T19:25:25Z

Re-run the flaky online integration tests. Since these hit actual network endpoints and those may be occasionally down, we want to re-run them up to 5 times. This prevents us from failing the builds just because the badssl endpoint is down, for example.

pks-t

I'm generally 👍 on this, feels like it is long overdue. I've been wondering about whether this is the right layer to fix this, though. Wouldn't it make more sense to have this in the test code itself? Like that, developers would benefit from it, too, and we'd be able to mark single tests as flaky instead of having to re-do all tests of a particular test suite.

It's fine if your answer to the above is "no", though. :)

pks-t · 2019-06-24T14:01:43Z

ci/test.ps1

 	$TestCommand += " -r${BuildDir}\results_${TestName}.xml"

-	Invoke-Expression $TestCommand
-	if ($LastExitCode -ne 0) { $global:Success = $false }


One thing I wondered about recently: do we really need the Powershell file at all? I saw that Azure does support Bash on Windows via Git for Windows. If that's the case, then we should drop this file completely so that we don't always have to keep both in sync. I'd naturally volunteer to do that.

That's true, it does. I have no love for PowerShell... I'm happy to do that, too, but I've got a few other things going on before I can look at it.

ethomson · 2019-06-24T14:16:23Z

I'm generally 👍 on this, feels like it is long overdue. I've been wondering about whether this is the right layer to fix this, though. Wouldn't it make more sense to have this in the test code itself? Like that, developers would benefit from it, too, and we'd be able to mark single tests as flaky instead of having to re-do all tests of a particular test suite.

Right. This was my first thought as well, actually. But then I realized it would require more thought than I wanted to give it. 😀

I think there are two issues here:

We need a way to decorate tests as flaky. This could be as simple as /* flaky */ at the end of a declaration... in fact, that's probably the smart way to do this as far as clar's parsing goes.

(But then I started yakshaving, because I wish clar's parser was smart enough to deal with commented out tests. So then I started thinking more about using llvm or something, and then things got really out of control.)

We could also mark a suite as flaky on the command-line, which might actually be smarter still. eg, -fonline.

All tests would need to get tightened up (or we'd fail them for memory leaks). This is not impossible by any means, but it is an added effort to go through and identify them. But if we do the whole-suite marking on the command line then we'd need to do a bunch en masse.

🤷‍♂

So I decided to knock this out quickly so that I'd be less angry. 😉

pks-t · 2019-06-24T14:36:13Z

ci/test.sh

+
+	if [ "$FAILED" -ne 0 ]; then
+		SUCCESS=0
+	fi


Previously we were executing failure, don't we have to do that now, too?

Ach, that ended up dead. I removed it, but I also improved the error message a little, inspired by that function.

pks-t · 2019-06-24T14:36:45Z

Fair enough ;) So let's get this merged quickly -- we can still improve in the future.

Our online tests are occasionally flaky since they hit real network endpoints. Re-run them up to 5 times if they fail, to allow us to avoid having to fail the whole build.

pks-t approved these changes Jun 24, 2019

View reviewed changes

pks-t reviewed Jun 24, 2019

View reviewed changes

ethomson force-pushed the ethomson/flaky_ci branch 3 times, most recently from b9f1625 to 1dc004e Compare June 24, 2019 21:27

ethomson added 2 commits June 24, 2019 22:54

ci: add flaky test re-execution on Unix

6d8a34a

Our online tests are occasionally flaky since they hit real network endpoints. Re-run them up to 5 times if they fail, to allow us to avoid having to fail the whole build.

ci: add flaky test re-execution on Windows

c7b4ce5

Our online tests are occasionally flaky since they hit real network endpoints. Re-run them up to 5 times if they fail, to allow us to avoid having to fail the whole build.

ethomson force-pushed the ethomson/flaky_ci branch from 1dc004e to c7b4ce5 Compare June 24, 2019 21:54

ethomson merged commit a064920 into master Jun 24, 2019

implausible mentioned this pull request Jul 23, 2019

Bump libgit2 nodegit/nodegit#1705

Merged

ethomson deleted the ethomson/flaky_ci branch February 2, 2020 13:07

snyk-bot mentioned this pull request Feb 23, 2020

[Snyk] Upgrade nodegit from 0.4.1 to 0.26.4 saurabharch/Breezeblocks#1

Open

pks-t added the backport-v0.28.5 label Mar 26, 2020

snyk-bot mentioned this pull request Apr 22, 2020

[Snyk] Upgrade nodegit from 0.24.3 to 0.26.5 aminatakonate000/Graviton-App#4

Open

snyk-bot mentioned this pull request May 5, 2020

[Snyk] Upgrade nodegit from 0.24.3 to 0.26.5 Barnstorm-Online/ngp-openapi-generator#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-run flaky tests#5140

Re-run flaky tests#5140
ethomson merged 2 commits intomasterfrom
ethomson/flaky_ci

ethomson commented Jun 23, 2019

Uh oh!

pks-t left a comment

Uh oh!

pks-t Jun 24, 2019

Uh oh!

ethomson Jun 24, 2019

Uh oh!

ethomson commented Jun 24, 2019

Uh oh!

pks-t Jun 24, 2019

Uh oh!

ethomson Jun 24, 2019

Uh oh!

pks-t commented Jun 24, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

ethomson commented Jun 23, 2019

Uh oh!

pks-t left a comment

Choose a reason for hiding this comment

Uh oh!

pks-t Jun 24, 2019

Choose a reason for hiding this comment

Uh oh!

ethomson Jun 24, 2019

Choose a reason for hiding this comment

Uh oh!

ethomson commented Jun 24, 2019

Uh oh!

pks-t Jun 24, 2019

Choose a reason for hiding this comment

Uh oh!

ethomson Jun 24, 2019

Choose a reason for hiding this comment

Uh oh!

pks-t commented Jun 24, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments