Skip to content

RATIS-2487. Trigger installSnapshot if leader cannot get previous entry#1420

Merged
szetszwo merged 8 commits intoapache:masterfrom
ivandika3:RATIS-2487
Apr 3, 2026
Merged

RATIS-2487. Trigger installSnapshot if leader cannot get previous entry#1420
szetszwo merged 8 commits intoapache:masterfrom
ivandika3:RATIS-2487

Conversation

@ivandika3
Copy link
Copy Markdown
Contributor

@ivandika3 ivandika3 commented Apr 2, 2026

What changes were proposed in this pull request?

See https://issues.apache.org/jira/browse/RATIS-2487

The current approach is to make newAppendEntriesRequest returns null which will skip the current iteration and change the shouldInstallSnapshot (if installSnapshot is enabled) and shouldNotifyToInstallSnapshot (if installSnapshot is disabled) so that subsequent LogAppender run will trigger snapshot since the leader cannot get the previous log (otherwise the logAppender is not going to send any RPC to the follower).

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/RATIS-2487

How was this patch tested?

UT.

@@ -238,6 +217,15 @@ public AppendEntriesRequestProto newAppendEntriesRequest(long callId, boolean he
final long snapshotIndex = follower.getSnapshotIndex();
final long leaderNext = getRaftLog().getNextIndex();
final long followerNext = follower.getNextIndex();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems getNextIndex is called twice in this method, we might need to call it only once per method.

Copy link
Copy Markdown
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivandika3 , thanks for working on this!

The change looks good. The current shouldInstallSnapshot code is distributed in three difference places, LogAppender, LogAppenderBase and GrpcLogAppender. How about we refactor the code and move them together? See https://issues.apache.org/jira/secure/attachment/13081541/1420_review.patch

@ivandika3 ivandika3 marked this pull request as ready for review April 3, 2026 06:07
@ivandika3
Copy link
Copy Markdown
Contributor Author

ivandika3 commented Apr 3, 2026

@szetszwo Thanks for the review and the suggestion. Updated.

Edit: TestRaftAsyncWithNetty has been flaky for a while.

Copy link
Copy Markdown
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

You are right that TestRaftAsyncWithNetty became flaky recently. Cc @slfan1989

@szetszwo szetszwo merged commit 98a4c48 into apache:master Apr 3, 2026
31 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants