Skip to content

Add Redis MOVED Redirection support#3639

Merged
liviuchircu merged 11 commits into
OpenSIPS:masterfrom
NormB:redis_moved
May 19, 2025
Merged

Add Redis MOVED Redirection support#3639
liviuchircu merged 11 commits into
OpenSIPS:masterfrom
NormB:redis_moved

Conversation

@NormB
Copy link
Copy Markdown
Member

@NormB NormB commented May 2, 2025

Summary

This update adds Redis MOVED Redirection support.

Details

When Redis is operating as a cluster, it will sometimes respond to a request with a MOVED reply. The contents of the MOVED reply contain a slot hash, endpoint and port that the request should be redirected toward. This is more fully described here: https://redis.io/docs/latest/operate/oss_and_stack/reference/cluster-spec/

This update partially supports issue: #2811

Solution

A request to Redis that is responded to with a MOVED will cause the redisReply to have an error state. When this error was received, the list of existing connections are attempted without any consideration to the endpoint returned in the redisReply.

A second issue was uncovered when issuing Redis requests with keys that contain hashtags which are parameters enclosed in braces {}. When hashtags are used, the slot hash is NOT computed using the entire key, but instead, the slot hash is determined using the contents of the first {hashtag}. The routine used to compare a slot hash against the range of slots in each Redis node does not take this into consideration.

This PR adds two updates to the error handling logic because a MOVED Redirection is returned as an error by Redis.

The first update parses the MOVED reply into it's components: slot, endpoint and port.
The second update uses the data from the MOVED reply components to perform a lookup for a node that matches the slot (range), endpoint and port. If a match is found, the node is returned to the error handler and the request is retried.

Compatibility

This update is only triggered when an error redisReply is received with a MOVED Redirection. Existing errors are handled the way they were before so this update should not cause any problems with non-MOVED errors.

Closing issues

Updates #2811

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for Redis MOVED redirection responses in clustered Redis setups. It introduces new constants, helper functions, and error handling logic to parse MOVED replies and redirect connections appropriately.

  • Added new constants and inline functions for parsing MOVED and ASK responses.
  • Implemented get_redis_connection_by_endpoint and parse_moved_reply to extract redirection info.
  • Updated error handling in the Redis query logic to manage MOVED replies by retrying with the correct connection.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
modules/cachedb_redis/cachedb_redis_utils.h Added macros and function declarations to support MOVED and ASK reply parsing.
modules/cachedb_redis/cachedb_redis_utils.c Implemented helper functions including get_redis_connection_by_endpoint and inline match_prefix.
modules/cachedb_redis/cachedb_redis_dbase.h Introduced redis_moved and const_str typedefs for storing redirection information.
modules/cachedb_redis/cachedb_redis_dbase.c Extended error handling in _redis_run_command to parse MOVED replies and retry redirection.

Comment thread modules/cachedb_redis/cachedb_redis_utils.c Outdated
Comment thread modules/cachedb_redis/cachedb_redis_dbase.c Outdated
NormB and others added 2 commits May 2, 2025 08:04
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@NormB NormB changed the title Add Redis MOVED redirection support Add Redis MOVED Redirection support May 2, 2025
Copy link
Copy Markdown
Member

@liviuchircu liviuchircu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work here, @NormB! I took a look here and the only big question is around that "slot" parameter returned in the MOVED reply, and whether we are handling it correctly. Is this something present in all MOVED replies, or is it only related to your use of the use of the {hashtags} Redis feature, leading to this extra "slot" data being returned?

I'm asking because if MOVED always includes a "slot", then comparing this slot against the pre-cached (and never to be changed...) "start_slot" and "end_slot" might not yield the correct Node anymore. Imagine if I add +1 Redis Cluster node and all slots re-configure. Your OpenSIPS won't do another "CLUSTER NODES" query and re-learn the nodes/slots until you restart it (per current implementation, IIRC). If this is the case, aren't we comparing apples to oranges? (hash slots post-reconfigure to hash slots pre-reconfigure).

Finally, I noticed some newly introduced spaces vs. tabs mixtures in the following functions, see if you can set up your Edit to always use Tabs, to be on the safe side:

  • _redis_run_command
  • parse_moved_reply
  • get_redis_connection_by_endpoint

Looking forward to getting this one merged! Cheers!

@liviuchircu liviuchircu added this to the 3.6-dev milestone May 15, 2025
@NormB
Copy link
Copy Markdown
Member Author

NormB commented May 15, 2025

The comparison of the MOVED data does use the "slot" number in the reply. This may be a little too aggressive in the compare logic. Redirecting the original request to the MOVED endpoint:port without taking the slot into consideration would be more lenient. Update: The most recent commit removes the "slot" comparison and now only uses the "endpoint" & "port".

This update, as noted is one step toward providing cluster support. It is not a complete cluster implementation. For example: When a node is added / removed from the cluster, the cached list of nodes must be updated which is outside of the scope of this change. The edge case when the node has an unknown endpoint, is also outside the scope of this change as it relates to a cluster node changing state (or more accurately, changing port).

Below is the related documentation:

https://redis.io/docs/latest/operate/oss_and_stack/reference/cluster-spec/#moved-redirection

MOVED Redirection

A Redis client is free to send queries to every node in the cluster, including replica nodes. The node will analyze the query, and if it is acceptable (that is, only a single key is mentioned in the query, or the multiple keys mentioned are all to the same hash slot) it will lookup what node is responsible for the hash slot where the key or keys belong.

If the hash slot is served by the node, the query is simply processed, otherwise the node will check its internal hash slot to node map, and will reply to the client with a MOVED error, like in the following example:

GET x
-MOVED 3999 127.0.0.1:6381

The error includes the hash slot of the key (3999) and the endpoint:port of the instance that can serve the query. The client needs to reissue the query to the specified node's endpoint address and port.

An empty endpoint indicates that the server node has an unknown endpoint, and the client should send the next request to the same endpoint as the current request but with the provided port.

@liviuchircu liviuchircu self-requested a review May 19, 2025 07:56
Comment on lines +117 to +121
if (match_prefix(redis_info->endpoint.s, redis_info->endpoint.len, it->ip, strlen(it->ip))) {
if (match_prefix(redis_info->endpoint.s, redis_info->endpoint.len, it->ip, strlen(it->ip))) {
if (it->port == redis_info->port) {
if (it->start_slot <= redis_info->slot && it->end_slot >= redis_info->slot) {
// Removed slot comparison as it may be a little too aggressive of a match
// Code is still here in the event that it needs to be added back in
//if (it->start_slot <= redis_info->slot && it->end_slot >= redis_info->slot) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on removing the slot comparison for now, until the cluster "topology scanning" support is extended to be also performed at runtime, perhaps a one-time operation after each MOVED reply, ideally.

But isn't there still an issue with your MOVED implementation whenever we add a +1 node to the Redis Cluster? All of a sudden, some of the OpenSIPS queries get a MOVED redirection to the new cluster node (the keys are balanced across all nodes). But this new function you added, get_redis_connection_by_endpoint() tries to match the new endpoint against the existing list of nodes, which is the incomplete one (before the +1 node was added). Won't the lookup fail? Please correct me if I'm wrong.

@liviuchircu liviuchircu merged commit fe8642d into OpenSIPS:master May 19, 2025
86 checks passed
@liviuchircu
Copy link
Copy Markdown
Member

Merged, as step #1 towards full MOVED support. Step #2 is related to the Redis Cluster configuration (including hash slots), and perhaps launching a CLUSTER NODES topology scanning command (+ redis node list rebuild) on each new MOVED command received.

liviuchircu pushed a commit to liviuchircu/opensips that referenced this pull request May 20, 2025
* Updates to support Redis MOVED redirection

* Add missing function definition to the include

* Update some comments

* Formatting updates

* Update modules/cachedb_redis/cachedb_redis_utils.c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update modules/cachedb_redis/cachedb_redis_dbase.c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Add missing include

* Fix: A little too aggressive in freeing the redisReply object

* Remove ASK defines until there is logic to implement it.

* Remove slot comparison when a MOVED is returned / fix space vs. tab formatting to be consistent

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@NormB NormB deleted the redis_moved branch May 21, 2025 02:17
NormB added a commit to NormB/opensips that referenced this pull request May 9, 2026
* Updates to support Redis MOVED redirection

* Add missing function definition to the include

* Update some comments

* Formatting updates

* Update modules/cachedb_redis/cachedb_redis_utils.c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update modules/cachedb_redis/cachedb_redis_dbase.c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Add missing include

* Fix: A little too aggressive in freeing the redisReply object

* Remove ASK defines until there is logic to implement it.

* Remove slot comparison when a MOVED is returned / fix space vs. tab formatting to be consistent

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants