Pārlūkot izejas kodu

fix(nodes): sync "start after first connect" expiry so un-activated nodes do not reset it (#5319)

* fix(nodes): stop un-activated nodes from resetting "start after first connect" expiry

In a multi-node setup a client is attached to inbounds on several nodes, but
its `client_traffics` row is shared per-email (the column is `gorm:"unique"`).
With "start expiry after first connect", the expiry is stored as a negative
duration and each node converts it to an absolute deadline (now+duration) the
first time the client connects *there*.

The master's per-node traffic merge wrote `expiry_time = ?` unconditionally for
every node sync. So a node where the client never connected keeps reporting the
un-activated negative duration and clobbers the absolute deadline that the node
where the client *did* connect had already activated — last writer wins. The
shared row flip-flops and usually lands back on the negative value, so the main
panel shows the timer "not started" while the active node counts down, and the
subscription (which reads this row and recomputes negative as now+duration on
every fetch) reports a perpetually-resetting, wrong expiry and usage.

Guard the merge so an un-activated (<= 0) value reported by a node can never
reset an already-activated absolute deadline. A positive node value is still
adopted, so a node that legitimately moves the deadline forward (traffic reset /
auto-renew) still propagates. The rule lives in both the SQL CASE used by the
merge and a small `mergeActivationExpiry` helper (kept in lockstep) that the
structural-change check reuses so the guard does not trigger spurious config
re-pushes.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>

* fix(nodes): cast expiry merge params to BIGINT for Postgres

The "start after first connect" merge guard introduced the comparison
`? <= 0` in the client_traffics expiry_time CASE. There Postgres infers
the parameter type as int4 from the literal 0, so binding a real expiry
value — a negative start-after-connect duration or a positive absolute
deadline (~1.7e12 ms) — overflows int4 and the whole setRemoteTrafficLocked
transaction fails, breaking node traffic and expiry sync on Postgres.
SQLite (dynamic typing) was unaffected.

Wrap both params in CAST(? AS BIGINT) (portable across SQLite and
Postgres) so the parameter is typed bigint, matching the explicit casts
the sibling GreatestExpr/ClientTrafficEnableMergeExpr helpers already use.

Verified against Postgres 16: TestNodeFirstConnectExpiry_NotClobbered
failed before this change and passes after; SQLite suite unchanged.

---------

Co-authored-by: Claude Opus 4.8 (1M context) <[email protected]>
Co-authored-by: Sanaei <[email protected]>
Rouzbeh† 1 dienu atpakaļ
vecāks
revīzija
628406117e

+ 32 - 4
internal/web/service/inbound_node.go

@@ -153,6 +153,25 @@ func (s *InboundService) upsertNodeBaseline(tx *gorm.DB, nodeID int, email strin
 	}).Create(&model.NodeClientTraffic{NodeId: nodeID, Email: email, Up: up, Down: down}).Error
 }
 
+// mergeActivationExpiry reconciles a node-reported client expiry with the value
+// already stored on the master. "Start after first connect" persists a negative
+// duration that each node converts to an absolute deadline (now+duration) the
+// first time the client connects there. The per-email client_traffics row is
+// shared across every node, so a node that has not yet seen a first connection
+// keeps reporting the negative duration — which must never reset a deadline
+// another node already activated.
+//
+// A node may legitimately move an already-activated deadline forward (traffic
+// reset / auto-renew extends it), so any positive node value is still adopted —
+// only an un-activated (<= 0) value is rejected once an absolute deadline
+// exists. Kept in lockstep with the SQL CASE in setRemoteTrafficLocked.
+func mergeActivationExpiry(existing, node int64) int64 {
+	if existing > 0 && node <= 0 {
+		return existing
+	}
+	return node
+}
+
 func (s *InboundService) SetRemoteTraffic(nodeID int, snap *runtime.TrafficSnapshot, dirty bool) (bool, error) {
 	var structuralChange bool
 	err := submitTrafficWrite(func() error {
@@ -544,22 +563,31 @@ func (s *InboundService) setRemoteTrafficLocked(nodeID int, snap *runtime.Traffi
 			if existing := centralCSByEmail[cs.Email]; existing != nil &&
 				(existing.Enable != cs.Enable ||
 					existing.Total != cs.Total ||
-					existing.ExpiryTime != cs.ExpiryTime ||
+					existing.ExpiryTime != mergeActivationExpiry(existing.ExpiryTime, cs.ExpiryTime) ||
 					existing.Reset != cs.Reset) {
 				structuralChange = true
 			}
 
 			enableExpr := database.ClientTrafficEnableMergeExpr()
+			// expiry_time merge mirrors mergeActivationExpiry: a node that has not
+			// yet seen the client's first connection keeps reporting the negative
+			// "start after first connect" duration, which must never reset the
+			// absolute deadline another node already activated. A positive node
+			// value is still adopted (e.g. auto-renew moves the deadline forward).
+			// CAST(? AS BIGINT): in the `<= 0` comparison Postgres would otherwise
+			// infer int4 from the literal and overflow on real expiry values.
 			if err := tx.Exec(
 				fmt.Sprintf(
 					`UPDATE client_traffics
-					 SET up = up + ?, down = down + ?, enable = %s, total = ?, expiry_time = ?, reset = ?,
-					     last_online = %s
+					 SET up = up + ?, down = down + ?, enable = %s, total = ?,
+					     expiry_time = CASE WHEN expiry_time > 0 AND CAST(? AS BIGINT) <= 0 THEN expiry_time ELSE CAST(? AS BIGINT) END,
+					     reset = ?, last_online = %s
 					 WHERE email = ?`,
 					enableExpr,
 					database.GreatestExpr("last_online", "?"),
 				),
-				deltaUp, deltaDown, cs.Enable, cs.Total, cs.ExpiryTime, cs.Reset,
+				deltaUp, deltaDown, cs.Enable, cs.Total,
+				cs.ExpiryTime, cs.ExpiryTime, cs.Reset,
 				cs.LastOnline, cs.Email,
 			).Error; err != nil {
 				return false, err

+ 141 - 0
internal/web/service/node_client_expiry_sync_test.go

@@ -0,0 +1,141 @@
+package service
+
+import (
+	"testing"
+
+	"github.com/mhsanaei/3x-ui/v3/internal/xray"
+)
+
+// TestMergeActivationExpiry covers the pure reconciliation rule in isolation.
+func TestMergeActivationExpiry(t *testing.T) {
+	const (
+		dur   = int64(-2592000000) // 30 days as a "start after first connect" duration
+		early = int64(1000)        // earliest absolute deadline (first connection)
+		late  = int64(2000)        // a later absolute deadline
+	)
+	cases := []struct {
+		name           string
+		existing, node int64
+		want           int64
+	}{
+		{"master unset takes node duration", 0, dur, dur},
+		{"master unset takes node activation", 0, early, early},
+		{"activation adopted over stored duration", dur, early, early},
+		{"node still un-activated does not reset deadline", early, dur, early},
+		{"node un-activated zero does not reset deadline", early, 0, early},
+		{"node renewal extends the deadline forward", early, late, late},
+		{"node positive adopted even if earlier", late, early, early},
+		{"both un-activated keep node value", dur, dur, dur},
+	}
+	for _, c := range cases {
+		t.Run(c.name, func(t *testing.T) {
+			if got := mergeActivationExpiry(c.existing, c.node); got != c.want {
+				t.Fatalf("mergeActivationExpiry(%d,%d) = %d, want %d", c.existing, c.node, got, c.want)
+			}
+		})
+	}
+}
+
+// TestNodeFirstConnectExpiry_NotClobbered reproduces the multi-node bug: a
+// client is attached to inbounds on two nodes with a "start after first connect"
+// expiry. The client connects only on node 1, which activates an absolute
+// deadline; node 2 never sees a connection and keeps reporting the negative
+// duration. The shared per-email client_traffics row must hold the activated
+// deadline — a later node-2 sync must not reset it back to "not started".
+func TestNodeFirstConnectExpiry_NotClobbered(t *testing.T) {
+	db := initTrafficTestDB(t)
+	createNodeInbound(t, db, 1, "n1-in", 41001)
+	createNodeInbound(t, db, 2, "n2-in", 41002)
+	svc := &InboundService{}
+
+	const email = "delayed"
+	const duration = int64(-2592000000) // 30 days, not yet started
+
+	// Both nodes start out reporting the un-activated negative duration.
+	syncNode(t, svc, 1, "n1-in", xray.ClientTraffic{Email: email, Up: 0, Down: 0, ExpiryTime: duration, Enable: true})
+	syncNode(t, svc, 2, "n2-in", xray.ClientTraffic{Email: email, Up: 0, Down: 0, ExpiryTime: duration, Enable: true})
+	if got := readTraffic(t, db, email).ExpiryTime; got != duration {
+		t.Fatalf("before any connection: expiry = %d, want %d", got, duration)
+	}
+
+	// Client connects on node 1: it activates an absolute deadline.
+	const activated = int64(1893456000000) // some absolute ms timestamp
+	syncNode(t, svc, 1, "n1-in", xray.ClientTraffic{Email: email, Up: 100, Down: 100, ExpiryTime: activated, Enable: true})
+	if got := readTraffic(t, db, email).ExpiryTime; got != activated {
+		t.Fatalf("after node 1 activation: expiry = %d, want %d", got, activated)
+	}
+
+	// Node 2 (no connection there) keeps reporting the negative duration. This
+	// must NOT reset the activated deadline.
+	syncNode(t, svc, 2, "n2-in", xray.ClientTraffic{Email: email, Up: 0, Down: 0, ExpiryTime: duration, Enable: true})
+	if got := readTraffic(t, db, email).ExpiryTime; got != activated {
+		t.Fatalf("node 2 clobbered the activated deadline: expiry = %d, want %d", got, activated)
+	}
+
+	// Subsequent node 1 syncs keep the same absolute deadline.
+	syncNode(t, svc, 1, "n1-in", xray.ClientTraffic{Email: email, Up: 200, Down: 200, ExpiryTime: activated, Enable: true})
+	if got := readTraffic(t, db, email).ExpiryTime; got != activated {
+		t.Fatalf("after further node 1 sync: expiry = %d, want %d", got, activated)
+	}
+}
+
+// TestNodeFirstConnectExpiry_NotClobbered_WithSettings exercises the full
+// production sync path — snapshots carrying real settings JSON, which drives the
+// GetClients/SyncInbound branch inside setRemoteTrafficLocked — to prove that
+// branch does not re-derive the per-email client_traffics.expiry_time from the
+// node's (still negative) settings and undo the merge guard.
+func TestNodeFirstConnectExpiry_NotClobbered_WithSettings(t *testing.T) {
+	db := initTrafficTestDB(t)
+	createNodeInboundWithClient(t, db, 1, "n1-in", 41001, "delayed")
+	createNodeInboundWithClient(t, db, 2, "n2-in", 41002, "delayed")
+	svc := &InboundService{}
+
+	const email = "delayed"
+	const duration = int64(-2592000000)
+	const activated = int64(1893456000000)
+
+	negSettings := `{"clients":[{"email":"delayed","enable":true,"expiryTime":-2592000000}]}`
+	actSettings := `{"clients":[{"email":"delayed","enable":true,"expiryTime":1893456000000}]}`
+
+	// Both nodes start un-activated.
+	syncNodeWithSettings(t, svc, 1, "n1-in", negSettings, xray.ClientTraffic{Email: email, ExpiryTime: duration, Enable: true})
+	syncNodeWithSettings(t, svc, 2, "n2-in", negSettings, xray.ClientTraffic{Email: email, ExpiryTime: duration, Enable: true})
+
+	// Node 1 activates (both its ClientStats and its settings now carry the
+	// absolute deadline, like a real node after adjustTraffics).
+	syncNodeWithSettings(t, svc, 1, "n1-in", actSettings, xray.ClientTraffic{Email: email, Up: 100, Down: 100, ExpiryTime: activated, Enable: true})
+	if got := readTraffic(t, db, email).ExpiryTime; got != activated {
+		t.Fatalf("after node 1 activation: expiry = %d, want %d", got, activated)
+	}
+
+	// Node 2 still reports the negative duration in BOTH ClientStats and
+	// settings. Neither the merge nor SyncInbound may reset the deadline.
+	syncNodeWithSettings(t, svc, 2, "n2-in", negSettings, xray.ClientTraffic{Email: email, ExpiryTime: duration, Enable: true})
+	if got := readTraffic(t, db, email).ExpiryTime; got != activated {
+		t.Fatalf("node 2 settings-sync clobbered the deadline: expiry = %d, want %d", got, activated)
+	}
+}
+
+// TestNodeRenewExtendsExpiry guards against over-correcting: a node that renews
+// a client (traffic reset / auto-renew) legitimately moves the deadline FORWARD
+// to a later absolute timestamp, and that must still propagate to the master.
+// The guard only rejects un-activated (<= 0) values, never a positive one.
+func TestNodeRenewExtendsExpiry(t *testing.T) {
+	db := initTrafficTestDB(t)
+	createNodeInbound(t, db, 1, "n1-in", 41001)
+	svc := &InboundService{}
+
+	const email = "renewing"
+	const first = int64(1893456000000)
+	const renewed = first + int64(2592000000) // +30 days after auto-renew
+
+	syncNode(t, svc, 1, "n1-in", xray.ClientTraffic{Email: email, Up: 10, Down: 10, ExpiryTime: first, Enable: true})
+	if got := readTraffic(t, db, email).ExpiryTime; got != first {
+		t.Fatalf("after activation: expiry = %d, want %d", got, first)
+	}
+
+	syncNode(t, svc, 1, "n1-in", xray.ClientTraffic{Email: email, Up: 20, Down: 20, ExpiryTime: renewed, Enable: true})
+	if got := readTraffic(t, db, email).ExpiryTime; got != renewed {
+		t.Fatalf("node renewal did not propagate: expiry = %d, want %d", got, renewed)
+	}
+}