zfc/bindbox-game

Fork 0

win 5ede909be4 docs: map existing codebase

2026-03-21 16:01:32 +08:00

15 KiB

Raw Permalink Blame History

Codebase Concerns

Analysis Date: 2026-03-21

Security Considerations

WeChat AppSecret logged in plaintext:

Risk: WeChat AppSecret (OAuth credentials) are written to application logs at Info/Error level, exposing secrets in log files and any log aggregation systems.
Files:
- internal/service/user/login_weixin.go:52 — s.logger.Info("DEBUG: LoginWeixin Config", zap.String("AppSecret", wcfg.AppSecret))
- internal/api/user/phone_bind.go:59 — h.logger.Error("...", zap.String("app_secret", wxCfg.AppSecret))
- internal/api/common/openid_app.go:44 — h.logger.Info("GetOpenID Config", zap.String("AppSecret", wxcfg.AppSecret))
Current mitigation: None. These are active, non-conditional log calls.
Recommendations: Remove all AppSecret fields from log calls immediately. If debugging is needed, log only AppID (never AppSecret).

Hardcoded internal API key fallback:

Risk: If Internal.ApiKey config is missing or empty, the system falls back to the hardcoded literal "bindbox-internal-secret-2024". Any attacker with knowledge of this default can call all internal game settlement endpoints.
Files: internal/router/router.go:99
Current mitigation: Config key overrides the default when set.
Recommendations: Remove the hardcoded default. Fail-closed: if expectedKey == "", reject all requests with 503 rather than falling back to a known string.

Hardcoded Nakama server key:

Risk: internal/api/game/handler.go:207 contains nakamaKey := "defaultkey" which is the well-known Nakama default key, used when config is absent.
Files: internal/api/game/handler.go:207
Current mitigation: Config value overrides when present.
Recommendations: Panic at startup if Nakama key is not configured in production mode.

CORS allows all origins with credentials:

Risk: AllowedOrigins: []string{"*"} combined with AllowCredentials: true is an invalid CORS configuration per spec (browsers block it) and signals the intent to allow arbitrary origins was not fully thought through.
Files: internal/pkg/cors/cors.go:14,34
Current mitigation: Browsers enforce the spec restriction, partially preventing exploitation.
Recommendations: Replace "*" with an explicit allowlist of trusted origins.

pprof profiling endpoint exposed in production:

Risk: core.WithEnablePProf() is unconditionally passed in internal/router/router.go:45. The Go pprof endpoints at /debug/pprof/* expose heap dumps, goroutine stacks, CPU profiles, and memory layout — high-value information for attackers.
Files: internal/router/router.go:45, internal/pkg/core/core.go:252-258
Current mitigation: None — pprof is always enabled.
Recommendations: Gate WithEnablePProf() behind an environment check (ENV != "pro").

Tech Debt

Skipped Redis ticket validation in game token service:

Issue: internal/service/game/token.go:127-136 has a commented-out return statement under a // TODO: 临时跳过 Redis 验证. When the Redis key is not found, validation is bypassed and the game token is accepted regardless. This permanently disables single-use ticket enforcement.
Files: internal/service/game/token.go:127-136
Impact: A valid JWT can be replayed indefinitely once issued; single-use semantics are broken.
Fix approach: Restore the return nil, fmt.Errorf("ticket not found or expired") line. Investigate why tickets expire from Redis before use (likely TTL too short or Redis key prefix mismatch).

Stub minesweeper game handlers:

Issue: internal/api/internal/minesweeper/handler.go contains two handlers (VerifyTicket, SettleGame) that are entirely mock implementations. Both contain // TODO: 实际验证逻辑 and return hardcoded responses. The settle handler always returns success: true and a mock reward string.
Files: internal/api/internal/minesweeper/handler.go:46-78
Impact: Minesweeper game settlement and ticket verification are non-functional. Any caller receives a fake success response regardless of actual state.
Fix approach: Implement proper Redis ticket lookup (matching the game token service), deduct tickets, and grant actual rewards on win.

TODO counts in douyin order sync:

Issue: internal/api/admin/douyin_orders_admin.go:302,337 have GrantedCount: 0 and RefundedCount: 0 with TODO comments noting these should return actual counts from the sync/grant functions. Admin UI shows incorrect stats (always 0) for these operations.
Files: internal/api/admin/douyin_orders_admin.go:302,337
Impact: Douyin sync reports are inaccurate. Operators cannot confirm how many prizes were actually granted or refunded per sync run.
Fix approach: Update SyncRefundStatus and GrantLivestreamPrizes to return counts, propagate to response.

Duplicate user handler registration with undocumented intent:

Issue: internal/router/router.go:83 has an explicit // TODO: Check if userHandler and userAppHandler are redundant or distinct. comment. Two user handler instances exist with unclear separation of responsibility.
Files: internal/router/router.go:82-86
Impact: Risk of inconsistent behavior — changes to one handler may be expected to apply to both but don't.
Fix approach: Audit both handler paths; consolidate or document the distinction.

Known Bugs

Debug fmt.Printf statements left in production code:

Symptoms: Over 35 fmt.Printf("[DEBUG]...") calls across production code path are unconditionally executed in all environments, writing to stdout rather than the structured Zap logger. This pollutes logs, degrades performance, and leaks business data.
Files (representative):
- internal/service/douyin/order_sync.go:648,710,731,800,812,824,829,840,846,850,855,859,872
- internal/service/sysconfig/dynamic_config.go:198
- internal/service/activity/activity_order_service.go:150,158,163,207,235,265
- internal/service/user/order_timeout.go:51,60,150
- internal/service/user/coupon_transfer.go:97
- internal/api/user/login_app.go:59
- internal/api/admin/lottery_admin.go:535
- internal/api/activity/lottery_app.go:67
- internal/api/activity/issue_choices_app.go:76,79,82
- internal/service/game/token.go:78,131,141,147,152,159
- internal/pkg/wechat/code2session.go:22
Trigger: Always — these are unconditional print calls in hot paths.
Fix: Remove all fmt.Printf calls; replace necessary observability with s.logger.Debug(...) calls gated by log level.

Silently discarded errors in critical paths:

Symptoms: Multiple writes/inserts ignore errors via _ = h.repo.GetDbW().Exec(...) and _ = h.repo.GetDbR().Raw(...).Scan(...). Failed inventory updates, ledger entries, and coupon operations produce no error response.
Files (representative):
- internal/api/admin/pay_refund_admin.go:155,174,195,198,224,227,235,239,262
- internal/api/activity/lottery_app.go:429,639,667
- internal/api/activity/issues_app.go:80
- internal/api/admin/users_profile.go:172,188,197,222,246,249
- internal/api/admin/activity_commitment_admin.go:62,63,64,66,96
Trigger: On database errors in refund, inventory, and query flows.
Fix: Wrap these in proper error handling; at minimum log errors; for writes in financial flows, propagate errors to callers.

Performance Bottlenecks

Dashboard admin handler is a 2,666-line monolith:

Problem: internal/api/admin/dashboard_admin.go contains the entire dashboard implementation in a single file. Many dashboard queries perform multiple full-table scans on orders, inventory, and draw_logs tables without pagination constraints.
Files: internal/api/admin/dashboard_admin.go
Cause: Dashboard endpoints aggregate across all-time data. Complex multi-table JOINs (up to 7 tables in dashboard_spending.go) run inline per request.
Improvement path: Introduce materialized summaries or scheduled background computation for dashboard aggregates. Split the file into per-widget files (max 400 lines each).

Unguarded Find() calls without LIMIT in handler layer:

Problem: Several handlers call .Find() on potentially unbounded result sets.
Files:
- internal/api/app/categories.go:44 — all active categories loaded at once
- internal/api/activity/issue_choices_app.go:50 — all reward settings for an issue
- internal/api/activity/draw_logs_app.go:133,146,164 — users, rewards, products hydration in loops
- internal/api/app/product_category.go:47
Cause: Missing Limit() calls; no pagination on supporting queries.
Improvement path: Add .Limit(500) guards and paginate public-facing endpoints. For in-memory hydration loops, batch-query by IN clause (already done in some places) but add bounds.

time.Sleep in hot production paths:

Problem: internal/pkg/wechat/shipping.go uses time.Sleep(time.Second) and time.Sleep(2 * time.Second) between WeChat API calls, blocking goroutines for up to 4 seconds per operation. Under load, this exhausts the goroutine pool.
Files: internal/pkg/wechat/shipping.go:138,165,215,246
Cause: Naive retry/rate-limit handling.
Improvement path: Replace with exponential backoff using time.After or context-aware wait, and move to a worker pool pattern.

Fragile Areas

Handler layer bypasses service layer for DB writes:

Files: internal/api/admin/pay_refund_admin.go, internal/api/activity/lottery_app.go, internal/api/activity/issue_choices_app.go
Why fragile: 113 direct GetDbW() calls exist in the API handler layer. Business logic (inventory updates, ledger entries, item card resets) is scattered across handlers and services, making transactional consistency hard to enforce and audit.
Safe modification: Any change to refund or inventory logic must trace all three locations (handler, service, scheduler). Do not add new business writes in handlers.
Test coverage: No dedicated tests for pay_refund_admin.go refund flows.

Douyin order sync with goroutine fan-out and mutex:

Files: internal/service/douyin/order_sync.go
Why fragile: The sync loop at line 1056 spawns a goroutine per order item with a shared sync.Mutex for counter updates. Errors from individual goroutines are collected in a slice protected by mutex, but goroutine lifecycle is managed only through a sync.WaitGroup. A single panicking goroutine will be caught by the task center worker recover, but may leave the mutex in an inconsistent state.
Safe modification: Add goroutine-level panic recovery inside the fan-out goroutine (line 1056). Do not increase fan-out concurrency without adding semaphore limiting.
Test coverage: 1 test file for a 1,094-line service.

Task center service is a 1,665-line file with embedded BUG FIX comments:

Files: internal/service/task_center/service.go
Why fragile: The file contains multiple // BUG FIX: comments at lines 715, 809, 1558, 1581, 1609, 1629 indicating patch-on-patch fixes. The quantity resolution logic for reward payloads has been fixed three times (lines 1558, 1581, 1609, 1629) with slight variations — these must be kept consistent.
Safe modification: When modifying reward quantity parsing, update all four sites. Do not add a fifth variant.
Test coverage: 4 test files cover primarily list filtering and invite logic, not the reward claim path.

Dynamic config service with global singleton and panic:

Files: internal/service/sysconfig/global.go:34-37, internal/service/sysconfig/dynamic_config.go
Why fragile: GetGlobalDynamicConfig() panics if called before InitGlobalDynamicConfig(). Any package that calls this at init time or before main.go initialization order completes will crash the process.
Safe modification: Always call InitGlobalDynamicConfig() as the first step after DB initialization in main.go. Do not call GetGlobalDynamicConfig() at package init() scope.
Test coverage: No tests in internal/service/sysconfig/.

Test Coverage Gaps

User service layer (37 source files, 2 test files):

What's not tested: Login flow, coupon add/transfer, order timeout, address share, expiration task, WeChat integration wrappers.
Files: internal/service/user/ (35 untested files including login_weixin.go, coupon_add.go, order_timeout.go, address_share.go)
Risk: Regressions in payment-adjacent and user lifecycle code go undetected.
Priority: High

Activity service (24 source files, 3 test files):

What's not tested: lottery_process.go, activity_order_service.go, scheduler.go, matching_game.go, most of the 24 files.
Files: internal/service/activity/
Risk: Core lottery and order creation flows have no automated test coverage. Any refactor risks silent breakage.
Priority: High

Admin service (6 source files, 0 test files):

What's not tested: All of internal/service/admin/.
Files: internal/service/admin/
Risk: Admin-level business logic untested.
Priority: Medium

Snapshot and recycle services:

What's not tested: internal/service/snapshot/ (2 files, 0 tests), internal/service/recycle/ (1 file, 0 tests).
Risk: Snapshot replay and recycle operations silently broken.
Priority: Medium

Sysconfig service:

What's not tested: internal/service/sysconfig/ (3 files, 0 tests) including the global singleton and dynamic config loader.
Risk: Config loading failures or key formatting bugs go undetected.
Priority: Medium

Scaling Limits

Matching game in-memory state:

Current capacity: Game state for the matching card game (MatchingGameState) is stored in internal/service/activity/matching_game.go using an in-memory sync.Mutex-protected struct. This is per-process state.
Limit: Cannot scale horizontally — a second server instance has no visibility into game state of the first.
Scaling path: Migrate game state to Redis using atomic operations or use a dedicated game state store (e.g., Nakama, which is already referenced in the codebase).

Scheduler goroutines without stop signal:

Current capacity: internal/service/activity/scheduler.go:38 and internal/service/douyin/scheduler.go:29 spawn bare goroutines that loop forever on time.Sleep(30 * time.Second).
Limit: Cannot be stopped gracefully on shutdown — goroutines outlive the shutdown context.
Scaling path: Thread a context.Context into the scheduler loop and break on ctx.Done().

Dependencies at Risk

Proliferation of debug command tools in cmd/:

Risk: cmd/ contains 9+ one-off debug/diagnostic tools (debug_task_270, debug_check_coupon_22, fix_openid, exploit_verify, check_order, etc.) that have hardcoded database credentials or connection strings for one-time use. These tools may be committed with active credentials and are not maintained.
Impact: Security exposure if credentials are embedded; build confusion if these break the CI pipeline.
Migration plan: Move all one-off tools to a tools/ directory with a clear no-deploy policy, or delete after use.

Concerns audit: 2026-03-21

15 KiB Raw Permalink Blame History