bindbox-game/.planning/research/PITFALLS.md

# Pitfalls Research

**Domain:** Go/GORM/MySQL financial analytics — profit/loss aggregation functions
**Researched:** 2026-03-21
**Confidence:** HIGH (derived directly from existing codebase evidence + confirmed Go/MySQL behavior)

---

## Critical Pitfalls

### Pitfall 1: MySQL SUM with Division Returns Decimal, Not SIGNED Integer

**What goes wrong:**
When a `SUM()` expression includes any division operation (e.g., `SUM(amount * draw_count / total_count)`), MySQL returns the result as a `Decimal` type, not `BIGINT`. Scanning a Decimal into a Go `int64` field silently returns `0`. The dashboard code already hit this and left a comment documenting it.

Evidence from `dashboard_activity.go:174`:
```
// 注意: MySQL SUM()运算涉及除法时会返回Decimal类型，需要Scan到float64
```
The fix used there: scan revenue stats into `float64`, then cast to `int64` in Go.

**Why it happens:**
MySQL promotes arithmetic involving division to Decimal to preserve fractional precision. GORM's `Scan()` does not coerce types — it matches Go field types exactly, and `int64` ≠ Decimal causes a silent zero.

**How to avoid:**
Wrap any `SUM` that contains division with `CAST(... AS SIGNED)` in the SQL itself. This forces integer rounding at the database layer and lets you scan directly into `int64`. The existing cost query in `dashboard_activity.go:237` already uses this pattern:
```sql
CAST(SUM(...) AS SIGNED) as total_cost
```
Use `CAST(... AS SIGNED)` on every aggregated column that involves division. Never scan division-containing SUM results directly into `int64` without the cast.

**Warning signs:**
- Aggregated monetary fields come back as `0` even when data exists
- Revenue stats are non-zero but cost stats are zero (or vice versa)
- Struct fields stay at their zero values after `Scan()`

**Phase to address:** Implementation phase — apply during every query that uses proportional allocation (e.g., distributing an order's revenue across multiple activities via `draw_count / total_count`).

---

### Pitfall 2: Double-Counting Revenue When One Order Spans Multiple Activities

**What goes wrong:**
A single order can result in draw logs across multiple activities (e.g., a user plays activity A and activity B in one checkout). If you `SUM(orders.actual_amount)` grouped by activity without proportional allocation, the full order amount is counted in every activity it touches. The existing dashboard already experienced this and added two-level subquery attribution.

Evidence from `dashboard_activity.go:197-212`: the fix was to compute `draw_count per (order, activity)` and `total_count per order` in two separate subqueries, then scale the order amount by the ratio `draw_count / total_count`.

**Why it happens:**
Aggregation joins `orders` to `activity_draw_logs` which is a one-to-many relationship. Without explicit proration, the order amount fans out to every matching activity row.

**How to avoid:**
Always attribute revenue using the subquery pattern:
```sql
JOIN (
    SELECT order_id, activity_id, COUNT(*) as draw_count
    FROM activity_draw_logs JOIN activity_issues ON ...
    GROUP BY order_id, activity_id
) as order_activity_draws ON order_activity_draws.order_id = orders.id
JOIN (
    SELECT order_id, COUNT(*) as total_count
    FROM activity_draw_logs GROUP BY order_id
) as order_total_draws ON order_total_draws.order_id = orders.id
```
Then multiply: `orders.actual_amount * order_activity_draws.draw_count / order_total_draws.total_count`. For the user-dimension function, this pattern still applies if a user's order touches multiple issues.

**Warning signs:**
- Total revenue across all activities exceeds the sum of all actual order payments
- A user's computed spending is greater than what WeChat Pay received
- Profit rates are implausibly negative across many activities

**Phase to address:** Implementation phase — design the user-dimension and activity-dimension query structure before writing SQL.

---

### Pitfall 3: Mixing Game-Pass Orders into Cash Revenue (Calculation Mouth-Discrepancy)

**What goes wrong:**
Game-pass orders (次卡) have `actual_amount = 0` and `source_type = 4` (or `order_no LIKE 'GP%'` or remark containing `use_game_pass`). Including them in `SUM(actual_amount + discount_amount)` makes their "revenue" appear as zero, understating total income. Including them in cost without crediting their imputed value makes every game-pass activity show a loss.

The codebase defines three detection conditions in `internal/service/finance/profit_metrics.go:IsGamePassOrder`. These must all be checked — any single condition is insufficient because historical data uses different conventions.

**Why it happens:**
Game-pass orders are structurally identical to regular orders but have zero monetary value. Treating all orders uniformly by summing `actual_amount` misses the imputed value of the subscription the user already paid.

**How to avoid:**
Use strict mutual exclusion in SQL:
- If game-pass order: revenue = `draw_count * activity.price_draw`, discount = 0, cash = 0
- If cash/coupon order: revenue = `actual_amount + discount_amount`, game-pass value = 0
- Use `CASE WHEN (source_type=4 OR order_no LIKE 'GP%' OR (actual_amount=0 AND remark LIKE '%use_game_pass%')) THEN ... ELSE ...` in every SUM

Never add `actual_amount + discount_amount + game_pass_value` as if they are additive columns of the same thing. They are alternative values for the same economic event.

**Warning signs:**
- Activities with many game-pass players show profit rates near -100%
- Total platform revenue is suspiciously lower than WeChat Pay reports
- `SpendingPaidCoupon` and `SpendingGamePass` are both non-zero for the same order

**Phase to address:** Implementation phase — encode the mutual-exclusion rule in query construction helpers before writing any aggregate SQL.

---

### Pitfall 4: Silently Ignoring Scan Errors on Aggregation Queries

**What goes wrong:**
Several existing dashboard queries call `db.Table(...).Select(...).Scan(&stats)` without checking the returned error. If the query fails (schema mismatch, column rename, database failover), `stats` remains an empty slice, downstream computations produce zero results, and no error is returned to the caller. The data looks correct (all zeros) rather than erroring.

Evidence from `dashboard_activity.go:146-158` — `drawStats` scan has no `.Error` check. The pattern appears in multiple places throughout the dashboard handlers.

**Why it happens:**
GORM's method chaining makes it easy to forget error handling. The pattern `db.Table(...).Scan(&x)` is syntactically identical whether you check `.Error` or not. In exploratory handler code that was never tested, errors were skipped for brevity.

**How to avoid:**
The new `internal/service/finance/` package must check every query error:
```go
if err := db.Table(...).Scan(&result).Error; err != nil {
    return nil, fmt.Errorf("profit_loss query failed: %w", err)
}
```
Service functions should return `error` as second return value — not swallow errors internally. The existing `profit_metrics.go` pure functions have no DB access and are fine; the DB-querying functions must propagate errors.

**Warning signs:**
- Function returns zero values with no error in tests against an empty SQLite db
- Aggregation results are uniformly zero across all parameters
- Schema changes (column renames, table renames) cause silent failures

**Phase to address:** Implementation phase — establish error-check convention in the first function written; testing phase — assert non-nil error on deliberately broken queries.

---

### Pitfall 5: Omitting Refunded Orders from Cost Calculation

**What goes wrong:**
Inventory items (`user_inventory`) awarded from a subsequently refunded order should be excluded from cost. If you compute cost by summing `user_inventory.value_cents` grouped by `activity_id` without filtering on `orders.status`, you count the cost of prizes from refunded orders but don't count their revenue — making the platform appear to have given away prizes for free.

The existing code in `dashboard_activity.go:250-251` already had to special-case this:
```go
Where("(orders.status = ? OR user_inventory.order_id = 0 OR user_inventory.order_id IS NULL)", 2)
```
Note the legacy data escape hatch: some old inventory rows have `order_id = 0` or NULL and cannot be filtered by order status. This must be preserved.

**Why it happens:**
`user_inventory` records are created when prizes are awarded, which happens before the refund window closes. Refunds do not delete inventory rows — they update `orders.status` to 4. Naive aggregation on `user_inventory` ignores order status entirely.

**How to avoid:**
Always join `orders` to `user_inventory` via `order_id` and include the legacy escape hatch:
```sql
LEFT JOIN orders ON orders.id = user_inventory.order_id
WHERE (orders.status = 2 OR user_inventory.order_id = 0 OR user_inventory.order_id IS NULL)
AND COALESCE(user_inventory.remark, '') NOT LIKE '%void%'
```
The `void` remark filter is also required — manually voided inventory entries should never count as platform cost.

**Warning signs:**
- Platform cost is higher than expected for activities with known refund activity
- Cost-side totals don't reconcile with accounting system data
- Test cases with a refunded order still show non-zero cost

**Phase to address:** Implementation phase — add a test case with a refunded order and verify cost = 0 for that order's prizes.

---

### Pitfall 6: Using Write DB (DbW) for Analytics Queries

**What goes wrong:**
The project has master-slave read-write splitting. Analytics queries that run on `GetDbW()` (master) instead of `GetDbR()` (replica) add latency to the write path, can block replication, and in the worst case cause master overload under concurrent analytics requests.

The CONCERNS.md already flags 113 direct `GetDbW()` calls in the handler layer. The pattern of bypassing the correct DB connection is established in the codebase and can propagate to new code.

**Why it happens:**
`GetDbR()` and `GetDbW()` look identical in usage. Developers copying from handler code that was written for writes will use `GetDbW()` by accident. The finance service package does not yet have established conventions.

**How to avoid:**
The new `internal/service/finance/` service must accept a `*gorm.DB` read-only handle at construction time (inject `repo.GetDbR()`), not a full repository. Document in the function signatures or struct fields that only the read replica is used:
```go
type ProfitLossService struct {
    dbR    *gorm.DB  // read replica only — never use for writes
    logger *zap.Logger
}
```
Never call `repo.GetDbW()` inside finance analytics functions.

**Warning signs:**
- MySQL master replication lag increases when analytics endpoint is called
- Write latency spikes during dashboard loads
- `GetDbW()` appears in `internal/service/finance/` source files

**Phase to address:** Implementation phase — inject read-only DB handle in constructor; testing phase — verify with a mock that only the read DB is called.

---

## Technical Debt Patterns

| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|----------|-------------------|----------------|-----------------|
| Scan revenue into `float64` instead of fixing SQL with `CAST(AS SIGNED)` | Avoids SQL rewrite | Floating-point rounding on monetary values (e.g., 0.1 + 0.2 ≠ 0.3 in IEEE 754) | Never for monetary fields — always use `CAST(AS SIGNED)` |
| In-memory sort + full table fetch for custom sort order | Simpler than `ORDER BY` with computed columns | Loads unbounded rows into Go heap when activity count grows | Only acceptable if total row count is bounded by pagination elsewhere |
| Hardcoding game-pass detection conditions in each query | Avoids abstraction overhead | Three different detection conditions must stay in sync across multiple queries | Never — centralize detection in `IsGamePassOrder()` already defined in `finance` package |
| Skip error check on `Scan()` | Fewer lines of code | Silent wrong data; impossible to distinguish "query returned zero rows" from "query failed" | Never for financial data |
| Use `AVG(multiplier)` across draws as the cost multiplier | One query instead of per-row | Hides per-order multiplier variance; a 2x card on one draw inflates cost for all draws in the group | Acceptable for summary statistics; not for per-order breakdowns |

---

## Integration Gotchas

| Integration | Common Mistake | Correct Approach |
|-------------|----------------|-----------------|
| GORM `Scan` into anonymous struct | Forgetting to qualify column names in SELECT causes ambiguous column error when multiple tables have `id`, `created_at`, etc. | Always alias computed columns explicitly: `SELECT orders.user_id as user_id`, not `SELECT user_id` |
| GORM raw SQL with `Raw()` + `Scan()` | Parameterized values passed in wrong order cause SQL to silently use zero values | Verify query with `db.Statement.SQL.String()` during development; test with non-trivial input values |
| MySQL `COALESCE` with nullable int columns | `COALESCE(NULL, 0)` works but `COALESCE(column, 0)` on a non-nullable column with value `0` returns `0` — `NULLIF` needed to distinguish "not set" from "explicitly zero" | Use `COALESCE(NULLIF(value_cents, 0), fallback_1, fallback_2, 0)` pattern already established in existing cost queries |
| Multiple ID lists in `WHERE IN (?)` with GORM | Passing an empty slice `[]int64{}` produces invalid SQL `WHERE id IN ()` in some GORM versions | Guard with `if len(ids) == 0 { return emptyResult, nil }` before building the query |
| Read replica lag | Querying replica immediately after a write (e.g., after seeding test data) can return stale results | In tests, use write DB handle or wait for sync; in production, this is acceptable for analytics |

---

## Performance Traps

| Trap | Symptoms | Prevention | When It Breaks |
|------|----------|------------|----------------|
| Fetching all activities before computing profit/loss (no predicate pushdown) | 100% CPU on `Find(&activities)`, slow response time | Apply all filters (status, name, date range) in the initial `query` before scanning, then pass `activityIDs` to subsequent queries | When activity count exceeds ~1,000 |
| Correlated subquery inside SUM for every row | Query time grows O(n²) with draw log volume | Pre-aggregate into a derived table subquery joined once, not per-row | When draw_logs table exceeds ~500K rows |
| No index on `activity_draw_logs.order_id` or `user_inventory.activity_id` | Sequential scan on every analytics query | Verify indexes exist with `SHOW INDEX FROM activity_draw_logs`; add composite index `(issue_id, order_id)` if missing | From day one on tables with writes |
| Loading all activities into memory for in-application sort | Memory spike on large result sets; no benefit if caller only wants top-10 | Accept this tradeoff only when total activities < 500; add a hard cap with an error if exceeded | When activity count exceeds ~500 |
| Querying `user_inventory` without `status IN (1, 3)` filter | Voided/cancelled inventory items inflate cost | Always filter: `WHERE user_inventory.status IN (1, 3)` | Immediately — even small void counts distort cost |

---

## Security Mistakes

| Mistake | Risk | Prevention |
|---------|------|------------|
| Interpolating user-supplied `user_id` or `activity_id` into raw SQL string instead of parameterized query | SQL injection — attacker can exfiltrate all financial data | Always use parameterized queries: `.Where("user_id IN ?", ids)` not `fmt.Sprintf("user_id IN (%s)", idsStr)` |
| Exposing raw profit/loss data without admin role check | Non-admin users can read platform margin data | The new service functions are Service layer — callers (API handlers) must apply `RequireAdminRole()` middleware; document this requirement in the function's GoDoc |
| Logging query parameters that contain user IDs | User ID lists in error logs can be correlated with financial data | Log query failure with a count, not the full ID list: `"profit_loss query failed for %d users: %v"` |

---

## "Looks Done But Isn't" Checklist

- [ ] **Game-pass mutual exclusion:** Verify that `SpendingPaidCoupon` and `SpendingGamePass` are never both non-zero for the same order. Write a test case with a mixed-type order set.
- [ ] **Refunded order exclusion:** Add a test case where an order is refunded (status=4) and verify it contributes zero to both revenue and cost.
- [ ] **Legacy zero order_id:** Confirm inventory rows with `order_id = 0` are included in cost (not excluded by the orders JOIN). Add a test row with `order_id = 0` and verify it appears in cost.
- [ ] **Empty parameter handling:** Call both functions with nil/empty `userIDs` and nil/empty `activityID` — verify they return all-data aggregation, not empty results or SQL errors.
- [ ] **All five asset types covered:** Points, coupons, item cards, physical products, fragments. Verify all five appear in the breakdown output. Missing one silently understates cost.
- [ ] **CAST on division SUM:** Open every query with a `/` operator in a SUM and confirm `CAST(... AS SIGNED)` wraps the entire expression.
- [ ] **Read-only DB used:** Grep for `GetDbW` inside `internal/service/finance/` — result must be empty.
- [ ] **Error propagation:** Every `Scan()` call inside finance functions must have its `.Error` checked and returned to the caller.

---

## Recovery Strategies

| Pitfall | Recovery Cost | Recovery Steps |
|---------|---------------|----------------|
| Decimal-to-int64 silent zero | LOW | Add `CAST(AS SIGNED)` to affected SQL; rerun query — no data migration needed |
| Revenue double-counting discovered post-launch | MEDIUM | Backfill correct totals by recomputing with fixed query over historical data; notify operators of corrected figures |
| Wrong DB handle (write instead of read) | LOW | Change constructor injection; no data impact |
| Missing refund exclusion | MEDIUM | Recompute affected period's profit/loss with corrected query; mark old reports as superseded |
| Silently swallowed errors causing wrong zeros | LOW-MEDIUM | Add error checks; add alerting on zero-result aggregations where data is expected; audit logs for the affected period |

---

## Pitfall-to-Phase Mapping

| Pitfall | Prevention Phase | Verification |
|---------|-----------------|--------------|
| Decimal/int64 scan mismatch | Implementation — SQL design | Integration test: query with division-containing SUM, assert non-zero int64 result |
| Revenue double-counting | Implementation — query structure design | Test: one order across two activities; assert sum of per-activity revenue equals order total |
| Game-pass mutual exclusion | Implementation — use `IsGamePassOrder()` helper | Unit test: game-pass order contributes to `SpendingGamePass` only, not `SpendingPaidCoupon` |
| Ignored Scan errors | Implementation — code review gate | Test: deliberately broken query (wrong table name); assert returned error is non-nil |
| Refunded order in cost | Implementation — WHERE clause | Test: refunded order inventory; assert cost contribution is zero |
| Write DB used | Implementation — constructor injection | Grep check in CI: `GetDbW` must not appear in `internal/service/finance/` |
| Missing LIMIT on supporting queries | Implementation — query design | Load test with 1000 activities; verify response time stays under 2s |

---

## Sources

- `internal/api/admin/dashboard_activity.go` — direct evidence of BUG FIX comments for Decimal/int64, double-counting, game-pass misclassification (lines 173-175, 274-275, 544-545)
- `internal/api/admin/dashboard_spending.go` — evidence of multi-join aggregation patterns and game-pass CASE expressions
- `internal/service/finance/profit_metrics.go` — `IsGamePassOrder()` three-condition detection; `ComputeProfit()` integer arithmetic; established pattern for cost multiplier
- `internal/service/finance/profit_metrics_test.go` — existing test coverage confirming pure-function behavior
- `.planning/codebase/CONCERNS.md` — flagged 113 `GetDbW()` calls in handler layer, silently swallowed errors in financial paths, missing error checks in `pay_refund_admin.go`
- Go `database/sql` specification — `Scan()` does not coerce types; MySQL 8.x documentation — SUM with division promotes to Decimal

---
*Pitfalls research for: Go/GORM/MySQL profit/loss analytics (Bindbox Game)*
*Researched: 2026-03-21*