288 lines
20 KiB
Markdown
288 lines
20 KiB
Markdown
# Pitfalls Research
|
||
|
||
**Domain:** Go/GORM/MySQL financial analytics — profit/loss aggregation functions
|
||
**Researched:** 2026-03-21
|
||
**Confidence:** HIGH (derived directly from existing codebase evidence + confirmed Go/MySQL behavior)
|
||
|
||
---
|
||
|
||
## Critical Pitfalls
|
||
|
||
### Pitfall 1: MySQL SUM with Division Returns Decimal, Not SIGNED Integer
|
||
|
||
**What goes wrong:**
|
||
When a `SUM()` expression includes any division operation (e.g., `SUM(amount * draw_count / total_count)`), MySQL returns the result as a `Decimal` type, not `BIGINT`. Scanning a Decimal into a Go `int64` field silently returns `0`. The dashboard code already hit this and left a comment documenting it.
|
||
|
||
Evidence from `dashboard_activity.go:174`:
|
||
```
|
||
// 注意: MySQL SUM()运算涉及除法时会返回Decimal类型,需要Scan到float64
|
||
```
|
||
The fix used there: scan revenue stats into `float64`, then cast to `int64` in Go.
|
||
|
||
**Why it happens:**
|
||
MySQL promotes arithmetic involving division to Decimal to preserve fractional precision. GORM's `Scan()` does not coerce types — it matches Go field types exactly, and `int64` ≠ Decimal causes a silent zero.
|
||
|
||
**How to avoid:**
|
||
Wrap any `SUM` that contains division with `CAST(... AS SIGNED)` in the SQL itself. This forces integer rounding at the database layer and lets you scan directly into `int64`. The existing cost query in `dashboard_activity.go:237` already uses this pattern:
|
||
```sql
|
||
CAST(SUM(...) AS SIGNED) as total_cost
|
||
```
|
||
Use `CAST(... AS SIGNED)` on every aggregated column that involves division. Never scan division-containing SUM results directly into `int64` without the cast.
|
||
|
||
**Warning signs:**
|
||
- Aggregated monetary fields come back as `0` even when data exists
|
||
- Revenue stats are non-zero but cost stats are zero (or vice versa)
|
||
- Struct fields stay at their zero values after `Scan()`
|
||
|
||
**Phase to address:** Implementation phase — apply during every query that uses proportional allocation (e.g., distributing an order's revenue across multiple activities via `draw_count / total_count`).
|
||
|
||
---
|
||
|
||
### Pitfall 2: Double-Counting Revenue When One Order Spans Multiple Activities
|
||
|
||
**What goes wrong:**
|
||
A single order can result in draw logs across multiple activities (e.g., a user plays activity A and activity B in one checkout). If you `SUM(orders.actual_amount)` grouped by activity without proportional allocation, the full order amount is counted in every activity it touches. The existing dashboard already experienced this and added two-level subquery attribution.
|
||
|
||
Evidence from `dashboard_activity.go:197-212`: the fix was to compute `draw_count per (order, activity)` and `total_count per order` in two separate subqueries, then scale the order amount by the ratio `draw_count / total_count`.
|
||
|
||
**Why it happens:**
|
||
Aggregation joins `orders` to `activity_draw_logs` which is a one-to-many relationship. Without explicit proration, the order amount fans out to every matching activity row.
|
||
|
||
**How to avoid:**
|
||
Always attribute revenue using the subquery pattern:
|
||
```sql
|
||
JOIN (
|
||
SELECT order_id, activity_id, COUNT(*) as draw_count
|
||
FROM activity_draw_logs JOIN activity_issues ON ...
|
||
GROUP BY order_id, activity_id
|
||
) as order_activity_draws ON order_activity_draws.order_id = orders.id
|
||
JOIN (
|
||
SELECT order_id, COUNT(*) as total_count
|
||
FROM activity_draw_logs GROUP BY order_id
|
||
) as order_total_draws ON order_total_draws.order_id = orders.id
|
||
```
|
||
Then multiply: `orders.actual_amount * order_activity_draws.draw_count / order_total_draws.total_count`. For the user-dimension function, this pattern still applies if a user's order touches multiple issues.
|
||
|
||
**Warning signs:**
|
||
- Total revenue across all activities exceeds the sum of all actual order payments
|
||
- A user's computed spending is greater than what WeChat Pay received
|
||
- Profit rates are implausibly negative across many activities
|
||
|
||
**Phase to address:** Implementation phase — design the user-dimension and activity-dimension query structure before writing SQL.
|
||
|
||
---
|
||
|
||
### Pitfall 3: Mixing Game-Pass Orders into Cash Revenue (Calculation Mouth-Discrepancy)
|
||
|
||
**What goes wrong:**
|
||
Game-pass orders (次卡) have `actual_amount = 0` and `source_type = 4` (or `order_no LIKE 'GP%'` or remark containing `use_game_pass`). Including them in `SUM(actual_amount + discount_amount)` makes their "revenue" appear as zero, understating total income. Including them in cost without crediting their imputed value makes every game-pass activity show a loss.
|
||
|
||
The codebase defines three detection conditions in `internal/service/finance/profit_metrics.go:IsGamePassOrder`. These must all be checked — any single condition is insufficient because historical data uses different conventions.
|
||
|
||
**Why it happens:**
|
||
Game-pass orders are structurally identical to regular orders but have zero monetary value. Treating all orders uniformly by summing `actual_amount` misses the imputed value of the subscription the user already paid.
|
||
|
||
**How to avoid:**
|
||
Use strict mutual exclusion in SQL:
|
||
- If game-pass order: revenue = `draw_count * activity.price_draw`, discount = 0, cash = 0
|
||
- If cash/coupon order: revenue = `actual_amount + discount_amount`, game-pass value = 0
|
||
- Use `CASE WHEN (source_type=4 OR order_no LIKE 'GP%' OR (actual_amount=0 AND remark LIKE '%use_game_pass%')) THEN ... ELSE ...` in every SUM
|
||
|
||
Never add `actual_amount + discount_amount + game_pass_value` as if they are additive columns of the same thing. They are alternative values for the same economic event.
|
||
|
||
**Warning signs:**
|
||
- Activities with many game-pass players show profit rates near -100%
|
||
- Total platform revenue is suspiciously lower than WeChat Pay reports
|
||
- `SpendingPaidCoupon` and `SpendingGamePass` are both non-zero for the same order
|
||
|
||
**Phase to address:** Implementation phase — encode the mutual-exclusion rule in query construction helpers before writing any aggregate SQL.
|
||
|
||
---
|
||
|
||
### Pitfall 4: Silently Ignoring Scan Errors on Aggregation Queries
|
||
|
||
**What goes wrong:**
|
||
Several existing dashboard queries call `db.Table(...).Select(...).Scan(&stats)` without checking the returned error. If the query fails (schema mismatch, column rename, database failover), `stats` remains an empty slice, downstream computations produce zero results, and no error is returned to the caller. The data looks correct (all zeros) rather than erroring.
|
||
|
||
Evidence from `dashboard_activity.go:146-158` — `drawStats` scan has no `.Error` check. The pattern appears in multiple places throughout the dashboard handlers.
|
||
|
||
**Why it happens:**
|
||
GORM's method chaining makes it easy to forget error handling. The pattern `db.Table(...).Scan(&x)` is syntactically identical whether you check `.Error` or not. In exploratory handler code that was never tested, errors were skipped for brevity.
|
||
|
||
**How to avoid:**
|
||
The new `internal/service/finance/` package must check every query error:
|
||
```go
|
||
if err := db.Table(...).Scan(&result).Error; err != nil {
|
||
return nil, fmt.Errorf("profit_loss query failed: %w", err)
|
||
}
|
||
```
|
||
Service functions should return `error` as second return value — not swallow errors internally. The existing `profit_metrics.go` pure functions have no DB access and are fine; the DB-querying functions must propagate errors.
|
||
|
||
**Warning signs:**
|
||
- Function returns zero values with no error in tests against an empty SQLite db
|
||
- Aggregation results are uniformly zero across all parameters
|
||
- Schema changes (column renames, table renames) cause silent failures
|
||
|
||
**Phase to address:** Implementation phase — establish error-check convention in the first function written; testing phase — assert non-nil error on deliberately broken queries.
|
||
|
||
---
|
||
|
||
### Pitfall 5: Omitting Refunded Orders from Cost Calculation
|
||
|
||
**What goes wrong:**
|
||
Inventory items (`user_inventory`) awarded from a subsequently refunded order should be excluded from cost. If you compute cost by summing `user_inventory.value_cents` grouped by `activity_id` without filtering on `orders.status`, you count the cost of prizes from refunded orders but don't count their revenue — making the platform appear to have given away prizes for free.
|
||
|
||
The existing code in `dashboard_activity.go:250-251` already had to special-case this:
|
||
```go
|
||
Where("(orders.status = ? OR user_inventory.order_id = 0 OR user_inventory.order_id IS NULL)", 2)
|
||
```
|
||
Note the legacy data escape hatch: some old inventory rows have `order_id = 0` or NULL and cannot be filtered by order status. This must be preserved.
|
||
|
||
**Why it happens:**
|
||
`user_inventory` records are created when prizes are awarded, which happens before the refund window closes. Refunds do not delete inventory rows — they update `orders.status` to 4. Naive aggregation on `user_inventory` ignores order status entirely.
|
||
|
||
**How to avoid:**
|
||
Always join `orders` to `user_inventory` via `order_id` and include the legacy escape hatch:
|
||
```sql
|
||
LEFT JOIN orders ON orders.id = user_inventory.order_id
|
||
WHERE (orders.status = 2 OR user_inventory.order_id = 0 OR user_inventory.order_id IS NULL)
|
||
AND COALESCE(user_inventory.remark, '') NOT LIKE '%void%'
|
||
```
|
||
The `void` remark filter is also required — manually voided inventory entries should never count as platform cost.
|
||
|
||
**Warning signs:**
|
||
- Platform cost is higher than expected for activities with known refund activity
|
||
- Cost-side totals don't reconcile with accounting system data
|
||
- Test cases with a refunded order still show non-zero cost
|
||
|
||
**Phase to address:** Implementation phase — add a test case with a refunded order and verify cost = 0 for that order's prizes.
|
||
|
||
---
|
||
|
||
### Pitfall 6: Using Write DB (DbW) for Analytics Queries
|
||
|
||
**What goes wrong:**
|
||
The project has master-slave read-write splitting. Analytics queries that run on `GetDbW()` (master) instead of `GetDbR()` (replica) add latency to the write path, can block replication, and in the worst case cause master overload under concurrent analytics requests.
|
||
|
||
The CONCERNS.md already flags 113 direct `GetDbW()` calls in the handler layer. The pattern of bypassing the correct DB connection is established in the codebase and can propagate to new code.
|
||
|
||
**Why it happens:**
|
||
`GetDbR()` and `GetDbW()` look identical in usage. Developers copying from handler code that was written for writes will use `GetDbW()` by accident. The finance service package does not yet have established conventions.
|
||
|
||
**How to avoid:**
|
||
The new `internal/service/finance/` service must accept a `*gorm.DB` read-only handle at construction time (inject `repo.GetDbR()`), not a full repository. Document in the function signatures or struct fields that only the read replica is used:
|
||
```go
|
||
type ProfitLossService struct {
|
||
dbR *gorm.DB // read replica only — never use for writes
|
||
logger *zap.Logger
|
||
}
|
||
```
|
||
Never call `repo.GetDbW()` inside finance analytics functions.
|
||
|
||
**Warning signs:**
|
||
- MySQL master replication lag increases when analytics endpoint is called
|
||
- Write latency spikes during dashboard loads
|
||
- `GetDbW()` appears in `internal/service/finance/` source files
|
||
|
||
**Phase to address:** Implementation phase — inject read-only DB handle in constructor; testing phase — verify with a mock that only the read DB is called.
|
||
|
||
---
|
||
|
||
## Technical Debt Patterns
|
||
|
||
| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|
||
|----------|-------------------|----------------|-----------------|
|
||
| Scan revenue into `float64` instead of fixing SQL with `CAST(AS SIGNED)` | Avoids SQL rewrite | Floating-point rounding on monetary values (e.g., 0.1 + 0.2 ≠ 0.3 in IEEE 754) | Never for monetary fields — always use `CAST(AS SIGNED)` |
|
||
| In-memory sort + full table fetch for custom sort order | Simpler than `ORDER BY` with computed columns | Loads unbounded rows into Go heap when activity count grows | Only acceptable if total row count is bounded by pagination elsewhere |
|
||
| Hardcoding game-pass detection conditions in each query | Avoids abstraction overhead | Three different detection conditions must stay in sync across multiple queries | Never — centralize detection in `IsGamePassOrder()` already defined in `finance` package |
|
||
| Skip error check on `Scan()` | Fewer lines of code | Silent wrong data; impossible to distinguish "query returned zero rows" from "query failed" | Never for financial data |
|
||
| Use `AVG(multiplier)` across draws as the cost multiplier | One query instead of per-row | Hides per-order multiplier variance; a 2x card on one draw inflates cost for all draws in the group | Acceptable for summary statistics; not for per-order breakdowns |
|
||
|
||
---
|
||
|
||
## Integration Gotchas
|
||
|
||
| Integration | Common Mistake | Correct Approach |
|
||
|-------------|----------------|-----------------|
|
||
| GORM `Scan` into anonymous struct | Forgetting to qualify column names in SELECT causes ambiguous column error when multiple tables have `id`, `created_at`, etc. | Always alias computed columns explicitly: `SELECT orders.user_id as user_id`, not `SELECT user_id` |
|
||
| GORM raw SQL with `Raw()` + `Scan()` | Parameterized values passed in wrong order cause SQL to silently use zero values | Verify query with `db.Statement.SQL.String()` during development; test with non-trivial input values |
|
||
| MySQL `COALESCE` with nullable int columns | `COALESCE(NULL, 0)` works but `COALESCE(column, 0)` on a non-nullable column with value `0` returns `0` — `NULLIF` needed to distinguish "not set" from "explicitly zero" | Use `COALESCE(NULLIF(value_cents, 0), fallback_1, fallback_2, 0)` pattern already established in existing cost queries |
|
||
| Multiple ID lists in `WHERE IN (?)` with GORM | Passing an empty slice `[]int64{}` produces invalid SQL `WHERE id IN ()` in some GORM versions | Guard with `if len(ids) == 0 { return emptyResult, nil }` before building the query |
|
||
| Read replica lag | Querying replica immediately after a write (e.g., after seeding test data) can return stale results | In tests, use write DB handle or wait for sync; in production, this is acceptable for analytics |
|
||
|
||
---
|
||
|
||
## Performance Traps
|
||
|
||
| Trap | Symptoms | Prevention | When It Breaks |
|
||
|------|----------|------------|----------------|
|
||
| Fetching all activities before computing profit/loss (no predicate pushdown) | 100% CPU on `Find(&activities)`, slow response time | Apply all filters (status, name, date range) in the initial `query` before scanning, then pass `activityIDs` to subsequent queries | When activity count exceeds ~1,000 |
|
||
| Correlated subquery inside SUM for every row | Query time grows O(n²) with draw log volume | Pre-aggregate into a derived table subquery joined once, not per-row | When draw_logs table exceeds ~500K rows |
|
||
| No index on `activity_draw_logs.order_id` or `user_inventory.activity_id` | Sequential scan on every analytics query | Verify indexes exist with `SHOW INDEX FROM activity_draw_logs`; add composite index `(issue_id, order_id)` if missing | From day one on tables with writes |
|
||
| Loading all activities into memory for in-application sort | Memory spike on large result sets; no benefit if caller only wants top-10 | Accept this tradeoff only when total activities < 500; add a hard cap with an error if exceeded | When activity count exceeds ~500 |
|
||
| Querying `user_inventory` without `status IN (1, 3)` filter | Voided/cancelled inventory items inflate cost | Always filter: `WHERE user_inventory.status IN (1, 3)` | Immediately — even small void counts distort cost |
|
||
|
||
---
|
||
|
||
## Security Mistakes
|
||
|
||
| Mistake | Risk | Prevention |
|
||
|---------|------|------------|
|
||
| Interpolating user-supplied `user_id` or `activity_id` into raw SQL string instead of parameterized query | SQL injection — attacker can exfiltrate all financial data | Always use parameterized queries: `.Where("user_id IN ?", ids)` not `fmt.Sprintf("user_id IN (%s)", idsStr)` |
|
||
| Exposing raw profit/loss data without admin role check | Non-admin users can read platform margin data | The new service functions are Service layer — callers (API handlers) must apply `RequireAdminRole()` middleware; document this requirement in the function's GoDoc |
|
||
| Logging query parameters that contain user IDs | User ID lists in error logs can be correlated with financial data | Log query failure with a count, not the full ID list: `"profit_loss query failed for %d users: %v"` |
|
||
|
||
---
|
||
|
||
## "Looks Done But Isn't" Checklist
|
||
|
||
- [ ] **Game-pass mutual exclusion:** Verify that `SpendingPaidCoupon` and `SpendingGamePass` are never both non-zero for the same order. Write a test case with a mixed-type order set.
|
||
- [ ] **Refunded order exclusion:** Add a test case where an order is refunded (status=4) and verify it contributes zero to both revenue and cost.
|
||
- [ ] **Legacy zero order_id:** Confirm inventory rows with `order_id = 0` are included in cost (not excluded by the orders JOIN). Add a test row with `order_id = 0` and verify it appears in cost.
|
||
- [ ] **Empty parameter handling:** Call both functions with nil/empty `userIDs` and nil/empty `activityID` — verify they return all-data aggregation, not empty results or SQL errors.
|
||
- [ ] **All five asset types covered:** Points, coupons, item cards, physical products, fragments. Verify all five appear in the breakdown output. Missing one silently understates cost.
|
||
- [ ] **CAST on division SUM:** Open every query with a `/` operator in a SUM and confirm `CAST(... AS SIGNED)` wraps the entire expression.
|
||
- [ ] **Read-only DB used:** Grep for `GetDbW` inside `internal/service/finance/` — result must be empty.
|
||
- [ ] **Error propagation:** Every `Scan()` call inside finance functions must have its `.Error` checked and returned to the caller.
|
||
|
||
---
|
||
|
||
## Recovery Strategies
|
||
|
||
| Pitfall | Recovery Cost | Recovery Steps |
|
||
|---------|---------------|----------------|
|
||
| Decimal-to-int64 silent zero | LOW | Add `CAST(AS SIGNED)` to affected SQL; rerun query — no data migration needed |
|
||
| Revenue double-counting discovered post-launch | MEDIUM | Backfill correct totals by recomputing with fixed query over historical data; notify operators of corrected figures |
|
||
| Wrong DB handle (write instead of read) | LOW | Change constructor injection; no data impact |
|
||
| Missing refund exclusion | MEDIUM | Recompute affected period's profit/loss with corrected query; mark old reports as superseded |
|
||
| Silently swallowed errors causing wrong zeros | LOW-MEDIUM | Add error checks; add alerting on zero-result aggregations where data is expected; audit logs for the affected period |
|
||
|
||
---
|
||
|
||
## Pitfall-to-Phase Mapping
|
||
|
||
| Pitfall | Prevention Phase | Verification |
|
||
|---------|-----------------|--------------|
|
||
| Decimal/int64 scan mismatch | Implementation — SQL design | Integration test: query with division-containing SUM, assert non-zero int64 result |
|
||
| Revenue double-counting | Implementation — query structure design | Test: one order across two activities; assert sum of per-activity revenue equals order total |
|
||
| Game-pass mutual exclusion | Implementation — use `IsGamePassOrder()` helper | Unit test: game-pass order contributes to `SpendingGamePass` only, not `SpendingPaidCoupon` |
|
||
| Ignored Scan errors | Implementation — code review gate | Test: deliberately broken query (wrong table name); assert returned error is non-nil |
|
||
| Refunded order in cost | Implementation — WHERE clause | Test: refunded order inventory; assert cost contribution is zero |
|
||
| Write DB used | Implementation — constructor injection | Grep check in CI: `GetDbW` must not appear in `internal/service/finance/` |
|
||
| Missing LIMIT on supporting queries | Implementation — query design | Load test with 1000 activities; verify response time stays under 2s |
|
||
|
||
---
|
||
|
||
## Sources
|
||
|
||
- `internal/api/admin/dashboard_activity.go` — direct evidence of BUG FIX comments for Decimal/int64, double-counting, game-pass misclassification (lines 173-175, 274-275, 544-545)
|
||
- `internal/api/admin/dashboard_spending.go` — evidence of multi-join aggregation patterns and game-pass CASE expressions
|
||
- `internal/service/finance/profit_metrics.go` — `IsGamePassOrder()` three-condition detection; `ComputeProfit()` integer arithmetic; established pattern for cost multiplier
|
||
- `internal/service/finance/profit_metrics_test.go` — existing test coverage confirming pure-function behavior
|
||
- `.planning/codebase/CONCERNS.md` — flagged 113 `GetDbW()` calls in handler layer, silently swallowed errors in financial paths, missing error checks in `pay_refund_admin.go`
|
||
- Go `database/sql` specification — `Scan()` does not coerce types; MySQL 8.x documentation — SUM with division promotes to Decimal
|
||
|
||
---
|
||
*Pitfalls research for: Go/GORM/MySQL profit/loss analytics (Bindbox Game)*
|
||
*Researched: 2026-03-21*
|