docs: complete project research

This commit is contained in:
win 2026-03-21 16:28:48 +08:00
parent fd2c9e242e
commit 9e3d893938
4 changed files with 1026 additions and 0 deletions

View File

@ -0,0 +1,194 @@
# Feature Research
**Domain:** Profit/Loss analytics functions — platform-perspective P&L aggregation for a game/e-commerce platform
**Researched:** 2026-03-21
**Confidence:** HIGH (based on direct codebase analysis of existing analytics + domain patterns)
---
## Feature Landscape
### Table Stakes (Users Expect These)
These are the non-negotiable capabilities that any reusable P&L service function must have.
Operators calling these functions expect all of the following to "just work."
| Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------|
| Revenue calculation (actual_amount + discount_amount) | Core platform-perspective income; existing Dashboard already does this | LOW | Coupon discount must be added back: it's real value received |
| Game-pass order classification | Orders with source_type=4, order_no LIKE 'GP%', or remark containing 'use_game_pass' need separate treatment | LOW | Logic already exists in `finance.IsGamePassOrder` — must be reused, not reimplemented |
| Game-pass value derivation (draw_count × activity_price) | Zero-cash orders have economic value; existing logic computes it correctly | LOW | `finance.ComputeGamePassValue` exists; new functions must call it |
| Prize cost calculation with item-card multiplier | Item cards double/triple prize value output; omitting multiplier understates cost | MEDIUM | `finance.ComputePrizeCostWithMultiplier` exists; value comes from `system_item_cards.reward_multiplier_x1000` |
| Profit = spending - prize_cost | Core formula; operators see profit and profit_rate | LOW | `finance.ComputeProfit` exists and returns (int64, float64) |
| Time-range filter (optional) | All existing dashboard analytics support time scoping | LOW | Must accept `*time.Time` for start/end; nil = all-time |
| User-dimension aggregation (one or many user IDs) | Operators look up whale users; existing `GetUserSpendingDashboard` does single-user only | MEDIUM | New function must accept `[]int64`; empty = all users |
| Activity-dimension aggregation (one activity ID) | Per-activity P&L is the primary ops view; `DashboardActivityProfitLoss` does this at handler level | MEDIUM | New function wraps the same logic as a reusable service method |
| "All asset types" as default (nil asset type = all) | PROJECT.md requires all params optional | LOW | Asset-type filter is additive; absence means no filter |
| Summary + per-asset-type breakdown in return value | Operators need total AND split by asset class | MEDIUM | Return struct must carry both `Summary` and `[]AssetBreakdown` |
| Refund/cancelled order exclusion | Orders in status 3 (cancelled) or 4 (refunded) must NOT count as revenue | LOW | Already enforced in existing Dashboard SQL; must be replicated |
| Voided inventory exclusion | Inventory with remark LIKE '%void%' or status=2 represents decomposed assets; must be excluded from prize cost | LOW | Pattern already established in existing queries |
### Differentiators (Competitive Advantage)
Features that go beyond what the existing dashboard provides, making the new service layer genuinely more reusable.
| Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------|
| Multi-user batch support (`[]int64` user IDs) | Existing dashboard only handles single user at a time; batch enables cross-user analytics (e.g., cohort P&L) | MEDIUM | Accept empty slice as "all users"; pass through as SQL IN clause |
| Composable filter struct (asset type, dimension ID, time range all optional) | Callers can mix and match filters without writing bespoke queries | MEDIUM | Use a `ProfitLossFilter` options struct with pointer fields for optionality |
| Canonical `AssetType` enum covering all 5 types | Points, coupon, item-card, physical-good, fragment — each type maps to different source tables | MEDIUM | Defining the enum properly prevents future callers guessing string/int values |
| Per-asset-type cost tracking (not just total) | Operators want to see "how much did item-card prizes cost vs physical goods" — the Dashboard conflates them | HIGH | Requires separate GROUP BY legs or CASE-based aggregation per type |
| Canonical spending classification reuse | New functions must call `finance.ClassifyOrderSpending` — not re-derive the rule — so calculation stays consistent everywhere | LOW | This is a correctness feature; prevents drift from the Dashboard numbers |
| Read-only DB enforcement (`DbR`) | Statistics queries must route to the read replica; new functions must accept a `*gorm.DB` injected from the caller (already DbR-aware) | LOW | Function signature should accept `db *gorm.DB` so callers can pass `h.repo.GetDbR()` |
### Anti-Features (Commonly Requested, Often Problematic)
| Feature | Why Requested | Why Problematic | Alternative |
|---------|---------------|-----------------|-------------|
| Caching / memoization inside the service function | "Stats queries are slow" | The service layer is not the right place for caching; it would break test isolation and caller control over staleness | Let the HTTP handler or a future cache layer wrap the call; the function stays pure |
| Real-time streaming / push notifications for P&L changes | "Alert me when profit drops" | Out of scope for v1 per PROJECT.md; adds event infrastructure complexity | Defer to a future monitoring milestone |
| Automatic pagination inside the aggregate function | "Return page X of users by profit" | Pagination belongs at the API layer; the service function returning a flat result set is more composable | Callers receive the full aggregated slice and paginate themselves |
| Reusing `DashboardActivityProfitLoss` handler logic directly | "Don't duplicate code" | The handler is tightly coupled to HTTP context, request parsing, and response formatting; pulling it into service layer would invert the dependency | New functions in `internal/service/finance/` are fresh implementations using shared `finance.*` primitives |
| Storing computed P&L in a materialized table | "Pre-compute for speed" | Requires write access and schema migration; risks stale data bugs | Query on demand from DbR; optimize with indexes if needed later |
| Returning string-formatted amounts (e.g. "¥12.50") | "UI-ready output" | Formatting belongs in the presentation layer; service functions should return raw int64 cents | Callers convert cents to display strings |
---
## Feature Dependencies
```
[Time-range filter]
└──requires──> [Optional *time.Time parameters]
[Multi-user aggregation]
└──requires──> [Revenue calculation]
└──requires──> [Game-pass classification]
└──requires──> [Prize cost with multiplier]
└──requires──> [Refund/void exclusion]
[Activity-dimension aggregation]
└──requires──> [Revenue calculation]
└──requires──> [Game-pass classification]
└──requires──> [Prize cost with multiplier]
└──requires──> [Refund/void exclusion]
[Per-asset-type breakdown]
└──requires──> [Canonical AssetType enum]
└──enhances──> [User-dimension aggregation]
└──enhances──> [Activity-dimension aggregation]
[Composable filter struct]
└──enhances──> [User-dimension aggregation]
└──enhances──> [Activity-dimension aggregation]
[Canonical spending classification reuse]
└──requires──> [finance.ClassifyOrderSpending (existing)]
└──prevents-conflict──> [Game-pass classification (must not re-derive)]
[Read-only DB enforcement]
└──requires──> [Caller passes *gorm.DB from DbR]
```
### Dependency Notes
- **Per-asset-type breakdown requires AssetType enum:** Without a canonical type definition, callers and implementations will use ad-hoc int/string values that drift.
- **Multi-user aggregation requires all revenue/cost sub-features:** The aggregation is just a GROUP BY wrapper around the same revenue and cost logic.
- **Canonical spending classification must reuse existing `finance.*` functions:** The existing Dashboard and the new service functions must produce identical numbers for the same data. Any divergence in classification logic breaks operator trust in the analytics.
- **Composable filter struct enhances both dimension functions:** A `ProfitLossFilter` struct with optional fields (asset types, IDs, time range) is shared between the user-dimension and activity-dimension functions — same struct, different dimension-ID field used.
---
## MVP Definition
### Launch With (v1)
The minimum that makes both service functions useful and correct.
- [x] `ProfitLossFilter` struct — optional asset types, optional user/activity IDs, optional time range
- [x] `QueryUserProfitLoss(db, filter) (ProfitLossResult, error)` — aggregates across specified user IDs
- [x] `QueryActivityProfitLoss(db, filter) (ProfitLossResult, error)` — aggregates for a single activity ID
- [x] `ProfitLossResult` struct — total revenue, total cost, profit, profit_rate, plus `[]AssetBreakdown`
- [x] Canonical `AssetType` constants: Points, Coupon, ItemCard, PhysicalGood, Fragment
- [x] Revenue calculation reusing `finance.ClassifyOrderSpending` (existing)
- [x] Prize cost calculation reusing `finance.ComputePrizeCostWithMultiplier` (existing)
- [x] Refund (status 3/4) and voided inventory exclusion
- [x] Time-range filter applied consistently to both orders and inventory tables
- [x] Unit tests covering: normal order, game-pass order, mixed, empty result, nil filter
### Add After Validation (v1.x)
- [ ] Per-asset-type breakdown populated (requires extending SQL GROUP BY or running separate legs per type)
- Trigger: ops team requests drill-down beyond total numbers
- [ ] Fragment asset type cost integration via `fragment_synthesis_logs`
- Trigger: fragment economy becomes significant in platform revenue reports
- [ ] Batch activity IDs support (`[]int64` activity IDs, not just one)
- Trigger: ops needs cross-activity comparison in a single call
### Future Consideration (v2+)
- [ ] Caching wrapper (Redis TTL-based) around the query functions
- Defer: not needed until query latency becomes user-visible (>2s)
- [ ] Incremental / time-bucketed aggregation (daily snapshots stored in a stats table)
- Defer: requires schema additions and migration planning
- [ ] Douyin (livestream) order integration into the user-dimension function
- Defer: currently only in the HTTP-layer spending leaderboard; integrating it requires joining `douyin_orders` which adds complexity and is outside the 5 declared asset types
---
## Feature Prioritization Matrix
| Feature | Operator Value | Implementation Cost | Priority |
|---------|---------------|---------------------|----------|
| Revenue calculation (reuse existing `finance.*`) | HIGH | LOW | P1 |
| Game-pass classification (reuse existing) | HIGH | LOW | P1 |
| Prize cost with multiplier (reuse existing) | HIGH | LOW | P1 |
| Refund/void exclusion | HIGH | LOW | P1 |
| Time-range filter | HIGH | LOW | P1 |
| User-dimension aggregation | HIGH | MEDIUM | P1 |
| Activity-dimension aggregation | HIGH | MEDIUM | P1 |
| `ProfitLossFilter` composable struct | HIGH | LOW | P1 |
| `ProfitLossResult` with Summary + Breakdown | HIGH | LOW | P1 |
| Canonical `AssetType` enum | MEDIUM | LOW | P1 |
| Multi-user batch ([]int64) | MEDIUM | LOW | P1 |
| Per-asset-type breakdown (5 types) | MEDIUM | HIGH | P2 |
| Fragment synthesis cost integration | LOW | MEDIUM | P2 |
| Batch activity IDs support | LOW | LOW | P2 |
| Read-only DB routing enforcement | HIGH | LOW | P1 (design constraint, not optional) |
**Priority key:**
- P1: Must have for launch — without these the functions are not useful or correct
- P2: Should have — adds analytical depth, add when P1 is proven
- P3: Nice to have — future milestone
---
## Competitor Feature Analysis
This is an internal platform analytics function, not a user-facing product.
The relevant "competition" is the existing Dashboard code that this service layer must be consistent with and eventually replace as the canonical source of truth.
| Feature | Existing Dashboard (DashboardActivityProfitLoss) | Existing Dashboard (GetUserSpendingDashboard) | New Service Functions |
|---------|-------------------------------------------------|----------------------------------------------|----------------------|
| Reusability | None — HTTP handler only | None — HTTP handler only | Core goal: callable from anywhere |
| Multi-user support | No — activity-scoped | No — single user ID only | Yes — []int64 user IDs |
| Asset-type breakdown | Implicit (physical goods via inventory) | Implicit | Explicit enum + breakdown slice |
| Time-range | Not supported | Supported | Supported (optional) |
| Spending classification | Inline SQL CASE | Inline SQL CASE | Calls `finance.ClassifyOrderSpending` |
| Douyin/livestream | Not included | Included (separate leg) | Out of scope for v1 |
| Calculation consistency | Source of truth today | Source of truth today | Must match exactly |
| Fragment asset type | Not supported | Not supported | Enum defined; cost TBD in v1.x |
---
## Sources
- Direct analysis of `/internal/service/finance/profit_metrics.go` — existing shared primitives
- Direct analysis of `/internal/api/admin/dashboard_activity.go` — activity P&L implementation
- Direct analysis of `/internal/api/admin/dashboard_spending.go` — user spending leaderboard
- Direct analysis of `/internal/api/admin/dashboard_user_spending.go` — per-user spending drill-down
- Direct analysis of GORM models: `orders`, `user_inventory`, `user_points_ledger`, `user_coupon_ledger`, `fragment_synthesis_logs`
- PROJECT.md requirements (validated requirements section)
---
*Feature research for: Bindbox Game profit/loss analytics service layer*
*Researched: 2026-03-21*

View File

@ -0,0 +1,287 @@
# Pitfalls Research
**Domain:** Go/GORM/MySQL financial analytics — profit/loss aggregation functions
**Researched:** 2026-03-21
**Confidence:** HIGH (derived directly from existing codebase evidence + confirmed Go/MySQL behavior)
---
## Critical Pitfalls
### Pitfall 1: MySQL SUM with Division Returns Decimal, Not SIGNED Integer
**What goes wrong:**
When a `SUM()` expression includes any division operation (e.g., `SUM(amount * draw_count / total_count)`), MySQL returns the result as a `Decimal` type, not `BIGINT`. Scanning a Decimal into a Go `int64` field silently returns `0`. The dashboard code already hit this and left a comment documenting it.
Evidence from `dashboard_activity.go:174`:
```
// 注意: MySQL SUM()运算涉及除法时会返回Decimal类型需要Scan到float64
```
The fix used there: scan revenue stats into `float64`, then cast to `int64` in Go.
**Why it happens:**
MySQL promotes arithmetic involving division to Decimal to preserve fractional precision. GORM's `Scan()` does not coerce types — it matches Go field types exactly, and `int64` ≠ Decimal causes a silent zero.
**How to avoid:**
Wrap any `SUM` that contains division with `CAST(... AS SIGNED)` in the SQL itself. This forces integer rounding at the database layer and lets you scan directly into `int64`. The existing cost query in `dashboard_activity.go:237` already uses this pattern:
```sql
CAST(SUM(...) AS SIGNED) as total_cost
```
Use `CAST(... AS SIGNED)` on every aggregated column that involves division. Never scan division-containing SUM results directly into `int64` without the cast.
**Warning signs:**
- Aggregated monetary fields come back as `0` even when data exists
- Revenue stats are non-zero but cost stats are zero (or vice versa)
- Struct fields stay at their zero values after `Scan()`
**Phase to address:** Implementation phase — apply during every query that uses proportional allocation (e.g., distributing an order's revenue across multiple activities via `draw_count / total_count`).
---
### Pitfall 2: Double-Counting Revenue When One Order Spans Multiple Activities
**What goes wrong:**
A single order can result in draw logs across multiple activities (e.g., a user plays activity A and activity B in one checkout). If you `SUM(orders.actual_amount)` grouped by activity without proportional allocation, the full order amount is counted in every activity it touches. The existing dashboard already experienced this and added two-level subquery attribution.
Evidence from `dashboard_activity.go:197-212`: the fix was to compute `draw_count per (order, activity)` and `total_count per order` in two separate subqueries, then scale the order amount by the ratio `draw_count / total_count`.
**Why it happens:**
Aggregation joins `orders` to `activity_draw_logs` which is a one-to-many relationship. Without explicit proration, the order amount fans out to every matching activity row.
**How to avoid:**
Always attribute revenue using the subquery pattern:
```sql
JOIN (
SELECT order_id, activity_id, COUNT(*) as draw_count
FROM activity_draw_logs JOIN activity_issues ON ...
GROUP BY order_id, activity_id
) as order_activity_draws ON order_activity_draws.order_id = orders.id
JOIN (
SELECT order_id, COUNT(*) as total_count
FROM activity_draw_logs GROUP BY order_id
) as order_total_draws ON order_total_draws.order_id = orders.id
```
Then multiply: `orders.actual_amount * order_activity_draws.draw_count / order_total_draws.total_count`. For the user-dimension function, this pattern still applies if a user's order touches multiple issues.
**Warning signs:**
- Total revenue across all activities exceeds the sum of all actual order payments
- A user's computed spending is greater than what WeChat Pay received
- Profit rates are implausibly negative across many activities
**Phase to address:** Implementation phase — design the user-dimension and activity-dimension query structure before writing SQL.
---
### Pitfall 3: Mixing Game-Pass Orders into Cash Revenue (Calculation Mouth-Discrepancy)
**What goes wrong:**
Game-pass orders (次卡) have `actual_amount = 0` and `source_type = 4` (or `order_no LIKE 'GP%'` or remark containing `use_game_pass`). Including them in `SUM(actual_amount + discount_amount)` makes their "revenue" appear as zero, understating total income. Including them in cost without crediting their imputed value makes every game-pass activity show a loss.
The codebase defines three detection conditions in `internal/service/finance/profit_metrics.go:IsGamePassOrder`. These must all be checked — any single condition is insufficient because historical data uses different conventions.
**Why it happens:**
Game-pass orders are structurally identical to regular orders but have zero monetary value. Treating all orders uniformly by summing `actual_amount` misses the imputed value of the subscription the user already paid.
**How to avoid:**
Use strict mutual exclusion in SQL:
- If game-pass order: revenue = `draw_count * activity.price_draw`, discount = 0, cash = 0
- If cash/coupon order: revenue = `actual_amount + discount_amount`, game-pass value = 0
- Use `CASE WHEN (source_type=4 OR order_no LIKE 'GP%' OR (actual_amount=0 AND remark LIKE '%use_game_pass%')) THEN ... ELSE ...` in every SUM
Never add `actual_amount + discount_amount + game_pass_value` as if they are additive columns of the same thing. They are alternative values for the same economic event.
**Warning signs:**
- Activities with many game-pass players show profit rates near -100%
- Total platform revenue is suspiciously lower than WeChat Pay reports
- `SpendingPaidCoupon` and `SpendingGamePass` are both non-zero for the same order
**Phase to address:** Implementation phase — encode the mutual-exclusion rule in query construction helpers before writing any aggregate SQL.
---
### Pitfall 4: Silently Ignoring Scan Errors on Aggregation Queries
**What goes wrong:**
Several existing dashboard queries call `db.Table(...).Select(...).Scan(&stats)` without checking the returned error. If the query fails (schema mismatch, column rename, database failover), `stats` remains an empty slice, downstream computations produce zero results, and no error is returned to the caller. The data looks correct (all zeros) rather than erroring.
Evidence from `dashboard_activity.go:146-158``drawStats` scan has no `.Error` check. The pattern appears in multiple places throughout the dashboard handlers.
**Why it happens:**
GORM's method chaining makes it easy to forget error handling. The pattern `db.Table(...).Scan(&x)` is syntactically identical whether you check `.Error` or not. In exploratory handler code that was never tested, errors were skipped for brevity.
**How to avoid:**
The new `internal/service/finance/` package must check every query error:
```go
if err := db.Table(...).Scan(&result).Error; err != nil {
return nil, fmt.Errorf("profit_loss query failed: %w", err)
}
```
Service functions should return `error` as second return value — not swallow errors internally. The existing `profit_metrics.go` pure functions have no DB access and are fine; the DB-querying functions must propagate errors.
**Warning signs:**
- Function returns zero values with no error in tests against an empty SQLite db
- Aggregation results are uniformly zero across all parameters
- Schema changes (column renames, table renames) cause silent failures
**Phase to address:** Implementation phase — establish error-check convention in the first function written; testing phase — assert non-nil error on deliberately broken queries.
---
### Pitfall 5: Omitting Refunded Orders from Cost Calculation
**What goes wrong:**
Inventory items (`user_inventory`) awarded from a subsequently refunded order should be excluded from cost. If you compute cost by summing `user_inventory.value_cents` grouped by `activity_id` without filtering on `orders.status`, you count the cost of prizes from refunded orders but don't count their revenue — making the platform appear to have given away prizes for free.
The existing code in `dashboard_activity.go:250-251` already had to special-case this:
```go
Where("(orders.status = ? OR user_inventory.order_id = 0 OR user_inventory.order_id IS NULL)", 2)
```
Note the legacy data escape hatch: some old inventory rows have `order_id = 0` or NULL and cannot be filtered by order status. This must be preserved.
**Why it happens:**
`user_inventory` records are created when prizes are awarded, which happens before the refund window closes. Refunds do not delete inventory rows — they update `orders.status` to 4. Naive aggregation on `user_inventory` ignores order status entirely.
**How to avoid:**
Always join `orders` to `user_inventory` via `order_id` and include the legacy escape hatch:
```sql
LEFT JOIN orders ON orders.id = user_inventory.order_id
WHERE (orders.status = 2 OR user_inventory.order_id = 0 OR user_inventory.order_id IS NULL)
AND COALESCE(user_inventory.remark, '') NOT LIKE '%void%'
```
The `void` remark filter is also required — manually voided inventory entries should never count as platform cost.
**Warning signs:**
- Platform cost is higher than expected for activities with known refund activity
- Cost-side totals don't reconcile with accounting system data
- Test cases with a refunded order still show non-zero cost
**Phase to address:** Implementation phase — add a test case with a refunded order and verify cost = 0 for that order's prizes.
---
### Pitfall 6: Using Write DB (DbW) for Analytics Queries
**What goes wrong:**
The project has master-slave read-write splitting. Analytics queries that run on `GetDbW()` (master) instead of `GetDbR()` (replica) add latency to the write path, can block replication, and in the worst case cause master overload under concurrent analytics requests.
The CONCERNS.md already flags 113 direct `GetDbW()` calls in the handler layer. The pattern of bypassing the correct DB connection is established in the codebase and can propagate to new code.
**Why it happens:**
`GetDbR()` and `GetDbW()` look identical in usage. Developers copying from handler code that was written for writes will use `GetDbW()` by accident. The finance service package does not yet have established conventions.
**How to avoid:**
The new `internal/service/finance/` service must accept a `*gorm.DB` read-only handle at construction time (inject `repo.GetDbR()`), not a full repository. Document in the function signatures or struct fields that only the read replica is used:
```go
type ProfitLossService struct {
dbR *gorm.DB // read replica only — never use for writes
logger *zap.Logger
}
```
Never call `repo.GetDbW()` inside finance analytics functions.
**Warning signs:**
- MySQL master replication lag increases when analytics endpoint is called
- Write latency spikes during dashboard loads
- `GetDbW()` appears in `internal/service/finance/` source files
**Phase to address:** Implementation phase — inject read-only DB handle in constructor; testing phase — verify with a mock that only the read DB is called.
---
## Technical Debt Patterns
| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|----------|-------------------|----------------|-----------------|
| Scan revenue into `float64` instead of fixing SQL with `CAST(AS SIGNED)` | Avoids SQL rewrite | Floating-point rounding on monetary values (e.g., 0.1 + 0.2 ≠ 0.3 in IEEE 754) | Never for monetary fields — always use `CAST(AS SIGNED)` |
| In-memory sort + full table fetch for custom sort order | Simpler than `ORDER BY` with computed columns | Loads unbounded rows into Go heap when activity count grows | Only acceptable if total row count is bounded by pagination elsewhere |
| Hardcoding game-pass detection conditions in each query | Avoids abstraction overhead | Three different detection conditions must stay in sync across multiple queries | Never — centralize detection in `IsGamePassOrder()` already defined in `finance` package |
| Skip error check on `Scan()` | Fewer lines of code | Silent wrong data; impossible to distinguish "query returned zero rows" from "query failed" | Never for financial data |
| Use `AVG(multiplier)` across draws as the cost multiplier | One query instead of per-row | Hides per-order multiplier variance; a 2x card on one draw inflates cost for all draws in the group | Acceptable for summary statistics; not for per-order breakdowns |
---
## Integration Gotchas
| Integration | Common Mistake | Correct Approach |
|-------------|----------------|-----------------|
| GORM `Scan` into anonymous struct | Forgetting to qualify column names in SELECT causes ambiguous column error when multiple tables have `id`, `created_at`, etc. | Always alias computed columns explicitly: `SELECT orders.user_id as user_id`, not `SELECT user_id` |
| GORM raw SQL with `Raw()` + `Scan()` | Parameterized values passed in wrong order cause SQL to silently use zero values | Verify query with `db.Statement.SQL.String()` during development; test with non-trivial input values |
| MySQL `COALESCE` with nullable int columns | `COALESCE(NULL, 0)` works but `COALESCE(column, 0)` on a non-nullable column with value `0` returns `0``NULLIF` needed to distinguish "not set" from "explicitly zero" | Use `COALESCE(NULLIF(value_cents, 0), fallback_1, fallback_2, 0)` pattern already established in existing cost queries |
| Multiple ID lists in `WHERE IN (?)` with GORM | Passing an empty slice `[]int64{}` produces invalid SQL `WHERE id IN ()` in some GORM versions | Guard with `if len(ids) == 0 { return emptyResult, nil }` before building the query |
| Read replica lag | Querying replica immediately after a write (e.g., after seeding test data) can return stale results | In tests, use write DB handle or wait for sync; in production, this is acceptable for analytics |
---
## Performance Traps
| Trap | Symptoms | Prevention | When It Breaks |
|------|----------|------------|----------------|
| Fetching all activities before computing profit/loss (no predicate pushdown) | 100% CPU on `Find(&activities)`, slow response time | Apply all filters (status, name, date range) in the initial `query` before scanning, then pass `activityIDs` to subsequent queries | When activity count exceeds ~1,000 |
| Correlated subquery inside SUM for every row | Query time grows O(n²) with draw log volume | Pre-aggregate into a derived table subquery joined once, not per-row | When draw_logs table exceeds ~500K rows |
| No index on `activity_draw_logs.order_id` or `user_inventory.activity_id` | Sequential scan on every analytics query | Verify indexes exist with `SHOW INDEX FROM activity_draw_logs`; add composite index `(issue_id, order_id)` if missing | From day one on tables with writes |
| Loading all activities into memory for in-application sort | Memory spike on large result sets; no benefit if caller only wants top-10 | Accept this tradeoff only when total activities < 500; add a hard cap with an error if exceeded | When activity count exceeds ~500 |
| Querying `user_inventory` without `status IN (1, 3)` filter | Voided/cancelled inventory items inflate cost | Always filter: `WHERE user_inventory.status IN (1, 3)` | Immediately — even small void counts distort cost |
---
## Security Mistakes
| Mistake | Risk | Prevention |
|---------|------|------------|
| Interpolating user-supplied `user_id` or `activity_id` into raw SQL string instead of parameterized query | SQL injection — attacker can exfiltrate all financial data | Always use parameterized queries: `.Where("user_id IN ?", ids)` not `fmt.Sprintf("user_id IN (%s)", idsStr)` |
| Exposing raw profit/loss data without admin role check | Non-admin users can read platform margin data | The new service functions are Service layer — callers (API handlers) must apply `RequireAdminRole()` middleware; document this requirement in the function's GoDoc |
| Logging query parameters that contain user IDs | User ID lists in error logs can be correlated with financial data | Log query failure with a count, not the full ID list: `"profit_loss query failed for %d users: %v"` |
---
## "Looks Done But Isn't" Checklist
- [ ] **Game-pass mutual exclusion:** Verify that `SpendingPaidCoupon` and `SpendingGamePass` are never both non-zero for the same order. Write a test case with a mixed-type order set.
- [ ] **Refunded order exclusion:** Add a test case where an order is refunded (status=4) and verify it contributes zero to both revenue and cost.
- [ ] **Legacy zero order_id:** Confirm inventory rows with `order_id = 0` are included in cost (not excluded by the orders JOIN). Add a test row with `order_id = 0` and verify it appears in cost.
- [ ] **Empty parameter handling:** Call both functions with nil/empty `userIDs` and nil/empty `activityID` — verify they return all-data aggregation, not empty results or SQL errors.
- [ ] **All five asset types covered:** Points, coupons, item cards, physical products, fragments. Verify all five appear in the breakdown output. Missing one silently understates cost.
- [ ] **CAST on division SUM:** Open every query with a `/` operator in a SUM and confirm `CAST(... AS SIGNED)` wraps the entire expression.
- [ ] **Read-only DB used:** Grep for `GetDbW` inside `internal/service/finance/` — result must be empty.
- [ ] **Error propagation:** Every `Scan()` call inside finance functions must have its `.Error` checked and returned to the caller.
---
## Recovery Strategies
| Pitfall | Recovery Cost | Recovery Steps |
|---------|---------------|----------------|
| Decimal-to-int64 silent zero | LOW | Add `CAST(AS SIGNED)` to affected SQL; rerun query — no data migration needed |
| Revenue double-counting discovered post-launch | MEDIUM | Backfill correct totals by recomputing with fixed query over historical data; notify operators of corrected figures |
| Wrong DB handle (write instead of read) | LOW | Change constructor injection; no data impact |
| Missing refund exclusion | MEDIUM | Recompute affected period's profit/loss with corrected query; mark old reports as superseded |
| Silently swallowed errors causing wrong zeros | LOW-MEDIUM | Add error checks; add alerting on zero-result aggregations where data is expected; audit logs for the affected period |
---
## Pitfall-to-Phase Mapping
| Pitfall | Prevention Phase | Verification |
|---------|-----------------|--------------|
| Decimal/int64 scan mismatch | Implementation — SQL design | Integration test: query with division-containing SUM, assert non-zero int64 result |
| Revenue double-counting | Implementation — query structure design | Test: one order across two activities; assert sum of per-activity revenue equals order total |
| Game-pass mutual exclusion | Implementation — use `IsGamePassOrder()` helper | Unit test: game-pass order contributes to `SpendingGamePass` only, not `SpendingPaidCoupon` |
| Ignored Scan errors | Implementation — code review gate | Test: deliberately broken query (wrong table name); assert returned error is non-nil |
| Refunded order in cost | Implementation — WHERE clause | Test: refunded order inventory; assert cost contribution is zero |
| Write DB used | Implementation — constructor injection | Grep check in CI: `GetDbW` must not appear in `internal/service/finance/` |
| Missing LIMIT on supporting queries | Implementation — query design | Load test with 1000 activities; verify response time stays under 2s |
---
## Sources
- `internal/api/admin/dashboard_activity.go` — direct evidence of BUG FIX comments for Decimal/int64, double-counting, game-pass misclassification (lines 173-175, 274-275, 544-545)
- `internal/api/admin/dashboard_spending.go` — evidence of multi-join aggregation patterns and game-pass CASE expressions
- `internal/service/finance/profit_metrics.go``IsGamePassOrder()` three-condition detection; `ComputeProfit()` integer arithmetic; established pattern for cost multiplier
- `internal/service/finance/profit_metrics_test.go` — existing test coverage confirming pure-function behavior
- `.planning/codebase/CONCERNS.md` — flagged 113 `GetDbW()` calls in handler layer, silently swallowed errors in financial paths, missing error checks in `pay_refund_admin.go`
- Go `database/sql` specification — `Scan()` does not coerce types; MySQL 8.x documentation — SUM with division promotes to Decimal
---
*Pitfalls research for: Go/GORM/MySQL profit/loss analytics (Bindbox Game)*
*Researched: 2026-03-21*

354
.planning/research/STACK.md Normal file
View File

@ -0,0 +1,354 @@
# Technology Stack
**Project:** Bindbox Game — Profit/Loss Analytics Functions
**Researched:** 2026-03-21
**Scope:** Service-layer multi-dimensional financial aggregation in an existing Go 1.24 / GORM 1.25 / MySQL project
---
## Existing Stack (Confirmed from Codebase)
The following are already in use and must not be replaced or duplicated.
| Layer | Technology | Version | Notes |
|-------|-----------|---------|-------|
| Language | Go | 1.24.0 | toolchain go1.24.2 |
| ORM | gorm.io/gorm | 1.25.9 | with `gorm.io/gen v0.3.26` |
| Database | MySQL | 8.x (inferred) | read/write split via `gorm.io/plugin/dbresolver` |
| DB driver | github.com/go-sql-driver/mysql | 1.7.1 | |
| Logger | go.uber.org/zap (wrapped) | 1.26.0 | project custom `logger.CustomLogger` interface |
| Test DB | gorm.io/driver/sqlite | 1.4.3 | in-memory SQLite via `NewSQLiteRepoForTest()` |
| Test assertions | github.com/stretchr/testify | 1.11.1 | |
| SQL mock | github.com/DATA-DOG/go-sqlmock | 1.5.2 | |
No new runtime dependencies are required for this milestone.
---
## Recommended Patterns for Analytics Functions
### 1. Query Execution: `db.Raw()` + Named Scan Struct for Complex Aggregations
**Confidence: HIGH** (verified from existing codebase usage in `dashboard_activity.go`, `dashboard_spending.go`)
The project already uses two GORM query styles:
**Style A — GORM builder with `.Select()` + `.Scan()`** (for joins + GROUP BY with multiple aggregated columns):
```go
type revenueRow struct {
DimensionID int64
TotalRevenue float64
TotalCost int64
}
var rows []revenueRow
db.Table(model.TableNameOrders).
Select(`
orders.user_id as dimension_id,
SUM(...) as total_revenue,
SUM(...) as total_cost
`).
Joins("LEFT JOIN ...").
Where("orders.status = ?", 2).
Group("orders.user_id").
Scan(&rows)
```
Use this style when:
- Grouping by a single dimension (user_id, activity_id)
- The aggregation fits in one SQL pass
- The query does not require correlated subqueries that GORM cannot model
**Style B — `db.Raw()` + `.Scan()`** (for queries with inline derived tables / CTEs):
```go
db.Raw(`
SELECT user_id, SUM(revenue) as total_revenue
FROM (
SELECT user_id, actual_amount + discount_amount as revenue
FROM orders WHERE status = 2
) t
WHERE user_id IN (?)
GROUP BY user_id
`, userIDs).Scan(&rows)
```
Use this style when:
- The query has two or more levels of subqueries
- GORM's builder would produce ambiguous `deleted_at` injection (known GORM pitfall, already documented in `dashboard_activity.go` comments)
- Conditional aggregation across multiple joins is complex enough to be unmaintainable in builder form
**Recommendation:** Use Style A (builder) as the default. Drop to Style B only when builder clarity degrades — which happens when the query has more than 2 subquery levels.
---
### 2. Service Constructor Pattern
**Confidence: HIGH** (established project pattern)
All services follow this exact signature:
```go
package finance
import (
"bindbox-game/internal/pkg/logger"
"bindbox-game/internal/repository/mysql"
"bindbox-game/internal/repository/mysql/dao"
)
type Service interface {
QueryUserProfitLoss(ctx context.Context, params UserProfitLossParams) (*ProfitLossResult, error)
QueryActivityProfitLoss(ctx context.Context, params ActivityProfitLossParams) (*ProfitLossResult, error)
}
type service struct {
logger logger.CustomLogger
readDB *dao.Query // analytics always use read replica
repo mysql.Repo // for direct *gorm.DB access when needed
}
func New(l logger.CustomLogger, db mysql.Repo) Service {
return &service{
logger: l,
readDB: dao.Use(db.GetDbR()),
repo: db,
}
}
```
Always use `db.GetDbR()` for analytics (read replica). Never use `GetDbW()` in the new `finance` service. The `writeDB` field should not exist in this service.
---
### 3. Multi-Dimensional Aggregation: Fan-Out + In-Memory Merge
**Confidence: HIGH** (established by `DashboardPlayerSpendingLeaderboard` and `DashboardActivityProfitLoss`)
The codebase consistently uses this pattern for analytics requiring data from multiple tables:
1. Fetch the primary dimension IDs in one query (user IDs or activity IDs)
2. Execute N parallel scan queries — one per data source (orders, inventory, draw_logs, etc.)
3. Accumulate results into a `map[int64]*ResultItem`
4. Apply in-memory business logic (e.g. `ComputeProfit`, `ClassifyOrderSpending`)
5. Return the merged result
```go
// Step 1: collect IDs
dimensionIDs := []int64{...}
// Step 2: fan out — each scan targets one logical data source
var revRows []revRow
db.Table(...).Select(...).Where("user_id IN ?", dimensionIDs).Group("user_id").Scan(&revRows)
var costRows []costRow
db.Table(...).Select(...).Where("user_id IN ?", dimensionIDs).Group("user_id").Scan(&costRows)
// Step 3: merge into map
resultMap := make(map[int64]*ProfitLossResult)
for _, r := range revRows {
resultMap[r.UserID].Revenue = r.Total
}
for _, c := range costRows {
resultMap[c.UserID].Cost = c.Total
}
// Step 4: apply finance functions
for _, item := range resultMap {
item.Profit, item.ProfitRate = financesvc.ComputeProfit(item.Revenue, item.Cost)
}
```
**Why this approach over a single mega-JOIN:**
- Avoids Cartesian products when joining tables with 1-to-many relationships (draw_logs × inventory × orders)
- Individual queries are independently cacheable in future
- Easier to test each data segment in isolation
- Avoids MySQL's `GROUP BY` optimizer struggling with multi-table fan-out
---
### 4. Optional Parameter Pattern with Struct
**Confidence: HIGH** (aligns with project idiom and Go best practices for analytics functions)
The new functions must accept all-optional parameters (no asset type = all types, no IDs = all records, no time range = all time). Use a plain struct — not variadic options or functional options — consistent with how the project already expresses request inputs:
```go
// AssetType constants — defined in finance package
type AssetType int
const (
AssetTypeAll AssetType = 0 // zero value = "all types"
AssetTypePoints AssetType = 1
AssetTypeCoupon AssetType = 2
AssetTypeItemCard AssetType = 3
AssetTypeProduct AssetType = 4
AssetTypeFragment AssetType = 5
)
type UserProfitLossParams struct {
AssetTypes []AssetType // empty = all types
UserIDs []int64 // empty = all users
StartTime *time.Time // nil = no lower bound
EndTime *time.Time // nil = no upper bound
}
type ActivityProfitLossParams struct {
AssetTypes []AssetType // empty = all types
ActivityIDs []int64 // empty = all activities
StartTime *time.Time
EndTime *time.Time
}
```
Do NOT use `time.Time` zero values as sentinels — pointer semantics make optionality explicit and avoid the zero-time edge case in GORM queries.
---
### 5. Result Type Design
**Confidence: HIGH** (matches the finance domain model already in place)
```go
// ProfitLossBreakdown is one asset-type slice within the result.
type ProfitLossBreakdown struct {
AssetType AssetType `json:"asset_type"`
Revenue int64 `json:"revenue"` // platform income (fen)
Cost int64 `json:"cost"` // prize cost (fen)
Profit int64 `json:"profit"` // revenue - cost (fen)
}
// ProfitLossResult is returned by both dimension functions.
type ProfitLossResult struct {
TotalRevenue int64 `json:"total_revenue"`
TotalCost int64 `json:"total_cost"`
TotalProfit int64 `json:"total_profit"`
ProfitRate float64 `json:"profit_rate"`
Breakdown []ProfitLossBreakdown `json:"breakdown"`
}
```
Keep all monetary values as `int64` fen (1/100 RMB), consistent with the entire codebase. Never use `float64` for monetary storage — only for profit rate display.
---
### 6. Existing Finance Utilities (Reuse, Do Not Reimplement)
**Confidence: HIGH** (verified in `internal/service/finance/profit_metrics.go`)
These functions are already tested and must be reused in the new service:
| Function | Purpose |
|----------|---------|
| `ClassifyOrderSpending(sourceType, orderNo, actualAmount, discountAmount, remark, gamePassValue)` | Classifies order as game-pass or paid-coupon and returns `SpendingBreakdown` |
| `IsGamePassOrder(sourceType, orderNo, actualAmount, remark)` | Boolean test for game pass order |
| `ComputeGamePassValue(drawCount, activityPrice)` | Calculates game pass monetary value |
| `ComputePrizeCostWithMultiplier(baseCost, multiplierX1000)` | Applies item card multiplier to base cost |
| `ComputeProfit(spending, prizeCost)` | Returns `(profit int64, profitRate float64)` |
| `NormalizeMultiplierX1000(multiplierX1000)` | Clamps multiplier to minimum 1000 |
The new aggregation functions will call these at the per-row level when processing scan results in Go, not inside SQL expressions where possible.
---
### 7. SQL Aggregation Best Practices for This Codebase
**Confidence: HIGH** (derived from existing queries and MySQL behavior)
**Use `CAST(... AS SIGNED)` for SUM over expressions involving division:**
MySQL returns `DECIMAL` for `SUM(x / y)` even when inputs are `BIGINT`. This causes GORM scan failures into `int64`. The existing code already uses `CAST(SUM(...) AS SIGNED)`.
**Use `COALESCE(NULLIF(col, 0), fallback1, fallback2, 0)` for value resolution:**
The price priority chain for inventory items is established in the codebase:
```sql
COALESCE(NULLIF(user_inventory.value_cents, 0),
activity_reward_settings.price_snapshot_cents,
products.price,
0)
```
Always use this chain when resolving item cost — do not use `products.price` alone as it may be stale.
**Use `GREATEST(COALESCE(multiplier, 1000), 1000)` for multiplier safety:**
Prevents zero or negative multipliers from producing incorrect cost calculations.
**Avoid GORM auto-injecting `deleted_at` in subqueries:**
When writing raw subqueries inside `.Joins()`, explicitly add `deleted_at IS NULL` conditions. GORM does NOT auto-inject soft-delete conditions inside string literals passed to `.Joins()`. This is a known bug documented in the existing code comments.
**Time range filtering — use explicit column prefix:**
```go
if params.StartTime != nil {
db = db.Where("orders.created_at >= ?", *params.StartTime)
}
if params.EndTime != nil {
db = db.Where("orders.created_at <= ?", *params.EndTime)
}
```
Always prefix column names with table names in multi-join queries to prevent `ambiguous column` errors.
---
### 8. Testing Pattern
**Confidence: HIGH** (established pattern in `profit_metrics_test.go` and `testrepo_sqlite.go`)
**Unit tests for pure finance logic** (no DB): test all functions in `profit_metrics.go` and the new calculation logic directly. These should cover boundary cases (zero revenue, zero cost, all-optional params, single asset type).
**Integration tests for scan functions**: use `NewSQLiteRepoForTest()` to create an in-memory SQLite DB. Note the limitations:
- SQLite does not support `CAST(... AS SIGNED)` — use `CAST(... AS INTEGER)` in test-only helper SQL, or restructure the scan to accept `float64` and convert in Go
- SQLite does not support `LIKE 'GP%'` the same way in some edge cases — keep game-pass detection in Go-layer logic where possible, not in SQL CASE expressions during testing
- The `GREATEST()` MySQL function is not available in SQLite — abstract multiplier logic into Go helpers
Recommended test structure for the new service:
```
internal/service/finance/
├── profit_metrics.go (existing — pure business logic, no DB)
├── profit_metrics_test.go (existing — pure unit tests)
├── service.go (NEW — Service interface + constructor)
├── params.go (NEW — param structs, AssetType constants, result types)
├── query_user.go (NEW — UserProfitLoss scan logic)
├── query_activity.go (NEW — ActivityProfitLoss scan logic)
└── service_test.go (NEW — integration tests using SQLiteRepoForTest)
```
Keep each query file under 300 lines. If `query_user.go` grows beyond that, split by data source (e.g. `query_user_revenue.go`, `query_user_cost.go`).
---
## What NOT to Do
| Anti-Pattern | Why | What to Do Instead |
|-------------|-----|-------------------|
| Single mega-JOIN across orders + inventory + draw_logs + products | Produces Cartesian products; MySQL optimizer struggles; query becomes unmaintainable | Fan-out into separate `.Scan()` calls per data source, merge in Go |
| `float64` for monetary storage in result structs | Precision loss at large values; inconsistent with codebase | Use `int64` (fen); only use `float64` for display-only fields like `profit_rate` |
| Using GORM GEN query builder for complex aggregations | GEN is designed for CRUD; `.Select()` + `.Group()` via GEN is awkward for multi-table GROUP BY with conditional SUM | Use `db.GetDbR().Table(...).Select(raw).Joins(...).Scan()` directly |
| Returning raw `*gorm.DB` from the service layer | Leaks ORM dependency upward; breaks testability | Return typed result structs |
| Putting business logic (e.g. game-pass classification) inside SQL CASE expressions | Hard to test; differs between MySQL and SQLite; duplicates logic from `finance` package | Compute classification in Go after scanning raw amounts |
| Accepting `time.Time{}` zero value to mean "no filter" | Zero time is a valid timestamp; causes subtle bugs | Use `*time.Time`; nil means "no filter" |
| Writing analytics queries to the write DB | Unnecessary load on master; read replica exists exactly for this purpose | Always use `repo.GetDbR()` in analytics service |
| Reusing the existing dashboard handler logic directly | Dashboard logic is tightly coupled to HTTP handler, specific response shape, and pagination | Implement fresh service-layer functions with clean params/result types |
---
## Dependency Additions
None required. All necessary libraries are already in `go.mod`.
---
## Sources
- Codebase analysis: `internal/service/finance/profit_metrics.go` (existing finance utilities)
- Codebase analysis: `internal/api/admin/dashboard_activity.go` (activity-dimension aggregation pattern)
- Codebase analysis: `internal/api/admin/dashboard_spending.go` (user-dimension aggregation pattern)
- Codebase analysis: `internal/repository/mysql/mysql.go` (Repo interface, DbR/DbW split)
- Codebase analysis: `internal/repository/mysql/testrepo_sqlite.go` (test DB pattern)
- Codebase analysis: `internal/service/user/user.go` (Service interface + constructor pattern)
- GORM v1.25 docs: soft-delete not injected into raw JOIN strings — HIGH confidence (matches existing code comments)
- MySQL docs: `SUM()` returns DECIMAL when expression involves division — HIGH confidence (matches `CAST(... AS SIGNED)` usage in codebase)
---
*Stack analysis: 2026-03-21*

View File

@ -0,0 +1,191 @@
# Project Research Summary
**Project:** Bindbox Game — Profit/Loss Analytics Service Layer
**Domain:** Go/GORM/MySQL financial aggregation functions for a game/e-commerce platform
**Researched:** 2026-03-21
**Confidence:** HIGH
## Executive Summary
This milestone implements two reusable service-layer functions — `QueryUserProfitLoss` and `QueryActivityProfitLoss` — that aggregate platform-perspective profit and loss data across user and activity dimensions. The domain is internal financial analytics on an existing Go 1.24 / GORM 1.25 / MySQL 8.x stack. No new runtime dependencies are required. All necessary shared logic (game-pass classification, prize cost with multiplier, profit computation) already exists in `internal/service/finance/profit_metrics.go` and must be reused without reimplementation. The key architectural decision is a dedicated `internal/service/finance/` package with a clean `Service` interface, constructor accepting only `DbR` (read replica), and typed input/output structs — not an extension of existing HTTP handler logic.
The recommended query approach is the fan-out + in-memory merge pattern already established in the codebase: issue separate, independently scoped `db.Table(...).Scan()` calls per data source (orders, inventory, draw logs, ledger), then merge results in Go using a `map[int64]*ProfitLossResult`. This avoids Cartesian products from multi-table JOINs, keeps individual queries testable in isolation, and remains compatible with the SQLite test harness. Raw SQL (`db.Raw()`) should be used only when the query requires more than two levels of subqueries, consistent with the existing codebase convention.
The highest-severity risks are revenue double-counting (when one order spans multiple activities), silent scan errors (GORM's `Scan()` does not surface type-mismatch failures), and misclassifying game-pass orders as zero-revenue orders. All three have prior-art evidence in the existing dashboard code, with fix patterns already established. The new service layer must enforce: `CAST(... AS SIGNED)` on any `SUM` containing division, strict mutual exclusion between game-pass and cash revenue paths, refunded-order exclusion from both revenue and cost, and error propagation from every `Scan()` call.
---
## Key Findings
### Recommended Stack
The existing stack is fully sufficient. Go 1.24, GORM 1.25 with `gorm.io/gen v0.3.26`, MySQL 8.x with read/write split via `gorm.io/plugin/dbresolver`, `go.uber.org/zap` (wrapped as `logger.CustomLogger`), `testify` for assertions, and in-memory SQLite via `NewSQLiteRepoForTest()` for integration tests. Adding no new dependencies reduces risk and keeps the codebase coherent.
**Core technologies:**
- **Go 1.24**: Primary language — existing toolchain, no change
- **GORM 1.25**: ORM — use `db.Table().Select().Scan()` (Style A) for single-dimension GROUP BY; `db.Raw().Scan()` (Style B) for multi-level subqueries
- **MySQL 8.x (DbR)**: Read replica — all analytics queries must route here via `repo.GetDbR()`
- **`logger.CustomLogger`**: Project-standard logger — inject at constructor, not package-level
- **SQLite (test only)**: In-memory test DB — `NewSQLiteRepoForTest()`; note SQLite does not support `CAST(AS SIGNED)` or `GREATEST()` — abstract these into Go helpers
- **Existing `finance.*` utilities**: `ClassifyOrderSpending`, `IsGamePassOrder`, `ComputeGamePassValue`, `ComputePrizeCostWithMultiplier`, `ComputeProfit`, `NormalizeMultiplierX1000` — all must be called, never re-derived
### Expected Features
**Must have (table stakes — P1):**
- Revenue calculation: `actual_amount + discount_amount` (coupon discount adds back real value) with strict refund/void exclusion
- Game-pass order classification via `finance.IsGamePassOrder` — three-condition detection, mutual exclusion from cash revenue
- Game-pass value derivation: `draw_count × activity_price` via `finance.ComputeGamePassValue`
- Prize cost with item-card multiplier via `finance.ComputePrizeCostWithMultiplier`
- Profit calculation via `finance.ComputeProfit` returning `(int64, float64)`
- Time-range filter: `*time.Time` start/end, nil means no bound — never use zero-value sentinel
- User-dimension aggregation: `QueryUserProfitLoss(ctx, ProfitLossParams)` accepting `[]int64` user IDs (empty = all users)
- Activity-dimension aggregation: `QueryActivityProfitLoss(ctx, ProfitLossParams)` accepting one activity ID
- `ProfitLossResult` struct: total revenue, cost, profit, profit_rate, plus `[]ProfitLossBreakdown`
- Canonical `AssetType` enum: Points (1), Coupon (2), ItemCard (3), Product (4), Fragment (5), All (0)
- All monetary values stored as `int64` fen — never `float64` for storage
- Read-only DB routing: constructor injects `repo.GetDbR()` only — `GetDbW()` must not appear anywhere in this package
- Error propagation: every `Scan()` error checked and returned, never swallowed
**Should have (differentiators — P2):**
- Per-asset-type cost breakdown (5 types as separate breakdown slice entries)
- Fragment asset type cost integration via `fragment_synthesis_logs`
- Batch activity IDs support (`[]int64` activity IDs, not just one)
**Defer (v2+):**
- Redis TTL caching wrapper around the query functions (defer until query latency exceeds 2s)
- Incremental / time-bucketed aggregation with materialized stats tables (requires schema additions)
- Douyin (livestream) order integration into user-dimension function
### Architecture Approach
Note: A separate ARCHITECTURE.md was not produced; architecture findings are synthesized from STACK.md and codebase analysis embedded in all three research files.
The architecture follows the established layered pattern: new package `internal/service/finance/` with a `Service` interface and constructor accepting `logger.CustomLogger` and `mysql.Repo`. Business logic lives exclusively in service functions; HTTP handlers call service functions and handle pagination, auth, and response formatting. The fan-out query pattern (multiple targeted `Scan()` calls merged in Go) replaces any attempt at a single mega-JOIN. Pure finance computation functions (no DB access) remain in `profit_metrics.go`; new DB-querying logic lives in separate, focused files.
**Major components:**
1. **`service.go`** — `Service` interface definition + `New(logger, repo)` constructor; stores `dbR *gorm.DB` (read-only handle) and `logger`
2. **`params.go`** — `AssetType` constants, `UserProfitLossParams`, `ActivityProfitLossParams`, `ProfitLossResult`, `ProfitLossBreakdown` types; shared between both query files
3. **`query_user.go`** — `QueryUserProfitLoss` implementation: ID collection, fan-out scans, in-memory merge calling existing `finance.*` utilities
4. **`query_activity.go`** — `QueryActivityProfitLoss` implementation: same pattern, activity-dimension scoping with proportional revenue attribution
5. **`profit_metrics.go`** (existing) — pure business logic functions; no modification needed
6. **`service_test.go`** — integration tests using `NewSQLiteRepoForTest()`; covers boundary cases (zero revenue, refunded orders, game-pass, legacy `order_id=0` inventory)
### Critical Pitfalls
1. **MySQL `SUM` with division returns Decimal, not SIGNED integer** — Wrap every `SUM(... / ...)` expression with `CAST(... AS SIGNED)` in SQL. Scanning Decimal into `int64` silently returns 0. Already hit in `dashboard_activity.go:174`; the fix is established — apply it consistently.
2. **Revenue double-counting when one order spans multiple activities** — Use the two-level subquery attribution pattern from `dashboard_activity.go:197-212`: compute `draw_count per (order, activity)` and `total_count per order` in separate derived tables, then prorate: `actual_amount * draw_count / total_count`. Naive `SUM(actual_amount)` grouped by activity fans out the full order to every matching activity.
3. **Game-pass orders misclassified as zero-revenue orders** — Use strict mutual exclusion: if `IsGamePassOrder()` returns true, revenue = `draw_count × activity_price`; otherwise revenue = `actual_amount + discount_amount`. Never sum both paths together. Three detection conditions must all be checked — not just `source_type=4`.
4. **Silently ignored `Scan()` errors causing all-zero results** — Every `Scan()` call must check `.Error` and return the error to the caller. "All zeros" is indistinguishable from a failed query without this check. This pattern is missing from existing dashboard code and must be corrected in the new package.
5. **Refunded order inventory counted as prize cost** — Always join `user_inventory` to `orders` on `order_id` and filter `orders.status = 2`, with the legacy escape hatch `OR user_inventory.order_id = 0 OR user_inventory.order_id IS NULL`. Add `NOT LIKE '%void%'` on `user_inventory.remark`. Refunds update `orders.status` but do not delete inventory rows.
6. **Writing analytics queries to the write DB (DbW)** — The constructor must accept and store only `repo.GetDbR()`. No call to `GetDbW()` should exist in `internal/service/finance/`. Enforce via grep in CI.
---
## Implications for Roadmap
Based on combined research, a 3-phase structure is recommended. All P1 features are tightly interdependent (revenue depends on game-pass classification, cost depends on multiplier logic, profit depends on both) so they belong in one implementation phase. Per-asset-type breakdown is isolated enough to be a second phase. Testing and hardening constitute a third phase.
### Phase 1: Foundation and Core P&L Functions
**Rationale:** All P1 features share the same data sources and query infrastructure. Building them together ensures the fan-out pattern, error handling convention, and read-replica routing are established consistently from the start. Deferring any P1 feature creates inconsistency in the result struct and breaks the `ProfitLossResult` contract for callers.
**Delivers:** Working `QueryUserProfitLoss` and `QueryActivityProfitLoss` with correct revenue (cash + game-pass), correct cost (with multiplier), correct profit, time-range filter, and refund/void exclusion. New package skeleton: `service.go`, `params.go`, `query_user.go`, `query_activity.go`.
**Addresses (from FEATURES.md):** All P1 features — revenue calculation, game-pass classification/derivation, prize cost with multiplier, profit formula, time-range filter, user-dimension aggregation, activity-dimension aggregation, composable filter struct, result type, AssetType enum, multi-user batch, read-DB enforcement.
**Avoids (from PITFALLS.md):** Decimal/int64 scan mismatch (CAST), revenue double-counting (subquery attribution), game-pass mutual exclusion, write-DB usage, silently swallowed scan errors.
**Research flag:** Standard patterns — established codebase conventions are documented; no additional research phase needed.
### Phase 2: Per-Asset-Type Breakdown
**Rationale:** The breakdown slice in `ProfitLossResult` can be populated as a separate set of GROUP BY legs per asset type once the core aggregation is proven correct. This requires extending the SQL or adding additional scan passes — but must not alter the top-level totals, making it safe to do independently.
**Delivers:** Populated `[]ProfitLossBreakdown` in the result, with one entry per `AssetType` (Points, Coupon, ItemCard, Product, Fragment). Fragment cost integration from `fragment_synthesis_logs`.
**Addresses (from FEATURES.md):** Per-asset-type breakdown (P2), Fragment synthesis cost (P2).
**Avoids (from PITFALLS.md):** Missing asset type silently understating cost ("Looks Done But Isn't" checklist item).
**Research flag:** Needs shallow research — Fragment synthesis log schema and join path require verification against the current DB model before implementation.
### Phase 3: Batch Activity IDs and Hardening
**Rationale:** Batch activity ID support (`[]int64`) is a low-complexity extension once the single-activity path is correct. Hardening (additional test cases, CI grep gate, load test) consolidates correctness guarantees.
**Delivers:** `QueryActivityProfitLoss` accepting `[]int64` activity IDs. CI enforcement of `GetDbW` absence. Integration tests covering all "Looks Done But Isn't" checklist items. Load test verification with 1,000 activities.
**Addresses (from FEATURES.md):** Batch activity IDs support (P2).
**Avoids (from PITFALLS.md):** Empty `[]int64{}` producing invalid SQL `WHERE IN ()`; performance trap of unbounded activity fetch; missing LIMIT guard.
**Research flag:** Standard patterns — no research phase needed.
### Phase Ordering Rationale
- Phase 1 must precede Phase 2 because the `ProfitLossResult` struct and fan-out pattern must be stable before extending it with per-type breakdown legs.
- Phase 2 must precede Phase 3 because batch activity ID support needs the full result struct (including breakdown) to be defined first.
- The proportional revenue attribution pattern (subquery join) is the most complex SQL in Phase 1 and must be designed before any other query is written — it anchors the activity-dimension function's correctness.
- SQLite test compatibility limits matter for Phase 1: `CAST(AS SIGNED)` and `GREATEST()` must be abstracted into Go helpers or test-specific SQL variants before the integration test suite is written.
### Research Flags
Phases needing deeper research during planning:
- **Phase 2:** Fragment synthesis log schema — verify `fragment_synthesis_logs` table columns, join path to `user_inventory` or `activity_id`, and whether the cost model for fragments differs from physical goods.
Phases with standard patterns (skip research-phase):
- **Phase 1:** All patterns are documented in the existing codebase with explicit prior-art examples.
- **Phase 3:** Batch ID extension and CI hardening are mechanical changes on established patterns.
---
## Confidence Assessment
| Area | Confidence | Notes |
|------|------------|-------|
| Stack | HIGH | All findings verified directly from `go.mod`, existing source files, and inline code comments — no speculation |
| Features | HIGH | Derived from direct codebase analysis of existing dashboard implementations and `profit_metrics.go`; requirements validated against PROJECT.md |
| Architecture | HIGH | Architecture inferred from STACK.md (no separate ARCHITECTURE.md produced); all patterns confirmed from multiple existing service examples in codebase |
| Pitfalls | HIGH | Every pitfall has direct evidence — code comments, inline bug fixes, or `.planning/codebase/CONCERNS.md` entries in the existing codebase |
**Overall confidence:** HIGH
### Gaps to Address
- **ARCHITECTURE.md was not produced** by the parallel research phase. Architecture guidance was successfully recovered from STACK.md (which contained the service constructor pattern, file structure, and query patterns) and from codebase analysis referenced in FEATURES.md and PITFALLS.md. No meaningful gap results — all architectural decisions are documented in this summary.
- **Fragment asset type cost model** is undefined for v1. The `fragment_synthesis_logs` table exists and the `AssetType` enum entry is defined, but the exact cost calculation formula and join path are not yet verified. Address in Phase 2 planning with a focused schema review.
- **SQLite test compatibility**: `CAST(AS SIGNED)`, `GREATEST()`, and `LIKE 'GP%'` are MySQL-specific. Integration tests on SQLite will require either Go-layer abstraction of these expressions or conditional SQL paths. This is a known constraint; address during Phase 1 test writing, not a blocker.
- **Batch activity IDs** deferred to Phase 3. The current `QueryActivityProfitLoss` design assumes one activity ID. The parameter struct should use `[]int64` from the start (even if Phase 1 only enforces `len(activityIDs) == 1`) to avoid a breaking interface change in Phase 3.
---
## Sources
### Primary (HIGH confidence — direct codebase analysis)
- `internal/service/finance/profit_metrics.go` — existing shared finance primitives; `IsGamePassOrder`, `ComputeProfit`, `ComputePrizeCostWithMultiplier`, `ClassifyOrderSpending`
- `internal/api/admin/dashboard_activity.go` — activity-dimension aggregation, prior-art bug fixes for Decimal/int64, double-counting, game-pass classification (lines 146-274)
- `internal/api/admin/dashboard_spending.go` — user-dimension aggregation, multi-join fan-out pattern
- `internal/api/admin/dashboard_user_spending.go` — per-user spending drill-down
- `internal/repository/mysql/mysql.go``Repo` interface, `GetDbR()` / `GetDbW()` split
- `internal/repository/mysql/testrepo_sqlite.go``NewSQLiteRepoForTest()` pattern
- `internal/service/user/user.go` — canonical `Service` interface + constructor pattern
- `.planning/codebase/CONCERNS.md` — flagged 113 `GetDbW()` calls in handler layer; silently swallowed errors in financial paths
- `go.mod` — confirmed dependency versions (Go 1.24.0, GORM 1.25.9, testify 1.11.1)
### Secondary (HIGH confidence — official documentation cross-referenced with codebase evidence)
- GORM v1.25 docs: soft-delete not auto-injected into raw JOIN strings — confirmed by existing code comments
- MySQL 8.x docs: `SUM()` with division promotes to Decimal — confirmed by `dashboard_activity.go:174` comment and `CAST(AS SIGNED)` fix pattern
---
*Research completed: 2026-03-21*
*Ready for roadmap: yes*