# Coding Conventions **Analysis Date:** 2026-03-21 ## Naming Patterns **Files:** - Snake_case for all Python files: `company_storage.py`, `company_cleaner.py`, `clickhouse_repo.py` - Private/internal modules prefixed with underscore: `_base.py`, `_boss_api.py`, `_boss_client.py`, `_boss_sign.py`, `_http_client.py` - Platform-named service files: `boss.py`, `qcwy.py`, `zhilian.py` under `app/services/crawler/` - Router files named after domain: `keyword.py`, `analytics.py`, `cleaning.py` **Classes:** - PascalCase throughout: `CleaningService`, `KeywordController`, `ClickHouseBaseRepo`, `JobAnalyticsRepo` - Services: `{Domain}Service` — `BossService`, `QcwyService`, `ZhilianService`, `IngestService`, `AnalyticsService` - Controllers: `{Domain}Controller` — `KeywordController` - Repos: `{Domain}Repo` or `{Domain}BaseRepo` — `ClickHouseBaseRepo`, `JobAnalyticsRepo` - Models (Tortoise ORM): `{Platform}{Entity}` — `BossKeyword`, `QcwyCompany`, `ZhilianCompany` - Schemas (Pydantic): `{Entity}Base`, `{Entity}Create`, `{Entity}Update`, `{Entity}Out` — see `app/schemas/keyword.py` **Functions and Methods:** - Snake_case for all functions and methods: `get_available`, `report_page_progress`, `store_batch`, `build_insert_row` - Private helpers prefixed with underscore: `_apply_proxy`, `_ensure_boss_token_loaded`, `_pick_first`, `_nested_get`, `_clean_text`, `_model_for_source` - Async dependency factories follow pattern `get_{service/controller}()`: `get_ingest_service`, `get_analytics_service`, `get_keyword_controller` **Variables:** - Snake_case: `data_list`, `platform_type`, `check_duplicate`, `page_size` - Module-level constants: UPPER_SNAKE_CASE — `COMPANY_SOURCES`, `QUEUE_TERMINAL_STATUSES` - Class-level constants: UPPER_SNAKE_CASE prefixed `_` — `_TOKEN_REFRESH_INTERVAL = 3600` **Types and Enums:** - Enums use PascalCase class name, UPPER_SNAKE_CASE values: `PlatformType.BOSS`, `ChannelType.MINI`, `DataType.JOB` - Enum values are lowercase strings matching URL slugs: `"boss"`, `"mini"`, `"job"` — see `app/schemas/ingest.py` - Enums inherit from `(str, Enum)` enabling direct string comparison ## Code Style **Formatting:** - Tool: `black` v24.10.0 - Line length: 120 characters (set in `pyproject.toml` `[tool.black]` and `[tool.ruff]`) - Target Python versions: 3.10, 3.11 (black), 3.13 (Pipfile) **Linting:** - Tool: `ruff` v0.9.1 (configured in `pyproject.toml`) - Ignored rules: `F403` (star imports), `F405` (may be undefined from star import) - Star imports from internal modules are allowed (used in `app/models/__init__.py`, `app/services/ingest/__init__.py`) **Import Sorting:** - Tool: `isort` v5.13.2 - No explicit isort config found; follows default ordering ## Import Organization **Order:** 1. Standard library (`from __future__`, `os`, `re`, `typing`, `datetime`, `json`) 2. Third-party (`fastapi`, `pydantic`, `tortoise`, `loguru`, `clickhouse_connect`) 3. Internal app imports (`from app.core.`, `from app.models.`, `from app.services.`, `from app.schemas.`) **Example from `app/api/v1/analytics.py`:** ```python from typing import Optional from datetime import datetime, date, timezone from zoneinfo import ZoneInfo from fastapi import APIRouter, Depends, Query from app.core.clickhouse import clickhouse_manager from app.services.analytics_service import AnalyticsService from app.schemas.analytics import JobStatisticsResponse ``` **Path Aliases:** - None; all imports use full `app.` prefix paths - `from app.log import logger` is the canonical loguru import path **Star Imports:** - Used only in `__init__.py` re-export files: `from .admin import *` in `app/models/__init__.py` - `# noqa: F401, F403` comments suppress lint warnings for intentional star imports ## Error Handling **Patterns:** - Services return `Dict[str, Any]` result objects with `"success"`, `"code"`, `"message"` fields instead of raising exceptions to callers - Controllers return dict with `"code": 200/400/404` and `"message"` for all outcomes - API route handlers do NOT use try/except — they rely on services returning structured results - Service methods wrap low-level calls in `try/except Exception as e` and log then return `False` or error dict **Service-level error handling example** (`app/services/cleaning.py`): ```python except Exception as e: logger.error(f"Error processing item {target}: {e}") return { "success": False, "target": target, "error": str(e), "storage_status": "error", "remote_sent": False } ``` **Repository-level:** `ClickHouseBaseRepo` does not swallow exceptions; they propagate to the service layer. **Auth exceptions:** `app/core/dependency.py` raises `HTTPException(status_code=401/403)` directly — the standard FastAPI pattern for auth failures. ## Logging **Framework:** `loguru` v0.7.3 **Import:** `from app.log import logger` (centralized re-export) or `from loguru import logger` (direct) **Patterns:** - `logger.info(f"...")` for normal operation events - `logger.warning(f"...")` for non-fatal recoverable issues (e.g., token not found, API soft failures) - `logger.error(f"...")` for caught exceptions and operation failures - F-string interpolation used consistently for message formatting - No structured fields (no `logger.bind()` usage observed) **Example:** ```python logger.info(f"获取招聘详情: {job_id}") logger.warning(f"Boss get_job_detail failed: {result.error}") logger.error(f"批量插入失败: {e}") ``` ## API Response Format **Two response styles coexist:** **Style 1 — Direct dict return** (most routes in new modules like `app/api/v1/job/job.py`, `app/api/v1/analytics.py`): ```python return {"code": 200, "data": result, "message": "ok"} ``` **Style 2 — JSONResponse subclasses** (older RBAC routes, defined in `app/schemas/base.py`): ```python Success(code=200, msg="OK", data=data) Fail(code=400, msg="error message") SuccessExtra(code=200, data=data, total=100, page=1, page_size=20) ``` **Paginated responses** include: `code`, `data` (list), `total`, `page`, `page_size` ## Comments **When to Comment:** - Docstrings on public methods describing purpose, not implementation: `"""获取可用关键词,优先返回断点续爬和失败重试的关键词"""` - Inline comments for priority logic and algorithm steps: `# 优先级 1: 断点续爬 (partial)` - Module-level docstrings for context: `"""Boss直聘 Service — 基于新算法文件的封装"""` - `# noqa` comments for intentional lint suppressions **JSDoc/TSDoc:** - Not applicable (Python backend) - Docstrings are brief single-line or short multi-line Chinese descriptions ## Function Design **Size:** Functions tend to be 10-50 lines; service methods like `process_single_item` in `app/services/cleaning.py` grow to ~70 lines due to multi-platform dispatch **Parameters:** - Keyword arguments with defaults preferred for optional params - Pydantic schemas used for HTTP request bodies (never raw dicts from router params) - `Optional[str]` with `= None` default for optional parameters **Return Values:** - Services return `Dict[str, Any]` with consistent keys (`code`, `message`, `data`) - Private helpers return `Optional[T]` or primitive types - Async functions return awaitable results (no mixing of sync/async) ## Module Design **Exports:** - `app/models/__init__.py` uses `from .{module} import *` to flatten model imports - Router modules export a single named router variable: `router = APIRouter(...)` or `{domain}_router = APIRouter(...)` - Service classes are imported directly by name **Barrel Files (`__init__.py`):** - `app/models/__init__.py` — re-exports all model classes - `app/services/ingest/__init__.py` — re-exports `IngestService` and config registrations - `app/api/v1/__init__.py` — aggregates all routers into `v1_router` **Dependency Injection:** - FastAPI `Depends()` used for service/controller instantiation in route handlers - Dependency factory functions named `get_{service}()` and defined in the same file as the router - Shared auth dependencies: `DependAuth`, `DependPermission` in `app/core/dependency.py` ## Tortoise ORM Model Conventions **Base class:** All models inherit from `app/models/base.py:BaseModel` (which extends `tortoise.models.Model` with `id = BigIntField(pk=True)`) **Timestamp mixin:** `TimestampMixin` adds `created_at` (auto_now_add) and `updated_at` (auto_now) — applied via multiple inheritance **Abstract base models:** Platform variants use abstract base + concrete subclasses: ```python class BaseKeyword(Model): # abstract = True in Meta ... class BossKeyword(BaseKeyword): class Meta: table = "boss_keyword" ``` **Field descriptions:** All fields include `description=` parameter for documentation ## Pydantic Schema Conventions - All schemas inherit from `pydantic.BaseModel` - All fields use `Field(...)` with `description=` for documentation - Enums inherit from `(str, Enum)` for JSON serialization compatibility - Output schemas include `class Config: from_attributes = True` to support ORM mode - Validation patterns use `Field(..., pattern="^(boss|qcwy|zhilian)$")` for enum-like string fields --- *Convention analysis: 2026-03-21*