Plan 01 - DATA-01: 30-day window dedup fix: - dedup.py: both single-field and double-field SQL queries now include AND created_at > now() - INTERVAL 30 DAY - tests/ingest/test_dedup.py: 6 mock tests validating 30-day window Plan 02 - DATA-04: company vs search job channel separation: - schemas/ingest.py: ChannelType.COMPANY = 'company' - configs/boss.py: register channel='company' config - configs/qcwy.py: register channel='company' config - configs/zhilian.py: register channel='company' config - company_jobs_sync.py: store_batch(..., 'mini', ...) → (..., 'company', ...) DATA-02: confirmed already complete (job.py has /data/batch-async endpoint) DATA-03: confirmed already complete (company_cleaner.py full pipeline) Full regression: 112 passed (106 existing + 6 new)
export DOCKER_DEFAULT_PLATFORM=linux/amd64 docker build -t zfc931912343/admin-crawler:v2.1 . docker push zfc931912343/admin-crawler:v2.1
docker build -t zfc931912343/boss-crawler:v1 . docker push zfc931912343/boss-crawler:v1
sudo docker rm -f admin-crawler &&sudo docker run -d --restart=always --name=admin-crawler --log-driver=json-file --log-opt max-size=10m --log-opt max-file=7 -p 9999:80 nbg2akd8w5diy8.xuanyuan.run/zfc931912343/admin-crawler:v1.5
docker run -d
--name mysql-server
--restart always
-p 3306:3306
-v /opt/mysql/data:/var/lib/mysql
-e MYSQL_ROOT_PASSWORD=jobdata123
-e MYSQL_DATABASE=job_data
mysql:8.0
--character-set-server=utf8mb4
--collation-server=utf8mb4_unicode_ci
Languages
Python
69.3%
Vue
22.8%
JavaScript
6.7%
Dockerfile
0.3%
Makefile
0.3%
Other
0.6%