2dfan自动签到

编辑记录

2024-12-05 18:11:45 第一次编辑

正文。

2024-12-06 15:51:45 第二次编辑

完善主代码，设置无痕模式(防止保存密码窗口弹出)，并改进登录机制(报错尝试重新登录)。

2025-01-13 11:51:45 第三次编辑

解决bug：别人fork时运行出问题。

开发历程

一开始使用selenium进行网页自动化任务，但后面发现会被cloudflare检测到，导致验证失败，而且该方法还需要下载对应的chromedriver，麻烦且不适用。
但后面打算放弃时，逛github项目无意中发现一个好用的网页自动化的库DrissonPage，比selenium简单且功能强大，尝试性地去进行cloudflare验证后，发现居然可以验证通过。于是后续用此库进行开发。

点击验证框

作为最关键的一步，因为cloudflare里的元素无法通过普通的网页定位获得，所有难以点击。最简单的就是通过其在屏幕坐标的形式点击，但受限很大。通过AI在github找到一个项目，是用opencv模板匹配做的，我居然忘了还有这个。运行后发现，确实能获取到checkbox的位置并点击。只需要提供一张模板图片和网页截图，再根据情况修改下点击的位置通过验证了。主要代码如下：

"Hansimov/captcha-bypass"

import cv2
import pyautogui
from mss import mss
from pathlib import Path
from PIL import ImageGrab

class ImageMatcher:
    def __init__(self, source_image_path, template_image_path):
        self.source_image = cv2.imread(str(source_image_path))
        self.template_image = cv2.imread(str(template_image_path))
        self.detected_image_path = source_image_path.parent / "screenshot_detected.png"

    def match(self):
        res = cv2.matchTemplate(
            self.source_image, self.template_image, cv2.TM_CCOEFF_NORMED
        )
        _, _, _, match_location = cv2.minMaxLoc(res)
        match_left = match_location[0]
        match_top = match_location[1]
        match_right = match_location[0] + self.template_image.shape[1]
        match_bottom = match_location[1] + self.template_image.shape[0]
        match_region = (match_left, match_top, match_right, match_bottom)

        self.match_region = match_region
        return match_region

    def draw_rectangle(self):
        cv2.rectangle(
            img=self.source_image,
            pt1=self.match_region[:2],
            pt2=self.match_region[2:],
            color=(0, 255, 0),
            thickness=2,
        )
        cv2.imwrite(str(self.detected_image_path), self.source_image)
    
class CaptchaBypasser:
    def __init__(self):
        self.captcha_image_path = (
            Path(__file__).parent / "captcha-verify-you-are-human-eg.png"  # 自动部署图片加eg，本地不加
        )
        self.screen_shot_image_path = Path(__file__).parent / "screenshot.png"

    def get_screen_shots(self):
        ImageGrab.grab(all_screens=True).save(self.screen_shot_image_path)

    def get_captcha_location(self):
        with mss() as sct:
            all_monitor = sct.monitors[0]
            monitor_left_offset = all_monitor["left"]
            monitor_top_offset = all_monitor["top"]

        image_matcher = ImageMatcher(
            source_image_path=self.screen_shot_image_path,
            template_image_path=self.captcha_image_path,
        )

        match_region = image_matcher.match()
        image_matcher.draw_rectangle()
        match_region_in_monitor = (
            match_region[0] + monitor_left_offset,
            match_region[1] + monitor_top_offset,
            match_region[2] + monitor_left_offset,
            match_region[3] + monitor_top_offset,
        )
        checkbox_center = (
            int(match_region_in_monitor[0] + 40), #本地60 自动部署40
            int((match_region_in_monitor[1] + match_region_in_monitor[3]) / 2),
        )

        # 该处画点并保存图片
        cv2.circle(
            img=image_matcher.source_image,
            center=checkbox_center,
            radius=2,
            color=(0, 0, 255),
            thickness=-1,
        )
        cv2.imwrite(str(image_matcher.detected_image_path), image_matcher.source_image)
        return checkbox_center

    def click_target_checkbox(self):
        captcha_checkbox_center = self.get_captcha_location()
        pyautogui.moveTo(*captcha_checkbox_center)
        pyautogui.click()

    def run(self):
        self.get_screen_shots()
        self.get_captcha_location()
        self.click_target_checkbox()


if __name__ == "__main__":
    captcha_bypasser = CaptchaBypasser()
    captcha_bypasser.run()

在本地运行时，使用显示中文的模板图片，自动部署时则使用英文的模板图片，另外本地的间距是60，自动部署的间距是40。
在主代码中，只需要引入该代码，运行：

1 2	captcha_bypasser = CaptchaBypasser() captcha_bypasser.run()

就能实现点击checkbox通过验证。

自动部署

还是在github actions中实现，借助gpt得到了代码并修改完整吧。在.github\workflows里创建yml文件。主要代码为：

name: 2dfan Task Runner

on:
  push:
    branches:
      - main
  schedule:
    - cron: '0 19 * * *'  # UTC时间19点对应北京时间3点.
    - cron: '0 22 * * *'  # UTC时间22点对应北京时间6点.
  workflow_dispatch: # 手动触发

jobs:
  run-task:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: 3.9

    - name: Install dependencies
      run: |
        sudo apt-get update
        sudo apt-get install -y xvfb libx11-dev xauth fonts-noto-cjk  # 安装中文字体
        python -m pip install --upgrade pip
        pip install -r requirements.txt

    - name: Start Xvfb
      run: |
        nohup Xvfb :99 -screen 0 1280x1024x24 &

    - name: Set DISPLAY and XAUTHORITY environment variables
      run: |
        echo "DISPLAY=:99" >> $GITHUB_ENV
        echo "XAUTHORITY=/home/runner/.Xauthority" >> $GITHUB_ENV
        touch /home/runner/.Xauthority  # 创建一个空的 Xauthority 文件

    - name: Run Python script
      env:
        USER_EMAIL: ${{ secrets.USER_EMAIL }}
        USER_PASSWORD: ${{ secrets.USER_PASSWORD }}
      run: |
        python 2dfan_DrissionPage.py

    - name: Upload screenshots as artifacts
      uses: actions/upload-artifact@v3
      with:
        name: screenshots
        path: |
          ./screenshot.png
          ./screenshot_detected.png
          ./pic1.png
          ./pic2.png
          ./pic3.png
          ./pic4.png

自动部署遇到的难点之一是自动化需要一个屏幕，所有这里使用了虚拟屏幕Xvfb，在安装依赖性里，安装了虚拟屏幕以及中文字体（使网页中文能正常显示），以及需要使用到的包（包含在requirements.txt里）。后面就是虚拟屏幕的一些关键配置保证屏幕能正常运行。接下来就是运行python代码，这里先添加了github actions的secrets，分别是2dfan网站的邮箱和密码用于登录。最后一步就是用于调试，由于看不到屏幕情况，在代码中添加截图代码，再在自动部署代码中获取这些图片，这些图片会下载到screenshots.zip中，这里图片保存在代码同一目录，所以路径为./xxx.png。触发方式为推送代码、定时、手动。
补充：后续需要批量删除过多的workflow时，可以使用命令行，步骤为：

安装 GitHub CLI
cmd终端运行
1
gh auth login
按提示登录，并确保选择正确的权限。
验证登录成功
1
gh auth status
执行命令批量删除，这里是在git bash里执行，因为cmd不支持xargs
1
2
3
4
gh api repos/USER/REPO/actions/runs --paginate | \
jq -r '.workflow_runs[].id' | \
tr -d '\r' | \
xargs -I {} gh api -X DELETE repos/USER/REPO/actions/runs/{}
其中的USER是github用户名，REPO是要进行操作的仓库名(注意有两处修改)，执行命令后会批量删除该仓库下actions的所有workflow。
单个的删除命令为：
1
gh api -X DELETE repos/USER/REPO/actions/runs/123456789
最后的数字为Workflow Run ID。

主代码

其实难点就在于那个cloudflare的元素获取吧，普通的方式难以获取，受https://github.com/sarperavci/CloudflareBypassForScraping.git启发，找到了定位的思路。完成全部代码并由gpt优化结构和命名等，代码为：

import time, os
import logging
from DrissionPage import ChromiumPage
from bypass_captcha import CaptchaBypasser
from DrissionPage import ChromiumOptions

# 配置日志记录
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')

# 全局变量
LOGIN_URL = "https://2dfan.com/users/421136/recheckin"
MAX_RETRIES = 3     # 最大重试次数
MAX_LOGIN_ATTEMPTS = 3  # 最大重新登录次数

def locate_button(ele, tag="tag:svg", retries=MAX_RETRIES):
    """
    尝试定位按钮，最多尝试 `retries` 次。
    """
    for attempt in range(retries):
        try:
            button = ele.parent().shadow_root.child()(f"tag:body").shadow_root(tag)
            if button:
                logging.info(f"按钮定位成功 (尝试次数: {attempt + 1})")
                return button
            else:
                logging.warning(f"按钮为空，重新尝试定位 (尝试次数: {attempt + 1})")
        except Exception as e:
            logging.error(f"定位按钮时出错: {e} (尝试次数: {attempt + 1})")
        time.sleep(1)
    raise RuntimeError("按钮定位失败，已达到最大重试次数")

def process_captcha(tab, eles, tag="tag:circle"):
    """
    定位验证码中相关元素并返回该元素。
    """
    for ele in eles:
        if "name" in ele.attrs and "type" in ele.attrs:
            if "turnstile" in ele.attrs["name"] and ele.attrs["type"] == "hidden":
                button = locate_button(ele, tag=tag)
                logging.info(f"验证相关按钮：{button}")
                tab.wait(1)
                return button
    raise RuntimeError("未找到验证码相关按钮")

def login_process(tab):
    """
    执行登录的输入账号、密码和验证码绕过的流程。
    """
    # 输入账号
    user_email = os.getenv("USER_EMAIL", "")
    if not user_email:
        raise ValueError("环境变量 USER_EMAIL 未设置")
    logging.info(f"输入账号: {user_email}")
    tab.ele('@name=login').input(user_email)

    # 输入密码
    user_password = os.getenv("USER_PASSWORD", "")
    if not user_password:
        raise ValueError("环境变量 USER_PASSWORD 未设置")
    logging.info("输入密码")
    tab.ele('@name=password').input(user_password)

    # 验证验证码
    tab.wait.eles_loaded("tag:input")
    eles = tab.eles("tag:input")
    button = process_captcha(tab, eles, tag="tag:svg")
    tab.wait.ele_hidden(button)
    logging.info("开始验证")
    tab.wait(3)

    # 初始化验证码绕过程序
    logging.info("初始化验证码绕过程序...")
    captcha_bypasser = CaptchaBypasser()
    logging.info("运行验证码绕过程序...")
    captcha_bypasser.run()

    # 检验是否成功
    button = process_captcha(tab, eles, tag="tag:circle")
    tab.wait.ele_displayed(button)
    logging.info("验证成功")
    tab.wait(2)
    tab.get_screenshot(name='pic1.png', full_page=True)

    # 点击登录按钮
    logging.info("查找并点击登录按钮...")
    login_button = tab.ele('@type=submit')
    if login_button:
        login_button.click()
        logging.info("登录按钮已点击")
    else:
        raise RuntimeError("未找到登录按钮")
    
def main():
    try:
        # 启动浏览器
        logging.info("启动浏览器...")
        co = ChromiumOptions()
        # 禁止所有弹出窗口
        # co.set_pref(arg='profile.default_content_settings.popups', value='0')
        # # 隐藏是否保存密码的提示
        # co.set_pref('credentials_enable_service', False)

        #设置无痕模式，防止弹出是否保存密码的提示.
        co.incognito(True)
        tab = ChromiumPage(co)

        # 跳转到登录页面
        logging.info("跳转到登录页面...")
        tab.get(LOGIN_URL)
        logging.info("已跳转到登录页面")

        login_attempts = 0  # 登录尝试计数
        while login_attempts < MAX_LOGIN_ATTEMPTS:
            login_attempts += 1
            logging.info(f"执行登录流程（尝试第 {login_attempts} 次）...")
            
            # 执行登录流程
            try:
                login_process(tab)

                # 检查当前页面URL
                tab.wait.new_tab()
                current_url = tab.url
                logging.info(f"当前页面URL: {current_url}")

                if current_url == "https://2dfan.com/users/sign_in":
                    logging.warning("仍处于登录页面，重新尝试登录...")
                    tab.refresh()
                    tab.wait.doc_loaded()
                    tab.get_screenshot(name='pic_error.png', full_page=True)
                else:
                    logging.info("成功跳转到主页，继续后续操作...")
                    break  # 登录成功，退出循环
            except Exception as e:
                logging.error(f"登录尝试失败: {e}")

        else:
            logging.error("达到最大登录尝试次数，退出程序")
            return

        # 等待页面加载
        tab.wait.eles_loaded("tag:input")
        eles = tab.eles("tag:input")
        logging.info("登录成功")
        tab.get_screenshot(name='pic2.png', full_page=True)

        # 检测签到状态
        checkin_status = tab.ele('text:今日已签到')
        if checkin_status:
            logging.info("今日已签到！")
        else:
            logging.info("未签到，尝试签到...")

            # 再次运行验证码绕过程序
            logging.info("再次运行验证码绕过程序...")
            captcha_bypasser = CaptchaBypasser()
            captcha_bypasser.run()

            # 检验是否成功
            button = process_captcha(tab, eles, tag="tag:circle")
            tab.wait.ele_displayed(button)
            logging.info("验证成功")
            tab.wait(2)
            tab.get_screenshot(name='pic3.png', full_page=True)

            # 点击签到按钮
            logging.info("查找并点击签到按钮...")
            checkin_button = tab.ele('@type=submit')
            if checkin_button:
                checkin_button.click()
                logging.info("签到按钮已点击")
            else:
                raise RuntimeError("未找到签到按钮")

            tab.wait(5)
            tab.refresh()
            tab.wait.doc_loaded()
            tab.wait(3)
            logging.info("刷新页面成功")

            # 检测签到状态
            checkin_status = tab.ele('text:今日已签到')
            if checkin_status:
                logging.info("签到成功！")
            else:
                logging.info("签到失败！")

    except Exception as e:
        logging.error(f"运行过程中发生错误: {e}")
    finally:
        # 确保浏览器关闭
        logging.info("关闭浏览器...")
        tab.close()
        logging.info("浏览器已关闭")

if __name__ == "__main__":
    main()

主要解决不知道什么时候进行下一步的问题，通过sleep来延时和下一步的时间间隔，会使在自动部署下每次运行不稳定，难以确定合适的间隔。于是，通过cloudflare的加载变化来确定时间点。验证刚加载时会一直转圈，获取对应的元素”tag:svg”，当其变为隐藏时，开始验证，使用了tab.wait.ele_hidden(button)函数等待。点击验证后，获取元素”tag:circle”(就是成功的圈圈)，当其变为显示时，使用了tab.wait.ele_displayed(button)函数等待，表明验证成功，之后稍微间隔下tab.wait(1)再点击登录按钮。此时会切换界面，使用了函数tab.wait.new_tab()等待页面加载完成。之后就是先检测是否已签到，如果没有则直接开始点击验证，验证是否成功使用前面的方法等待”tag:circle”显示。验证成功后则点击签到按钮，之后刷新网页验证是否签到成功。
其中还加入了多次检测机制，因为可能第一次定位不到，需要多次定位，于是写了locate_button(ele, tag="tag:svg", retries=MAX_RETRIES)函数。定位cloudflare内元素采用了process_captcha(tab, eles, tag="tag:circle")函数，原理就是先eles = tab.eles("tag:input")找所有input标签，这些eles判断是否有属性name和type，再进一步判断name的值中是否有”turnstile”，type的值是否为”hidden”，同时满足时则为对应的cloudflare区域，使用locate_button()设置其中的参数tag来获取到需要的元素。部分代码：

def process_captcha(tab, eles, tag="tag:circle"):
    """
    定位验证码中相关元素并返回该元素。
    """
    for ele in eles:
        if "name" in ele.attrs and "type" in ele.attrs:
            if "turnstile" in ele.attrs["name"] and ele.attrs["type"] == "hidden":
                button = locate_button(ele, tag=tag)
                logging.info(f"验证相关按钮：{button}")
                tab.wait(1)
                return button
    raise RuntimeError("未找到验证码相关按钮")

结语

至此，代码相对稳定，在自动部署下，各环节没有异常，都能实现预计的操作并完成签到任务。

解决bug

别人运行时报错：

2025-01-11 22:14:18,172 - ERROR - 运行过程中发生错误: 
The browser connection fails.
Address: 127.0.0.1:44774
Tip: 
1, the user folder does not conflict with the open browser 
2, if no interface system, please add '--headless=new' startup parameter 
3, if the system is Linux, try adding '--no-sandbox' boot parameter 
The port and user folder paths can be set using ChromiumOptions.
Version: 4.1.0.17
2025-01-18 22:14:18,172 - INFO - 关闭浏览器...
Traceback (most recent call last):
  File "/home/runner/work/2dfan_autosign/2dfan_autosign/2dfan_DrissionPage.py", line 204, in <module>
    main()
  File "/home/runner/work/2dfan_autosign/2dfan_autosign/2dfan_DrissionPage.py", line 200, in main
    tab.close()
UnboundLocalError: local variable 'tab' referenced before assignment

解决方法：
在主程序中添加代码:

co = ChromiumOptions().set_paths(user_data_path=r'/tmp/chrome_user_data').auto_port()
co.incognito(True)  # 启用无痕模式
co.set_argument('--no-sandbox')
co.set_argument('--disable-gpu')
co.set_argument('--disable-dev-shm-usage')

成功解决！