Back in the days when the Zero COVID policy was prevailing, our university introduced a Daily Health Report system. Students and faculty were mandated to submit a daily online form detailing their health status. Noncompliance resulted in denied campus access, and in more stringent times, forced quarantine. Thanks to the comprehensive lockdown of campuses, our activity were strictly confined. Consequently, our daily data submission were pratically invariant. It’s a colossal waste of effort to do it manually (with some anecdotes later on), so I opted to automate the process.

As the policies evolved, our school’s reporting platform also underwent changes. I had to update the reporting script multiple times with new features to align those of the reporting platform.

Much like my previous article, there’s a significant distinction between making something work and making it work with elegance. So in this article, I’ll share my infrastructure for the automated daily report system, and delve into some design options and decisions I made in the way.

The reporting script

Writing a script is about the easiest thing in the whole system with the least technical complexity. Anyone with basic scripting abilities can do it well, so I open-sourced mine. It only takes a few minutes to open the Developer Tools on your browser, identify the request originating from the [Submit] button, copy its payload out and put that into a script, and it’s ready to service. If anything marginally fancy were to be added, it’d be saving certain data to a separate file so that others can adopt the script more easily.

The next thing is to run the script every day at a desired time. A common solution is to use Cron that is simple and easy. Systemd timers is a modern alternative offering more features at the expense of a more complex configuration. I chose the latter for its RandomizedDelaySec option, so that the script won’t be run at the exact same time every day.

At the beginning I also had a sample GitHub Actions workflow file so that others can fork my repository and start automating their reports with minimal effort. However, I scrapped it later on realizing it’s against GitHub’s ToS.

First step

Status report

The next thing is to stay informed of whether the script is working properly. Logging in to the server and reading logs every day is not fun. Assuming that it worked and ending up being denied entry to the school is even worse. So it’d be nice to be notified of everything it does.

A common choice is via email, but it’s lacking a bit of timeliness. I chose Telegram because I’m actively using it and it provides a bot API. Adding python-telegram-bot to the script and a few lines of code, I can get a notification on my Telegram every time the script runs.

My actual setup differs slightly, with an extra component between the script and the bot: an AWS Lambda serverless function. I did this for two reasons:

  • Minor reason: Telegram servers (api.telegram.org) is not directly accessible from mainland China for well-known reasons.
  • Major reason: I already have a GitHub webhook running on AWS Lambda. It is much less involved to add another URL handler to that function and reuse the existing codebase, like credentials and message formatting. This allows me to simplify the notification to a single requests.post.

Second step

As a bonus feature, I also send the error message and the line number in case of an exception, so that I can quickly identify the problem before investigating the logs.

[THU Checkin] Success: 2023-02-24 20:42:23
Checkin: Success
Apply: Success

[THU Checkin]Error: 2023-02-25 20:05:46
AttributeError: ‘NoneType’ object has no attribute ‘group’
On checkin.py line 67

Uploading images

Sometime later, our school began to demand regular uploads of our health QR code. The QR code is generated by a govermental mobile app whose retrieval is, unfortunately, difficult to automate. Before stepping over the line of producing fake QR codes, I decided to take the screenshots manually and have my script upload them to the reporting platform. The good news is, there’s no measures on the platform to validate the uploaded images, so uploading an outdated screenshot yields no consequences most of the time, and I don’t have to constantly update the screenshots for the script.

Image uploading is nothing new to the requests Python library, but I have to deliver the files from my phone somehow. Options to transfer files from an Android phone to a Linux server are abundant, and for me I found SMB the most convenient. Root Explorer is the file manager that I’ve been using for a decade, so I could just set up Samba on my server to receive the files from it.

[THU Checkin] Success: 2023-02-25 08:33:36
Checkin: Success
Apply: Success
Image 1: Skipped
Image 2: Success
Image 3: Success

Third step

Alternatively, I could have my Telegram bot accept the images and forward them to the server. This would be more convenient in terms of using, but much less in coding as I didn’t have any existing code in my Telegram bot that handles images. Meanwhile, I already had Samba running on my server so I in fact did not set it up anew.

Securing the server

At this point everything is operational, with one detail missing: The SMB protocol is not known for being secure. Exposing the SMB port to the Internet is prone to troubles and connecting to a VPN every time is not convenient. Luckily I have Clash for Android running on my phone 24/7 that I can use to proxy Root Explorer. I set up a shadowsocks-libev server and configured Clash to route traffic targeting my server through it, and then closed the SMB port in my server firewall.

There’s a noteworthy thing about Clash: It’s a rule-based proxy software that reads configurations. My airport1 service provides their configuration through a subscription URL, but Clash for Android doesn’t support editing subscribed config. Another background story comes up here: I have another Lambda function serving as my own Clash config subscription. It fetches the airport config and modifies it to my preferences, and then serves it to Clash. It also makes updating the config easier, as I can just update the Lambda function code and the changes will be reflected in Clash.

Fun fact: My custom subscription is also used with Clash for Windows on my computer, which helped me completely bypass two RCE vulnerabilities (1, 2).

Conclusion

After all this complexity, here’s what I’ve got:

Final state

The script runs every day at a random time in a configured time span, and I get a notification on Telegram regardless of whether it succeeds or fails. If the script fails I also have the required information to look into it. The script also uploads the health QR code screenshots to the reporting platform, and I can update the images from my phone through a secured connection.

Of all these tasks, only taking the screenshots and uploading them to the server is manual, denoted in the image by blue arrows. All black arrows are automated and require no attention to function.

As the zero-COVID policy came crumbling down in December 2022, our school also put an end to the daily health reporting system. As a result, I can safely share my setup here without fearing repercussions. I hope this article brings you some inspiration for your next automation project.

Anecdote

During the days around the strictest lockdown of campuses, all students’ requests for outgoing were manually reviewed by two levels of authority, with the second level being the dean. Our department consists of over 2,000 students that kept submitting requests every day. Needless to say, many staff weren’t happy about this, and the dean in particular. We were once asked to stop phoning her as she was already processing the requests from 7 AM to 11 PM every day. To everyone’s relief, the reviewing process was cancelled in a few days and requests were automatically approved thereafter.

  1. Shadowsocks service providers are commonly called “airports” because the icon of Shadowsocks is a paper plane, and every provider has multiple “plane servers” that you can use. 

Leave a comment