This repository contains the CMP3102 Major Project Original Source Code Submission for Temasek Polytechnic (TP).
The We-Fie Hunt: A Telegram Scavenger Adventure Project is a collaboration between the School of Informatics & IT and the School of Humanities & Social Sciences. It is a demonstration of how we can effectively leverage Generative Artificial Intelligence (AI) to enhance our TP Leadership Fundamentals (LEADFUN) subjects.
Through a Telegram Bot, students are able to carry out this group-based activity, collaborating in teams to take group selfies and hunt for specific locations around the school that match the tasks listed by the bot. This application then reduces our lecturers’ administrative workload by automating and streamlining the entire evaluation process.
- Streamlined Group Creation and Management
- Instant Photo Evaluation and Justification System
- Own-Time-Own-Target Task Checklist
- Automated Leaderboard Calculation
- Tutor Interface for Game Session Management
- Simple Housekeeping Buttons
Below are all of the specialised tools that we have curated for our application. In order to ensure dynamicity and scalability, we leveraged OpenAI-API as our Large Language Model to streamline the evaluation pipeline.
| Functionality | Technologies | Description |
|---|---|---|
| Image Processing | Colour Space Conversion, Image Rescaling, Rotational Manipulation | Manipulation and conversion of images into a format that will allow the computer to extract meaningful information, enhance visual quality, or prepare images for further use. |
| Facial Recognition & Detection | InsightFace, Model Zoo, FaceAiSharp | Facial Detection is done to determine all the human faces present in order to identify how many members the group contains. Facial Recognition is then conducted, where the facial embeddings of each detected member are taken from the initial photo and saved into the database, to be mathematically compared with each facial embedding present in the submitted photos. |
| Template Matching / Feature Detection | LightGlue + Superpoint, FLANN + SIFT + Lowe's Ratio Test | Template Matching, better known as Feature Detection, takes a template image and compares it for its presence in a target image by identifying and matching distinctive features (like points, edges, or textures) across multiple images. |
| Colour Detection | HSV Range Selection | Colour Detection involves the identification and isolation of different, specified colours in an image through a pixel-level analysis. |
| Voice Recognition | Resemblyzer | Voice Recognition involves the computer’s ability to receive and interpret dictation or unique speakers. |
| Audio Transcription | WhisperModel | Audio Transcription is a speech-to-text function that allows for voice documentation through automatic speech recognition. |
| Video Processing | All of the Above | Video Processing, in our application, allows us to apply all the aforementioned specialised tools to videos as well. |
-
Backend Restrictions: AWS Fargate, a serverless compute engine for containers, while easier, also means that we cannot choose specific hardware. Thus, computations run on the 1 vCPU are more time-consuming and costly.
-
Database Restrictions: We use the free tier RDS, which means there's only so many DB connections it can support. As such, our backend's connection pooling mechanism only pools around 100 connections (capped). So if we have more than 100 simultaneous requests, they will be stalled (insignificantly).
-
Time Constraints: The application has yet to support voice processing which includes functions such as facial tracking, audio transcription, and speaker diarization. However, this has been tested isolatedly as evidence of its future potential. Potential Future Enhancements include integrating all these functions, an improved UXID, and more extensive testings to solidify edge-cases.
Original Workflow and Repository has been handed over to TP, this is simply the forked source code.


