Yufeng Xing (aka. Adam Edelweiss) is a master's student of data science at University of San Francisco, and a master's student in computer science at Georgia Institute of Technology. He received his bachelor of International Business from Sun Yat-sen University in China, and he exchanged for a semester at Bocconi University in Italy.
He is working as a data engineering intern now at a startup company called BlueBoard. Before that, he was an analytics intern at Ernst & Young in China. He participated in various analytics projects about footfall predictions and wildlife conservation. He also joined the development of an open-source software called Dango-Translator, which is currently focusing on developing the real-time OCR and translation models.
His current interests include artifical intelligence, data processing, machine learning, and deep learning. He is also interested in embedded systems, software development, game development, and information security.
Please contact him through the following methods listed below.
The search engine project searches the document based on the keywords. For linear search, we walk through all the files and check if there's a keyword inside. For indexing search, we create an index by traversing all the files at the beginning and it will return the intersection of documents on keywords. This project also implement hash table indexing, which is even more efficient.
The recommendation system is based on a variant of word2vec called doc2vec, which convert the whole document into a vector for comparisation. After converting, it selects k nearest neighbourhood based on the current article and then makes recommendations. This project also has an online demo but it can be very slow based on different circumstances.
GETFILE is a simple (FTP-like) protocol used to transfer a file from one computer to another. For both the client and the server, the implementation follows a boss-worker thread pattern. Thereby, the server can handle more requests at a time and the workload generator can also be able to download more than one file at a time. The implementation of this project is not publicly released.
The goal of this project is to gain some experience on inter process communication (IPC). The proxy of the client is created using CURL's easy API, and the cache process will run on the same machine as the proxy and it communicate withs client via shared memory. The data channel from the command channel is separated for transferring local files efficiently. The implementation of this project is not publicly released.
The first part of this project generates an remote procedure call (RPC) service that will perform the fetch file, store file, list file, and get infomation operations. The second part is to turn that gRPC service to a distribued file system (DFS) that provides coherency and atomicity. The implementation of this project is not publicly released.
PFG is an open-source generator used for generating random texts on random backgrounds. The font styles, font colors, font sizes, and positions are all generated in a random way. This tool is used to train deep learning OCR models for pixel fonts. The present version supports Japanese and English.
Job Compare 6300 is an Android APP developed by Java with Android Studio. It can be used to track the information of the offers and it can also be used to compare the job offers. A SQLite database is used to store the data in the background and you can download a demo using the following button. The user manual is available on GitHub.
This project is based on Cisco packet tracer. It implements a network with techniques of DNS, DHCP, OSPF (with designated router and backup designated router), VLAN, and VPN.