Projects

A selection of my projects

Journey of Water @ Walt Disney World

Interactive water attraction inspired by Moana

July, 2019

Conducted computer vision, deep learning, and controls R&D for this attraction. So happy to see it bringing joy to so many people!

CLVR Jaco Play Dataset

Dataset of diverse robot trajectories from CLVR @ USC

April, 2023

Diverse multimodal dataset of 1085 teleoperated robot episodes. Includes multi-view camera observations, cartesian position and velocity data, and natural language annotations. Tasks include picking and placing of various items (of various categories) and receptacles.

Improving Language Understanding via Privileged Multimodal Training

Final Project for CSCI 535 (Multimodal and Probabilistic Learning for Human Communications)

May, 2023

Improving Language Understanding via Privileged Multimodal Training

Novel method to improve the performance of downstream unimodal language understanding tasks by leveraging multimodal data during training; employing optimization towards the task-specific objective while regularized against a multimodal teacher network in latent space.

LCMVAE: Language-conditioned Masked Variational Autoencoder

Final Project for CSCI 566 (Deep Learning Applications)

May, 2022

LCMVAE: Language-conditioned Masked Variational Autoencoder

A simple solution for multimodal representation learning, which takes advantage of modality-specific pre-trained encoders and a flexible, lightweight, latent-mixing network to effectively generate meaningful multimodal representations.

A deep learning approach to the screening of malaria infection: Automated and rapid cell counting, object detection and instance segmentation using Mask R-CNN

Published in Computerized Medical Imaging and Graphics

March, 2021

A deep learning approach to the screening of malaria infection: Automated and rapid cell counting, object detection and instance segmentation using Mask R-CNN

Application of a deep-learning model, Mask R-CNN, trained to identify healthy and Plasmodium-infected red blood cells. Capable of generating segmentation masks on top of bounding box classifications for immediate visualization and stage-specific identification. Potential to reduce errors common in manual counting through subsequent standardization.

Reward-Induced Representation Learning

CLVR implementation challenge

October, 2021

Pre-trained predictive models on various tasks to facilitate RL training on a novel downstream task of the same distribution. The resulting reward-induced representation proved useful in both speeding up downstream PPO training, and improving final rollout performance.

Lilypod

Undergraduate Mechatronics Capstone Project

July, 2020

AI-powered water monitoring system. Measures parameters of its surrounding environment, collects samples of the pollutants, and performs data analytics in the cloud to produce valuable information.

NaviGAIT

Vision for the blind, through positional sound

April, 2019

Final project for 30.007 Engineering Design Innovation course at SUTD. An ultrasonic-based wearable for the blind, using 3D positional audio feedback to assist in obstacle avoidance, and IMU-based fall detection and notification functionality over Twilio API. Won Best Demonstration Award at public project exhibition; awarded by industry guests.

Automatic Watermark Removal

Final Project for CS 484 (Computational Vision)

December, 2019

Given a set of images with the same watermark, our method segments watermark boundaries (employing graph cut) and subsequently estimates the watermark's parameters (color + alpha-blending constant). It then removes the watermark using an inverse watermarking transform.

Portrait AutoColorizer

Final Project for SYDE 522 (Machine Intelligence)

April, 2020

This one started because I wanted to colorize my grandparents' old portraits. Our AutoColorizer fine-tunes a VGG16 network pretrained on ImageNet. It is then trained on a dataset of 200,000 celebrity portraits. We formulate the colorization process as the prediction of the AB channels given the L channel of an image in the LAB color space.

Jullian Yapeter