ยท4 min read

How We Won Best Google Cloud at HackHarvard: Building ReAlive

hackathongoogle-cloudproject

Two months ago, I was eating pani puri on Brigade Road in Bangalore. Now I'm standing in a lecture hall at Harvard University, watching a judge nod while my teammate demos an app that generates soundscapes for old photographs. We won Best Google Cloud. At Harvard. In my first US hackathon.

I need to talk about this.

The Idea

My teammate pulled out his phone and showed us a black-and-white photo of his grandfather standing in a garden. "What if," he said, "you could hear what this photo sounded like?" Not just generic background music. Actually hear the birds in that garden, the wind through those specific trees, the ambient sounds of that era and place.

That's ReAlive. An AI pipeline that takes an old photograph and generates a contextually accurate audio experience for it, bringing the image to life through sound.

The Technical Pipeline

We had 36 hours, so every architectural decision was about speed and feasibility.

Depth Mapping. We used MiDaS (from Intel ISL) to generate monocular depth maps from the input photograph. This gives us a 3D understanding of the scene, what's in the foreground, what's in the background, how the space is structured. For spatial audio, this is essential. A bird in the background should sound distant. A stream in the foreground should feel close.

Scene Recognition. Google Cloud Vision API for object and scene detection. We identify elements in the photo: trees, water, buildings, people, animals, vehicles. Each detected element maps to a sound category. A garden with trees and birds gets wind and birdsong. A street scene with cars gets traffic ambience. We built a mapping layer that translates vision labels to audio categories with confidence weighting.

Audio Synthesis. This was the trickiest part. We used a combination of the Google Cloud Text-to-Speech API for narration elements and a curated library of spatial audio samples mapped to scene categories. The depth map informed the spatial positioning, using stereo panning and volume attenuation based on estimated distance from the camera.

The Glue: Google Cloud. Cloud Functions for the serverless backend. Cloud Storage for uploaded images and generated audio. Vision API and Text-to-Speech API for the AI heavy lifting. Cloud Run for the Streamlit frontend. All of it tied together with Python and held together with caffeine.

The 36-Hour Timeline

Hours 0-4: Brainstorming, rejecting three other ideas, settling on ReAlive. Team of four, two ML people (including me), one backend, one frontend-ish.

Hours 4-12: I built the depth mapping and scene recognition pipeline. Getting MiDaS to run fast enough for a demo was its own battle. We ended up using the smaller MiDaS v2.1 model and pre-computing depth maps.

Hours 12-24: Integration hell. The audio synthesis pipeline kept producing outputs that sounded like a haunted house instead of a peaceful garden. We rewrote the scene-to-sound mapping three times. At hour 18, someone accidentally deleted our Cloud Functions deployment. Good times.

Hours 24-33: Polish, bug fixes, slide deck. The demo came together around hour 30. We ran it on five different photos and it actually worked. The old garden photo produced gentle wind, distant birdsong, and soft rustling. I got chills.

Hours 33-36: Practice the pitch. Sleep was not involved.

The Demo Moment

When we played the generated audio for that garden photo in front of the judges, the room went quiet. One judge took off her headphones and said, "That's really beautiful." In a hackathon full of technically impressive projects, that emotional reaction is what set us apart.

What It Meant

Look, winning a prize at a hackathon is great for the resume. But for me, this was about something bigger. Two months into living in a completely new country, still figuring out how Celsius thermostats work, still missing my mom's cooking, still getting used to the fact that "how are you" is a greeting and not a question. Winning at Harvard felt like proof that I belong here. That the bet of leaving India for grad school is paying off.

We also mentored at HackGT9 and won Best AI/ML at HackUMass with Meta-Identity a few weeks later. October was a good month. I eventually distilled everything I learned into a post about how to win hackathons. But HackHarvard and ReAlive will always be the one I remember first.