Hello there!
I’m Gopika, a Master's student in the Erasmus Mundus Joint Master in Artificial Intelligence (EMAI) at Universitat Pompeu Fabra (UPF).
My research follows two parallel branches: foundations of AI and representation learning, covering latent space modeling, probabilistic methods and deep neural architectures, and applications in audio processing and music information retrieval (MIR), where I apply these techniques to analyze rhythm, timbre and the evolution of musical style.
I’m a lifelong learner, always eager to explore new ideas through my research, science communication, and personal hobbies. In my free time, I enjoy reading, birdwatching, playing stringed instruments, and stargazing.
If you want to reach out, feel free to contact me at gk1656@nyu.edu!
Research and Projects
This project began as part of my Post-graduate Practical Training Program (PPTP) with the Music and Sound Cultures Research Group at NYU Abu Dhabi. I developed a Temporal Convolutional Network (TCN)–based stroke-transcription pipeline for the mridangam and designed complementary konnakol (spoken percussion) alignment methods to handle expressive timing and speed changes. The system addresses key challenges such as sparse onsets and nested rhythmic layers, and the results were presented at the 2025 IEEE ICASSP workshops (SALMA & WIMAGA) as part of ongoing work on rhythm-aware transcription.
In this project my teammates and I worked with Zurich Insurance to improve the reliability of confidence scores from large language models. We built a Gaussian Process Classifier to calibrate transformer outputs after inference, combining signals such as logits, verbal self-confidence, and generation consistency. I implemented the calibration pipeline and evaluation framework and ran experiments on the Kleister-NDA dataset. Our method outperformed raw logits and verbal confidence on Expected Calibration Error, Negative Log-Likelihood, and Brier Score, delivering trustworthy confidence estimates for legal-document information extraction. Awards: University of Ljubljana Data Science Competition 2025 and UCL Centre for Doctoral Training in AI Research Showcase.
In this project, my teammates and I investigated how to make large language models more reliable when the wording of a prompt changes. We tested inference-time strategies—such as Chain-of-Thought, Self-Consistency, and Self-Refinement—and fine-tuned models using parameter-efficient LoRA on groups of paraphrased prompts. I contributed to the design of the experiments and the evaluation pipeline, using metrics like the Prompt Sensitivity Index (POSIX) and an LLM-as-a-Judge scoring method to compare models including LLaMA-2, Mistral, and Falcon, highlighting how different techniques affect robustness and answer quality.
This project, stemming from a collaboration with Prof. Minsu Park at NYU Abu Dhabi, evolved into my capstone thesis. The project utilizes machine learning techniques to analyze the evolution of popular music in the United States, uncovering continuous changes with occasional radical shifts. It challenges the role of genre as the sole driver and highlights the multifaceted nature of evolution.
In this project, my teammates and I examined how Europe’s high-voltage power grid responds to structural and functional attacks using advanced network-science techniques. We built a role-aware model that separates generation, transmission, and conversion stations, then simulated targeted node removals using degree, betweenness, closeness, and PageRank centrality. I helped design and run the experiments, tracking structural collapse through the largest connected component and functional failure via the percentage of unserved nodes. We also recreated the April 2025 Iberian blackout, showing how cascading failures from generator trips and France–Iberia disconnection led to total regional collapse.
In this project my teammates and I explored how reinforcement learning can improve marketing uplift modeling. We reproduced and extended the R-Lift algorithm, framing the problem as a Markov Decision Process and training a neural policy-gradient model to target users most likely to respond to treatment. I implemented experiments on the Hillstrom and Criteo datasets, analyzed Qini curves and coefficients to measure incremental lift, and addressed challenges such as treatment-control imbalance and large-scale data handling. Our results showed higher Qini coefficients than standard baselines, but also revealed stability issues with fluctuations below the random baseline, highlighting the trade-offs in RL-based uplift methods.
At the Modern Microprocessor Architectures Lab at NYUAD, I researched different backdoors in facial recognition systems. I created a state of the art facial recognition system and conducted experiments with data poisoning by changing facial attributes. The results of this work was published in IEEE Transactions on Biometrics, Behavior, and Identity Science.
In this paper, my teammate and I evaluate the ability of the XLM-R model to learn and transfer grammatical knowledge from a source language (English) to 4 similar and dissimilar target languages (German, Hebrew, French and Russian). Furthermore, we test the model on a low-resource language (Nepali).