Huzaifa Arif

Logo

View My GitHub Profile

Huzaifa Arif

About Me

I am a fifth-year Ph.D. candidate and researcher with multiple first-author publications in trustworthy AI. My work focuses on exposing and mitigating privacy and safety vulnerabilities in Large Language Models, demonstrated through research at IBM and LLNL. I am seeking a full-time Research Scientist position starting December 2025 where I can apply my expertise in LLM alignment and privacy to build verifiably safe AI systems.

Professional Experience

IBM T.J Watson Research Center - Yorktown Heights, NY

AI Research Extern - Trustworthy AI | Jun 2025–Aug 2025
Mentors: Pin-Yu Chen, Ching-Yun Ko, Keerthiram Murugesan, Payel Das

Lawrence Livermore National Laboratory - Livermore, CA

Data Science Intern | May 2024–Aug 2024
Mentors: Bhavya Kailkhura, James Diffenderfer

IBM T.J Watson Research Center - Yorktown Heights, NY

AI Research Extern - Trustworthy AI | Jun 2023–Aug 2023
Mentors: Pin-Yu Chen, Keerthiram Murugesan, Payel Das

IBM T.J Watson Research Center - Yorktown Heights, NY

AI Research Extern - Trustworthy AI | Jun 2022–Aug 2022
Mentor: Pin-Yu Chen

Education

Rensselaer Polytechnic Institute - Troy, NY

Ph.D. in Electrical and Computer Systems Engineering | Expected Dec 2025

Lahore University of Management Sciences - Lahore, Pakistan

B.S. in Electrical Engineering

Ongoing work

Project 1: Parameter-Efficient LLM Safety Alignment: I am developing a lightweight, prefix-based method to steer LLMs towards safe behavior without needing to retrain the entire model. By combining Supervised Fine-Tuning (SFT) with Direct Preference Optimization (DPO), this work efficiently instills safety, mitigates toxicity, and can be adapted to prevent demographic bias and PII leakage. (Manuscript in preparation for ICLR 2026).

Project 2: Exposing Association Leakage in LLMs: I have identified a novel privacy vulnerability, “association leakage,” where LLMs can be prompted to reveal sensitive linked information (e.g., a name and its corresponding private data). My work introduces a new post-hoc attack that amplifies this leakage by manipulating attention heads at inference time, defining a new threat model for LLM safety audits. (Manuscript in preparation for ICLR 2026).

Publications

1. Reprogrammable-FL: Improving Utility-Privacy Tradeoff in Federated Learning via Model Reprogramming

Conference: IEEE Conference on Secure and Trustworthy Machine Learning, February 2023
Authors: Huzaifa Arif, Alex Gittens, Pin-Yu Chen
🎥 Talk | 💻 Code | 📄 Paper

2. PEEL the Layers and Find Yourself: Revisiting Inference-time Data Leakage for Residual Neural Networks

Conference: IEEE Conference on Secure and Trustworthy Machine Learning, April 2025
Authors: Huzaifa Arif, Keerthiram Murugesan, Payel Das, Alex Gittens, Pin-Yu Chen
đź“„ Paper

3. Group Fair Federated Learning via Stochastic Kernel Regularization

Journal: Transactions on Machine Learning Research, April 2025
Authors: Huzaifa Arif, Pin-Yu Chen, Keerthiram Murugesan, Alex Gittens
đź“„ Paper

4. Forecasting Fails: Unveiling Evasion Attacks in Weather Prediction Models

Conference (Workshop): AAAI Workshop on AI to Accelerate Science and Engineering
Authors: Huzaifa Arif, Pin-Yu Chen, Alex Gittens, James Diffenderfer, Bhavya Kailkhura
đź“„ Paper

Patents

Book Chapters

Preprints

DP-Compressed VFL is secure for Model Inversion Attacks

Paper | Code

Additional Research Work

Professional Service

Reviewer Experience

Recent News & Achievements

2025

2024

2022

2021

Contact Information

đź“§ Email: arifh@rpi.edu | huzaifaarif20@gmail.com
📞 Phone: (518) 961-8482


Last updated: August 2025