Yifan Wu

Hi there! I'm a Research Scientist at Meta GenAI, working on multimodal foundation models for Meta AI.

I recently earned my PhD from the University of Pennsylvania, where I was affiliated with UPenn GRASP Lab and Penn Image Computing and Science Laboratory. I was grateful to be advised by Prof. James C. Gee, and closely worked with Prof. Jianbo Shi and Prof. Mark Yatskar.

Email: yfwu@seas.upenn.edu  /  Google Scholar  /  Linkedin

profile photo

Selected Publications
A Concept-based Interpretable Model for the Diagnosis of Choroid Neoplasias using Multimodal Data
Yifan Wu*, Yang Liu*, Yue Yang, Michael S. Yao, Wenli Yang, Xuehui Shi, Lihong Yang, Dongjun Li, Yueming Liu, James C. Gee, Xuan Yang, Wenbin Wei, Shi Gu
Arxiv, 2024, under review.
Paper  /  Demo

We demonstrated how to encode the expertise of specialized clinicians into AI to build an interpretable machine learning model that produces outputs understandable by humans.

The Role of Chain-of-Thought in Complex Vision-Language Reasoning Task
Yifan Wu, Pengchuan Zhang, Wenhan Xiong, Barlas Oguz, James C. Gee, Yixin Nie
Arxiv, 2023
Paper

We found that GPT-4V can benefit significantly from the Chain-of-Thought prompt. We present the "Description then Decision" strategy, which improves Winoground task performance by 50%.

Towards Establishing Dense Correspondence on Multiview Coronary Angiography: From Point-to-Point to Curve-to-Curve Query Matching
Yifan Wu*, Rohit Jena*, Mehmet Gulsun, Vivek Singh, Puneet Sharma, James C. Gee
Arxiv, 2023, under review.  
Paper

We established dense correspondence in multi-view angiography by formulating it as a query matching problem and extending point matching to curve matching for enhanced topological awareness.

NODEO: A Neural Ordinary Differential Equation Based Optimization Framework for Deformable Image Registration
Yifan Wu*, Tom Z Jiahao*, Jiancong Wang, Paul A Yushkevich, M Ani Hsieh, James C. Gee
CVPR, 2022  
Project Page/ Paper/ Supplementary

We model each voxel as a moving particle and consider the set of all voxels in a 3D image as a high-dimensional dynamical system whose trajectory determines the targeted deformation field.

Interpretable Identification of Interstitial Lung Disease (ILD) Associated Findings from CT
Yifan Wu, Jiancong Wang, William D. Lindsay, Tarmily Wen, Jianbo Shi, and James C. Gee
MICCAI, 2020  
Paper

Formulated the radiologic ILD findings identification as a multi-class classification problem given the raw thoracic CT dataset.

From Image to Video Face Inpainting: Spatial-Temporal Nested GAN (STN-GAN) for Usability Recovery
Yifan Wu, Vivek Singh, Ankur Kapoor
WACV, 2020  
Paper/ Video Result

We propose to use constrained inpainting methods to recover usability of corrupted images, which are masked for privacy protection but complete images are required for further algorithm development.

Towards Generating Personalized Volumetric Phantom from Patient's Surface Geometry
Yifan Wu, Vivek Singh, Brian Teixeira, Kai Ma, Birgi Tamersoy, Andreas Krauss, and Terrence Chen
MICCAI, 2019  
Paper

This paper presents a method to generate a volumetric phantom with internal anatomical structures from the patient?s skin surface geometry.

Privacy-Protective-GAN for Face De-identification
Yifan Wu, Fan Yang, and Haibin Ling
Arxiv, 2018  
Paper

Defined the face-identification task by establishing an effective de-identification measurement: achieve privacy protection simultaneously preserving data utility. Proposed an end-to-end trainable framework to synthesize de-identified facial images.

Experiences
ibm Research Scientist Intern, May, 2023 - Nov, 2023, Menlo Park, CA, USA
ibm Research Scientist Intern, May, 2017 - Dec, 2018, Princeton, NJ, USA
Teaching
upenn Fall 2022: CIS537 Biomedical Image Analysis, Teaching Assistant.
upenn Fall 2021: CIS581 Computer Vision and Computational Photography, Teaching Assistant.

Website template from Jon Barron.