• Hi! I'm Lv Zhen

    I'm currently a senior algorithm expert in Computer Science at Alibaba.

    My research interests lie in Computer Vision and Deep Learning, focusing on 3D vision, multimodal models mainly.

Education

Ph.D. in Wuhan University
2011 - 2018
Photogrammetry and Remote Sensing

B.S. in Wuhan University
2007 - 2011
Remote Sensing Science and Technology

Publication

An Adaptive Multifeature Sparsity-Based Model for Semiautomatic Road Extraction From High-Resolution Satellite Images in Urban Areas

IEEE Geoscience and Remote Sensing Letters (GRSL), 2017

[Paper]

Joint image registration and point spread function estimation for the super-resolution of satellite images

Signal Processing: Image Communication (SPIC), 2017

[Paper]

A New Change Detection Method of Remote Sensing Image

Geomatics and Information Science of Wuhan University, 2016

[Paper]

Work Experience

  • Led the development of speech driven facial animation similar to Lipsync.
  • Drove and launched the project that realized 3D hand tracking in the wild, as well as virtual scene interaction by Unreal engine on mobile.
  • Realized the AI music creation application including lyric generation, singing voice synthesis and singing voice conversion.
  • Realized the application of dance motion creation by diffusion model.
  • Dominated 3D photo/video project similar to Apple 15 pro's 3D special video.
  • Developed video processing algorithms and optimized mobile applications.
  • Duplicated After Effect's plugin LockDown, including 2D mesh tracking (ARAP), mesh rendering on mobile.
  • Completed face swapping algorithm based on DeepFace.

Engineering Project

4K Spatial Video with SOTA Performance

Realtime LIPSYNC

AIGC-Based Song and Dance Animation Generation

FACE SWAPING

Italian Trulli

AR 3D Hand Interaction in Real-Time on Mobile