Tianma Shen 沈天马

Tianma Shen 沈天马

PhD Candidate

Santa Clara University

Biography

As a PhD student specializing in computer vision, I am deeply passionate about advancing the field of image and video compression. My research focuses on developing innovative algorithms and techniques to enhance the efficiency and quality of visual data compression.

With hands-on experience in object detection and segmentation, I have honed my skills in creating robust computer vision models that can accurately interpret and analyze visual information. Recently, I have extended my expertise to practical applications by developing an AI Deck detection app, which leverages state-of-the-art computer vision technologies to provide effective and user-friendly solutions.

I am driven by the challenge of solving complex problems and am always eager to collaborate with professionals who share a passion for pushing the boundaries of technology. Feel free to connect with me to discuss potential opportunities, share insights, or explore collaborations.

Interests
  • Computer Vision
  • Deep Learning
  • Image and Video Compression
Education
  • PhD Candidate in Computer Science, 2019 - 2025(EST)

    Santa Clara University

  • MEng in Computer Science, 2017 - 2019

    University of Shanghai for Science and Technology

  • BSc in Mathematics, 2013 - 2017

    Shanghai Maritime University

Skills

Python

Pytorch, Tensorflow, Numpy, Pandas, OpenCV, PIL and etc.

Matlab

Statistics Tools, Signal Processing Tools, PDE solvers.

Xcode/Fultter

IOS/Android development, UI Controller Design.

Deep Learning Model

CNNs, Swin Transformer, GAN, Diffusion Model.

Computer Vision

Image Generation, Object Detection, Image and Video Compression, Pose Estimation, Segmentation.

Photography

Photoshop, Premiere Pro, After Effect, Final Cut Pro, Unity.

Projects

.js-id-compression
AI Mahjong
Training the deep learning model to automatically detect the mahjong tiles and choose the best tile for each turn.
AI Mahjong
Fish2Mesh
A fisheye-aware transformer-based model designed for 3D egocentric human mesh recovery. We propose an egocentric position embedding block to generate an ego-specific position table for the Swin Transformer to reduce fisheye image distortion.
Fish2Mesh
Home Design
Apply the GAN to generate the layout of the furniture in the bedroom, living room, dining room and bath room.
Home Design
Image Compression for classification, object detection and segmentation
This is image compression for machine, which propose a transformer-based context entropy model.
Image Compression for classification, object detection and segmentation
Image Compression with Swin Transformer
This is transformer-based image compression method, which propose a 2D zigzag entropy model. The paper of this project won the Best Student Paper Award.
Image Compression with Swin Transformer
MAGAN
We utilize the solutions of Monge–Ampère Partial Differential Equation (MAPDE) as the new loss function of WGAN to make the training process more stable.
MAGAN

Publications

(2023). Learned Image Compression with Transformers. In Big Data V.

Cite Code

(2023). RISAT, real-time instance segmentation with adversarial training. Multimedia Tools and Applications.

Cite

(2019). 3DACN, 3D Augmented Convolutional Network for Time Series Data. Information Sciences.

Cite

(2019). Da-bert, Enhancing part-of-speech tagging of aspect sentiment analysis using bert. In “APPT”.

Cite

Contact