Projects | Tianma's Website

AI Mahjong

Thu, 17 Aug 2023 00:00:00 +0000

Fish2Mesh

Mon, 01 Jan 0001 00:00:00 +0000

Home Design

Mon, 01 Jan 0001 00:00:00 +0000

This project is processing during I worked at B&Q company.

At B&Q, where I previously worked as an Applied Scientist, I built a recommendation system that generated architectural layout suggestions based on user preferences, transforming floor plans into 2D matrices and enhancing customer experiences through similarity detection algorithms. This experience reinforced my capabilities in ML model development and demonstrated my capacity for impactful AI-driven solutions.

Step 1: Floorplan Detection

The initial phase focuses on detecting key elements in the floorplan, including spaces, scale marks, walls, doors, and windows.

Space and Scale Mark Detection: Achieved using a YOLO-based detection model. Wall, Door, and Window Detection: Handled through a regression network. Examples of floorplan detection results:

Step 2: Space Matching and Case Recommendation

For each detected space, we match the top 5 design cases tailored to the consumer’s preferences.

Filtering Options: Users can filter recommendations by style and color labels, which are included in our dataset. Space Matching: To determine the best match, we propose a heatmap of polygon patterns for the floorplan. This allows us to compute similarity between each space in the floorplan and spaces in our dataset. Examples of space matching visualizations:

Image Compression for classification, object detection and segmentation

Mon, 01 Jan 0001 00:00:00 +0000

Recent years have witnessed great advances in deep learning-based image compression, also known as learned image compression. An accurate entropy model is essential in learned image compression, since it can compress high-quality images with a lower bit rate.

Current learned image compression schemes developed entropy models using context models and hyperpriors. Context models utilize local correlations within latent representations for better probability distribution approximation, while hyperpriors provide side information to estimate distribution parameters. Most recently, several transformer-based learned image compression algorithms have emerged and achieved state-of-the-art rate distortion performances, surpassing existing convolutional neural network (CNN)-based learned image compression and traditional image compression. Transformers are better at modeling long-distance dependencies and extracting global features than CNNs.

However, the research of transformer-based image compression is still in its early stage. In this work, we propose a novel transformer-based learned image compression model. It adopts transformer structures in the main image encoder and decoder and in the context model. In particular, we propose a transformer-based spatial-channel auto-regressive context model. Encoded latent-space features are split into spatial-channel chunks, which are entropy encoded sequentially in a channel-first order, followed by a 2D zigzag spatial order, conditioned on previously decoded feature chunks. To reduce the computational complexity, we also adopt a sliding window to restrict the number of chunks participating in the entropy model. Experimental studies on public image compression datasets demonstrate that our proposed transformer-based learned image codec outperforms traditional image compression and existing learned image compression models visually and quantitatively.

The more details are show up in the Github link: https://github.com/stm233/image-compression-with-swin-transformer.

Image Compression with Swin Transformer

Mon, 01 Jan 0001 00:00:00 +0000

The more details are show up in the Github link: https://github.com/stm233/image-compression-with-swin-transformer.

MAGAN

Mon, 01 Jan 0001 00:00:00 +0000

Generative Adversarial Networks (GANs) suffer from the instability of training because of Optimal Transportation (OT) problems. Based on Brenier’s Theorem, we converted OT problems into solving the elliptic Monge–Ampère Partial Differential Equation (MAPDE) by utilizing the finite difference method. In order to solve n (n > 3) dimensional MAPDE, we improved Neumann boundary conditions and extended a discretization of MAPDE for the numerical solution to enable the optimal map between generators and discriminators. The solution of MAPDE was regarded as a new divergence instead of Wasserstein Distance from WGAN.

We provided several computational examples to demonstrate that the precision was increased by 5.3%. Moreover, MAGAN was able to stabilize training with almost no hyperparameter tuning and the convergent speed of MAGAN was 317.2% faster than WGAN-GP on LSUN Bedrooms Database. MAGAN also achieved the Inception Score (IS) of 8.7 on CIFAR-10.