Shahinur Alam, PhD

AI, HCI, and VR Researcher

With over 8 years of research experience, I specialize in human-computer interaction, virtual reality, AI, and medical imaging. I lead the NSF-funded project named SAIL, developing an immersive VR system for ASL education. I am passionate about driving technological advancements and improving user experiences.

Research Publications

Stats

Research Experience

With over eight years of research experience, I have authored 20+ published articles, reviewed 140+ scholarly works, and accumulated 150+ citations. These achievements reflect my commitment to advancing knowledge, proficiency in peer review, and substantial impact on the academic community.

Published Articles

Years of Institutional Research Experience

Citations

Recognized Journal Articles Reviewed

ASL Champ!

The ASL Champ! project uses Virtual Reality (VR) and deep learning to create an immersive platform for learning American Sign Language (ASL). Addressing challenges like the lack of real-time feedback in traditional ASL education, the platform combines motion capture with a deep learning model using Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks. Users interact with a signing avatar in a virtual coffee shop, receiving real-time feedback to improve sign accuracy. Achieving 90.12% training accuracy and 86.66% test accuracy, ASL Champ! offers an engaging, effective learning experience that makes ASL education more accessible and interactive.

The development of ASL Champ involved multidisciplinary collaboration, integrating 3D avatar design, motion capture, AI model development, and sign language data collection. The game aims to provide users with a more intuitive and accurate way to learn ASL, making it particularly beneficial for those with limited access to traditional ASL instruction methods.

ASL Champ! details

Air-Writing

This work focuses on developing a highly accurate trajectory-based air-writing recognition system using deep learning techniques and depth sensor technology. The system allows users to write in free space, with their finger movements captured as 3D trajectories by an Intel RealSense SR300 camera. The data is processed through normalization techniques to eliminate noise and ensure smoother trajectories. Two deep learning models, Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN), were employed to recognize the air-written digits. The system was trained on a self-collected dataset of 21,000 trajectories and the 6D Motion Gesture (6DMG) dataset, achieving a groundbreaking accuracy of 99.32% with the LSTM model and 99.26% with CNN. This research overcomes the limitations of traditional writing systems and gesture-based interfaces by offering a touchless, efficient method for input in applications such as augmented reality (AR), virtual reality (VR), and healthcare. The project’s key contributions include the development of a large public dataset and the integration of advanced neural network architectures to enhance recognition accuracy.

Air-writing details

Image Super-resolution

This project focuses on enhancing the resolution of images produced by integral imaging microscopy (IIM) using a deep learning-based Generative Adversarial Network (GAN). IIM provides 3D visualization of microscopic objects, but its images suffer from low resolution due to limitations in the micro-lens array (MLA) and poor lighting conditions. To address this, we developed a GAN-based super-resolution algorithm that improves the clarity and detail of IIM images by up to 8×, outperforming traditional methods. The GAN model consists of two parts: a generator that reconstructs high-resolution images from low-resolution inputs and a discriminator that distinguishes between real and generated images. By applying this method to various microscopic specimens, including biological samples and electronic components, we demonstrated significant improvements in image quality. The enhanced images retain fine details, depth, and edges, making them more suitable for scientific analysis. The model was trained using the PyTorch library on a high-performance computing system and tested with various metrics like PSNR, SSIM, and PSD, showing superior results compared to existing resolution enhancement techniques. This project provides an efficient, scalable solution for real-time image enhancement in IIM, with potential applications in biomedical science, nanophysics, and other fields requiring precise 3D imaging.

Image super-resolution details

Gesture Recognition

This project presents the implementation of a character recognition system based on finger-joint tracking using a 3-D depth camera, focusing on improving human-computer interaction (HCI). The system allows users to write characters, digits, and symbols in mid-air using simple hand gestures. The research aims to overcome the limitations of traditional gesture recognition systems, which often rely on wearable devices or support only limited character sets. Utilizing an Intel RealSense SR300 camera, the system tracks 22 finger joints to recognize 124 characters, including digits, alphabets, symbols, and special keys, offering a full keyboard experience. It employs a combination of Euclidean distance thresholding and geometric slope techniques for high accuracy in both single-hand and double-hand recognition modes. The system achieved an accuracy rate of over 91% and a recognition time of less than 60 milliseconds per character, making it suitable for real-time applications. Unlike many conventional systems, it operates effectively in both light and dark environments. The technology’s simplicity makes it user-friendly, with minimal training required, as confirmed by user studies. This system is highly versatile and can be applied across various industries, including virtual reality, healthcare, and accessibility, offering an innovative solution for hands-free input systems.

Gesture recognition details

ASL Champ!

ASL Champ! details

Air-Writing

Air-writing details

Image Super-resolution

Image super-resolution details

Gesture Recognition

Gesture recognition details

Top 5 High-Performing Research

M. S. Alam et al., ‘ASL champ!: a virtual reality game with deep-learning driven sign recognition’, Computers & Education: X Reality, vol. 4, p. 100059, 2024, doi: https://doi.org/10.1016/j.cexr.2024.100059
M. S. Alam, K. -C. Kwon and N. Kim, “TARNet: An Efficient and Lightweight Trajectory-Based Air-Writing Recognition Model Using a CNN and LSTM Network, Volume 2022, doi: https://doi.org/10.1155/2022/6063779
M. S. Alam, K. -C. Kwon and N. Kim, “Implementation of a Character Recognition System Based on Finger-Joint Tracking Using a Depth Camera,” in IEEE Transactions on Human-Machine Systems, vol. 51, no. 3, pp. 229-241, June 2021, doi: https://doi.org/10.1109/THMS.2021.3066854.
M. S. Alam, K. -C. Kwon, M. -U. Erdenebat, M. Y. Abbass; M. A. Alam, and N. Kim, “Super-Resolution Enhancement Method Based on Generative Adversarial Network for Integral Imaging Microscopy” in Sensors 2021, 21, 2164. https://doi.org/10.3390/s21062164.
M.S. Alam, K.-C. Kwon; M.A. Alam, M.Y. Abbass, S.M. Imtiaz, N. Kim, “Trajectory-Based Air-Writing Recognition Using Deep Neural Network and Depth Sensor,” in Sensors 2020, 20, 376. https://doi.org/10.3390/s20020376

Recent from Blogs

AI-Powered Content Summarizer

by Shahinur | Oct 20, 2024 | Artificial Intelligence, Deep Learning

This project contains a Flask-based web application that integrates various Natural Language Processing (NLP) models to generate text summaries. The app allows users to enter text and receive summaries in different formats such as paragraph or bullet points, utilizing...

Exploration and Exploitation

by Shahinur | Jun 12, 2024 | Reinforcement Learning

Exploration and Exploitation are two fundamental processes in reinforcement learning. In short, exploration is about learning something new, and exploitation is about making decisions based on known information. Balancing these two is crucial. Exploration Through this...

YOLOv5 & YOLOv8 Object Detection

by Shahinur | May 2, 2024 | Computer Vision, Deep Learning, Mathematics, OpenCV, Python

This repository demonstrates object detection using two versions of the YOLO model: YOLOv5 and YOLOv8 from Ultralytics. The scripts provided allow you to detect objects in images and videos using pre-trained models, and display or save the annotated outputs using...

Avoiding Conflicts on GitHub: The Power of Locking

by Shahinur | Oct 12, 2023 | Miscellaneous

GitHub is a hub of collaboration where developers come together to work on projects, share ideas, and resolve issues. While this collaborative environment is one of GitHub's strengths, it can also give rise to commit conflicts. GitHub provides a powerful tool called...

Unreal Engine Basic Terminology

by Shahinur | Jun 3, 2023 | Basic Terminology, Game Development, Unreal Engine, Virtual Reality

Unreal Engine is a powerful and versatile game development engine created by Epic Games. With its real-time rendering capabilities, visual scripting system (Blueprints), and robust physics engine, Unreal Engine empowers developers to bring their creative visions to...

Confusion Matrices with Heatmaps: Implementation in Python with seaborn and scikit-learn

by Shahinur | Feb 16, 2022 | Miscellaneous

Understanding and effectively utilizing a confusion matrix and heatmap is crucial for anyone dealing with classification problems in machine learning. In this blog post, we'll dive into the intricacies of these powerful tools, providing clear explanations and...

Create a Progressbar using tqdm and Python

by Shahinur | Sep 3, 2021 | Deep Learning, Machine Learning, Miscellaneous, Python

tqdm is a must need python package for all data scientists and artificial intelligence researchers/developers. Sometimes, we need to visualize what is going on in the background. Some programs may require hours to several days to finish. In that case, a progress bar...

Basic Anaconda Commands

by Shahinur | Sep 1, 2021 | Deep Learning, Machine Learning, Python

Anaconda is a fantastic tool for all deep learning, machine learning, and computer vision researcher. It reduces tons of extra work for setting up environments and tools. Personally, I love it so much. Anaconda Navigator is a great UI for setting up environments and...

Hand joint detection using OpenCV and MediaPipe

by Shahinur | Aug 27, 2021 | MediaPipe, OpenCV

MediaPipe is a cross-platform framework for building multimodal applied machine learning pipelines. MediaPipe Python package is available on PyPI for Linux, macOS, and Windows. Today we will write a simple code for hand joint detection using OpenCV. At...

« Older Entries