Nicholas Huang

Full-Stack Developer

I build intelligent systems at the intersection of software engineering and artificial intelligence. Passionate about creating scalable applications and advancing machine learning research that bridges theory and real-world impact.

About

I build intelligent systems at the intersection of software engineering and artificial intelligence. Passionate about creating scalable applications and advancing machine learning research that bridges theory and real-world impact.

Languages

TypeScript
JavaScript
Python
Java
C#
C/C++
Ruby
SQL
R
HTML/CSS

Frameworks

React
Next.js
Node.js
Express.js
Angular 2+
Flask
FastAPI
Ruby on Rails

AI / ML

PyTorch
TensorFlow
scikit-learn
Hugging Face
OpenAI API
pandas
NumPy
AWS Bedrock
GCP

Tools & Infrastructure

AWS (IoT, S3, DynamoDB, Athena, EC2)
Azure Cosmos DB
Docker
Kubernetes
Redis
Git

Experience

Research

2025

Inference-Time Chain-of-Thought Pruning with Latent Informativeness Signals

Large language models (LLMs) improve reasoning accuracy when generating multiple candidate solutions at test time, but standard methods like Best-of-N (BoN) incur high computational cost by fully generating all branches. Self-Truncation Best-of-N (ST-BoN) mitigates this by truncating unpromising paths early, but its reliance on consistency-based heuristics does not directly evaluate branch quality, which can limit efficiency on heterogeneous tasks. We present KL-Adjusted Pruned Path Algorithm (KAPPA), an inference-time method that combines Kullback-Leibler divergence, confidence, and entropy into a principled scoring function to guide progressive pruning. By promoting diversity during exploration and selectively eliminating low-scoring branches, KAPPA maintains accuracy while substantially reducing memory and token usage. Experiments on GSM8K and MATH500 with DeepSeek-R1-Distill-Qwen-1.5B and Qwen2.5-7B-Instruct demonstrate that KAPPA stabilizes performance in smaller models and achieves up to ~60% reduction in peak memory and ~90% reduction in total token generation relative to BoN, with minimal impact on accuracy.

Model Compression
Transformers
Edge AI

2024

Revisiting Absence withSymptoms that *T* Show up Decades Later to Recover Empty Categories

This paper explores null elements in English, Chinese, and Korean Penn treebanks. Null elements contain important syntactic and semantic information, yet they have typically been treated as entities to be removed during language processing tasks, particularly in constituency parsing. Thus, we work towards the removal and, in particular, the restoration of null elements in parse trees. We focus on expanding a rule-based approach utilizing linguistic context information to Chinese, as rule based approaches have historically only been applied to English. We also worked to conduct neural experiments with a language agnostic sequence-to-sequence model to recover null elements for English (PTB), Chinese (CTB) and Korean (KTB). To the best of the authors' knowledge, null elements in three different languages have been explored and compared for the first time. In expanding a rule based approach to Chinese, we achieved an overall F1 score of 80.00, which is comparable to past results in the CTB. In our neural experiments we achieved F1 scores up to 90.94, 85.38 and 88.79 for English, Chinese, and Korean respectively with functional labels.

NLP
Transfer Learning
Multilingual Models

Projects

AutoFBM

Automated bot that generates accurate prices and descriptions for Facebook Marketplace listing based on similar listings in the market. Streamlines doing market research and creating product listings with a React-based interface.

TypeScript
Python
React
Vite
Flask
OpenAI API

BrainChip

An educational app that personalizes study material generation. Users input a topic, available study time, and educational level, then the app generates curated notes, slide decks, and additional resources using Amazon Bedrock and Claude AI.

Python
Streamlit
Amazon Bedrock

DripCheck

AI-powered outfit recommendation app for unpredictable weather. Combines real-time weather and location data with personal style preferences to generate outfit suggestions and images via GPT 3.5 and DALL-E 3.

JavaScript
React
Node.js
Express.js
OpenWeather API
OpenAI API

Fix-it Felix

A camera-powered repair guide app. Uses your computer's camera and Gemini to identify objects and provide step-by-step repair and troubleshooting guidance through an interactive chat interface.

JavaScript
Python
React
Flask
GCP
WebRTC

Product Finder AI Agent

A full-stack AI agent for natural language product discovery. Gemini converts conversational queries into structured filter trees applied against a product catalog, with results rendered as a card grid.

TypeScript
Python
React
Vite
FastAPI
GCP

Get in Touch

I'm always open to discussing new opportunities, research collaborations, or interesting projects. Feel free to reach out.

nhuangra@gmail.com