Skip to content
View BlazeWild's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report BlazeWild

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
BlazeWild/README.md
Banner

Typing SVG

🔬 AI Researcher from Nepal | Computer Vision & Multimodal Learning

Profile views followers

AI Coding

🔬 Research Focus

  • 🎥 Multimodal Video Captioning - Audio-Visual understanding
  • 👁️ Computer Vision - 3D Reconstruction, Pose Estimation
  • 🤖 Vision Transformers - Attention mechanisms for visual tasks
  • 📊 Deep Learning Research - PyTorch implementations
  • 🌐 Portfolio | 📧 [email protected]

🌟 Research Projects

Hav-Cocap Real-Time Motion Transfer

🛠️ Research Stack

PyTorch Transformers OpenCV CUDA TensorFlow Python W&B Jupyter MediaPipe NumPy C++

📊 GitHub Statistics

GitHub Stats GitHub Streak
Top Languages

🏆 GitHub Trophies

GitHub Trophies

📈 Contribution Graph

Contribution Graph

🤝 Connect With Me

LinkedIn Twitter Portfolio <🤝 Connect
"Researching multimodal AI systems for real-world applications"

Pinned Loading

  1. Real-Time-Motion-Transfer-to-a-3D-Avatar Real-Time-Motion-Transfer-to-a-3D-Avatar Public

    Real-time human pose detection and motion transfer to 3D avatars using MediaPipe, DNN, and Three.js — supports webcam and video inputs with custom avatar integration.

    Python 17 7

  2. Custom_LLM_DataGen_Template Custom_LLM_DataGen_Template Public

    🔧 Modular pipeline for generating high-quality, domain-specific datasets for LLM fine-tuning — from PDFs and web scraping to synthetic Q&A generation, quality filtering, and training-ready formatting.

    Python 2 1

  3. TrekNepal-3B__Finetuned-Llama3.2-3B TrekNepal-3B__Finetuned-Llama3.2-3B Public

    Fine-tuning pipeline for LLaMA 3.2-3B on Nepal trekking using custom synthetic Q&A data, LLM-based filtering, and QLoRA optimization.

    Python

  4. Hav-Cocap Hav-Cocap Public

    Hav-Cocap: Hybrid Audio-Visual Compressed Video Captioning framework. Extends CoCap with an Audio Encoder and evaluated on the AVCaps dataset.

    Jupyter Notebook