1.Core Responsibilities: Implemented the end-to-end pipeline of text preprocessing, vector embedding, retrieval, and generation, successfully constructing a highly scalable knowledge base Q&A system. 2.Technical Challenges: Confronted critical RAG bottlenecks, including low retrieval precision, poor recall rates, short/ambiguous user queries, and the limitations of standard embeddings in capturing complex knowledge structures. 3.Solutions & Optimization: * Developed advanced query expansion and rewriting strategies by leveraging Multi-Query to generate diverse question formulations. 4.Utilized Step-Back Prompting to abstract and ground broader, higher-level concepts. 5.Integrated HyDE (Hypothetical Document Embeddings) to generate pseudo-documents, effectively bridging the semantic gap before retrieving actual text. 6.Key Results & Impact: Built efficient retrieval indices using ChromaDB to achieve deep semantic-level information recall. Successfully elevated the final answer accuracy from 50% to approximately 80%, validated by a rigorous human evaluation of 350 test queries on a standardized document knowledge base.
About
an AI researcher and developer currently pursuing my graduate studies at Chongqing University of Technology. With a dedicated focus on Large Language Models (LLMs), Knowledge Graphs, and Retrieval-Augmented Generation (RAG) frameworks, I specialize in building and optimizing scalable, production-ready AI systems. My hands-on expertise spans deep model fine-tuning using advanced tools like LLaMA-Factory and Unsloth, orchestrating sophisticated agentic workflows such as Corrective RAG (CRAG) via LangGraph, and managing high-performance Linux server environments with dual-GPU acceleration. Passionate about transforming complex technical theories into practical, vertical business applications, I am eager to bring my robust engineering skills and research insights to a cutting-edge technology team where I can drive impactful AI innovation.
Competitions
-
Mar.2025
The 22nd "Huawei Cup" China Post-Graduate Mathematical Modeling Competition
-
Jun.2025
The 7th "Zhongqing Cup" National Mathematical Modeling Competition for College Students
-
Jan. 2025
The 6th MathorCup Mathematical Application Challenge — Big Data Competition
-
Jan.2025
The 7th "Huawei Cup" China Graduate AI Innovation Competition
-
Jun.2025
Chongqing AI Large Model Competition
Projects
1.Project Description & Core Framework: Developed an innovative recommendation framework that integrates LLM-driven interest generation (via a local Qwen2-7B), granular ball computing, and a simplified Graph Neural Network (GNN). This architecture significantly enhances user profiling capabilities and recommendation precision by capturing multi-grained behavioral patterns and structured semantic features. 2.Functional Modules & Technical Implementation:LLM-Based User Interest Generation: Leveraged a locally deployed Qwen2-7B-Instruct model to deeply extract and profile explicit and implicit user interests from historical data. Granular Ball Granularity Clustering: Implemented granular ball computing to perform adaptive-granularity interest clustering, multi-scale capturing user preference evolution. Knowledge Graph Item Enhancement: Enriched item representations by incorporating external semantic knowledge and structural relational features from Knowledge Graphs. Simplified GNN Representation Learning: Adopted a lightweight Graph Convolutional Network (GCN) architecture to perform efficient, low-latency graph-based node embedding and propagation. Dynamic Mask Denoising: Designed a dynamic masking mechanism to filter out data noise and redundant interactions, heavily improving the model's structural robustness. Experimental Results & Impact: Conducted extensive benchmarking evaluations across three public datasets. The proposed framework demonstrated superior scalability and accuracy, delivering a maximum improvement of 7.59% in the NDCG@100 metric compared to the strongest state-of-the-art baseline.
1.Project Description & Computational Infrastructure: Directed an instruction-based Supervised Fine-Tuning (SFT) pipeline on the Llama-3-8B model to automate the structural processing of unstructured, colloquial meeting transcripts. Managed the entire training lifecycle within a Linux server environment powered by dual NVIDIA RTX 4090 GPUs. By integrating the Unsloth acceleration framework with QLoRA (4-bit quantization), successfully minimized VRAM footprint and boosted training throughput by over 2x. 2.Technical Challenges & Targeted Solutions: Addressed high-stakes production bottlenecks including unstable output formatting, severe model hallucinations, and few-shot overfitting.Format Enforcement: Integrated a custom Data Collator alongside a strict EOS Token truncation strategy to forcibly constrain model generation boundaries during inference, completely eliminating unstructured conversational filler and ensuring output alignment with a standardized JSON Schema template.Overfitting & Repetition Control: Mitigated the model's text-looping (repetition) tendencies and overfitting issues by strategically scaling the lora_dropout hyperparameter and executing a rigorous max_steps early-stopping strategy. 3.Experimental Results & Performance Metrics: Evaluated and verified model performance against a proprietary, custom-built validation dataset. The fine-tuned model achieved an extraordinary breakthrough, elevating the extraction accuracy of key entities from a baseline of 40% (pre-fine-tuning) to an industry-grade 98%.