Applied Scientist | Amazon
• Developed a robust pipeline to run multi-node batch training and inference on Sagemaker. Evaluated and tested for fine-tuning (SFT, LoRA) Alexa LLM (7B, 13B, 30B), Llama v2, Flan - T5 for Query Rewriting (CQR) task.
• Research on efficiently tailoring Large Language Models for Text Generation using RL-based policy methods.
Graduate Research Assistant | Pathology Dynamics Lab
• Involved in curation of new text dataset of 10K records for comprehensive data analysis and model development.
• Optimized multi-label text classifiers using RoBERTA and active learning achieving 60% F-1 score with limited labeled data.
• Developed PubMed BERT-based relationship extraction model for the new dataset, benchmarking the results.
• Worked on Information Retrieval for meta-analysis using LLMs based on Open AI API like ChatGPT and GPT-3.
• Developing a Multi-label Hierarchical Contrastive learning approach for Biomedical Entity Linking.
Artificial Intelligence Engineer | RadicalX AI
• Led a 5-member team in developing a potent anti-cheat and anti-fraud system, blending SVM and BERT models.
• Built a robust Zero-Shot intent classifier based on BLINK architecture for career coach chatbot based on GPT-4.
NLP Research Assistant | Janus Lab
• Conducted Exploratory Data Analysis (EDA) on 1K+ text files using regex, pandas and stemming.
• Implemented transfer learning on transformer models like BERT, Spacy to detect racial bias in each document.
• Enhanced a text summarization model utilizing BART to visualize insights from interview transcripts.
Software Engineer | Infosys
• Reduced CRUD extraction time by designing an automation framework and implemented a development tool using Python and SQL; earned appreciation award for saving over 4 days of manual effort.
• Contructed Python scripts for data migration and cleaning, particularly for Teradata and IBM DB2 transfers.