My Natural Language Processing (NLP) Journey till 2023
Abstract: NLP
Here are the seven paradigms for NLP, as defined by đ literature, myself đ, and ChatGPT đŹ. The following list also highlights some of my research contributions corresponding to each paradigm. đ
-
Rule-based systems: This approach involves manually defining rules and patterns to identify and extract information from text. This paradigm is often used for simple tasks such as text normalization, entity recognition, and text classification.
-
Statistical methods with feature engineering: This approach involves using statistical models and machine learning algorithms along with handcrafted features to analyze and understand natural language. This paradigm is widely used in NLP applications and can be effective for a wide range of tasks.
-
Deep learning with architecture engineering: This approach involves using deep neural networks to model complex relationships in language data. This often involves designing and tuning the architecture of the network. This paradigm has gained popularity in recent years due to its ability to handle large and complex datasets.
- Transformer-based models with objective engineering: This approach involves using transformer-based models, such as BERT or GPT, to learn representations of language, and then fine-tuning the model on specific language tasks with carefully chosen objectives. This paradigm has led to significant advances in NLP in recent years.
- [CIKMâ21] QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query Attribute Value Extraction
- [ACLâ21] Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data
- [NAACLâ21] Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning
- Prompt-based learning with prompt engineering: This approach involves using prompts, which are example inputs and corresponding outputs, to train a language model. The model is then fine-tuned on the target task by modifying the prompts. This paradigm is useful for generating structured text and is commonly used in the context of question answering and dialogue systems.
- [NAACLâ22 (Findings)] SEQZERO: Few-shot Compositional Semantic Parsing with Sequential Prompts and Zero-shot Models
- Reborn Generative AI: The generative AI paradigm of NLP involves developing models that can generate new and original text based on some input or prompt. These models are typically based on neural networks, such as language models or dialogue systems, and are trained on large amounts of text data to generate high-quality and diverse text.
- GPT: The first version of GPT (GPT-1) was released by OpenAI in June 2018. Since then, several updated versions of the GPT model have been released, including GPT-2 in February 2019 and GPT-3 in June 2020. The GPT-3 model is available for use through OpenAIâs API, which allows developers to access the modelâs capabilities via a cloud-based service. However, the model is currently not available for download or for direct use outside of the API.
- [arXivâ20] On Data Augmentation for Extreme Multi-label Classification
- InstructGPT with RLHF: This approach involves using reinforcement learning and hierarchical planning to train a language model to generate structured text for tasks such as cooking recipes, furniture assembly, or software tutorials. This paradigm is focused on generating instructional text for structured, task-oriented applications.
- ChatGPT: ChatGPT is a language model developed by OpenAI that is designed to generate human-like responses to natural language inputs, with a focus on conversational applications such as chatbots and dialogue systems. As a language model, ChatGPT is capable of understanding the meaning and context of text inputs and generating coherent, fluent text in response.
- GPT: The first version of GPT (GPT-1) was released by OpenAI in June 2018. Since then, several updated versions of the GPT model have been released, including GPT-2 in February 2019 and GPT-3 in June 2020. The GPT-3 model is available for use through OpenAIâs API, which allows developers to access the modelâs capabilities via a cloud-based service. However, the model is currently not available for download or for direct use outside of the API.