THESIS
2024
1 online resource (xv, 141 pages) : illustrations (chiefly color)
Abstract
Meta-Learning aims at extracting shared knowledge (meta-knowledge) from historical tasks to accelerate learning on new tasks. It has achieved promising performance in various applications and many meta-learning algorithms have been developed to learn a meta-model that contains meta-knowledge (e.g., meta-initialization/meta-regularization) for task-specific learning procedures. In this thesis, we focus on meta-learning with complex tasks, thus, task-specific knowledge is diverse and various meta-knowledge is required.
First, we extend learning an efficient meta-regularization for linear models to nonlinear models by kernelized proximal regularization, allowing more powerful models like deep networks to deal with complex tasks. Second, we formulate the task-specific model parameters into...[
Read more ]
Meta-Learning aims at extracting shared knowledge (meta-knowledge) from historical tasks to accelerate learning on new tasks. It has achieved promising performance in various applications and many meta-learning algorithms have been developed to learn a meta-model that contains meta-knowledge (e.g., meta-initialization/meta-regularization) for task-specific learning procedures. In this thesis, we focus on meta-learning with complex tasks, thus, task-specific knowledge is diverse and various meta-knowledge is required.
First, we extend learning an efficient meta-regularization for linear models to nonlinear models by kernelized proximal regularization, allowing more powerful models like deep networks to deal with complex tasks. Second, we formulate the task-specific model parameters into a subspace mixture and propose a model-agnostic meta-learning algorithm to learn the subspace bases. Each subspace represents one type of meta-knowledge and structured meta-knowledge accelerates learning complex tasks more effectively than a simple meta-model. Third, we propose an effective and parameter-efficient meta-learning algorithm for prompt tuning on natural language processing tasks. The proposed algorithm learns a pool of multiple meta-prompts to extract meta-knowledge from meta-training tasks and then constructs instance-dependent prompts as weighted combinations of all the meta-prompts by attention. Instance-dependent prompts are flexible and powerful for prompting complex tasks.
Next, we study mathematical reasoning tasks using large language models (LLMs). To verify the candidate answers generated by LLMs, we propose combining the meta-knowledge of forward and backward reasoning. Lastly, we propose question augmentation to enlarge the question set for training LLMs to enhance the LLMs’ mathematical reasoning meta-knowledge. The original questions are augmented in two directions: in the forward direction, we rephrase the questions by few-shot prompting; in the backward direction, we mask a number in the question and create a backward question to predict the masked number when the answer is provided.
Post a Comment