Prompt Tuning Learning (IV)
OpenPrompt
https://github.com/thunlp/OpenPromptA Unified Framework can to….
API design
Modularity, Flexibility, Uniformity
DEMO
Integrating several tutorial into one colab
https://colab.research.google.com/drive/10syott1zXaQkjnlxOiSXKDFGy68SWR0y?usp=sharing
Prompt Tuning Learning (III)
Dleta-Tuning:四两拨千斤小范围的参数微调可以驱动模型的整体优化,以达到几近于全参数微调的效果。Dleta-Tuning旨在如何adapt large-scale PLMs?
一个高效的方法–Delta Tuning
只更新PLM的少量参数
保持PLM的参数固定
Why Parameter Efficient Work?
In the Past Era
Parameter efficient learning can’t be realized in the past, because all the parameters are randomly initialized
With Pre-training
Pre-training can learn Universal Knowledge, 预培训可以学习通用知识
Adaptation of downstream, 下游的适应性
Imposing universal knowledge to specific tasks, 将通用知识强加给具体任务
Delta TuningDelta Tuning: Param ...
Prompt Tuning Learning (II)
整体概述本文将主要从两个方向来整合对Prompt Learning的理解:
任务和数据方面。使用Prompt Learning来提高少数几次的学习能力(给模型加入额外的上下文),弥合模型调优和预训练之间的差距,以增强其能力。
优化方面。使用delta learning中,小参数的更新驱动整体的更新,来刺激具有数十亿参数级别的模型,并对其中一小部分参数进行优化。
Prompt-Learning我们还是从对比Fine-tuning的基本范式来看起。概括来说,Pre-training是Mask掉一个单词,并将其送入encoder中,再预测出Mask掉的单词为某某某。Fine-tuning过程则是将一个句子送入encoder中,通过task head预测出句子所对应的label。如下图:Since we use PLMs as base encoders, add additional neural layers for specific tasks, and tune all the parameters, there is a GAP between pre-training and ...
Prompt Tuning Learning (I)
在Prompt-Based Learning之前:
Pretrained LMs 微简介利用pre-trained LMs作为初始化,然后tuning parameters for downstream task. 通常情况,不同的Model size/体积越大,train的越多,model会保留越多的知识,从而performance会越好。同时,在1990~2020,pre-trained LMs的论文发表量了从100涨到了3400多左右。(稍微详细的介绍可参考:待定)
Pretrained LMs 的问题Pre-trained LMs会遇到一些问题与要求 • 【DATA Scarcity】Fine-tuning的数据需要是annotated data,虽然不如pre-trained过程中所需的数据量这么大。但是如果Fine-tuning数据量过小,Fine-tuning很容易在downstream task中过拟合。合理的downstream task的annotated data可以在2.5k~391k之间(例如:MNLI:391K;CoLA:8.5K,MRPC: ...
CMU Coding Notes I
Algorithm Sketch for NN App CodeCreate a model For each example • create a graph that represents the computation you want • calculate the result of that computationif training • perform back propagation • update parameters
Tensors and Numerical ComputationNumerical Computation Backend• Most neural network libraries use a backend for numerical computation• PyTorch/Tensorflow: MKL, CUDNN, custom-written kernels• minnn: numpy/CuPy
12345import numpy as npa = [[1, 0], [0 ...
(Encryption) Code Module I
10d79030e8222445804370796b90fca1e241bbb107551dab70af759e0247cd108c349fd28304561c02f77d2ff3edc0a14ca995b6d95168df2c574043de5284f4
Hey, password is required here.
Hello World
Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub.
Quick StartCreate a new post1$ hexo new "My New Post"
More info: Writing
Run server1$ hexo server
More info: Server
Generate static files1$ hexo generate
More info: Generating
Deploy to remote sites1$ hexo deploy
More info: Deployment