Loading [MathJax]/extensions/TeX/AMSmath.js

中文分词词性和序列标注之CRF-4-查找最大概率序列

4 linear chain CRF Viterbi

超找给定 x 序列最大概率输出序列 y^\ast

HMM/MEMM viterbi类似的搜索方法,但在CRF中并不用计算规范化概率。
\begin{align} y^\ast &= \arg\max_y p(y|x) \\ &= \arg\max_y \frac{ \exp( w \cdot F(y,x) ) }{Z_w(x)} \\ &= \arg\max_y \exp( w \cdot F(y,x) ) \\ &= \arg\max_y w \cdot F(y,x) \\ &= \arg\max_y \sum_k w_k f_k(y, x) \\ &= \arg\max_y \sum_k w_k \sum_i f_k(y_{i-1}, y_i, x_i) \\ &= \arg\max_y \sum_i \sum_k w_k f_k(y_{i-1}, y_i, x_i) \\ &= \arg\max_y \sum_i f^\prime(y_{i-1}, y_i, x_i) \end{align}

f^\prime(y_{i-1}, y_i, x_i)=\sum_k w_k f_k(y_{i-1}, y_i, x_i)

也就是说只需计算连续最大特征函数求和路径。

可看到线行链中只有前后两个 y_{i-1}, y_i是互相依赖的,因此这里和HMM类似,只是把求积换成求和。

  1. 初始化:
    \begin{align} \delta_1(s_j) &= f^\prime(y_0=start, y_1=s_j, x_1) \\ \psi_1(s_j) &= 0 \end{align}

  2. 递归:
    \begin{align} \delta_i(s_l) &= \max_{s_j} \delta_{i-1}(s_j) + f^\prime(y_{i-1}=s_j, y_i=s_l, x_i) \\ \psi_i(s_l) &= \arg\max_{s_j} \delta_{i-1}(s_j) + f^\prime(y_{i-1}=s_j, y_i=s_l, x_i) \end{align}

  3. 结束:

\begin{align} \delta_{n+1}(s_l) &= \max_{s_j} \delta_{n}(s_j) + f^\prime(y_{n}=s_j, y_{n+1}=stop, x_{n+1}) \\ \psi_{n+1}(s_l) &= \arg\max_{s_j} \delta_{n}(s_j) + f^\prime(y_{n}=s_j, y_{n+1}=stop, x_{n+1}) \\ p^\ast &= \max_{s_j} \delta_{n+1}(s_j) \\ y_n^\ast &= \max_{s_j} \delta_{n+1}(s_j) \\ y_i^\ast &= \psi_{i+1}(y_{i+1}^\ast) \quad, i = n-1, n-2, \cdots, 1 \end{align}

参考资料:
[1] 李航; 统计机器学习
[2] Ethem Alpaydın: Introduction to Machine Learning Third Edition 15 Hidden Markov Models
[3] Lafferty, J., McCallum, A., Pereira, F. (2001). “Conditional random fields: Probabilistic models for segmenting and labeling sequence data”. Proc. 18th International Conf. on Machine Learning. Morgan Kaufmann. pp. 282–289.
[4] An Introduction to Conditional Random Fields Charles Sutton1 and Andrew McCallum2

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload the CAPTCHA.

Proudly powered by WordPress   Premium Style Theme by www.gopiplus.com