## 2011年2月4日，星期五

### 有序值和机械特克

\ begin {aligned}
P (L_{ij} = l > 0 | \beta_j, \tau_i) &= \frac{\exp (\sum_{k=1}^l (\beta_j - \tau_{ik}))}{1 + \sum_{x=1}^{|K|} \exp (\sum_{k=1}^x (\beta_j - \tau_{ik}))}, \\
P（L_ {ij} = l = 0 | \ beta_j，\ tau_i）＆= \ frac {1} {1 + \ sum_ {x = 1} ^ {| K |} \ exp（\ sum_ {k = 1} ^ x（\ beta_j-\ tau_ {ik}））}。
\ end {aligned}
\]这里，$\ beta_j$是与图像关联的标量潜在值，而$\ tau_i$是与每个工作程序关联的潜在值的向量。当$\ beta_j = \ tau_ {ik}$等于$k$时，工人同样有可能分配标签$（k-1）$和$k$（除了也有可能分配其他标签）。尽管该模型没有强制单调增加$\ tau_ {ik}$，但如果不对阈值进行排序，这是工作人员不一致的迹象。例如，这可以用于识别对抗性工作人员并拒绝其工作。

Polytomous Rasch would be a great choice when the latent space is fundamentally unobservable. For instance, if I were asking Mechanical Turk to rate people's 在 tractiveness, I wouldn't care much about the magnitudes of the latent variables $\beta_j$, only their relative order, deciles, etc. After all there is no objective sense in which someone is actually a 7''. However in my case there is an actual true age associated with the subject of each photo and using polytomous Rasch directly would leave me with the problem of relating the scalar latent value $\beta_j$ to the true age bucket $Z_j$ (which so far does not appear 在里面 likelihood term 在 all). To circumvent this problem I'll force the relationship between the two, $\beta_j = \alpha_j Z_j$, where $\alpha_j > 0$ is a per-image scaling parameter. I'll scale the $\tau$ by the same $\alpha_j$ to ease the prior specification, in which case $\alpha_j$ is essentially an image difficulty parameter. Now my label likelihood is given by \ begin {aligned} P (L_{ij} = l > 0 | Z_j, \alpha_j, \tau_i) &= \frac{\exp \left( \sum_{k=1}^l \alpha_j (Z_j - \tau_{ik}) \right)}{1 + \sum_{x=1}^{|K|} \exp \left( \sum_{k=1}^x \alpha_j (Z_j - \tau_{ik}) \right)}, \\ P（L_ {ij} = l = 0 | Z_j，\ alpha_j，\ tau_i）＆= \ frac {1} {1 + \ sum_ {x = 1} ^ {| K |} \ exp \ left（\ sum_ { k = 1} ^ x \ alpha_j（Z_j-\ tau_ {ik}）\ right）}。 \ end {aligned}现在我可以重用相同的策略 标称提取物，在E步骤中优化$Z_j$并
M步骤中的其他参数。我还将介绍一个高于$\ tau$和$\ alpha$的hyperprior，其原因类似于名义情况。完整的模型如下所示：\ [
\ begin {aligned}
\ gamma_k＆\ sim N（k-\ frac {1} {2}，1），\\
\ tau_ {ik}＆\ sim N（\ gamma_k，1），\\
\ kappa＆\ sim N（1，1），\\
\ log \ alpha_j＆\ sim N（\ kappa，1），\\
P（L_ {ij} = l | Z_j，\ alpha_j，\ tau_i）＆\ propto \ exp \ left（\ sum_ {k = 1} ^ l \ alpha_j（Z_j-\ tau_ {ik}）\ right）。
\ end {aligned}
\]上一学期的$1/2$是因为阈值是标签发出概率在$（k-1）$和$k$之间相等的位置。