sábado, 29 de diciembre de 2012

Paraphrasing for Automatic Evaluation Regina Barzilay , tipos de parafraseo

http://acl.ldc.upenn.edu/N/N06/N06-1058.pdf

The most notable examples in this category include measures such as BLEU and ROUGE


In this paper, we explore the use of paraphrasing
methods for renement of automatic evaluation
techniques. Given a reference sentence and a
machine-generated sentence, we seek to nd a paraphrase
of the reference sentence that is closer in
wording to the machine output than the original reference.
For instance, given the pair of sentences in
Table 1, we automatically transform the reference
sentence (1a.) into
However, Israel's
answer
failed to completelyremove the U.S. suspicions.

candidato parafraseo

Next, the algorithm tests whether
the candidate paraphrase is admissible in the context
of the reference sentence. Since even synonyms
cannot be substituted in any context (Edmonds and
Hirst, 2002), this ltering step is necessary. We predict
whether a word is appropriate in a new context
by analyzing its distributional properties in a large
body of text. Finally, paraphrases that pass the ltering
stage are used to rewrite the reference sentence.
We apply our paraphrasing method in the context
of machine translation evaluation. Using this strategy,
we generate a new sentence for every pair of
human and machine translated sentences. This synthetic
reference then replaces the original human reference
in automatic evaluation.



Score  para determinar la calidad del parafraseo

BLEU
BLEU is the basic evaluation measure that we use
in our experiments. It is the geometric average of
the n-gram precisions of candidate sentences with
respect to the corresponding reference sentences,
times a brevity penalty. The BLEU score is computed
as follows:
BLEU = BP · 4vuut
4
Y
n=1
pn
BP = min(1, e1−r/c),
where pn is the n-gram precision, c is the cardinality
of the set of candidate sentences and r is the size of
the smallest set of reference sentences.
To augment BLEU evaluation with paraphrasing
information, we substitute each reference with the
corresponding synthetic reference.
458




ALGORITMOS QUE REALIZAN PARAFRASEO , OJO AL DE BROWN EL DEL LSA YA HA SIDO AMPLIAMENTE INVESTIGADO POR MI

Latent Semantic Analysis
(LSA), and Brown clustering (Brown et al., 1992).
These techniques are widely used in NLP applications,
including language modeling, information extraction,
and dialogue processing (Haghighi et al.,
2005; Sera n and Eugenio, 2004; Miller et al.,
2004). Both techniques are based on distributional
similarity. The Brown clustering is computed by
considering mutual information between adjacent
words. LSA is a dimensionality reduction technique
that projects a word co-occurrence matrix to lower
dimensions. This lower dimensional representation
is then used with standard similarity measures to
cluster the data. Two words are considered to be a
paraphrase pair if they appear in the same cluster.
We construct 1000 clusters employing the Brown
method on 112 million words from the North American
New York Times corpus. We keep the top 20
most frequent words for each cluster as paraphrases.
To generate LSA paraphrases, we used the Infomap
software4 on a 34 million word collection of articles
from the American News Text corpus. We used
the default parameter settings: a 20,000 word vocabulary,
the 1000 most frequent words (minus a stoplist)
for features, a 15 word context window on either
side of a word, a 100 feature reduced representation,
and the 20 most similar words as paraphrases

No hay comentarios:

Publicar un comentario