HLT Phonetic Scorer

This package provides a utility to compute phonetic features (i.e., rhyme, alliteration, plosive, homogeneity) of tokenized sentences. 

Requirement

Java version >= 1.8

Usage

java -jar eu.fbk.hlt.phonetics.PhoneticScorer.jar [args]

where args can be:

i) -f <inputFile> <outputFile>

For each line in the input file, it calculates the phonetic scores and writes in the output file as tab-separated (Text\tRhyme Score\tAlliteration Score\tPlosive Score\tHomogeneity Score\n).

ii) -s <input string (with quotations)>

For the input string, it outputs the phonetic scores.

iii) -i

Interactive mode. It consumes standard input one line at a time and outputs the phonetic scores for each line. Please note that in all three cases, the program expects already tokenized text in which tokens are space separated. If your text is not tokenized, please consider tokenizing them before providing it to the phonetic scorer.

Please refer to the examples directory and the java doc for usage examples.

Terms of Use

The package includes two external resources (i.e. CMU pronunciation dictionary and Variant Conversion Info (VarCon) lexicon), both of which are free to use for non-commercial applications. The respective licences are available in the src/resources/lexical directory.

The phonetic scorers are described and used in the following paper:

@InProceedings{ozbal-pighin-strapparava:2013:ACL2013,

author = {\"{O}zbal, G\"{o}zde and Pighin, Daniele and Strapparava, Carlo},

title = {BRAINSUP: Brainstorming Support for Creative Sentence Generation},

booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},

month = {August},

year = {2013},

address = {Sofia, Bulgaria},

publisher = {Association for Computational Linguistics},

pages = {1446--1455},

url = {http://www.aclweb.org/anthology/P13-1142}

}

If you use this package in your research, please cite this paper.

The package is free to use for non-commercial uses. It is licensed under a CC BY-NC 4.0  International license

Download: Phonetic-scorer