Simple Incremental Update Model (SIUM)

SIUM has been the focus of part of my PhD. It is explained in some of our papers. I have a Java implementation of it with some example code on how to use it. It isn't hard to implement, but if you have something in Java, it can be easily incorporated. If you want a copy of it, please contact me directly. 


This is a toolkit for Incremental Processing. It is chiefly used in dialogue research, where things like speech recognition, syntactic parsing, semantic representations, natural language understanding, and speech synthesis are done in an incremental way, usually word by word, instead of end-pointing on sentence, or higher-level fragments.

The theory is based on this article which defines an incremental framework for dialogue processing. More reading on the toolkit itself can be found here. The toolkit was written primarily by Timo Baumann, but I include it here because I use it a lot in my research. It was written in Java. I found it very intuitive to set up and use. It is easy to set up and run the demos, and it took me no more than 25 minutes to make my own modules and see them work together.

InproTK (hosted at SourceForge)


Suffix Tree Language Model (STLM)

This is written in C++. It requires xmlrpc++, protobuf, zlib, and cpptest, which are all included in the tar ball. It still has some minor bugs, but it works overall as long as you have enough memory. I also have a Java version in jar form, contact me if you are interested.

Please refer to this paper if you use it. As far as licensing goes, I make no guarantees whatsoever, I incur no blame for anything.


This is the version of Moses that can run it directly. When running configure, use --with-stlm=/path/to/stlm and then make as usual. The language model ID is 10. It should all be set to be used in the experiment management system.


Troubleshooting: If you get a SEGFAULT at program invocation on the ./bin/stlm binary, comment out line number 25 in Text.cpp (vocba[ROOT] = index++).


Multiple-Infer Alchemy

This is a version of Alchemy (Markov Logic Network) that can be invoked once, then queried multiple times via an xmlrpc service. After downloading, run make in the src folder, as usual. Then invoke infer with normal arguments, including the query parameters and the evidence file. It will always read from the same evidence file, so that file will need to be updated every time you invoke infer. You can set the port with the -port flag.

Below is a sample python client that can tell the server to run inference. The service returns the results to the client, which is then printed out. There is a simple markup in the results to help with parsing.

import xmlrpclib
proxy = xmlrpclib.ServerProxy("http://localhost:3001")
print proxy.infer([True])

The [True] parameter tells it to renew the evidence to the original trained mln file (as specified when infer was invoked). False, then, would keep all evidence up until that point (not really well tested). Note that it does take quite a bit of memory. I had a trained mln file that was pretty large (took about 30 seconds to load), and was able to run inference on it about one thousand times before it ran out of memory. It's enough for me, so I'm not going to spend any more time with it.