Saturday, April 14, 2007

Integer-Based Representation

I'm working with integer-based or numerical representations in triples and quadruples. Basically, this approach uses bitfields instead of URI. I designed it to be computationally faster than string-based formats and the bitfields allow differentiation between tense negation, logical negation, set complementing and other operations on entities and relation types. I'm working on some tools to convert XML-based data to and from this format.

Also, n-ary predicates can be converted to a set of triples (or quadruples) by combinatorically relating the n arguments (pairwise) using binary subpredicates. This could be of use in converting sentence predicates into a triples-based language.

A rule system will probably be required to capture overlap between different predicates' subpredicates — which is unfortunate as there are upwards of 4,527 frames in FrameNet and 3,635 in PropBank. This pairwise relating the syntactic elements in n-ary predicates or frames should result in more useful structured knowledge. This is a first attempt at a post-SRL/pre-NLG format.

No comments: