Tuesday, May 1, 2007

Foundations of Machine Reading/Writing

[Leibniz][took up][the question][in][his baccalaureate thesis], and [argued][in][the true scholastic style][for][a principle of individuation] which [would preserve][the independence of universals] [with respect to][ephemeral sensations], and [yet][embodied][universal ideas][in][the eternal natures of individuals].

Let us label each bracketed chunk with a letter, A through S.

with the [yet] (O), we would have

D(B(A,C),E)
{

G(F+I(A,J),H)
K(J,M(L,N))

}
O
{

P(A,R(Q,S))

}

Theoretically resembling:

D(B(A,C),E)
G(F+I(A,J),H)
K(J,M(L,N))
P(A,R(Q,S))
O(G(F+I(A,J),H),P(A,R(Q,S)))
O(K(J,M(L,N),P(A,R(Q,S)))

but let us look at the following four and call the processing of O a metaoperation.

D(B(A,C),E)
G(F+I(A,J),H)
K(J,M(L,N))
P(A,R(Q,S))

So, in this paradigm of machine reading, the goal is to turn a sequence of chunks into a structure with necessary internal states to manage the noun chunks during the process. Transitions may occur via chunked and non-chunked tokens.

How would we describe the process of building these four predicates?

Start a new predicate with A as the left argument.
Name that predicate B.
Place C in the right argument of that predicate.
Move that predicate into the left position of a new predicate.
Name that predicate D.
Place E in the right argument of that predicate.
Start a new predicate with A as the left argument.
Name that predicate F.
Move that predicate into the left position of a new predicate.
Name that predicate G.
Place H in the right argument of that predicate.
Now back to the last predicate that we just moved into the left position of this predicate.
Add I to that predicate's label.
Place J in the right argument of that predicate.
Start a new predicate with J as the left argument.
Name that predicate K.
The next statement is set in the right argument of that predicate.
Start a new predicate with L as the left argument.
Name that predicate M.
Place N in the right argument of that predicate.
Start a new predicate with A as the left argument.
Name that predicate P.
The next statement is set in the right argument of that predicate.
Start a new predicate with Q as the left argument.
Name that predicate R.
Place S in the right argument of that predicate.

Patterns:

(×4)
Start a new predicate with <X> as the left argument.
Name that predicate <X+1>.

(×3)
Start a new predicate with <X> as the left argument.
Name that predicate <X+1>.
Place <X+2> in the right argument of that predicate.

(×2)
Name that predicate <X>.
The next statement is set in the right argument of that predicate.
Start a new predicate with <X+1> as the left argument.
Name that predicate <X+2>.
Place <X+3> in the right argument of that predicate.

“A” is the subject of the sentence, "Start a new predicate with A as the left argument" (×3)

In two of the occasions that "Start a new predicate with A as the left argument" occurs (Start a new predicate with <SUBJ> as the left argument), it is preceded by "Place <X> in the right argument of that predicate." As the sentence ends with that instruction, it is possible that in a sequence of sentences, all three occurances would be.

I theorize that in a processed document, a continuous sequence of these instructions (across sentence boundaries) would have robust and complex patterns indicative of natural writing style. Furthermore, I theorize that this methodology will be able to explain why people read active tense faster than passive and how these are processed differently in this paradigm.

No comments: