Tuesday, 3 January 2017

Expressions and their orders

We have a world made up of strings of characters and the and the purpose of this post is to define expressions, and a partial order on them, which will be needed in posts to come. The second post of the series started at reference 1.

Let our characters be the letters, upper and lower case, the numerals (plus period to allow real numbers, plus hyphen to allow negative numbers), underscore (to allow strings to be segmented) and four punctuation characters (open and close round brackets, equals sign and space). For completeness, one might add single quotes, which can be used to bracket otherwise non-conformant strings, but we do not bother with that here.

Let our tokens be composed of letters, numerals and underscore, with the rule that a token which is not a number must start with a letter, must not contain punctuation characters, must not contain period or hyphen and must not end with an underscore. So ‘AbcDE12’ and ‘cat_123’ are tokens while ‘Abc=def’ and ‘123cat’ are not. Space is our main separator.

Then our expressions are things which look like ‘sat(cat on=mat with=mouse123)’. In this, neither the head word (in this case, sat) nor labels (in this case, on and with) may be numbers. But expressions may be nested as in ‘sat(cat_blue on=mat with=mouse(colour=white reference=123))’.

The elements of the expression governed by the head word are called its phrases. The elements together are called its body.

We do not expect, but do not forbid, a label to be used more than once at any one level.

The order of phrases which are labelled is not significant, the order of those which are not, is. So we treat ‘sat(cat on=mat with=mouse123)’ as the same as ‘sat(cat with=mouse123 on=mat)’. We can put it more formally, putting an equivalence relation on the set of expressions, but things are usually clear enough without going to those lengths.

Such expressions might be regarded as formalised containers for natural language, certainly by those IT people holding out for a universal language, and are quite like expressions in html, the lingua franca of the internet.

We define a simple version of a relation which we call instantiation. It is possible to define much more complicated versions of this relation, versions which do a great deal of work, which might, for example, make use of some or all of the logical connectors, but we do not expect to need such here.

Roughly speaking, one expression A is an instance of another expression B, if B can be obtained from A by deleting suitable elements, by which we mean deleting phrases, deleting labels, deleting bodies or deleting trailing elements of segmented tokens. Intuitively, but inaccurately:

A is more detailed than B
B describes a larger class of objects than A
A describes a member of the set B.

Instantiation is transitive, that is to say, if A is an instance of B and B is an instance of C, then A is an instance of C. Which brings us to orders.

Orders

We like to order things and ordering seems to be built into our natures – with some evidence for this being the way that quite small children can come to like to count things. For older people who want to play more complicated games, alphabetic ordering has a lot going for it, not least that one can order any pair of strings (of characters). One string always comes before or after the other. The ordering usually used in dictionaries and encyclopedias. But such orders can be a bit destructive, they have no regard to any inner structure that the strings in question might have. One only has to think of the way that an alphabetic ordering treats strings like ‘rat-123’, ‘rat-1’ and ‘rat-23’, which do not come out in the order one wants at all, and to get it right one has to go to the bother of adding prefixing & padding zeroes to give ‘rat-123’, ‘rat-001’ and ‘rat-023’

In the present case, we can use the relation of instance developed above to generate a less destructive order, albeit a weaker order, in that it is only a partial order (for which click on illustration). Given two expressions, under this order, one cannot always say that one expression comes before or after another. Sometimes there is nothing to be said on the matter.

But what you are buying is that when two expressions are related by the order, that this expression is an instance of that expression, then they are indeed related. They have stuff in common.

Reference 1: http://psmv3.blogspot.co.uk/2016/12/from-grids-to-objects.html.

Reference 2: https://en.wikipedia.org/wiki/Partially_ordered_set. For the illustration.

Group search key: sra.

No comments:

Post a Comment