Wednesday, 1 March 2017

On seeing rectangles

Task

In this next post in the ‘sra’ series, we think about how the data and activation processes which we have alluded to at reference 1 and elsewhere, might be used to describe, to conjure up some elementary visual scenes. A speculation.

Motivated in part by the thought that the good eyes which we share with most vertebrates (and some invertebrates) must have powered the evolution of quite a lot of our neural machinery. Put another way, our neural machinery is good at the two dimensional spaces (more or less) that we see. A thought given some further support by, example, reference 2.

We start with a space of rectangles. We then specify that space a little more formally. Lastly we move onto to the activation of that space which we postulate as being consciousness.

Our space of rectangles

This rather small world is made up of rectangles. All our rectangles have position and are properly orientated with sides vertical and horizontal. Following Powerpoint (from Microsoft and from which the illustrations are taken), a rectangle has a fill of some nominated colour, possibly null, and a line – tracing the periphery of the rectangle – of some nominated colour and weight, possibly null. The weight here is the prominence, the thickness of the line.

We can specify any such rectangle with seven numbers as follows: (up, down, height, width, line-colour, line-weight, fill-colour), where up and down are real numbers, height and width are positive real numbers, line-colour, line-weight and fill-colour are integers. Up and down give the position of the centre of the rectangle (with respect to some origin, corresponding roughly to the centre of attention), while the other five numbers specify its size and appearance.

We use ∞ to denote the high value or infinity and Ø to denote the low value or the null value.

Following the ‘bring object to the front’ and ‘send object to the back’ features of Powerpoint, we also allow depth, possibly by assigning each rectangle in our world a unique depth number, with the low value being at the back and high value being at the front. Things at the front may occlude things at the back and we do not here allow the transparency of Powerpoint rectangles. Nor do we allow things like ‘A is in front of B and B is in front of C and C is in front of A’ – which we can draw easily enough, but which can only represent a scene in the real world with a bit of fiddling about.

We allow a universal rectangle which fills the whole scene and which has no line. And which, as it will obscure anything that might be behind it, may as well be assigned a depth of low value. The background, assuming here that there is stuff in the foreground. Conventionally, up and down for this rectangle are zero, height and width are high value and line is null. We only allow one such rectangle and we do not allow any other rectangle to use this special high value. Every such other rectangle is properly inside and in front of the universal rectangle.

In the illustration above, on the left, the red rectangle is in front of and partially occludes the green rectangle. Neither rectangle has line. In the middle, the red rectangle is properly inside the blue lined rectangle which has no fill. And on the right the blue rectangle has light blue fill and dark blue line.

We restrict ourselves to stuff which is or which appears to be in front of me, rather than in my head or on my periphery. Where by in my head we mean in my mind, my impression being that, generally speaking, when I think of things it is as if they were behind the eyes and between the ears. They have a position of sorts, but are most definitely inside and not outside. And where by the periphery we mean the surface of the head. Stuff happens there, even if, quite often, I cannot see it. A periphery which might grow to include the rest of the body and its position in the rest of the world.

In all this, we think it right to resist getting too mathematical, either topologically or geometrically. Strict mathematical rules would hinder rather than help in this more statistical and noisy world of neurons.

Exclusions

Fill patterns are on the margins of what we allow.

While, for the time being, we do not allow rectangles:
  • to have edges or parts of edges in common
  • where both fill and line are null
  • where individual edges can be lines or not. Line is all or nothing
  • rectangles which are partly off-scene. Apart from the background, rectangles have to be entirely on-scene.
And, our data was conceived as read-only, although we do give a little space to the possibility of update.

More examples

We suppose that our task is to code, in our data structure, the three scenes illustrated above.

In the left hand scene we are in a red world. Everything is a uniform red and there is no structure to be seen anywhere – in which connection I remember that I have been to at least one conceptual art exhibition where the artist has tried to create such a thing, although I think he chose a green or a blue, not as strident as the red above. Red seems to be a popular colour in the present context, with Nicolas Humphrey with his book ‘Seeing Red’ coming to mind.

In the middle scene, there is a bit of structure with the red patch only being part of an otherwise empty space. The surrounding line should not be there, but is included for clarity.

While in the right hand scene in that there is a background rather than just empty space around the red patch. Also suggesting the possibility of pattern of fill in addition to colour of fill.

One might debate whether the middle scene is possible, whether there always has to be a background. But my sense is that it is possible, that it is possible to have something, to be aware of something suspended in an otherwise empty space, an empty space which would need to figure in the activation process outlined below.

We are not interested here in whether we are seeing something which is really there or not. But we do suppose that we are seeing something and that it does appear to be outside rather than inside.

Left scene

We talked above of a tuple which specified a rectangle: (up, down, height, width, line-colour, line-weight, fill-colour)

An economical approach would be to code the scene as a single number, standing for the colour, with it being understood that that was what a single number stood for. That the single number ‘red’ was a shorthand for (0, 0, ∞, ∞, Ø, Ø , ‘red’), where by ‘red’ we mean the number that codes for red. In which connection see reference 3.

A slightly less economical approach would be to code the scene with the phrase ‘(colour=red)’, with the other phrases being defaulted as above.

Or we could drop the defaults and have ‘(0, 0, ∞, ∞, Ø, ‘red’)’ in full.

Or we might go for pixels and enumerate how every many thousand red pixels made up the scene – remembering that in order to do this we would need to have some convention for dealing with the rectangle which was the entire scene, the universal rectangle. An enumeration which would be very bulky indeed when compared with the simple specification with which we started – although it is possible that the brain goes in for the sort of trickery described in the JPEG and MPEG standards for moving images about.

Middle scene

Here we need a bit more specification, but we have don’t have the background and the boundary problems of the left scene. So we might have ‘(height=5 width=10 fill-colour=‘red’)’, ‘(up=0 down=0 height=5 width=10 line-colour= Ø, line-weight= Ø, fill-colour=‘red’)’ or ‘(0, 0, 5, 10, Ø, Ø, ‘red’)’.

And in this case too we could go for pixels, having a straightforward, finite patch to cover.
From where I associate to the granularity of advertising posters when seen from up close, and the tendency to see patterns, perhaps just rows and columns or perhaps something more complicated, like more or less fantastic faces built around two spots which chance to be in a more or less eye-like configuration, which are not part of what you are intended to see: the brain, it seems, abhors a vacuum and latches onto whatever perturbations it can find or invent. We do not pursue these vagaries here.

Right scene

Here we have the background and a foreground object to specify. With the difference that in this case as well as having a colour we have a pattern to code, taking our rectangle specification vector from seven elements to eight.

A more formal specification

We now define our space of rectangles a little more formally – allowing scenes containing many elements, at least more than the two of the last example. And in the jargon of reference 1, we have also moved onto the compiled version, during which compilation we have sorted out the business of occlusion.

<rectangle>=(position=<number> [salience=<number>] [up=<number>] [down=<number>] [height=<number>] [width=<number>] [line-colour=<number>] [line-weight=<number>] [fill-colour=<number>] [occlusion_data=(…)])

<visual_field>=(metadata=(…) [backgound=<rectangle>] [foreground_element=<rectangle>]…) 

More information about expressions of this form can be found at reference 4, but in the meantime ‘(…)’ means some unspecified expression. 

Metadata is data about the visual field as a whole. Perhaps, for example, the emotional state arising from it. Sadness, joy or anger. Absent in our simple examples.

‘[…]’ means an optional element.

‘position’ means position in depth, as opposed to position in the up-down plane. No two objects have the same position and background has the low value. This is a slightly stronger condition than is needed, but it will serve. Position governs the occlusion of objects, with an object with a high position possibly occluding an object with a low position. 

‘salience’ says how important an object is, how strongly it features in consciousness. The property which can make objects seem larger than they really are, perhaps accounting for some of the difference between what we see and what the camera sees. A property which is computed by the compilation process.

‘[foreground_element=< rectangle>]…’ means zero, one or more rectangles. Square brackets more generally indicating an optional element, while angle brackets mark something which is further defined elsewhere, usually above.

Occlusion data says what part of the object in question can actually be seen. Data which is generated by the compilation process, using the various position data. In the case that there are other objects, there will be an occlusion phrase in the background object. But there will be no occlusion phrase in the top object, as there will be no occlusion, by definition.

Note that
  • we are more concerned here with the content of the visual field, what has to be there to support that which we see, rather than with the details of how exactly this data is expressed in neural or layer terms
  • we do not think that this data will be held in pixel form, certainly not without support taking some other form
  • if data were actually to held in the form suggested above, update would be easy. Changing the value of a field very easy, while taking away an existing element or adding a new element does not look difficult.
Visual field activation

Consciousness exists in time; it is a consequence of activating the data, data which of itself is dead, inert.

In the simplest case, in the case that our field just consists of a marker which says, for example, red, the activation process just keeps repeating red over and over, for the duration of the frame of conscious. There is nothing else for it to do. We might imagine these activations as being sprayed out randomly on a disc, with decreasing probability as one gets further away from the centre of the disc. 

With this probability distribution getting more complicated as the visual scene gets more complicated.

Specifying this probability distribution  is part of the compilation process. 

We now move onto the activation process proper.

The process works in chunks of time. It does what it can in a chunk, then pauses, with the host system deciding whether to allow another chunk, to start again at the beginning, or whether to stop the activation process and move onto a new frame or onto something else altogether.

Within a chunk, it takes objects on a random basis and scans them, with the scan leaving a trail of high frequency neural activation behind it. The probability of taking any particular object and the time allowed for scanning being qualified by the salience, where there is one, with high salience making it more likely than an object will be selected. 

Except in the case of the background, it first scans any visible border, again on a random basis, for a certain period of time. It then scans the interior, again on a random basis for a certain period of time.

Then moves onto the next object. There might be a preference for an adjacent object.

We suggest that this strikes the right sort of balance between structure and appearance.

With the power needed provided partly by attention and arousal, partly by input from the eyes. Much easier to see rectangles when they are in front of one, rather than when they have to be dreamed up.

Conclusions

Consciousness exists in both time and space, it is not simply a state of being, outside of time or space, and we have outlined a process which we hypothesise might result in consciousness of our small world of rectangles. A process which might be thought of as a torch light roaming, rather quickly, about the visual scene. So quickly that it all blurs into a single experience. Or perhaps of a blind person running his finger over a tactile version of our rectangular scene, our scene of rectangles.

The hard bit is still missing, how you get from the electrical fields arising from all this activation to subjective experience, but the hypothesis does narrow the search field, even if the electrical fields are very faint and very deep inside the brain, making them rather difficult to see from the outside, however vivid they might be from the inside. 

We think that there has to be some sort of continuity: things which seem, subjectively, to be close together, should be close together, at least in some sense, in these electrical fields. That is to say, for example, if two things appear to be close together in distance, texture or colour, then they should be close together in these electrical fields. The stuff which makes up what we experience is not scattered all over their physical expression; we have, so as to speak, something in the way of retinotopy.

There is also the thought that, again following reference 1, that our data is held in the low frequencies while our activation processes are in the high frequencies. A bit of leakage from high to low, in effect updating the data, having consciousness actually doing something rather than just being there, does not look out of the question. Although mapping that update from the working copy held in our data structure to longer term memory might be another matter.

References


Reference 2: The Geometry of Meaning: Semantics Based on Conceptual Spaces - Peter Gärdenfors – 2014.



Group search key: sra.

No comments:

Post a Comment