The other day I talked (at reference 1) about meeting
in the middle in the context of bridges.
Now, this is also something that one
does in the context of brains. One looks at the insides, all the billions of
cells in the brain and tries to work up and out. Or one can look at the
outside, perhaps from right outside with a scanner, or from the inside, as a
person with subjective experience might, and try to work down and in. To try to
figure out what would be necessary to generate what one sees; a process
sometimes called reverse engineering, something that far eastern engineers –
Chinese, Japanese or Korean – are said to be very good at – they want the
product, while we are vain and want the glory. With the hope that the two ways
of doing things are going to meet in the middle.
A more tractable version of this problem is to try and
figure out how an image from the internet which is displayed on your telephone
is put together. This problem is illustrated above, where I have gone to the BBC
website and right clicked something to bring up the html code, with the result that the stuff on the right hand side describes
most of what I see on the left. Which, as can be seen, is a reasonably complex
business.
A more tractable version still is to try and figure
out how the Powerpoint slide that you see in the conference room or lecture
theatre is put together. What might you find if you inspected the pptx file which underlay what you are
seeing? What would have to be there, at least in some sense, for you to be
seeing what you are?
In the present post, I focus on just one feature of
Powerpoint, the jump within document.
We consider the Powerpoint file as a stream of characters.
We suppose that this file can be broken into a series of segments. A header
segment, followed by a series of body segments and finished off with a trailer
segment, where a body segment might either be a slide segment or an information
segment. With the distinction being that while both slide segments and
information segments are necessary, what you actually see is mainly specified
in the slide segment, with a one to one correspondence between slides that you
see and segments. We are vague about what the information segments might do and
about how many of them there might be. But I feel sure that they are there,
rather like all the padding in DNA.
The header segment will contain data about the
presentation as a whole, for example the name of the font to be used by
default, in the absence of any further specification. The trailer segment might
contain some statistics, like the number of slides, which can be checked
against the rest of the file. Integrity checks, just to make sure that
something has not gone wrong during construction. And if you look very hard at
a slide segment you might be able to find some of the words that you see when
the slide is displayed, just as if you look very hard at the right hand side of
the illustration above, you might be able to see some of the words which appear
in the left hand side. It will probably help if you click to enlarge.
On a Powerpoint slide, there will often be words as
well as pictures. Some of the words and phrases will be in blue underline, with
the convention being that such a word or phrase marks a hyperlink: if you – or
the presenter – clicks on it, you will be taken through hyperspace and
deposited somewhere else, in the case with which we are here concerned,
somewhere else in the same Powerpoint presentation.
So what we do now is speculate about how this might be
done.
From errors in such links that I have turned up in the
past, one possibility is that the link comes in the form of an integer, either
positive or negative but not zero. In the case of a positive number, jump
forward so many slides, in the case of a negative number jump back so many
slides. In the code, this might be expressed as something like ‘… {anthropomorphic:-432}
…’, where anthropomorphic is the bit which is to be displayed in blue underline
and minus 432 is the jump number to take you to the definition of this
interesting sounding word, first seen some hours previously. Curly brackets and
colon are then special characters which are either otherwise forbidden or subject
to special treatment, special treatment which I shall not go into here, beyond
saying that you could ask google about escape characters.
A problem with this implementation is that while it is
easy to insert such a link while in edit mode, such links need to be
maintained. Every time you insert a new slide or delete an old slide,
Powerpoint needs to check for any such link which that insertion or deletion
affects and adjust it accordingly, adding one or subtracting one. If you delete
a whole bunch of slides it needs to do something slightly more complicated. My
suspicion was that Powerpoint was sometimes getting this wrong, a supposition
which would have accounted for the errors in links that I was getting in my
rather large Powerpoint, then running at more than 1,000 slides. Maybe the
problem was that Powerpoint, for some reason that one can only guess at, only
allowed such numbers to have at most three digits, with strange things (which
standards people sometimes rather kindly call implementer defined) happening if
you went to four. Such things do happen in the world of bits and bytes, for
what at the time seem like perfectly good reasons, however silly it might
look when it goes wrong.
Another possibility was that Powerpoint assigned every
slide, as it was created, a unique reference number, a number which would not
be reused (allowing reuse is apt to cause complications) and which would be
included in every slide segment. Your link could then be in the form of a
reliable, absolute address rather than an unreliable relative address. The pain
would come at execution time: instead of just nipping backwards or forwards through
so many slides, Powerpoint would, potentially, have to search through the whole
file to find the address in question. Which, although this seems a bit unlikely
these days, might take a while in the case of a very big presentation.
So one makes an index. One includes a table in the
header segment which maps one from absolute address to slide number. One
includes another table which maps one from slide number to absolute position in
the file, the serial number of the character in the character stream which
makes up the file. In the olden days, this might have been the actual address
on a disc unit, the track and segment numbers, if I have remembered the jargon
aright. Then, given the reference number of the slide you want to jump to, you
get the slide number from a search of the first table, a rather quicker
business than a search of the whole file, and you get the absolute position
from a search of the second table. And off you go.
Leaving someone else to remember to generate new
reference numbers when you copy and paste a slide in presentation edit mode,
rather than just copying the old reference, along with the rest of the old
slide.
Which is all fine and dandy, but you now have a much
more complicated piece of machinery and there is a lot more to go wrong. There
is a lot more code for Microsoft to look after, and to test, every time a
significant change is made in any other part of the system, just in case there
was some unexpected & untoward interaction. Or side effect as they say in
Big Pharma.
Yet another possibility is that rather than relative
slide number, the link is itself an absolute position in the file. Hopefully,
Microsoft did not select this one.
An even more exotic possibility is that the link contains
a search term rather than an address. So when you go to jump, Powerpoint
executes the search term – for example find the slide which contains the words
‘red’ and ‘apple’ – and takes you to the first such slide that it turns up.
Hopefully there is only one such. The up-side of this one is that you avoid the
need for obscure reference numbers, reference numbers which will almost
certainly confuse some poor sap of a maintenance programmer, some years down
the line.
And no doubt, if I gave it a bit more time, I could
come up with other possibilities.
The lesson which I draw from all this is that even in
this seemingly simple example, there is plenty of scope for complication under
the covers. It is hard to work out from what one sees from the safety &
comfort of the lecture theatre what is going on under them. So people who want
to work out brains, beware!
The good news is that there are usually errors, and
from the errors one can sometimes get a handle on the machinery; errors are
often a lot more revealing about what is going on inside than normal workings,
with normal workings often, by design, doing a very thorough job of hiding what
is going on inside. Good news which people who look at brains exploit to the
full.
PS: it is, of course, always possible that the file
which underlies a Powerpoint presentation is not organised by slide at all,
rather by feature. First it keeps all the letters from A thru M, then it keeps
all the letters M thru Z, then the numbers, then the special characters, then
the rectangles, then the ellipses, then the text boxes and so on. And then
assembles a slide from all these bits and pieces when it is needed. Just in
time, as the supply chain jargon goes. Which, I believe, is roughly the way
that the brain does things.
Reference 1: http://psmv3.blogspot.co.uk/2016/10/meeting-in-middle.html.
No comments:
Post a Comment