Psychophone Demo


Introduction

Dogs bark. Chickens cluck. Birds chirp. Cats miaow. Dolphins click. Whales sigh and moan. But are noises like this communication or just audible behaviour? It is sometimes said they cannot form communication (as we know it) because they lack the compositional properties requried for genuine language. But is this really the case? The pychophone simulation examines the issue. It demonstrates that even primitive chirping sequences---provided they are decoded in the right way---can have complex, compositional properties. To get a feel for how the simulation works, try running an example. Pressing the Encode button (in the window below) will generate a simulated birdsong encoding of the conceptual structure shown in the window on the left. Pressing the Decode button will then decode the birdsong to produce an interpretation on the right. (You may need to stretch the window out to see this.) This is done by identifying and interpreting the compositional structure in the original sounds. Use the 'next' option (via a right-click in the left window) to cycle through the examples.


Things to try

When you've mastered editing (see the instructions in the lower, middle window), try creating a conceptual structure which produces one of the following text encodings. You'll need to set 'encodingType' to 'text' so that the encoding is text rather than chirping. You may also need to add entries to the dictionary for some of these examples (see the tutorial at the end).

`The man gives chocolate to the dog'

`The station is in the middle of the town'

`People drive on the left in Britain'

`Revenge is a dish best served cold'

`Ocean racing is like standing in a shower tearing up ten-pound notes'


The conceptual structures

To see what's going on in this demo it may help to know something about the encoding approach used and the way conceptual structures are represented. A conceptual structure is taken to be a concept whose constituents are themselves concepts. Concept construction involves combining existing concepts. This can either take the form where constituents are merged to form a single entity (otherwise known as categorization) or it can take the less extreme form where constituents are combined into a related ensemble (also known as composition or `part-of' construction). So concepts in a structure can be categorical or compositional. In the simulation, all concepts are represented as circles and constituents are connected using arcs. But arcs to categorical concepts come together at a point, whereas arcs for compositional concepts join a line labeled with the relevant relation. Using this information, you should be able to `read' the structure shown in the left window. If it doesn't make sense, try changing the setting of `encodingType' to `text'. Pressing Encode, will then generate an encoding in simplified English. This may help with interpreting the conceptual structure.

Conceptual structures in both windows can be edited using the pointer. Check out the information in the lower/middle window to see how this works.


The encoding

To get the simulation to work, the hierarchical conceptual structure shown in the left window has to be encoded as a linear sequence and then translated into sounds. The approach taken is based on mechanisms from English. When we want to communicate a compositional concept, the convention (in English) is to place a reference to the relation in between references to the constituents. So if we want to communicate the idea of there being a `biting' relation between a dog and a man, we would say `dog bites man' rather than `bites dog man' or `dog man bites'. Essentially this is the convention for sequencing phrases in a subject-verb-object pattern (which is what makes English an `SVO' language rather than `VSO' or a `SOV'). But for present purposes, we can view it as just a useful convention for producing a linear encoding of a compositional structure.

Another useful principle from English affects the handling of categorical concepts. If we want to communicate that a certain entity is in a certain category, we can put a reference to the category before the reference to the entity. So to express the idea that the dog is in the category of brown things, we say `brown dog'. The mechanism allows repeats of course. To capture the idea of the dog being brown and furry, we say `furry brown dog'. To capture the idea of the dog being a definite dog rather than just any old dog, we use `the' as in `the furry brown dog' and so forth. The simulation uses this idea of pre-pending category (and article) references in much the same way.

Using these two principles, we can encode a conceptual structure that comprises one compositional construct and any number of categorical constructs. For example to encode a combination of a compositional `biting' concept (connecting a particular dog and a particular man) where the dog is within the category of brown things and the category of furry things, and the man is in the category of angry things and the category of tall things, we could use `the brown furry dog bites the tall angry man'. The problem, of course, is what happens if we want to encode a hierarchy of such constructs.

The compositional `biting' concept might be a constituent in a higher-order `shocks' concept, whose other constituent might be `village' (cf. `the biting of the man has shocked the village'). How should this be encoded in a linear form? The simplest thing is to use embedding, i.e., embed the encodings for the sub-concepts within the encoding of the super concept. This leads to encodings like `dog bites man shocks village.' Unfortunately, because of the way the structure is flattened out, the encoding can be ambiguous. The sequence `dog bites man shocks village' might encode the idea that the man shocks the village. Or it might encode the idea that it is the biting of the man which shocks the village.

Though it relies on embedding, English has a whole raft of mechanisms for sorting out the kind of ambiguities which then arise. The simulation just uses one, simple idea. It assumes that conceptual structures are put together so that embedding works on a `left to right' basis, i.e., embedded compositional concepts feature as `objects' rather than `subjects'. Encoding then uses ordinary SVO-embedding. But by ensuring that we group constructs on a right-to-left basis during decoding, we can then retrieve the original hiearchical structure. It's not very general. But it does work reasonably well in most cases.

The other mechanism used by the simulation is inflection. Again, this is a simplified version of something from English. An alternative way of indicating that a compositional concept is within a higher-level category is to `inflect' the name of the relation. For example, to communicate the fact that the `biting' concept is within the category of past events (i.e., it happened in the past), we can change `bites' to `bit'. This implicitly flags the relevant categorical super-structure, i.e., the existence of the 2nd-order categorical concept.

The idea of word inflection can also be used to retrieve a valid left-to-right grouping in some cases. For example, consider a compositional construct based on `at' which captures the idea that a certain event happened at a certain time, i.e., `the train leaves at 9PM'. The original concept may place the embedded structure in the object role, contravening the left-to-right rule. But we can recover the left-to-right pattern by inflecting the relation from `at' to 'is-when', and then rerversing the order of the constituents. This essentially encodes `the train leaves at 9PM' as `9PM is-when the train leaves'. The simulation includes an example which illustrates this effect.

In many cases, these simple ideas are sufficient to produce unambiguous linear encodings of hierarchical conceptual structures. They're all borrowed from English. So, if language-like labels are used for concepts and relations, and if the sequence is represented using those labels, the encoding can be surprisingly sentence-like. (You can check this out by setting `encodingType' to `text' in the simulation. This causes the system to display the internal, sequence encoding rather than generate any sounds.) This raises the question of whether conventions like SVO-embedding and relational-inflection are more to do with language and linguistics, or more to do with production of unambiguous, linear encodings of hierarchical structure.


Where the sounds come from

Having encoded the conceptual structure in a linear form it is then straightforward to translate the sequence into a series of modulations for some sonic template. The simulation takes some sound template (you can change this in the Settings dialog) and modulates tone and tone-durations of successive chunks so as to encode the successive elements of the sequence. The decoding just reverses the process to recover the original sequence. The encoding and decoding share the same assumptions about how specific references are encoded as modulations. They are also assumed to have the same capabilities for generating and detecting tone modulations. Otherwise, the only information the decoder has about the original structure comes via the encoding.

The general aim is to demonstrate how relatively simple mechanisms for producing linear encodings of hierarchical conceptual structures (borrowed from English) can be used to take us from complex conceptualisations to encodings that sound like animal noises. This shows that sequences of such noises can represent compositional structure and might therefore mediate real communication. There may be good reasons for assuming that bird chirping, whale song etc. is not genuine communication. But the inability of the underlying sonic medium to capture compositionality cannot be one of them.


Tutorial

Let's say we want to create a conceptual structure which represents the idea of a UFO crashing into the clock of Big-Ben, i.e., one which generates the text encoding `the UFO hit the clock of Big-Ben'.

To create a new example in the simulation, start by selecting `clear' (right-click in the left window). This gets rid of any existing structure.

Now we need to work out what the relevant entities and relations are. In this case, it seems there are three entities: UFO, Big-Ben and clock. To set things up, use the pointer (drag across empty space) to create three new concepts. Then right-click on each to set its label. You should end up with your left window looking something like this.

The clock is in an `of' relation with Big Ben. So create a new concept and then fix its relation to be `of'. Then make clock and Big-Ben constituents of the new concept. You should then have something like this in your left window.

Now we need to create another compositional concept to capture the fact that it is the the UFO which `hits' the clock.

Pressing Encode will now produce the encoding `UFO hits clock of BigBen'. One problem with this is that it doesn't capture the idea that it's a particular clock we're thinking of. To overcome this, create a new categorical concept with label `definite' and with clock as its constituent. The simulation's default dictionary contains the information that being in the category `definite' can be encoded using the qualifier `the'. So now we get the slightly improved `UFO hits the clock of BigBen'.

To capture the fact the it's a particular UFO we're talking about, we can make UFO a constituent of `definite' too.

A remaining problem is the fact that we wanted the text-encoding to be `the UFO hit the clock of Big-Ben', i.e., we wanted to conceptualise the event as happening in the past. To achieve this we need to put an entry in the dictionary which shows that `hit' is the inflection of `hits' which implicitly places the event into the `past' category. So using the right-click menu, select editDictionary and type `hits' in the newWord box and `[past=hit]' against newWordDefinition. Pressing Encode should now generate the desired text encoding: `the UFO hit the clock of Big-Ben'.


Page created on: Tue Sep 30 14:15:15 BST 2008
Feedback to Chris Thornton