In SignPuddle Markup Language, there are 3 main parts of information:
terms, text, and source. SignWriting can be used in each. The voice
language items are defined the same as sign language items.
However, by convention, I will be using voice language items differently
than sign language items.
The voice language items will use UTF-8. This will be straight
character data, so I'm wrapping the entires as a CDATA block to avoid
The sign language items will use BSW as hexadecimal. I still need to
decide if terms can be one than one sign. This will determine if terms
are edited with SignMaker or SignText. I need to decide the same for
the source: one sign only, or more than one sign.
For the ultimate in flexibility, I could have the sign language items
use UTF-8; the same as the voice language sections. I would need to
encode the Binary SignWriting using the UTF-8 I propose with the plane 4
solution. This way, we could mix sign language with HTML markup and
other spoken languages. However, this encoding is not approved by the
Unicode consortium so it may be considered bad manners to start using
plane 4 without their approval.
Either way I go, I will not need to update the SPML DTD definition. You
can see that I am not limiting the terms, text, or source.
Here's an abbreviated definition
<!ELEMENT spml (entry+)>
<!ELEMENT entry (item+)>
<!ELEMENT item (term*,text?,src?)>
<!ELEMENT term (#PCDATA)>
<!ELEMENT text (#PCDATA)>
<!ELEMENT src (#PCDATA)>
+ one or more
* zero or more
? zero or one