LISTSERV mailing list manager LISTSERV 16.0

Help for SW-L Archives


SW-L Archives

SW-L Archives


SW-L@LISTSERV.VALENCIACOLLEGE.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

SW-L Home

SW-L Home

SW-L  June 2010

SW-L June 2010

Subject:

AW: Fixing a fundamental flaw in Binary SIgnWriting

From:

Stefan Wöhrmann <[log in to unmask]>

Reply-To:

SignWriting List: Read and Write Sign Languages

Date:

Tue, 1 Jun 2010 23:10:57 +0200

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (58 lines)

Hi Steve, Valerie and everybody,

I do not understand these software discussions - so excuse this question.
Will this change effect the work we already have invested in the SignPuddle
online dictionary? Do I have to rewrite entries?

Thanks
Stefan ;-)

-----Ursprüngliche Nachricht-----
Von: SignWriting List: Read and Write Sign Languages
[mailto:[log in to unmask]] Im Auftrag von Steve Slevinski
Gesendet: Dienstag, 1. Juni 2010 22:49
An: [log in to unmask]
Betreff: Fixing a fundamental flaw in Binary SIgnWriting

Hi List,

This is a technical discussion. Nothing is going to change regarding
the writing system. The change is only data related.

Back in 2008, I made a poor design choice for Binary SignWriting. I
needed to define what was a character for the encoding model. I decided
that each symbol should be a character. Some others (Stuart Thiessen,
Michael Everson, members of the WLDC, ...) thought that each BaseSymbol
should be a character with an individual symbol being defined as a
BaseSymbol character with one or two modifying characters.

Encoding with symbol characters seemed the better choice, rather than
using 3 times the amount of data to say the same thing. I was wrong.
My choice made searching by BaseSymbol much more difficult. I was
forced to pre-process the data before I could search. This was wasted
effort. I realized the error of my ways when I was reading a discussion
of searching with Unicode.

I need to fix my poor design choice and reencode the ISWA 2010 with
BaseSymbol characters and modifiers. I then need to refactor the
character encoding model. This should be a quick fix I'll have ready by
Friday, but it changes BSW once again. Hopefully for the last time.

On the bright side, this makes it easier for inclusion in Unicode. With
my previous encoding, I required an entire Unicode plane of 65,000
characters. With the new encoding, I only need 1,280 characters. This
is a much better number.

Years ago, Michael Everson worked with Unicode for the tentative
acceptance of SignWriting into the standard. If you look at the Unicode
roadmap for the Supplementary Multilingual Plane, you'll see that Sutton
SignWriting has 4 rows set aside awaiting a proposal. These 4 rows
represent 1024 characters. With the new encoding, I can create a
proposal that requires 5 rows. Much more reasonable that an entire plane.
http://www.unicode.org/roadmaps/smp/

Sorry to any and all programmers / users this will inconvenience, but it
is a needed change.

Regards,
-Steve

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In