LISTSERV mailing list manager LISTSERV 16.0

Help for SW-L Archives


SW-L Archives

SW-L Archives


SW-L@LISTSERV.VALENCIACOLLEGE.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

SW-L Home

SW-L Home

SW-L  October 2011

SW-L October 2011

Subject:

Signbox size and coordinate strings

From:

Steve Slevinski <[log in to unmask]>

Reply-To:

SignWriting List: Read and Write Sign Languages

Date:

Thu, 6 Oct 2011 10:58:47 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (94 lines)

Hi list,

Here is my current design and a technical discussion. Any feedback is
appreciated. Please ignore if you don't want to peak under the hood.

Background material:
=============
1) Regular Expressions
http://en.wikipedia.org/wiki/Regular_expression

2) Cartesian Coordinates.
http://en.wikipedia.org/wiki/Cartesian_coordinates

=============

I use Cartesian Coordinates for the SignPuddle data. We start with a
2-dimensional canvas. Both the width and the height are divided into
specific points to create a grid. The center of the grid is point
(0,0). The horizontal position is called the X value. The vertical
position is called the Y value.

          -y|
            |
            |
            |
-x | +x
-----------+------------
            |
            |
            |
            |
          +y|



In my current design, the x and y values are unlimited. Negative to the
top-left. Positive to the bottom-right.

In general, the challenge I face is to create a string that represents a
specific coordinate. My current string has the form "n100x100" for the
coordinate (-100,100)". Simply replace the "-" minus sign with an "n"
and replace the "," comma with an "x". The purpose of these
replacements is to enable double click selection. The "n" and the "x"
continue the string without a character that creates a gap.

Regular Expressions allow for efficient searching and pattern matching.
Regular expressions are simple and powerful when used correctly. They
can easily become overly complex and difficult to understand.

The current coordinate characters can be described with the regular
expression pattern:
"n?[0-9]+xn?[0-9]+"

This can be understood in parts.

n? , may or may not have an "n"

[0-9] , select one value between 0 and 9.

[0-9]+ , select one or more digits

x , match the character "x"

I've run into a problem that general searching is inefficient or slow.
This is due to Unicode and the current form of the coordinate value.
More accurate searching is forcing me use overly complex Regular
Expressions features, like negative lookahead.

I think I need to change the form of my coordinates so that searching is
efficient and accurate. I am considering a new form of coordinate
string that is a simple value 6 digits long.

The pattern can be described as "[0-9]{6}". Understood in parts as:

[0-9] , select one value between 0 and 9.
[0-9]{6} , select six values between 0 and 9.

I will limit both the X and Y axis to the values -500 to +499. The
center is still (0,0).

Here is the coordinate string for (0,0): "500500". The string is
divided in half. The first 3 digits are for the X value and the last 3
digits are used for the Y value. Simply subtract 500 from the value in
the string. To go in the reverse, simply add 500 to the value and
combine the Y and Y values. For example, the coordinate (111,111) would
have a string of "611611" and the coordinate (-15,-20) would have the
string "485480".

Depending on speed experiments, I may duplicate the SignPuddle XML files
with ASCII rather then the Preliminary Unicode. Large files have a lot
of wasted overhead processing UTF-8 and Unicode values.

Thoughts? Opinions?
-Steve

Top of Message | Previous Page | Permalink