Linux
Spreadsheets
This document discusses the
spreadsheet software that is available under Linux.
Spreadsheet and word processor software is of
particular interest to those that wish to see Linux
more widely adopted for use in office
applications.
There exists an assortment of
"free" spreadsheet packages, most not
considered to be terribly "credible" as they do not read or write
the data formats used by Microsoft Excel or Lotus 123, which are considered
"industry
standard."
There are a number of
commercial packages that do good job of
"understanding"
various proprietary spreadsheet
formats.
This document also includes
some historical background about
spreadsheets.
The folks that maintain lists
of Scientific Applications for Linux have a Spreadsheet Page. It largely
duplicates my list below, as does the spreadsheet
list in the
Linux Applications and Utilities Page
Based on the history of
personal computers, spreadsheets may be argued to be the
most
important application area on personal
computers.
Something that
isn't
questionable is that the entry of personal computers into
business began as a specific result of the advent of
spreadsheet software.
Businesses bought Apple
II computers because
they wanted to use VisiCalc. (Which,
interestingly to operating system aficionados, was
originally developed on the Multics
platform, according to VisiCalc coauthor Bob Frankston. Yet another argument for
the notion that virtually all important modern computing
innovations took place 20 years ago on the Multics
platform...) More details may be found at VisiCalc: Information from its creators, Dan
Bricklin and Bob Frankston
Thanks are in order to
Bob Frankston for some corrections he has
provided to this history.
When Lotus 123 became available for IBM PCs, the cycle
continued, and IBM PC sales took off. Then Microsoft got into
the picture, and things have unfortunately degraded
considerably...
Here is another
Brief History of Spreadsheets. The
straight facts are pretty accurate; I'll just take a bit of
issue with a couple points that represent "editorial opinion" that we can probably
agree to disagree over...
-
The authors suggested that
Lotus 123's "A1"
referencing system was "more
intuitive" than the "R1C1" system used by various other
spreadsheets notably including Microsoft's
MultiPlan.
I suggest instead that
neither is particularly more
intuitive than the other; "A1" simply happens to be
shorter.
|
The
A1 notation
originated in VisiCalc not
1-2-3.
The reasoning for it was simple: it
was simple to type. Reversing
standard notation, we used the
letters for the columns because
there were fewer columns than
rows.
|
|
| --
Bob
Frankston |
|
-
They stated that:
The
spreadsheet instantly became easier to use than the
archaic interface of PC-DOS
products...
The text-based user
interfaces were hardly "archaic" at the time; they were as
up to date at the time as anything could be. It is
indeed fairly convenient to select
"blocks" using a
mouse, and it is fair to claim this is easier
to learn than the (quite functional)
keyboard-based ways for doing this that were
and likely still are faster. (Except with
Excel , where the
keyboard interface appears to have been made
deliberately arcane, but I
digress...)
Spreadsheets have
provided for (without most of the users having any conscious
awareness of this), the large scale deployment of Cellular
Automata. One [FAQ on Cellular
Automata] defined Cellular Automata thus:
A cellular automaton is a discrete
dynamical system. Space, time, and the states of the system
are discrete. Each point in a regular spatial lattice,
called a cell, can have any one of a finite number of
states. The states of the cells in the lattice are updated
according to a local rule. That is, the state of a cell at
a given time depends only on its own state one time step
previously, and the states of its nearby neighbors at the
previous time step. All cells on the lattice are updated
synchronously. Thus the state of the entire lattice
advances in discrete time steps.
Spreadsheets satisfy
these requirements, with a few "bits of fuzziness," notably
that:
-
"Nearby neighbors" don't
have to be terribly nearby
with a spreadsheet.
-
There are some questions of
synchronicity.
Spreadsheet packages have
often provided some configurability to indicate
update policies that are not
synchronous, but which rather define some ordering
of updates.
-
Pure CA systems tend to
apply a single update rule to many cells in the
lattice; in contrast, the spreadsheet software in
common use attach individual formulae to each and
every cell.
Improv and related
packages would apply rules to different regions of
cells; it seems to me that a more intelligent use
of spreadsheets should involve some sort of
"pattern matching" to help discover
such rules for the user.
ZigZag seeks to
transform things from the "globally Euclidean"
space used by spreadsheets to that of "locally
euclidean" directions that more resemble the
hyperlinking of Nelson's
Xanadu.
-
Cellular
Automata FAQ
I've been using spreadsheets of various
sorts on a wide variety of platforms since the mid-'80s. The
first one which I used any substantial amount was SuperCalc, running under ZR-DOS (an enhanced
CP/M
"clone" that grew into ZCPR).
Others have included (in rough chronological
order):
-
Lotus 123
-
Lotus Symphony, which integrated a
wordprocessor of dubious functionality, a simple
"form" processor, and a
telecom module into the mix;
-
A 6502 Atari
8-bit based spreadsheet called SpeedCalc that was published in
Compute! Magazine;
-
LDW
Power, a Lotus 123 "clone" for the Atari
ST;
-
Borland's Quattro Pro, used in grad school
primarily for its linear programming
capabilities;
-
The spreadsheet built into the
TRS-80 Model 100 portable computer (a very small, early
version of Microsoft Multiplan that eventually grew into
the monster now known as Excel)
-
As Easy
As (guess who they were
cloning?);
MS-DOS-based and available as
shareware, this is still a featureful spreadsheet. It
doesn't allow heavy-duty GUI
formatting of spreadsheets, but only people with far
too much time on their hands do so...
-
Lucid
3D;
The first version of
Lucid 3D ran on TRS-80
Model 100 laptop computers, and the necessity for
frugality on that platform resulted in a design that
was sparse, frugal, and indeed, extremely
lucid.
The MS-DOS version was my
favorite MS-DOS program of any sort; its user interface
is a wonderful model of integration of powerful use of
both keyboard and mouse; the program was still frugal
in its use of disk/RAM and yet provided excellent
overall functionality.
-
SC
("Spreadsheet Calculator");
-
VC
(Enhanced, "more visual" version of SC);
-
The much despised memory hog,
Microsoft Excel ;
-
The spreadsheet built into the
Psion 3 handheld computer;
-
SmartWare
-
Xess;
-
Wingz
-
Teapot!
with bits of playing around with others
such as Lotus
Improv , the Microsoft Works
spreadsheet for MS-DOS, the FSF's
Oleo, an entirely-custom one that I wrote in
LISP
(mostly just as a programming exercise), and sundry fiddling
around with a pretty wide variety of MS-DOS, MS-Windows, and
Unix-based "integrated packages."
The most notable spreadsheet of which I have never made
significant use is
VisiCalc, the program that popularized the whole idea
of software that that allow interactive entry of numbers, text,
and formulae arranged in rows and columns.
Tasks I've done with spreadsheets have
included:
-
Preparation of accounting
working papers
-
Preparation of financial
statements
-
Economic simulations via
difference equations/"cellular automata"
-
Mathematical modelling, solving
linear and nonlinear programs
-
Loan modelling
-
Database conversions
-
Statistical analysis
In short, I've done enough work of
enough various types with a large enough variety of spreadsheet
packages that I figure I'm entitled to rant a little bit about
their proper use.
Here are some further useful resources
on the ancient and more modern history of
spreadsheets:
"Enhancements" of spreadsheets over the last few
years have not involved any substantive improvements in
functionality, but have primarily just involved enhancing their
"typesetting" capabilities, that is,
the ability to change fonts, insert special formatting, and to
otherwise make tables look "pretty."
I put "enhancements" in quotes because I am skeptical
that this actually represents a true improvement of either the
quality of the information or user efficiency in finding and
using information.
These so-called
improvements gloss over the continuing problems that plague
spreadsheet users:
-
Spreadsheet models
encourage the use of "spaghetti" logic, where cells point
to cells that point to cells, and can grow into
random networks of calculation logic;
-
They permit lots of easy
off-by-one errors;
-
They generally are
difficult to verify/audit;
-
They do not provide good
tools for managing data either in terms of
consolidation or searching for specific
detail;
-
Perhaps most importantly,
despite their convenience, spreadsheets are
not a robust repository for
information.
I have seen one multinational
enterprise that (believe it or not) built a budgeting system
atop sets of dozens of departmental spreadsheets that they
would roll up into a master budget; while it's a neat extension
of the technology, only a fool would try to use this to run a
large enterprise. One bad link in one subsheet, and the whole
house of cards could fall down. (And the "top" vendor these days, Microsoft, isn't noted for building
products that are of industrial
grade robustness.)
The last few points point towards where
I would like to see spreadsheets go. They have been, and are
very good at producing ad-hoc, one-off reports. This is a
proper use of spreadsheets.
They are often being used instead as
repositories for information that really ought to be managed by
a database management system of some sort.
What spreadsheets should do is to
allow, nay encourage, the use of data extracts from external
sources, notably relational
databases. The use of named ranges (which are a venerable
feature from at least as early as Lotus
123 v2.01) is of assistance; Lotus
Improv was a rather complex-to-use test platform for
improved "modelling" whose functionality included database
extraction.
Using external repositories permits the
benefits of:
-
A single repository that can be
kept correct, rather than a multitude of mutually
incompatible data stores;
-
Data synchronization (a
restatement of the last);
-
All the good RDBMS "stuff"
like:
and perhaps even more
sophisticated things such as
In effect, the real point I would
propose is that the task of building a spreadsheet should
involve some data modelling, with thought not
just about the report at hand, but also about where the data
comes from and perhaps should go to.
Microsoft Excel does have a capability to connect to
databases as does
OpenOffice.org ; the support is
yet a bit on the primitive side.
I would suggest that what
happened in the history of the developments is that, for
"political" reasons, the
developments attributable to Lotus Improv (originally developed using
NeXtstep
) were lost, and that its better model of spreadsheet
construction/management was thereby lost.
Improv provided an interface that actively
encouraged, nay required the user to add additional
structure to spreadsheet models.
It provided the ability
to define a variety of "categories"
to provide multidimensional analysis, as well as
"groups" to allow the grouping
of data that is not so readily decomposed.
- Items
-
Every time some form of
categorization is defined, this defines a sequence
of "items," whether that
be a list of months ("January,
February, March, ..."), a list of countries
("Canada, United States, United
Kingdom, France, Germany, ..."), continents
("North America, South America,
Asia, Europe, Africa, Australia,
Antarctica,"), or whatever.
Each "item" represents a row or column,
and thus may contain many cells.
- Categories
-
In a company selling things
internationally, it would be unambiguously valuable
to set up Country as a category, as you
would certainly need to analyze
data based on that. Currency exchange rates are
based on countries' currencies, the set of laws
that apply depend on the country, and so
forth.
Periods of time, such as
months, quarters, and years, also tend to
unambiguously reflect a "dimension," in this case that of
time.
- Groups
-
On the other hand, reports
might need to group countries together into
"regions" or
"continents,"
depending on who is looking at the data.
Those groupings are likely to be fuzzier,
whether we're talking about grouping several
countries together to represent a Continent
or Sales Region, or if there is need to have
smaller regions (such as states, provinces,
or counties, or shires) to decompose the
activities within a country.
In both cases, it would be
fairly appropriate to define a less-structured
"group" that does not
add an extra dimension to the hierarchy, and
thereby to the complexity of the data model. Thus,
a set of related items are collected together to
represent a "Group."
- Formulae
-
The behaviour of formulae
in Improv is
exceedingly different
from that of traditional
spreadsheets.
In a traditional
spreadsheet, a formula is associated with a cell,
and in order to have a particular formula apply to
many cells, you must copy the formula into that
range of cells.
In Improv, on the other hand,
formulae are not associated with cells, but are
"first class" objects associated with the
spreadsheet, and rather than representing a mere
single cell, are applied to an entire range/group
of cells, and may thereby operate on items, groups,
and categories.
-
Instead of
operating on cryptic "cell ranges," they operate
on named ranges, and thereby tend to be
more readable than the traditional
spreadsheet alternative.
-
Improv formulae almost
always represent vector operations,
providing values for multiple cells at
once. Thus, a formula that computes monthly
totals across category "Fruit" might look like
Total =
SUM(Fruit). The crosscheck formula,
computing annual totals for each variety of
fruit, might be Annual =
SUM(Months).
This example
displays the "overlap" issue; the pair of
formulae overlap in the cell that contains
the total for all fruit for the whole year.
Improv discloses
this overlap, and allows formulae to be
placed into a priority order where
"later" formulae
override "earlier"
formulae.
Here are links to
historical information about Improv
as well as about other packages that might be considered to be
"successors."
-
Story of
Improv
-
The Story of Improv versus
PowerStep (another
NeXTstep spreadsheet)
-
Quantrix
This was created as
something of a successor to Improv; it might conceivably be
ported to Linux, if they get suitable
interest.
-
Advance Planning Solutions
Advance had a
spreadsheet-like system rather like
Improv.
The company was acquired by
PeopleSoft in 2000, so parts of this may have been
integrated into their applications, but it is not
likely still available as a separate
product.
-
The following observation
came to me via email:
| |
... It's
worth noting that Improv flattered
an earlier program, Javelin (by a
company of the same name located in
Cambridge MA).
|
|
| --
Bob
Frankston |
|
I was aware of the release
of Javelin; it was an
MS-DOS-based software package that had an
unfortunately-brief flash of fame.
See also the web page of
one of Javelin's
authors, John R. Levine, as well as
Probert Encyclopaedia on Javelin.
It appears that
Javelin may have been
an early victim of Microsoft
Predatory Marketing; an InfoWorld article indicates
that when Javelin won
the InfoWorld "Product of
the Year" award, beating out Excel, Bill Gates
"got up and stomped out
of the room in front of everybody in a
spectacularly rude manner."
Another report suggests
that they got overambitious, planning to try to
dominate Lotus 123 when
they really needed to grow their
niche. And then were "done in" by bad timing on an
IPO, scheduled just a week after the market crash
of October 1987.
Other comments suggest that
the failure had to do with the software being
difficult to use. It was powerful, but business GUI
software was in its infancy at the time, and the
implication is that Javelin did not have a
sufficiently "user-friendly" interface to permit
widespread adoption.
These may all have been
contributing factors.
You may still be able to
get copies of Javelin;
I'm told that it was bought out by a "venture
capital" group; they integrated it into some data
retrieval tools, and you may be able to get a copy
of World Bank Indicators - World*Data
1995 which included a copy of Javelin.
-
Lotus Support
File Library for Improv
Jocelyn Paine has
presented a number of papers on a system called Model Master. I have excerpted the
following:
| |
Spreadsheet models
can be difficult to read and maintain.
Spreadsheets provide few facilities for
documentation, and although the structure
of a spreadsheet program is implicit in the
cell equations, it cannot be made explicit
as it would if programmed in a conventional
programming language. To make spreadsheets
easier to use, we are developing
Model Master
(MM), a compiler that generates
spreadsheet equations from textual
specifications of models.
An MM program
consists of one or more object
specifications. To specify single objects,
the user describes their attributes or
properties, together with equations stating
how these depend on one another's present
and past values. To specify a complete
model, the user describes how these objects
are to be connected together, by writing
extra equations that say how their
attributes are interrelated. MM compiles
these specifications into cell equations.
It automatically allocates attributes to
columns and time points to rows: the user
can override these allocations, but will
not usually need to do so.
MM is based on a
new programming paradigm, System Limit
Programming, also used in the development
of the Web authoring tool Web-O-Matic.
Further information
as well as a on Model Master: an
object-oriented spreadsheet front
end
|
|
| --Jocelyn
Paine |
|
Furthermore, the paper
references some of the preliminary research that has been done
on the issue of the correctness of spreadsheets. They are
highly dependent on user input, there are several serious
vulnerabilities from which spreadsheet tools suffer. The use of
a tool like Model Master to
construct a spreadsheet allows conscious validation of more of
the spreadsheet model which can't but be helpful.
There have been other
presentations on Model Master, and
it has been augmented to include a decompiler so that a
spreadsheet may be turned into a concise set of equation
specifications. Several interesting things pop out of
that:
-
The Model Master program
may be a more attractive interchange format than
raw spreadsheets themselves.
This has the various merits
that:
-
What is transferred
is effectively a description of the
spreadsheet model; that may be more
usefully readable than the spreadsheet
itself;
-
It is likely to be
more compact than the "binary dumps" that
commercial spreadsheets
generate.
-
The model program
can't contain the
"macro viruses" that MS Office has been
plagued with of late.
-
Any "nefarious"
calculations will be visibly described in
the model's text.
For instance,
suppose I were to have a special formula
for the line calculating my payroll amounts,
that would show up.
-
The decompiler can readily
pick up on which calculations are being run
"hard-coded," and which
are using formulae.
-
The "tough part," at this point, which
warrants considerable additional research, is the
notion of doing some searching to find repeated
patterns of formulae.
For instance, it is very
common for there to be a column of cells that
computes some sort of "cost," by multiplying a quantity
cell by a price cell, perhaps adding in taxes or
other costs, coming up with a "total cost."
It would be very valuable
to recognize the repetition of that formula, and
essentially present the formula only once in the
model. Note that this is exactly
the way Improv treated
formulae; they were not defined merely for a cell,
but rather for a whole range of
cells.
In a traditional
spreadsheet, the repetition is done by hand, which
is one of the major areas that modelling errors
creep in. By "pattern
searching," such errors may both be found
(when decompiling) and avoided altogether (when
compiling).
Further development has built a
web-based front end for Model
Master, The Spreadsheet Autopublisher
Spreadsheet algebra describes notation to
use to allow manipulating Model Master spreadsheet
models.
More recently, Jocelyn has
"ditched" Java in
favor of Prolog
for the latest versions of Model
Master.
One point of view is to
treat spreadsheets as a form of "database," and, in regarding
them from that perspective, to use database-oriented tools to
work with data.
-
Excel
-versus-Access
Spreadsheet-vs-DB test. Well worth looking
at.
-
www.DigDB.com - DIG
DIG is billed as a "Personal Data
Warehouse" or "End User Query Tool." It may be used
to extract data from spreadsheets, pushing it into
a database, to be analyzed using a set of database
tools, in this case, written in Java
.
By having a "real" database
there (albeit a somewhat weak one), they can
certainly overcome the common deficiencies where a
spreadsheet "chokes" if presented with too much
data, or if you try to have too many elements
involved in a
Pivot
Table.
-
EUSPRIG -
European Spreadsheet Risks Interest
Group
EuSpRIG was founded in
March 1999 when researchers from ISACA (Northern UK
Chapter), University of Wales Institute Cardiff and
the University of Greenwich came together to
discuss the ever increasing problem of business
risk associated with spreadsheet errors.
-
Stop the
Subversive Spreadsheet!
-
wikiCalc
Dan Bricklin's Wiki tool
that will store web-updatable
spreadsheet-structured data
A number of "dead ends"
have been encountered in the ongoing development of
spreadsheets.
-
The "traditional"
spreadsheet systems went through a process of
"racing for successive refinements" through the
late 1980s and early 1990s, largely seeking to add
"feature points" to win the contests for "most
features counted in the reviews."
Unfortunately, this means
that the current code bases are not terribly
amenable to more radical evolution, and many of the
"improvements" are merely cosmetic, in providing
more and more sophisticated tools for managing the
appearance of the
data.
The major
problem with the
"traditional" spreadsheet system is that it
does not provide much in the way of
"structuring tools" to recognize and
enforce the structure of the data
model, as described in
Problems with Modern Spreadsheet
Developments.
-
Improv proposed better ways of building models,
and essentially mandated constructing
spreadsheets as a process of constructing a system
model.
This had the unfortunate,
and, as it turns out, unacceptable effect of
preventing the "free form" spreadsheet construction
that traditional spreadsheets
encouraged.
In a traditional
spreadsheet, you have a set of rows and columns in
which you are free to put anything.
It is easy to prototype something
up, throwing values here and there wherever it
seems convenient to throw them.
In Improv, elements could not be
added to the matrix without defining the nature of
the row and column. This eliminates "doodling
around." On the one hand, there is good to it, in
that spreadsheets commonly turn into undefined
spaghetti. On the other hand, if you have to be
able to "theoretically" justify every change you
make to the shape of the spreadsheet, then the only
people that will be using the tool will be
"theorists," and to a great extent, that's what
happened with Improv.
-
Model Master provides considerable power in
defining models, provides all sorts of "strong
typing" options, and provides the logical extension
of having the language specify access to robust
data sources like relational
databases, but has two substantial
demerits:
-
It altogether rules
out "free form" construction of data into
sheets
Although this
is changing, as
construction of a "decompiler" is
underway.
-
It
mandates using
a declarative programming
language to describe the
model.
The implicit
"programmability" of the cellular automata
means that the average user doesn't
need to know about
programming; unfortunately, Model Manager pushes
programming in their face.
-
It seems to me that
a "step forward" is to try to take the
merits of each of these approaches, whilst
seeking to avoid
the demerits.
|
Robert Monfera
<monfera@fisec.com>
pointed out to me what he described as
"uniform structural unification;" basically
the notion of taking spreadsheets in the
traditional "free form" pioneered with
VisiCalc, and
then, rather than starting by trying to
enforce structure (as was the case with
Improv), instead
searching for
structure.
It's not a
mechanism of infinite analytical power; it
likely will only be helpful to find some
limited bits of structure. Of course,
"limited" may still be sufficient to
actually provide some useful added
functionality to relatively unsophisticated
users, and forcing people to start from a
data modelling perspective, as with
Improv , hasn't
proven terribly popular.
|
The strength of the
traditional spreadsheet is in providing a "free form" medium
where users may construct models without directly having to
program.
So, we start with a front
end that is a very "traditional" sort of spreadsheet. Rows,
columns, cells, formulae.
It would be nice to get
the benefits of Model
Master, in providing the ability to attach fairly strong
"rules" to portions of the spreadsheet, whether to enforce the
use of common formulae or to enforce "strong typing" of the
data types used in those regions. For instance, a region that
represents "dates" should contain nothing other than legitimate
dates.
The route to this is to
use some "artificial intelligence-like" techniques to search
for patterns in the data, and to write up
rules to propose to the user. I will call these "Pattern
Wizards."
This could include
patterns such as the following:
-
Detecting sequences of cell
contents that look, for instance, like
dates.
The proposal would then
offer to:
-
Give the region a
name indicating that
it is a "date" region;
-
Attach "type"
information to the region, requiring that
all cells contain dates;
-
If a clear sequence
is indicated, offer a formula that would
compute the contents
of the cells.
-
Detecting that a region
contains a Price/Quantity formula.
For instance, there may be
a column that multiplies the contents of a cell in
another column of the current row (indicating
quantity?) by either a specific value, or by the
contents of one cell somewhere in the sheet
(perhaps containing a constant price?), or by the
contents of a cell in another column (indicating a
price for the current row?).
Such a pattern is
suggestive of a price/quantity relationship, and
the system could offer to:
-
To establish all
three regions (price, quantity, total cost)
as having names;
-
To attach "type"
information to all three
regions;
-
To name and attach
the single formula to all
of the formulae cells.
-
Detecting a "running
balance" formula.
If there is commonality
that a sequence of cells add together the "cell
above" with "values to the left," this looks like a
running balance.
A similar set of proposals
could be generated, to attach "type" information,
to attach names, and to attach the single formula
across all the "balance" cells.
-
In order to provide goodly
flexibility in offering "abstractions" that make
for convenient formulae, the system would need to
allow construction of "user-defined"
functions.
My preference would be for
this to be a dynamic language such as Lisp;
a critical factor is for the
language to be quite functional,
where cells receive one
value that is solely based on the input
parameters. Other interesting alternatives would
include:
-
ML, which makes major use of the notion
of strong static typing; this would address
the issue of the way cells can contain
different "types" of data, and prevent many
classes of runtime errors.
-
Haskell, characterized by being purely
functional, and providing
lazy evaluation.
Lazy evaluation is a particularly
useful notion for efficient implementation
of spreadsheet calculation, as it allows
deferring calculations until they are
actually needed. Mind you, this only
particularly applies to order of
calculations, which isn't really a good
enough for picking an implementation
language.
-
It would be
valuable for the "extension language" to be
the language in which the
"pattern
wizards" described above are
constructed so that it is possible to
augment the patterns without a need
to deploy a whole new
system.
The point here is
not to expect users
to write their own patterns, as most won't
be able to cope with this. The
"average
user" is not going to be
writing "Pattern
Wizards," but will rather use
"pre-canned"
ones.
However, an
organization might hire a programmer that
looks around to find the organization's
"favorite
patterns," and create Wizards to
detect them.
Note that the
"pattern wizards" are given
the ability to decide that certain regions of the
spreadsheet are to have certain names, and are to use
common types/formats/formulae.
A logical extension to
this would be to allow attaching database tables
to regions, so that you might have columns that look up
database information based on either static SQL SELECTs, or
look up data based on other cells.
Indeed, if a good
database schema can be arrived at for the data that does
not
represent computed information, it would be very nice to offer
the ability to push the data conformant to the schema into a
relational
database. That provides the merits of:
-
Eliminating arbitrary size
limits, as databases are commonly able to quite
readily cope with managing tables containing
millions of tuples.
-
Improving robustness, as
good database systems support transactional
updates, and generally don't crash very
much.
-
Having the database manage
data validation.
For instance, a relational
database will reject attempts to add data with
invalid dates, where, for instance, an attempt to
do INSERT INTO TRANS (ACCT,
AMOUNT, DATE) VALUES (101, 255.71,
'2002-02-29') will elicit an error message
such as ERROR: Bad date -
external representation
'2002-02-29'.
ZigZag, an invention of
Ted Nelson, is a new type of data structure. For
mathematicians, the key words would be discrete,
multidimensional, locally euclidean, with global directions
(coordinate axes). Nonmathematicians can find an explanation in
the FAQ at the GZigZag project website, but as
a short and very inadequate summary, it is a spreadsheet on
steroids. Or a database on acid. Or a filesystem on ...
whatever.
There was a free
implementation, GZigZag, written in Java using
Swing.
Major properties of the
system are thus:
- Discrete
-
the information is stored
in cells, kind of like a spreadsheet.
- Multidimensional
-
instead of two dimensions,
X and Y, that a spreadsheet has, a ZigZag space can
have any number of dimensions which are
distinguished by strings.
- Locally
Euclidean
-
a spreadsheet is globally
euclidean, i.e. it is a lattice. ZigZag
is only locally euclidean, so the neighborhood of
(i.e. the cells next to) a
given cell looks euclidean: if you go up and come
down, you are back where you were
before.
But if you start from
location 1, and go up, left, down and right, you
might not get back to where you were, let's say you
are at location 2. But if you then go left, up,
down and right from location 2, you get back to
location 1.
The connections in ZigZag
are user-alterable so you can connect any two cells
along any given dimension, but because of the local
euclidean constraint, each cell can be connected to
only one cell in the positive and one cell in the
negative direction on each dimension.
Computer scientists might
note that ZigZag is an interesting special case of
graphs.
-
ZigZag
Structurre
-
Gzz
Replacing GZigZag, since ZigZag was a registered
trademark, this is a "
free software " implementation of the
ZigZag
Zzstructure cell-oriented interconnection
scheme.
Cancelled because of
patents on underlying technology...
|