This article describes XML-GL, a visual query and transformation language for XML. The use of XML-GL does not rely on the availabilty of schema information. If some kind of schema is available editing and creating queries should be easier, but the languages evaluation model does not rely on a schema.

It is on the other hand possible to express schema information with XML-GL, with more expressiveness than in a DTD, but no type system like in XML-Schema for the textual information is available.

A XML-GL program is a set of pairs (but at least one pair) of XML-GL graph arranged horizontally, the left representing the query pattern and the right for the construction pattern. A XML-GL schema is a XML-GL graph enriched with some extra constructs for conditions, that are not mentioned in the context of query programs.

A graph consists of labelled boxes representing elements. Their hierarchical structure is modelled through directed edges. If the structure modelled this way is a graph the meaning is straight forward for schema graphs as well as for patterns. To denote the relevance of order for child elements, the first edge is crossed by a short stroke. If the structure is an acyclic graph, but not a tree anymore and it is used as query pattern, it can be interpreted as a join. This is also the case, if two trees share nodes. In a schema graph the edges are labelled with multiplicity information like edges of ER-Diagrams. Another special visualisation is used for nodes containing only textual content -- they are drawn as hollow circles. Filled circles depict attributes. The edges to those circles are labelled with the element name or the attribute name.

A feature used solely in schemas is an xor arch, crossing two edges. To model mixed content of plain text and elements the elements and a text node circle are xor-ed. The edge to the textnode circle is labelled with "content" to indicate that it is not a text-only element. Therefore in this circumstances it is not possible to construct a text-only element called "content".

The base idea for the use of XML-GL graphs for querying has already been explained in the section about [comai00graphical]. Further constructs are elplained now. There exist three kinds of construction primitives for aggregation purpose:

Plain boxes : They are used to construct elements named after their label for each matched element on query side attached to them.
Triangles : They simply list all the matched elements from the query part they are attached to explicitly or by name.
List icons: They react as a kind of combination of both, by aggregating the matched elements according to a further grouping condition explicitly related to by an edge.

Further constructs are available for nesting and unnesting of content, which is a powerful tool for preventing recursive queries. Nevertheless it does not seem to be possible to express queries about the structure of the queried data with them, which is usually simple with recursion.

Graphical Query Languages for Semi-Structured Information

Sara Comai.

Sara Comai's PhD Thesis presents two graphical query language called XML-GL and WG-LOG. XML-GL presented here is not very similar to the language XML-GL presented in [ceri99xmlgl], but it is likely to be an evolutionary step due to the ongoing research on this language. Furthermore this article is quite condensed and therefore maybe covers less details than the older paper [ceri99xmlgl]. Both languages are based on graphs.

WG-Log is based on G-Log, a database language intended to query complex objects through the use of graphs and pattern matching. The patterns are explicitly based on schemas. A query consists of ane or many graphs, each representing a rule. Opposed to other approaches like VIPR [erwig00visual] and also visXcerpt, the query and the construction part are integrated in one graph: the querying graph structure is constructed through thin or red lines, while the construction structure is constructed through green or thick lines. This greatly indicates the relation between information flow between query and construction part -- they share the same node, making variables obsolete. A possible drawback is, that complicated graphs may tend to be cluttered with lot of edges, but this is speculation and may need to be investigated.

The property of being explicitly schema based is used to provide very informative graph structures not only using named nodes, but also named relations. On the other hand WG-Log is only applicable to schema based data, while semi structured data encourages the use of schema free data. This domain is covered by XML-GL.

XML-GL uses pairs of graphs arranged horizontally and separated by a line to represent rules. Graphs consist of boxes with names inside, representing element names. Complex programs may consist of various rules. The graphs are used as query and construction patterns. Because XML data is tree structured, simple queries can be stated with trees as query patterns. Graphs are used to model joins. Information is carried from the query side to the construction side through tag name match, meaning that an element with a certain name in the query pattern is carried to thr corresponding position in the construction pattern, where the same name occurs. Various constructs exist to use unnamed data and to overcome ambiguity. If e.g. a name occurs more than one time in a query graph (due to use of a join for example), a line connecting the relevant query and construction node is used.

XML-GL is explaiyned in more detail in [ceri99xmlgl]

XML Queries and Transformations for End Users

Martin Erwig.

Xing is a form-based interface to express XML queries and transformations by so-called "document patterns". The main idea is to let the user draw samples of the data she indends query from the data source. Optionally a pattern for the structure of the resulting document may be provided. These two document patterns together are called a document rule. To transport information from the querying pattern to the result generating pattern variables (so called aliases) are used.

The definition of this system was based on following design goals:

Do not define a(nother) textual language
Avoid XML syntax
Use a simple and intuitive visualisation of XML
Employ pattern matching
Facilitate restructuring
Keep the system simple and intuitive

Based on these constraints it got nescessary to define a visualisation of XML. This is called the document metaphor which is strongly influenced by documents like product descriptions and forms. XML Documents are visualized as follows:

XML elements are represented by boxes (with rounded edges)
Tag names are written in bold font above the corresponding element on the left
Elements containing only plain text are shown as the (bold) tagname followed by the text on the same line and both separated by a colon.
Attributes are displayed like the former abbreviated elements, except that the attribute name is not bold faced.
elements containing text mixed with other elements (like HTML/XHTML documents) are not supported.
XML name spaces are neither supported

Document patterns are build as instances of the document metaphor. Further elements are:

Regular expressions : Regular expressions (and therefore also simple wildcards) may be used instead of tag names
Aliases : matched patterns may be enclosed in curly brackets prefixed by a name. It is possible to refer to this name in the result pattern.
or.patterns : Or patterns are shown like or-statements in regular expressions - the components are arranged horizontally and joined by a vertical bar.
Deep queries : Deep queries may match at arbitrary depth in the document tree below their occurence. They are depicted as a prefixed ellipsis to the pattern to use for the deep query.

Kaleidoquery: A Visual Query Language for Object Databases

N. Murray , N. Paton and C. Goble.

Kaleidoquery is a visual query language for object databases. It is a visual syntax that is back-ended by the textual query language OQL. The main philosophy behind Kaleidoquery is a filter flow metaphor. This can be compared somehow to a functional paradigm. Kaleidoquery is modelled towards the needs of casual users, that want to create ad-hoc queries. The iconic nature is useful for users not aware of the database schema -- they can deduce it from the visualisation.

A query can be understood as a chain of filters, with all database extends passing through this chain and being successively reduced until the intended result is left over. The first element if this filter chain is therefore an icon representing a certain class in the database. The database author is urged to provide icons to database classes. The chosen icon is the lowest member of the filter structure. For a projection of some attributes, a vertical arrow to a oval with the attributes to project is used. A filter network without projection as last filter step has no output. This is useful for joins, that are explained later.

Further simple constraints (like less, equal,...) can be placed in textual notation on this uprising filter path. To express conjunction of constraints, all constraints are attached vertically by arrows. For disjunctions, the filter flow is split in many parallel vertical filter flows, each strand represents one disjunct. The somehow horizontal arrangement of disjuncts and vertical arrangement of conjuncts is quite common in visual languages. Negation is expressed by inverting the corresponding simple constraints colours.

Joins are expressed by parallel filter expressions sharing a simple binary constraint, like equality. One side of the equality is then located in one sub filter, the other side in the other.

Object oriented databases may store complex structures where objects may contain other objects as members. Refining a query based on certain members of selected objects along the filter flow is a common operation. Shifting the selection to the members of an object is called navigation. Navigation is represented by a horizontal arrow with the member's name written above it.

In object databases a common operation is checking for membership of objects. For this purpose a curly horizontal arrow with the word member is used. The same curly arrow is also used for the for all and the exists quantifiers found in OQL.

Further expressions like functions, grouping and ordering constructs are expressed by wrapping the ovals with further ovals and the corresponding operation included in the wrapping structure. This corresponds to the nested term structure of the textual OQL counterparts.

VXT: Visual XML Transformer

E. Pietriga and J.-Y. Vion-Dury.

XVT is a visual language integrated into an interactive runtime and editing environment. It's main purpose is transformation rather than querying. It is a pattern based language. Because the emphasis is on transformation rather than on querying, some commonly used query language features like aggregation are not covered.

The roots of XVT can be found in XSLT. Several rules provide a template mechanism. A rule consists of a query and a construction part. Information matched on the query side are collected through links with slots on the construction side and the links can be typed as copy-, textual extraction- or template application-relation.

The XML visualisation is a so called visual treemap representation: Elements are displayed as hotizontally aligned nested boxes. Above the box a label indicates the element name. Text nodes are represented as diamond shaped boxes, attributes a triangles.

VXT: A Visual Approach to XML Transformations

E. Pietriga , V. Quint and J.-Y. Vion-Dury.

This atricle presents VXT in more details, compared with [pietriga01xvt2]. The further aspects covered here are related to document type definition, further language constructs and the translation of VXT programs into XSLT.

A very intersting aspect mentioned is, that very little extension is nessecary to use this visualisation mechanism as document type definition formalism. The proposed extensions are:

conditions or choices: choices are modelled by stacking the different choices vertically.
zero or one occurence: Those elements are surrounded by a dashed box.
zero or many occurences: Like zero or one occurence those elements are surrounded by a dashed rectangle, but a second emply dashed rectangle is attached to it's right through a short dashed line.
one or many occurences: in this case the same notation as zero or many occurences is used, except that the left dashed box is drawed with a solid line this time.

The document type definition presented this way can be used as translucent overlay or template mechanism inside the editing window to support easier transformation writing.

Aesthetic Programming

Paul A. Fishwick.

This work presents a new concept called aesthetic programming - a combination of traditional methods of computer programming with art and artistic concepts. The visual approaches presented are appliable as well to modelling as to programming. There is no concrete visual language that is proposed, rather than a methology for developing visual models or programs. The main idea is to provide artistic constructs containing sufficient symbolism and concise metaphoric mapping to be executable an a computer.

The methology proposed is called rube. It is intended to help modelers to construct aesthetic models for physical phenomena as well as to design programs. This methology is based on following steps:

Choose what is to be modelled
Select the type of model (e.g. automata, data flow network, petri net, graphs, ...)
Choose an aesthetic that is suitable to map elements from previous steps via metaphors to elements of the aestetic style (e.g. states may be mapped to rooms and transitions may be mapped to pathways, if this representation is intuitive for a concrete automata to model - this concrete style provides a internal view to the automata opposed to an external view)
Define a mapping between stylistic components and the formal model type components
Create the model or program

From Queries to Answers in Visual Logic Programming

Jordi Puigsegur , W. Marco Schorlemmer and Jaume Agusti.

This work presents a visual declarative programming language based on two graphical constructs: directed acyclic graphs and graphical set inclusion. The acyclic graphs are used to represent predicate application, set inclusion represents logical inclusion. This way functional and relational programming style is combined into a visual language. Furthermore the operational semantic of this language is defined in visual inference rules. This means, that instead of translating the visual language into a textual logic based language on which inference rules can be expressed the usual way, the rules are expressed as tuples of diagrams.

The language consists of visual sets, elements and diagrams - elements are used as equivalent to terms in logic based languages and sets correspond to predicates. Diagrams combine defferent predicates and terms. A diagram is the smallest complete unit of description in this language.

A Visual Approach to XML Document Design and Transformation

Kang Zhang , Da-Qian Zhang and Yi Deng.

This paper presents a visual approach to automatic validation and transformation of XML documents. The approach uses a formal graph transformation mechanism, based on the reversed graph grammar (RGG) that has been used for the automaticgeneration of domain-oriented visual programming languages. RGG's are collections of graph rewriting rules based themselves on graphs.

The transformation idea is in fact similar to XSLT transformations: each rule matches some elements and for further transformations the templates are applied recursively to the child nodes. In contrast to XSLT, it is possible to work on more complicated structures, than only nodes in one rule. The relevant nodes are referenced through numerical IDs.

The hierarchical XML document structure is represented through trees on rectangular nodes with the labels inside the bounds. Those rectangles are attached through edges to indicate child/parent relationships. At the top of each box, there is a smaller box which indicates the type of the parent node. This box is the slot of the parent/child edge. At the bottom of the node box are the slots indicating the possible child elements

The document type system modelled through the slot mechanism is useful to see document instances at the same time as the document type. It is likely to be less expressive than the DTD formalism, because alternative node types are not indicated on the node instances.

QURSED:Querying and Reporting Semistructured Data

Yannis Papakonstantinou , Michaelis Petropoulos and Vasilis Vassalos.

The QURSED project provides a tool for authoring web-based query forms and reports. To create the query forms, the developer may use any web page editor. These forms are backended by a query language called TQL (tree query language). A editor for tying query forms to TQL programs (the QURSED editor) is presented. At this level the developer has to cope with the tag structure of the query forms to provide bindings to constructs of the TQL program. By populating the form the user ties concrete values to variables im the TQL progran and the query can be executed.

A TQL program is a tree consisting of constant nodes (they are labes that match elements in the queries XML tree), node name variables, literal variables (at leaf level), AND and OR constructs providing the ability to construct conditions based on comparison of variables (to literals or to other variables). In the end a TQL program is translated into a XQuery program that is executed to produce the result.

The TQL program editor is a classical tree view

The tree view used in this project is Swings JTree found in java

with nodes representing the language components. The editor model is the default behaviour providet by the windowing toolkit used to implement the TQL program editor.

Compared to Xcerpt, TQL is much simpler to implement, but the language is also much more restricted. Programs written in TQL tend to be longer and less readable then Xcerpt programs.

The Unified Modeling Language Reference Manual

J. Rumbaugh , I. Jacobson and G. Booch.

This book describes UML, a visual modelling language used to specify object oriented software. The language provides a very extensive visual vocabulary to model different aspects of object oriented software. They are separated in different aspects. A diagrammatic way of visual modelling is used. UML does not define one diagram type combining all aspects of object oriented modelling together. This are the different aspects of UML:

static design : This can be described through class and object diagrams. They mainly consist of boxes and typed edges forming a graph. Boxes describe objects or classes. They contain the class/object name, atomic members (in textual representation) and the members multiplicity. Atomic members are objects not further described in the formalism or simple values like numbers and strings. Other members are described by other boxes. The different edges describe association and inheritance, both basic concepts of object oriented formalisms. The association-edges are used to associate non atomic members to classes.
dynamic design : For dynamic design three different types of diagrams are providet: sequence diagrams and state transition systems and use case diagrams.
State transition systems are directed graphs, with labelled edges and nodes. Nodes represent the state of an object, edges transitions between different states. A state is displayed similar to objects or classes in the static diagrams, but only atomic members are shown and they have values assigned. The transitions are labelled with names (corresponding to method members names in the static diagrams).

Sequence diagrams are used to show the relationship of time and messages passed between objects for certain situations. Therefore different objects are represented through vertical bars indicating the activity of the object. The objects (arranged horizontally) pass messages through arrows to each other. Arrows are labelled with labels indicating the messages (corresponding to methods in the static design combined with concrete values). Different types of arrows indicate blocking or non blocking calls and results.

Use case diagrams are a very vague formalism to show the interaction relation of actors with the system and system components between each other. Components could be actions or transactions. Components can be grouped together by boxes to indicate system boundaries. Use case diagrams are graphs with object and actor nodes. Components are ovals with names, actors are depicted through named stickmen. Edges indicate a relationship between actor and component, inheritance between components and inclusion of components.

UML is not intended to be used as programming language rather than as design and modelling language. Therefore the visual constructs describe vague terms and no operational semantic is available for UML. Nevertheless UML can be used to build skeletons of software projects providing a basis for further implementation.

- comparison of class boxes with xmlVis and Xings document metaphor

Query-by-Example:a data base language

M.M.Zloof.

QBE (short for Query-by-Example) is a database management language. The base idea is, that the user fills in examples of the data he/she intends to modify or query. The underlying data model for QBE is the relational data model. Because relational databases can be seen as a set of tables, the example queries edited by the user are tables.

QBE is as visual query language language, because it relies on tables which prodide a way of arranging information in two dimensions. Except of underlining text (to indicate variables), further constructs of QBE are purely textual.

In QBE a query starts with a empty table skeleton based on the schema of the data. The user may now fill in fixed values in rows for constraints on the result data. For different alternative constraints, more (implicitly OR-ed) rows are filled in the table. The result of such a query is a table containing only tuples found in the database fulfilling the constraints. The result schema is the same. To return data in different schema, the user may use variables and the keyword P. (for print) in columns relevant for the result. The same variables may be used in other (temporary result) tables. Temporary result tables may contain variables of different tables occuring in the same query. The same variable can also be used in different queries - this pattern is called link. This way joins over different relations can be formulated.

For further qualification of variables or constant values, inequality operators and negation can be used, as well as arithmetic expressions. There is also a so called Conditions box: expressions over variables, arithmetic expressions, equations and AND and OR concatenated expressions can be places here. It is intented to be used, when conditions are difficult to express in table cells.

For grouping and aggregation further keywords are provided: ALL. prevents obmission of duplicate elements in the result, CNT. counts the matching tuples, SUM. to add all matched numeric values together, MIN. and MAX. to get the minimum/maximum of the matched values and AVG. to calculate the average of the matched numeric values.

To be able to fully manage a database, not only querying but also inserting, deleting and updating is nessesary. They keywords therefore are D. for delete, U. for update and I. for insert. I. can also be used to create new tables in conjunction with keywords describing the valid database types. These operations are used similar to the print operation.

QBE has been used as conceptual framework for many successful commercial database systems like MS Acess. For a certain extend of feature completness and for the sake of comparability, the expressiveness of query languages for relational databases are compared to the relational algebra or to the tuple calculus (which provide the same expressiveness) : QBE can directly be translated into the tuple calculus and is therefore as expressive as for example SQL. On the other hand it is much easier to use especially for casual users - The knowledge about filling in forms is wider spread, than the knowledge about textual database languages.

Visual Haskell: A First Attempt

H. John Reekie.

Visual haskell is a visual functional programming language based on the textual functional language haskell. It is very tightly bound to haskell. This way haskell programs can be seen or edided as well in the textual and in the visual representation. The goal is to provide a visual representation to all constructs found in haskell.

The main concept behind visual haskell is the dataflow diagram. This seems to be generally the widest spread visual view for functional programming languages. A function is defined as a box labelled with it's name and optional an icon or a visual representation. These icons are used to provide new graphical elements for application-specific datatypes and functions. This is important to give programs instantly recognizable meaning in particular application domains. At one of the outer sides of the boxes bound (usually the right side) there are slots for the input arguments, at the opposite side (usually the left side) a slot for the output is available. At the inner side of the input argument slots side, visual patterns can be attached with further slots to pass the pattern variables to concrete function bodies. This is done through arrows. Different patterns and different corresponding bodies can be stacked (separated by a line) inside the function definition box.

An icon representing a function may have a special grey box representing a visual variable for the functions first argument. The special focus on the functions first argument is due to a wide spread functional methodology called currying - with currying a n-ary function may be applied to m (<n) arguments resulting in a (n-m)-ary function (This way compilers or virtual machines can break down every function application into a serie of unary function applications). Functions in functional libraries are often designed in a way where applying something to the first argument results in a very handy specialized function. This can be done in a very cheerful way in visual haskell because of the visual variables inside iconic function representation.

The design of a completely visual OOP language

W. Citrin , M. Dhoerty and B. Zorn.

This publication presents a visual imperative object-oriented programming language called VIPR. VIPR's semantic is inspired by C++, but it may be understood in terms of a small amount of graphical rewriting rules. Features like continuations and dynamic dispatch, which exist implicitly in C++, can be made explicit for the sake of easier program comprehension. The static view of programs as well as the visualisation of dynamic execution can be presented with the same visual metaphors.

In VIPR programs are represented as nested circles. The notation can be seen as a viewing a flowchart from the point of a bead moving along the flowlines. The nested circles can also be interpreted as segments of a pipeline with a point of view inside this pipeline. VIPR defines objects and classes by aggregating definitions of data and procedures together visually. Interfaces are defined by hiding all the private methods and data. Dynamically dispatched methods contain branches to all the appropriate implementations of the method and the correct branch is selected dynamically based on the type of the object being called.

Procedures and functions are sets of nested rings outside the main procedure's rings. A procedure call is an arrow pointing to the procedure with nodes attached to it representing the arbuments. Nodes attached to the outermost ring of the procedure represent parameters. Each procedure must have at least one parameter, representing the continuation or the return address. The assignment of arguments to parameters is done through positional matching

Classes and objects are viewed as aggregations of methods. In essence, a VIPR class is a grouping of individual methods, each of which looks like a strand of fiber being viewed head-on. Subclasses are enclosed inside the aggregate representing the superclass. This is the opposite of the usual way of modelling class hierarhies, where subclasses indicate their superclass. This implies that the superclass is modified when a subclass is defined, but the system environment can automatically deduce this for the user. This way only single inheritance is supported.

The use of arrows in a visual environment usually leads to scalability problems concerning readability. Therefore VIPR programs may be enriched by text where it simplifies the representation. Nevertheless, the programs semantic is fully determined through non textual components. Therefore the textual elements may be seen as documentation or syntactically restricted comments (the textual representation of visual elements must is the syntactical C++ equivalent)

Short Descriptions of Relevant Articles

XML-GL: A Graphical Language for Querying and Restructuring XML Documents

Stefano Ceri , Stefano Ceri , Ernesto Damiani , Piero Fraternali , Stefano Paraboschi and Letizia Tanca.