GROG : Geographical Queries using Graphs

MAINGUENAUD Michel

Institut National des Télécommunications

9 rue Charles FOURIER

91011 EVRY

33 1 60 76 40 40

email : MAINGUENAUD@FRINT51.BITNET

1. Introduction :

In current research toward the design of more powerful Data Manipulation Language (DML), different research groups are simultaneously concentrating their work on logical-based query language. Because of the diversity of the Geographical Information System (GIS) applications [SMSE87] and users, there is a need to design a user-friendly query language.

Part of GIS applications is network management such as roads, telecommunications, railways, etc. These data can be easily represented using graphs. Manipulations of graphs are very often defined with a recursive (or logic-based) formalism. Starting with a graphical query language supporting recursion defined in [CMW87] we propose a DML for GIS applications relying on network management. Section 2 presents various queries adressed to such a GIS. Section 3 presents the syntax of semantics of [CMW87]. Section 4 presents the limitations of [CMW87]. Section 5 presents the extensions [M89]. Section 6 has the conclusions and discussion of further work.

2. Network-oriented queries :

This section presents various typical queries addressed for example to a (road) Network Information Management System :

Query 1 : What are the paths from Nice to Paris ? This query can be evaluated with a graph traversal operator.

Query 2 : What are the paths from Nice to Paris with olny large-city stages ? This query can be evaluated with a graph traversal operator and constraints on the nodes of the graph.

Query 3 : What are the paths from Nice to Paris with AIR FRANCE (AF) company ? This query can be evaluated with a graph traversal operator and constraints on the edges of the graph.

Query 4 : What is the shortest path from Nice to Paris using motorway ? This query can be evaluated with a graph traversal operator, aggregates and constraints on the edges of the graph.

These queries can be written with Horn clauses [CGT88]. But Horn clauses are not well adapted as a user-interface : 1) Horn clauses are not a very user-friendly language for the one who is not used to recursive concepts, 2) Horn clauses can't modelise some typical queries such as :

Query 5 : What are the common parts between a path from Nice to Geneva (which costs less than 10 units) and a path from Paris to Vienna ?

Query 6 : What are the paths without any cycle from Nice to Paris ?

Therefore Network Information Management Systems need a user-friendly Data Manipulation Language which allows recursive queries to be expressed.

3. Starting point of GROG [CMW87]:

Data are modelised by a directed graph. A directed graph G is represented by G(N, E). N is a set of nodes, E is a set of edges between two elements of N. Nodes and edges are labeled. All the graphs which are used here will be multi-graphs : two edges with different labels can occur between two given nodes (to simply multi-graphs will be called graphs).

Definitions :

A graph G is defined by G (NG, EG, [[psi]]G, [[nu]]G, [[epsilon]]G) :

. NG is a set of nodes NG = {n1, ... , np}

. EG is a set of edges EG = {e1, ... , eq}

. [[psi]]G is an incident function

[[psi]]G : EG -> NG x NG

. [[nu]]G is a node labeling function :

[[nu]]G : NG -> In with In = Dn0 x ... x Dnn

i.e. : Name x Population x Museum.

. [[epsilon]]G is an edge labeling function :

[[epsilon]]G : EG -> Ie with Ie = De0 x ... x Dee

i.e. : Company x Departure_time x Arrival_time

A set X {x1, ... } of variables and sets Dij {dij1, ... } (i = e, n) of constants for all the domains of the labeling functions are defined. A hyphen (-) will be interpreted in the labeling function as any value of domain Dij.

Properties : A graph G is such as :

. [[universal]] n [[propersubset]] NG [[existential]] e [[propersubset]] EG / [[psi]]G (e) = (n, a) [[logicalor]] [[psi]]G (e) = (a, n) / a [[propersubset]] NG

(There is no isolated nodes).

. [[universal]] ei [[propersubset]] EG, ej [[propersubset]] EG [[psi]] (ei) = [[psi]] (ej) => [[epsilon]] (ei) != [[epsilon]] (ej)

(two edges with the same extremities have different labels)

Queries :

A graphical query Q defined on a graph G is a set of labeled and oriented graphs {Q1, ... , Qp}. The labels of the nodes can be variables, or constants.

Graph Q (NQ, EQ, [[psi]]Q, [[nu]]Q, [[epsilon]]Q) is such as :

[[universal]] n [[propersubset]] NQ, [[nu]]Q (n) [[propersubset]] Dn0 [[union]] X x Dn1x...x Dnn (Dn0 : set of node-constants)

[[universal]] e [[propersubset]] EQ, [[epsilon]]Q (e) [[propersubset]] De0 [[union]] X x De1 x...x Dee (De0 : set of edge-constants)

Edge-labeling :

A simple label is a tuple of constants ([[propersubset]] Dij), variables or hyphens. (CMW87) generalises labels with a regular expression. These expressions are recursively defined as :

. A label l [[propersubset]] Ie, is such that l is a regular expression

. If S1 and S2 are regular expressions then S1 | S2 (S1 or S2) is a regular expression.

. If S1 and S2 are regular expressions then <S1, S2> (S1 then S2) is a regular expression

. If S is a regular expression S+ (Transitive Closure) is a regular expression.

Labeling examples : AF AF+ <AI, AF>+ <AI | AF>+ _+ ....

Query 3 can be expressed (with Ie = De0) by :

4. Limitations of [CMW87] :

Part of the limitations can be found in [CMW87]. To sum up the main limitations are : 1) type of recursive queries which can be expressed, 2) path management (aggregates, cycles, intersections, etc) 3) conditional expressions, 4) definition of the results of a query. These limitations are studied in detail in [M89].

This section olny presents the limitations of the intersection-typed queries. Query 5 cannot be expressed by the following graph (query) :

The reasons are :

1) It does not allow the whole path Nice-Geneva (resp. Paris-Vienna) to be part of the Paris-Vienna (resp. Nice-Geneva) path.

2) It does not allow to define any aggregate on the path Nice-Geneva.

3) Defining the result graph as the union of differents graphs leads to wrong results in case of aggregate in a multi-graph approach.

4) Defining the result graph to be an edge-independent subgraph homeomorphism between a query-graph Q and a data-graph G gives partial answers for variables x and y (out of the edge- matching between graph G and graph Q)

5. Extensions :

Limitations presented in section 4 lead us to define some extensions. The main extensions presented here are : 1) the manipulation of graph hierarchies, 2) the definition of the variables for path management 3) the definition of new edges 4) the definition of a "global query".

Let D be the set of convinient attributes on graphs, nodes, edges, let dt (sub-set of D) be the set of attributes defined on type t (graph, node, edge).

Let J be an alphabet to define names of variables.

5.1 Hierarchies of graphs :

To manipulate a graph G (N, E), [[Gamma]] a graph labeling function is defined :

[[Gamma]] : G (N, E) -> Ig with Ig = Dg0 x ... x Dg

Let G1 (NG1, EG1, [[psi]]G1, [[nu]]G1, [[epsilon]]G1) and G2 (NG2, EG2, [[psi]]G2, [[nu]]G2, [[epsilon]]G2) be two different graphs :

a) [[Pi]]L : N x N -> E0

[[Pi]]L (n1, n2) = {eLk / [[Psi]]L (eLk) = (n1, n2) n1 [[propersubset]] NG1 [[logicaland]] n2 [[propersubset]] NG2}

E0 = {eLk / [[Psi]]L(eLk) = (ni, nj) ni [[propersubset]] Ni, nj [[propersubset]] Nj [[logicaland]] Ni <> Nj}

b) [[Psi]]L : E0 -> N x N

[[Psi]]L (eLk) = (n1, n2) such as n1 [[propersubset]] NG1 [[logicaland]] n2 [[propersubset]] NG2

c) [[epsilon]]L : E0 -> IL such as IL = DL0 x ... x Ds

[[epsilon]]L (eLk) = (iL0, ... , iLs)

After extension of node ni (ni [[propersubset]] N1) with the graph G2 (N2, E2), the result graph G' is defined by G' (NG', EG', [[psi]]G', [[nu]]G', [[epsilon]]G') :

EL = { ek [[propersubset]] E0 / [[existential]] n1, n2 , n1 [[propersubset]] N1 [[logicaland]] n2 [[propersubset]] N2 [[logicaland]]

ek [[propersubset]] [[Pi]]L (n1, n2) [[union]] [[Pi]]L (n2, n1)}

NG' : N1 [[union]] N2

EG' : E1 [[union]] E2 [[union]] EL

[[Psi]]G' : [[Psi]]G' (ek) = [[Psi]]1 (ek) if ek [[propersubset]] E1

[[Psi]]G' (ek) = [[Psi]]2 (ek) if ek [[propersubset]] E2

[[Psi]]G' (ek) = [[Psi]]L (ek) if ek [[propersubset]] EL

[[nu]]G' : [[nu]]G' (nj) = [[nu]]1 (nj) if nj [[propersubset]] N1

[[nu]]G' (nj) = [[nu]]2 (nj) if nj [[propersubset]] N2

[[epsilon]]G' : [[epsilon]]G' (ek) = [[epsilon]]1 (ek) if ek [[propersubset]] E1

[[epsilon]]G' (ek) = [[epsilon]]2 (ek) if ek [[propersubset]] E2

[[epsilon]]G' (ek) = [[epsilon]]L (ek) if ek [[propersubset]] EL

5.2 Extension of variables :

Graph-typed variable (called gi):

[[gamma]] : J -> G (N, E, [[Psi]]G , [[nu]]G , [[epsilon]]G)

gi -> Gi (NGi , EGi , [[Psi]]Gi , [[nu]]Gi , [[epsilon]]Gi)

A graph can be represented by a set of [dg, {ni, nj, de}]. dg is a set of data on the graph (history, ...). {ni, nj, de} represents the set of edges.

Derived variables :

Node-typed variable (called vi) :

[[nu]]' : J -> Dg x N [[union]] X in fact Dg x N [[union]] X x ø x ø

vk -> {dg, nq} {dg, {(n, ^, ^)} }

where ^ represents the undefined symbol (representation of an "node-edge")

A node-edge is an edge with a unique known extremity (initial one). It does not exist any data about this edge. Function [[epsilon]] applied to this edge will respond the undefined symbol ^. A node-typed variable is a set of graphs. These graphs can be represented by : {dg, {(n, ^, ^)} }.

Path-typed variable (called Vi) :

[[epsilon]]' : J -> Dg x N [[union]] X x N [[union]] X x De

Vij -> (dg, {(n1, nq, de1), ... , (nm, n2, dem)} )

Vi = {Vij}

A user-hidden variable Vij represents a unique path. Data are defined on this variable such as length or aggregates on the edges of the path (a path from n1 to n2 is a set defined as : {(n1, nq, de1), ... , (nm, n2, dem)}). A path-typed variable is a set of variables Vij.

These variables allow to extend the definition of the language allowing :

1) to take into account not olny [[epsilon]] (ei) where ei is an edge of a path but the whole path ([[epsilon]]').

2) to define aggregate functions (such as Min, Length, Count, etc) defined on path-typed variables or/and recursively on these functions

(i.e. : Min (Length (V1))), and classical aggregate-typed variables to memorise the value of an aggregate.

5.3 New edges :

The homeomorphism defined in [CMW87] has to be changed for the language to be able to take into account aggregates. The constraint on the paths such as they have to be disjoint node-to-node has to desappear. An answer is now a set of graphs {Ri} and each element of this set is a result graph (a possible answer to the query). This graph has not to be connexe. The final answer is a logical OR of all the result graphs. But the union of all the result graphs is not a result graph (out of the aggregate functions).

Result graph :

A result graph R (graph Ri) is defined as a graph G (see section 3)

R (NR, ER, [[psi]]R, [[nu]]R, [[epsilon]]R)

but the constraint :

. [[universal]] n [[propersubset]] NR [[existential]] e [[propersubset]] ER / [[psi]]R (e) = (n, a) [[logicalor]] [[psi]]R (e) = (a, n) où a [[propersubset]] NR

(there is no isolated node).

desappears. An answer can be an intersection such as a node-edge (n, ^).

The constraint becomes :

. [[universal]] n [[propersubset]] NR [[existential]] e [[propersubset]] ER / [[psi]]R (e) = (n, a) [[logicalor]] [[psi]]R (e) = (b, n) with a [[propersubset]] NR [[union]] {^},

b [[propersubset]] NR

Edges :

Three types of edges are available :

General properties :

These edges follow the same rules : 1) these edges are oriented, 2) these edges represent the results of sub-queries, 3) these edges are a binary operator (one initial point, one end point), 4) these edges represent paths without cycle.

General definitions :

An edge is defined by [[Pi]]T such as : [[Pi]]T : [[alpha]] x [[beta]] -> [[gamma]]

Let [[alpha]], [[beta]], [[gamma]] be the type of the initial and end points of an edge.

Let T represents the type of an edge : T [[propersubset]] {D, T, Is_e, Ic_e, Ic_ne, Is_n, Ic_n}

Manipulation of edges :

Link : 1)- Direct Link [[Pi]]D

2)- Transitive Link [[Pi]]T

3)- Intersection of paths (edges) [[Pi]]Is_e

4)- Inclusion of paths (edges) [[Pi]]Ic_e

Manipulation of nodes and edges :

5)- Inclusion [[Pi]]Ic_ne

Manipulation of nodes :

6)- Intersection of nodes [[Pi]]Is_n

7)- Inclusion of nodes [[Pi]]Ic_n

Definition of the operators :

To define the operators of GROG, a formal system is thus presented : F (G, M, D0) such as :

Let G be a graph (with the previous definitions)

Let M be a set of manipulations M = {M1, ..., M7}

Function [[Gamma]] is defined by : [[Gamma]] : G -> Dg0 x ... x Dg

Function [[epsilon]]G is defined by : [[epsilon]]G : E -> De0 x ... x De

Function [[nu]]G is defined by : [[nu]]G : N -> Dn0 x ... x Dn

Definitions :

Let OID[[epsilon]], OIDn, OID[[Gamma]] be bijective applications such as :

OID[[epsilon]] : E -> De0 (therefore OID[[epsilon]]-1 exists)

OID[[nu]] : N -> Dn0 (therefore OID[[nu]]-1 exists)

with De0 = {e1, ... , ee} and Dn0 = {n1, ... , nn}

Let PRi be a set of graphs {g1, ... , gg} obtained by the decomposition of a graph Ri (path of a result graph Ri). Edges of graph gi belong to a sub-set of ERi (path or non-connexe edges). gi is a sub-graph of graph Ri. Each gi represents an instanciation of an edge of a query-graph Q.

OID[[Gamma]] : PRi -> Dg0 (therefore OID[[Gamma]]-1 exists)

with Dg0 = {g1, ... , gg}

From now let D0 be De0 [[union]] Dn0 [[union]] Dg0

Axiom 1 : Edge identifier

OID[[epsilon]] allows to associate to each edge of graph G (ei [[propersubset]] EG), a symbol (ei [[propersubset]] D0). OID[[epsilon]](ei) = ei

Axiom 2 : Node identifier

OID[[nu]] allows to associate to each node of graph G (nj [[propersubset]] NG), a symbol (nj [[propersubset]] D0). OID[[nu]](nj) = nj

Axiom 3 : Path identifier

To each path, p, of a sub-query (an edge of graph Q) a symbol (gk [[propersubset]] D0) and a word, M(p), are defined. M(p) is obtained by concaténation (+ or [[Sigma]]) of the symbols (ei [[propersubset]] D0) of the edges respecting the order within the path.

Example : Let p = [e1 e2 e3 e4] be a path

4

M (p) = e1 + e2 + e3 + e4 = [[Sigma]] ek = e1e2e3e4

k=1

Axiom 4 : Node-of-a-path identifier

To each node identified by (nj [[propersubset]] D0) belonging to a path identified by gk (gk [[propersubset]] D0) is defined a couple of symbols (gk, nj)

OID[[Gamma]][[nu]] : PRi x NRi -> Dg0 x Dn0

Axiom 5 : Definition of a word and sub-word

Let a and b be two words of the paths p1, p2 and ek be the symbols of the edges of the paths.

a = M (p1) = ei + ... + ej

b = M (p2) = e1 + ... + ep

a is a sub-word of b <=> b = e1 + ... + ei + ... + ej + ... + ep

(written a - b)

All the definitions here will refer to a unique result-graph Ri. PRi = {gi} represents the instanciations of each edge of the graph Q. Without lost of generality, in the following we olny consider a graph gi (the instanciation of an edge of the graph Q). Pi will refer to an edge, NPi will refer to a set of nodes for a path (the elements of NPi look like (gk, nj) and all gk are equal).

Definitions for edge-manipulations :

To each edge definition [[alpha]], [[beta]], and [[gamma]] will take their value in {[[nu]]', [[epsilon]]'}. They represent the starting-point, the end-point and the result of a query. An example of a query will be given for each definitions.

Direct Link edge : Edge between two nodes n1 and n2

[[Pi]]D and [[Pi]]T are obtained using [[alpha]] = [[beta]] = [[nu]]' and [[gamma]] = [[epsilon]]'

Direct :

The goal is to find the nodes j such as j is a successor of a node i. ({i and j / j [[propersubset]] [[Gamma]]+(i)})

[[Pi]]D : (Dg x (N [[union]] J)) x (Dg x (N [[union]] J)) -> Dg x N x N x De

Example :

Give all the one-edge paths (direct links) between Nice and Paris

Manipulation 1 :

After [[Pi]]D(ni, nj) :

[[Pi]]D (ni, nj) <- {p / M(p) = ek [[logicaland]] [[Psi]]G (OID[[epsilon]]-1 (ek)) = (ni, nj)}

(p is a one-edge-path)

Transitive Closure form :

The goal is to find all nodes j such as there exists at least one path between i and j ({i, j / j [[propersubset]] [[Gamma]]+(i) [[logicalor]] [[existential]] a path from i to j})

[[Pi]]T : (Dg x (N [[union]] J)) x (Dg x (N [[union]] J)) -> Dg x N x N x De

Example : Query 1 is defined by

Manipulation 2 :

After [[Pi]]T(ni, nj) :

[[Pi]]T (ni, nj) <- {g / M(g) = e1 + ... + ek [[logicaland]] [[Psi]]G (OID[[epsilon]]-1 (e1)) = (ni, *) [[logicaland]]

[[Psi]]G (OID[[epsilon]]-1 (ek)) = (*, nj)}

(g is a one-path-graph)

Intersection edge : The goal is to find the common parts between two paths

[[Pi]]Is_e is obtained using [[alpha]] = [[beta]] = [[gamma]] = [[epsilon]]'

[[Pi]]Is_e : (Dg x N x N x De) x (Dg x N x N x De) -> Dg x N x N x De

Even if the intersection is a symetric manipulation, the edge is still oriented to be as close as possible to the natural language (which requires an order in a sentence).

Example : Query 5 (without aggregates) is defined by

Manipulation 3: :

After [[Pi]]is_e (P1, P2)

1/ P1 is unchanged

2/ P2 is unchanged

3/ [[Pi]]is_e (P1, P2) <- P3

P3 = { p / M(p) - M(P1) [[logicaland]] M(p) - M(P2)}

In this example, P1 represents for a result graph, Ri, a path from Paris to Vienna, P2 a path from Nice to Geneva and P3 the set of sub-paths (p) which belong to the path P1 (M(p) - M(P1)) and to the path P2 (M(p) - M(P2)). P3 is a set of graph (sub-path). Each graph may not be connexe and represents the intersection of two given paths. The notion of set is defined for P3 out of the fact that the answer may not be connexe.

Inclusion Edge : The goal is to find a path which is contained in an other path.

[[Pi]]Ic_e is obtained using [[alpha]] = [[beta]] = [[gamma]] = [[epsilon]]'

[[Pi]]Ic_e : (Dg x N x N x De) x (Dg x N x N x De) -> Dg x N x N x De

Example : What are the paths from Paris to Nice and from Lyon to Marseille such as the path Lyon-Marseille is included in the path from Paris to Nice ?

Manipulation 4 :

After [[Pi]]ic_e (P1, P2) :

1/ P1 <- {p1 / p1 path (of P1), p2 path (of P2) M(p1) - M (p2)}

2/ P2 <- {p2 / p2 path (of P2), p1 path (of P1) M(p1) - M (p2)}

3/ [[Pi]]ic_e (P1, P2) <- P1

Let PN be the word of a path from Paris to Nice and LM be the word of the path from Lyon to Marseille :

(1) The edge defined with the nodes Lyon and Marseille symbolises a path p1 from Lyon to Marseille (M(p1) = LM). This path is included in PN (M(p2) = PN)(with the definition of the inclusion M(p1) - M(p2)).

(2) The edge defined with the nodes Nice and Paris (p2) symbolises a path p2 from Paris to Nice such as the word (path) containts a sub-word (path) LM (with the definition of the inclusion M(p1) - M(p2)) after application of point (1).

(3) The edge defined with point (2) and point (1) symbolises the same path as in point (1).

Cycles :

In the previous query the cycles are forbidden (by definition of the edges). But the query defined here which may seem similar authorizes cycles to appear :

Each edge is a path without cycle (by definition), but there could exist a node in common between a path Paris-Lyon and Marseille-Nice.

Definitions for edges and nodes manipulations :

Inclusion Edge : The goal is to find a node (or a set of node) which belongs to a path.

[[Pi]]Ic_ne is obtained using [[alpha]] = [[gamma]] = [[nu]]' and [[beta]] = [[epsilon]]'

[[Pi]]Ic_ne : (Dg x (N [[union]] J)) x (Dg x N x N x De) -> Dg x N

Example : What are the stage-cities of a path Paris-Nice ?

Manipulation 5:

After [[Pi]]ic_ne (N1, P1) :

1/ N1 <- {ni / OID[[nu]](ni) is the origin or the extremity of an edge ek such as ek - M(P1)

([[Psi]] (OID[[epsilon]]-1 (ek) ) = (ni, *) [[logicalor]] [[Psi]] (OID[[epsilon]]-1 (ek) ) = (*, ni) )}

2/ P1 <- {p1} if N1 != [[emptyset]]

[[emptyset]] otherwise

3/ [[Pi]]is_ne (N1, P1) <- N1

Definitions for node-manipulations :

Intersection edge : The goal is to find nodes which belong to two paths

[[Pi]]Is_n is obtained using [[alpha]] = [[beta]] = [[gamma]] = [[nu]]'

[[Pi]]Is_n : (Dg x (N [[union]] J)) x (Dg x (N [[union]] J)) -> Dg x N

Example : What are the common stage-cities of two paths Nice- Paris using tool-roads (a) and using major-roads (RN) ?

Manipulation 6:

After [[Pi]]is_n (NP1, NP2) :

1/ NP1 is unchanged

2/ NP2 is unchanged

3/ [[Pi]]is_n (NP1, NP2) <- {ni / ni [[propersubset]] NP1 [[logicaland]] ni [[propersubset]] NP2}

In this example, intersection edge represents the set of nodes (ni) which are stage-cities of two given paths Paris-Nice.

Inclusion Edge : The goal is to find the set of nodes included in an other set of nodes.

[[Pi]]Ic_n is obtained using [[alpha]] = [[beta]] = [[gamma]] = [[nu]]'

[[Pi]]Ic_n : (Dg x J) x (Dg x J) -> Dg x N

Example :

What is the set of stage-cities of a path Lyon-Marseille such as each of them is also a stage-citie of a path Paris-Nice ?

Manipulation 7:

After [[Pi]]ic_n (NP1, NP2) :

1/ NP1 <- {ni [[propersubset]] NP1 / ni [[propersubset]] NP2}

2/ NP2 is unchanged

3/ [[Pi]]ic_n (NP1, NP2) <- NP1

The difference between the intersection and inclusion notions is such that the inclusion leads to a reduction of the cardinality of set v1 because set v1 has to be included in set v2.

A node which is a stage-city in a path between Lyon and Marseille but not a stage-city between Paris and Nice will not be considered (even if it is a stage-city of a path Lyon-Marseille).

By the end query 5 can be expressed like :

5.4 Global queries :

The juxtaposition operator // does not generate a graph but a global query (by definition). This allows to defined queries with a logical OR such as i.e. What are the direct paths (without any stages) from Nice to Paris with AF company or if we have to stop, we must have a stop at Marseille whatever the company is. A global query Q is a set of sub-queries Q1, ... , Qp :

. Qi (Ni, Ei, [[Psi]]i, [[nu]]i, [[epsilon]]i) and Ni, Ei, [[Psi]]i, [[nu]]i, [[epsilon]]i are defined as previous.

Q = Q1 (N1, E1) // ... // Qp (NP, EP)

The juxtaposition operator // allows queries to be defined in parallel.

Properties :

The juxtaposition (//) and the fusion (&&) are associative and fusion (&&) is distributive on the juxtaposition (//). The results in case 2 and 3 are global queries.

1) G1 (N1, E1) && (G2 (N2, E2) && G3 (N3, E3) ) =

(G1 (N1, E1) && G2 (N2, E2)) && G3 (N3, E3)

2) G1 (N1, E1) // (G2 (N2, E2) // G3 (N3, E3)) =

(G1 (N1, E1) // G2 (N2, E2)) // G3 (N3, E3)

3) G1 (N1, E1) && (G2 (N2, E2) // G3 (N3, E3)) =

(G1 (N1, E1) && G2 (N2, E2)) // (G1 (N1, E1) && G3 (N3, E3))

The juxtaposition operator (//) can be seen as a logical "OR" between two queries :

Example :

is by definition different from the fusion (&&), which can be seen as a logical "AND". Therefore this graph is different from the previous graph :

An other expression for the fusion (&&) is :

Results :

A global query is such as :

Q = Q1 // ... // Qp = // Qi (i = 1, ... , p)

Q has no solution if :

[[universal]] i [[propersubset]] {1, ... , p}, Qi has no solution

Qi has no solution if :

. An edge of graph Qi has no instanciation or

. A variable has no instanciation (value equal to [[emptyset]]) unless if it is required. Asking a value to be equal to [[emptyset]] is a way to express the negation concept.

6. Conclusion :

Part of Geographical Information Systems (GIS) is network-oriented information management applications. This paper presents a Data Manipulation Language based on [CMW87]. This language allows the user to define easily graph-manipulation-oriented queries. The implementation of this language is realistic [P87, M89]. Graphical screens and Extended Data Base Mangement System are very convinient tools. Transitive closure algorithms in a Data Base context have been widely studied [BR86].

The partial implementation (recursive queries without variable) was done two years ago. It was based on a weak couplage between a non-first normal form RDBMS [S88] and Prolog. Transitive Closure operators with aggregates were taken into account by Prolog and paths were stored in the RDBMS. A macintosh-like interface allowed the user to define a simple recursive query (are very complex queries realistic?) and to get information from an edge of the result graph.

Future work will be oriented toward the definition of a thematic Data Manipulation Language and a study of the links between the network component approach and the thematic component approach to define a complete Geographical Data Manipulation Language.

Acknowledgments :

The author would like to thank R. Jeansoulin for many helpful comments on this work.

REFERENCES

BR 86 F. Bancilhon R. Ramakrishnan

An amateur's Introduction to Recursive Query Processing Strategies

Proceedings of the ACM SIGMOD conference

Washington May 1986

CGT88 S. Ceri G. Gottlob L. Tanca

Logic Programming and Databases

Springer Verlag 1988

CMW 87 IF Cruz AO Mendelzon PT Wood

A graphical Query Language supporting recursion

Proceedings of the SIGMOD conference

San Fransisco May 1987

M89 M. Mainguenaud

Un langage visuel de manipulation de graphes :

Application aux Bases de Données Géographiques

Doctorate Thesis - Université Paris VI 6/28/89

P87 P. Pfeffer

LGV : un langage graphique de manipulation de réseaux

DESS Report

Université de Paris-Sud Orsay July 1987

S88 M. Scholl and all

VERSO : a DBMS based on Nested Relations

Nested Relations and Complex Objects Workshop - Darmstadt

Lecture Notes in Computer Science

Springer Verlag December 1988

SMSE 87 TR Smith S. Menon JL Star JE Ester

Requirements and principles for implementation and construction of large scale GIS

International Journal of GIS Vol 1 ndeg. 1 1987

GROG : Geographical Queries using Graphs

MAINGUENAUD Michel

Institut National des Télécommunications

9 rue Charles FOURIER

91011 EVRY

33 1 60 76 40 40

email : MAINGUENAUD@FRINT51.BITNET

Abstract :

This paper describes a graphical Data Manipulation Language (DML), GROG. The central notion of GROG is that it allows recursive queries to be expressed very easily. GROG is a DML for a Geographical Information System (GIS).

This language is said to be graph-oriented by the way data are modelised, queries and the results of a query are expressed .

Keywords : Geographical Information System, Data Manipulation Language, Recursive Queries.