Advanced C++ User Guide

Chapter 4

Data Integrity

The first section in this chapter introduces three data integrity facilities: inverse data members, illegal pointer detection, and protected references. The rest of the chapter covers inverse data members in detail.

Data Integrity Facilities

Many design applications create and manipulate large amounts of complex persistent data. Frequently, this data is jointly accessed by a set of cooperative applications, each of which carries the data through some well-defined transformation. Because the data is shared, and because it is more permanent and more valuable than any particular run of an application, maintaining the data's integrity becomes a major concern and requires special database support.

ObjectStore provides facilities to help deal with three of the most common integrity maintenance problems:

Inverse Members

One integrity control problem concerns pairs of data members that are used to model binary relationships. ObjectStore allows you to declare two data members as inverses of one another, so they stay synchronized with each other according to the semantics of binary relationships. This works for pairs of data members that represent one-to-one, one-to-many, and many-to-many relationships. See Inverse Data Members.

Illegal Pointers

Another integrity control problem concerns illegal pointers. ObjectStore can detect two kinds of illegal pointers:

ObjectStore provides facilities that automatically detect such pointers upon transaction commit. You can control the way ObjectStore responds when illegal pointers are encountered; ObjectStore can either raise an exception or change the illegal pointers to 0 (null). See objectstore::set_null_illegal_pointers() in the C++ A P I Reference.

References to Deleted Objects

Use an instance of os_Reference_protected<T> instead of a pointer when you want ObjectStore to detect references to deleted objects. os_Reference_protected<T> has an interface like os_soft_pointer<T>. See Using ObjectStore Soft Pointers.

After the object referred to by an os_Reference_protected is deleted, resolution of the os_Reference_protected causes an err_reference_not_found exception to be signaled. If the referent database has been deleted, err_database_not_found is signaled.

See os_Reference_protected in the C++ A P I Reference.

Inverse Data Members

ObjectStore allows you to model binary relationships with pointer-valued (or collection-of-pointer-valued) data members that maintain the referential integrity of their inverse data members. You implement this inverse maintenance by defining an embedded relationship class, which encapsulates the pointer (or collection-of-pointers) so that it can intercept updates to the encapsulated value and perform the necessary inverse maintenance tasks.

The ObjectStore class library contains the necessary relationship and collection classes, as well as a set of macros to simplify the use of these classes. In general, when you use a class that has inverse members, you can access these members as if they were simple data members. The code that manipulates the instances need not be aware of the inverse maintenance that is occurring, because this is entirely hidden by the relationship class implementation.

To use ObjectStore's relationship facility, you must include the files <ostore/relat.hh> along with <ostore/ostore.hh> and <ostore/coll.hh>. The #include line must place <ostore/relat.hh> after the other two, in the following order:

ostore/ostore.hh, ostore/coll.hh, ostore/relat.hh

Inverse Member End-User Interface

As a relationship definer (that is, the definer of the class that contains relationships), you have a number of options for presenting the relationship to that class's users. Suppose, for example, that the class part has a single-valued relationship container that points to the part that contains this one. Then the end user of the part class can be presented with any of the following interfaces for getting and setting this relationship:

Getting relationships
otherpart = somepart->container; /* simple data member */
otherpart = somepart->container.getvalue();  /* relationship *
otherpart = somepart->get_container();/*functional interface */

Setting relationships
somepart->container = otherpart; /* simple data member */
somepart->container.setvalue(otherpart); /* relationship */
somepart->set_container(otherpart); /* functional interface */
Simple data member interface

The first style of interface is called simple data member because the end user interacts with the relationship exactly as if it were a simple data member of type part*. The end user need not be aware that special inverse-update processing is occurring.

Relationship interface

The second style of interface is called relationship because it treats the container data member as an object in its own right (that is, a relationship object). In other words, if somepart refers to a part, then somepart->container refers to a relationship instance and somepart->container.getvalue() returns the value of the relationship.

Functional interface

The third style of interface is called functional because it encapsulates all access to the relationship inside functions defined on the class part.

Note that it is completely up to the class definer to decide which of these interfaces to export to the class's end users. The underlying ObjectStore library interface to relationships supports all of them and, in fact, a class definer could choose to export more than one. It might do so, for example, so that the end user could do either of the following:

p->set_container(q) 

or

p->container.setvalue(q) 

Similarly, for the many-valued relationship contents, which lists a part's subparts, any of the following interfaces could be presented to the end user:

Getting contents
os_collection* subparts;
subparts = somepart->contents; /*  simple data member */
subparts = somepart->contents.getvalue(); /*  relationship */
subparts = somepart->get_contents(); /* functional interface */

Setting contents
somepart->contents.insert(otherpart); /*  simple data member */
/* relationship* /
somepart->contents.getvalue().insert(otherpart);
somepart->insert_contents(otherpart); /*functional interface*/

Again, deciding which of these interfaces to export to the end user is under the control of the class definer. The ObjectStore library interface to relationships supports all three.

About
m side of relationships

The size of an os_relationship m data member is 8 bytes, 4 bytes for the pointer to the os_collection and 4 bytes for the vtbl.

The collection for an m side of an os_relationship data member is created upon the first insertion into the collection.

You control the size and placement of the collection by calling os_relationship::create_coll() in the constructor of the class that contains the os_relationship m data member.

Presizing the collection yields the best performance in terms of eliminating mutations as the collection grows and in terms of clustering.

Defining Relationships

To define a class that has relationships, you define a data member by using the appropriate relationship macro. This relationship macro defines the appropriate access functions for getting and setting the relationship. You then instantiate the bodies of these functions by using another macro. Because most of the access functions have inline implementations, they incur negligible run-time overhead.

The relationship macros wrap a class around the data member; this adds no additional storage to the data member. The wrapper simply implements the functions to perform the inverse operations. The m side of a relationship is an embedded collection that is 8 bytes. It mutates to an out-of-line representation automatically upon the insertion of the first element.

Relationship Macros

There are four relationship member macros to choose from:

The corresponding function body macros are

Descriptions of all of these macros can be found in the C++ Collections Guide and Reference.

Note that these macros always come in fours. Each use of a member macro to define one side of a relationship must be paired with another member macro to define the other side of the relationship, and each member macro must have a corresponding body macro to provide the implementations for the relationship's accessor functions. This means that a one-to-many relationship member must also have a one-to-many relationship body and a many-to-one inverse member, which itself must have a many-to-one relationship body.

Macro Arguments

The member macros always have five arguments:

By scanning just the last argument and the member name, you can quickly grasp the externally visible interface to the data member. For example:

os_relationship_1_m (person,employer,company,employees, 
    company*) employer;

defines a company* employer data member, which is part of a relationship.

The function body macros have just four arguments. For each function body macro, the arguments are the same as those of the corresponding member macro, but without the last argument, as shown in the examples that follow.

Compiler caution

The first four macro arguments are used (among other things) to concatenate unique names for the embedded relationship class and its accessor functions. The details of macro preprocessing differ from compiler to compiler, and in some cases it is necessary to enter these macro arguments without white space to ensure that the argument concatenation works correctly. There should be no white space in the argument list between the opening parenthesis and the comma separating the fourth and fifth arguments. All the examples that follow adhere to this important convention and should, therefore, work with any C++ compiler.

Relationship Examples

Example: Single-Valued Relationships

Consider an example in which a class node is defined that has single-valued inverse relationship members next and previous (as in a node in a list structure). This uses the os_relationship_1_1 and os_rel_1_1_body macros. Note that both the simple data member and relationship style of interfaces are supported automatically.

See the C++ Collections Guide and Reference for descriptions of the os_relationship_1_1() and os_rel_1_1_body() macros.

/* C++ Note Program - Header File */

#include <fstream.h>
#include <string.h>
#include <ostore/ostore.hh>
#include <ostore/coll.hh>
#include <ostore/relat.hh>

class author;

/*  A simple class that records a note entered by the user. */
class note {

  public:

    /* Public Member functions */
    note(const char*, int);
    ~note();
    void display(ostream& = cout);
    static os_typespec* get_os_typespec();

    /* Public Data members */
    os_backptr bkptr;
    char* user_text;
    os_indexable_member(note,priority,int) priority;
    os_relationship_1_m(
      note,the_author,author,notes,author*) 
      the_author;
};

#include <ostore/relat.hh>

class node {

  public:

  os_relationship_1_1(node,next,node,previous,node*) next;
    os_relationship_1_1(node,previous,node,next,
      node*) previous;
    node() {};

};

os_rel_1_1_body(node,next,node,previous);
os_rel_1_1_body(node,previous,node,next);

main() { 
  OS_ESTABLISH_FAULT_HANDLER

  /* show the end users use of these relationships */

  objectstore::initialize();
  os_collection::initialize();
  node*  n1 = new node();
  node*  n2 = new node();

  n1->next = n2; 

  /* this also automatically updates n2->previous  */
  printf("n1 (%x) --> (%x)\n",
    n1, n1->next.getvalue());

  printf("n2 (%x) --> (%x)\n",
    n2, n2->previous.getvalue());

  OS_END_FAULT_HANDLER
}
Compiler caution

While the simple data member style of access normally allows you to treat a single-valued relationship as a normal pointer-valued data member in most situations, this capability depends upon the operator=() (to set the value) and coercion operators (to get the value). Thus, the following simple assignment,

n1->next = n2->next;

actually is interpreted by the C++ compiler as

n1->next.operator=( n2->next.operator node* () );
Example: incorrect use of the coercion operator

The coercion operator operator node* () is used to get the value of the relationship in the right-hand-side expression, and the assignment operator operator=() is used to set the value of the relationship in the left-hand-side expression. Be aware that the compiler only applies the coercion operator if it knows that the desired type of the expression is a node* pointer. The following does not work correctly:

printf("The value of the relationship is %x \n", n1->next );

This does not work because printf() does not have prototype information for its arguments, so the compiler does not know to apply a coercion. In this case, either of the following would be a suitable alternative:

Example: avoiding coercion errors
printf("The value of the relationship is %x \n",
    n1->next.getvalue() );
printf("The value of the relationship is %x \n",
    (node*)n1->next );
Example: private declarations of relationships

The next example defines a class node as the previous one did, but presents to the end user a functional-style interface. This is done exactly as before, except that the relationships themselves are declared private so that the user cannot directly access them by way of the simple data member or relationship styles of interfaces, and the class definer writes simple inline member functions to extend a functional-style interface instead. Note that in this example the two relationship members are defined by the same class, node. This does not have to be the case. Even if they were defined by different classes, node and arc for example, they could still be made private because the relationship macros define the relationship implementation classes as friends.

#include <ostore/ostore.hh>
#include <ostore/coll.hh>
#include <ostore/relat.hh>

class node {
  private:
    os_relationship_1_1(node,next,node,previous,
      node*) next;
    os_relationship_1_1(node,previous,node,next,
      node*) previous;
  public:
    node* get_next() {return next.getvalue();};
    void set_next(node* val) {next.setvalue(val);};

    node* get_previous() {
    return previous.getvalue();};
    void set_previous(node* val) {
    previous.setvalue(val);};
    node() {};
};

os_rel_1_1_body(node,next,node,previous);
os_rel_1_1_body(node,previous,node,next);

main() { 
OS_ESTABLISH_FAULT_HANDLER

/* show the end users use of these relationships */

  objectstore::initialize();
  os_collection::initialize();
  node*  n1 = new node();
  node*  n2 = new node();

  n1->set_next(n2); 
  /* this automatically also updates n2->prev */

  printf("n1 (%x) --> (%x)\n",n1, n1->get_next());
  printf("n2 (%x) --> (%x)\n",n2, n2->get_prev());

OS_END_FAULT_HANDLER
}

Example: Many-Valued Relationships

The os_rel_m_m_body and os_rel_m_1_body macros should not be used in include files that are included in more than one source file used in a given application. This is because these macros define the bodies for virtual functions. Using these macros in a header file that is included in more than one place can result in redundant definitions of the virtual table that is generated by the compiler to implement virtual function calling.

See the C++ Collections Guide and Reference for descriptions of the os_rel_m_m_body(), os_rel_m_1_body(), and os_relationship_m_m() macros.

Following is an example in which a class node is defined with a pair of many-to-many relationships, ancestors and descendents (as in a node in a graph structure):

#include <ostore/ostore.hh>
#include <ostore/coll.hh>
#include <ostore/relat.hh>

class node {
  public:
    os_relationship_m_m(node,ancestors,node,descendents,
      os_collection) ancestors;
    os_relationship_m_m(node,descendents,node,ancestors,
      os_collection) descendents;
    node() {};
};

os_rel_m_m_body(node,ancestors,node,descendents);
os_rel_m_m_body(node,descendents,node,ancestors); 

main() {
  OS_ESTABLISH_FAULT_HANDLER

  /* show the end users use of these relationships */
  objectstore::initialize(); os_collection::initialize();
  node*  n1 = new node(); node*  n2 = new node();

  n1->ancestors.insert(n2); 
  /* this also updates n2->descendents */

  node* n;

  printf("n1 (%x)\n",n1);
  printf(" has %d descendents: ", n1->descendents->size ()); {
    os_cursor c(n1->descendents);
    for (n = (node*) c.first(); n; n = (node*) c.next()) 
      printf("(%x) ",n);
    printf("\n");
  }

  printf("     and %d ancestors: ", n1->ancestors->size ()) {
    os_cursor c(n1->ancestors);
    for (n = (node*) c.first(); n; n = (node*) c.next()) 
      printf("(%x) ", n);
    printf("\n");
  }

  printf("n2 (%x)\n",n2);
  printf("     has %d descendents: ", 
    n2->descendents->size ()); {
    os_cursor c(n2->descendents);
    for (n = (node*) c.first(); n; n = (node*) c.next()) 
      printf("(%x) ", n);
    printf("\n");
  }

  printf("     and %d ancestors: ", 
    n2->ancestors->size ()); {
    os_cursor c(n2->ancestors);
    for (n = (node*) c.first(); n; n = (node*) c.next()) 
      printf("(%x) ", n);
    printf("\n");
  }

OS_END_FAULT_HANDLER
}

Example: One-to-Many and Many-to-One Relationships

Following is an example in which a class node is defined that has a one-to-many relationship, children, and a many-to-one inverse, parent (as in a node in a tree structure).

See C++ Collections Guide and Reference for descriptions of the os_relationship_1_m(), os_relationship_m_1(), os_rel_1_m_body(), and os_rel_m_1_body() macros.

#include <ostore/ostore.hh>
#include <ostore/coll.hh>
#include <ostore/relat.hh>

class node {

  public:
    os_relationship_1_m(node,parent,node,children,
      node*) parent;
    os_relationship_m_1(node,children,node,parent, 
      os_collection) children;
    node() {};
};

os_rel_1_m_body(node,parent,node,children);
os_rel_m_1_body(node,children,node,parent); 

main() {
  OS_ESTABLISH_FAULT_HANDLER

  /* show the end users use of these relationships */

  objectstore::initialize();
  os_collection::initialize();

  node*  n1 = new node();
  node*  n2 = new node();

  n1->children.insert(n2); 
  /* this also updates n2->parent */
  /* NOTE: "n2->parent = n1;" would have had */
  /* identical effect */

  /* etc */

  OS_ESTABLISH_FAULT_HANDLER
}

Following is an example that illustrates a one-to-many relationship involving two different classes:

#include <ostore/relat.hh>

class person { 
  public:
    os_relationship_1_m(person,employer,company,
      employees, company*) employer;
    char* name;
};

class company { 
  public:
    os_relationship_m_1(company,employees,person,
      employer, os_collection) employees;
  int gross_revenue;
};

os_rel_1_m_body(person,employer,company,employees);
os_rel_m_1_body(company,employees,person,employer);

Duplicates and Many-Valued Inverse Relationships

For most kinds of ObjectStore relationships, an update to one side of the relationship always triggers a corresponding update to the other side. This is true for the following kinds of relationships:

For other relationships, an update to one side does not always trigger an update to the other side.

The following example shows the way ObjectStore handles one-to-many relationships in which the collection at the many end of the relationship allows duplicates. It also shows the way ObjectStore handles many-to-many relationships in which one of the collections involved allows duplicates and the other does not.

Suppose a complex part keeps track of the primitive parts it uses, as well as the number of times each primitive part is used. (For example, a wheel might be a primitive part and be used four times in a complex part like a car.) Suppose also that each primitive part is used in only one complex part. This can be modeled with the following classes:

Class definitions
class complex_part {
  os_relationship_m_1(
      complex_part,
      components,
      primitive_part,
      used_by,
      os_Bag<primitive_part*> ) components ;
}

class primitive_part {
  os_relationship_1_m(
      primitive_part,
      used_by,
      complex_part,
      components,
      complex_part* ) used_by ;
}

Suppose that a certain primitive_part, a_wheel, is used by a particular complex_part, the_car. If you do

a_wheel->used_by = 0;

ObjectStore removes all occurrences of a_wheel from the_car's components because setting used_by to 0 implies that the wheel is not used by the car at all.

Suppose you do

the_car->components.remove(a_wheel)

If the car uses four wheels at first, afterward it uses three wheels. a_wheel->used_by still points to the car because the car still uses the wheel at least once.

Now suppose each primitive part can be used by multiple complex parts.

class complex_part {
  os_relationship_m_1(
      complex_part,
      components,
      primitive_part,
      used_by,
      os_Bag<primitive_part*>
  ) components ;
}

class primitive_part {
  os_relationship_1_m(
      primitive_part,
      used_by,
      complex_part,
      components,
      os_Set<complex_part*>
  ) used_by ;
}

And suppose you do

a_wheel->used_by.remove(the_car);

This causes all occurrences of a_wheel to be removed from the_car's components because it implies that the wheel is not used by the car at all.

If you do

the_car->components.remove(a_wheel);

ObjectStore removes the_car from the wheel's used_by set only if it removes the last occurrence of the wheel from the car's components, that is, only if the car no longer uses the wheel at all.

Use of Parameterized Types

Relationships can be used either with or without a compiler that supports parameterized types. All the previous examples were written without the use of parameterization. In the case of many-valued relationships, you can obtain a greater degree of type safety by using a parameterized collection type. This is accomplished by changing the last parameter to the relationship member macro (recall that the last parameter always indicates the type of the value). For example:

Example
class node {
  public:
    os_relationship_m_m(node,ancestors,node,descendents,
      os_Collection<node*>) ancestors;
    os_relationship_m_m(
        node,descendents,node,ancestors,
        os_Collection<node*>) descendents;
        node() {};
};

os_rel_m_m_body(node,ancestors,node,descendents);
os_rel_m_m_body(node,descendents,node,ancestors);

In this case, the functions that perform a get value (that is, getvalue()) and the coercion operator return an os_Collection<node*>& rather than an os_collection& only.

Deletion Propagation and Required Relationships

By default, deleting an object that participates in a relationship automatically updates the other side of the relationship so that there are no dangling pointers to the deleted object. In some cases, however, the desired behavior is actually to delete the object on the other side of the relationship (for example, for subsidiary component objects). You can obtain this behavior by using the relationship body macros:

(Descriptions of all these macros can be found in the C++ Collections Guide and Reference.)

These macros are like the body macros already discussed, except that they have three extra arguments, used for specifying various options. The fifth argument (the first extra argument) can be either os_rel_propagate_delete or os_rel_dont_propagate_delete, as in

Example
os_rel_m_1_body_options(part,subparts,part,container,
    os_rel_propagate_delete, os_auto_index, os_no_index)

The last two arguments are used to indicate whether the current member and its inverse are indexable. These are described in the next section.

Indexable Inverse Members

If you want automatic index maintenance enabled for an inverse data member, you must use one of the options body macros:

(Descriptions of all these macros can be found in the C++ Collections Guide and Reference.)

These macros are like the body macros discussed earlier, except that they have three extra arguments used for specifying various options.

The sixth and seventh arguments (the second and third extra arguments) are used to specify whether the current member and its inverse, respectively, are indexable. For nonindexable members, use os_no_index. For indexable members, use a call to the macro os_index(), indicating the name of the defining class's os_backptr member. Such macro calls have the form

Form of the call
os_index( class, member) 

where class is the name of the class defining the indexable member and member is the name of the os_backptr-valued data member appearing before indexable members of the class. Following is an example:

Example
os_rel_m_1_body_options(part,subparts,part,container,
  os_propagate_delete, 
  os_auto_index, os_index(part,b))

Many-valued members that have an inverse need not be indexable to be used in a path. For an indexable many-valued relationship, specify os_auto_index.


[previous] [next]

Copyright © 2003 Progress Software Corporation. All rights reserved.

Updated: 04/24/03 17:04:02