Tapping the ER/Repo Reservoir to Get the Metadata You Want

By: Gregory Keller

Abstract: Power user SQL Querying of the ER/Repo

Expert Advice Topic 1: Communicating Business Metadata: Tapping the ER/Repo Reservoir to Get the Metadata You Want.

Author: Greg Keller

Problem Space:

Of the myriad of duties performed by DAs (data architects) and Data Stewards, striving for conformity of data via approved data standards they create ranks amongst the highest of priorities and lays the groundwork for much of their peripheral activities. While conceptually modeling complex business data problems, these standards are constantly levied in their designs to ensure data consistency is permeated through the design, and hopefully carried on through to implementation. The problem is though: “How can we communicate to and ensure the folks who implement these designs, leverage these same standards in an efficient, simplistic and repeated manner? There are a myriad of solutions to solve this problem: off the shelf metadata Repositories (i.e. ‘Big R’ repositories), data modeling product web documentation facilities, custom/static metadata reports and more. The stumbling blocks are many though due to astronomical implementation and management costs of “Big R” Repositories (usually price themselves out of most Fortune 5000 shops) and modeling product reporting engines which do not easily allow for the level of metadata ‘searching’ the developers and various distributed audiences require to find the metadata definitions they are looking for.

Solution:

Tapping into and mining data from “small R” repositories (where “small R” = data modeling collaboration repositories) is a cost-effective and efficient approach at getting the required metadata directly from the ‘source’. Using the Embarcadero ER/Studio Repository as an example, we’ll discuss some approaches with examples of leveraging traditional SQL on how to tap and query the Repository data and achieve desired results sets.

  • Step 1: Setting the stage: A Primer on ER/Studio and ER/Studio Repository

ER/Studio and its optional add-on collaboration server, “ERStudio Repository”, establish a system of real time collaboration between teams of ‘modelers’ of varying roles on common designs…e.g. teams of DA’s and DBAs/Database Developers can work in tandem, safely and securely on the same designs at the same time. The products work in a classic Client/Server mode where ER/Studio is the “client” and ER/Repository acts as the “server.” ER/Studio Repository, as the server, is installed on an RDBMS (Oracle, SQL Server, DB2 UDB, Sybase).

Once installed, the process is simple: An ER/Studio DM1 file (the local copy of a model file on the client) is initially added to ER/Repository typically by an ER/Repository Administrator. This happens directly from the ER/Studio (client) interface. Behind the scenes, the process involves the ER/Studio turning the local “*.DM1” file (an ASCII file) into data values which is then inserted via SQL into the corresponding tables of the ER/Repository database. With this complete, the ‘diagram’ becomes available for others to come get and work on (security settings applied of course). This add/get process is very similar to that of typical source code control systems.

For more information, see: http://www.embarcadero.com/products/erstudio/requickfacts.html

  • Step 2: Discussion of ER/Repository’s Meta Model

So, with an ER/Studio diagram added to ER/Repository, and the diagram’s ‘metadata’ now inserted into the ER/Repository’s tables, how can you understand more about where your metadata is? There is a very helpful solution which ships with ER/Studio: The “Repository Meta Model”.

Using ER/Studio, browse to (using the default path here in this example) “C:\Program Files\Embarcadero\ERStudio6.0\SystemModels” and open up the DM1 file called “Repository250MetaModel”.

As a helpful ‘blueprint’, the ER/Repository’s meta model ships with ER/Studio so you can see all of the data structures and their associations. This will help you understand the path to accessing the data you require and the dependencies between the tables. Note that in the ER/Studio diagram, all of the entities/tables are inter-related with relationships. This is for design purposes only to document the relationships between tables the application enforces. The tables, as they exist on the database do not leverage primary/foreign key constraints.

Study the Submodels you see in the ER/Studio data model to understand key areas of the meta model such as the Data Dictionary system, Model Objects (e.g. where all of an Entities data is stored in Repository) etc.

  • Step 3: Writing the query:

Leveraging the ER/Repository Meta Model and a great SQL editing/query building tool such as DBArtisan, it’s time to get down to writing a query. In this example, we’ll be writing against an ER/Repository hosted on Oracle 8i, but the approach is similar regardless of the DBMS (with the exception of the JOIN syntax you will see here).

Assume you have the problem of weeding through metadata to look for specific things across all Diagram sin the ER/Repository. For the purpose of this query, assume you would like to search for…

  • A particular Physical Table Name,
  • Its corresponding Entity Name,
  • the definition/Comment of the Physical Table,
  • the Submodel the Entity is used in,
  • finally the name of the DM1 file the occurrences are found in.

…all by a specific and desired string.

The query to achieve this would look like what you see below. Leverage the ER/Studio Repo Metad Model to understand the associations between the tables reference below with respect to the placement on the ‘query chain’. Note ERREPO is the Owner used in this Oracle instance but would be replaced by any owner you will have set up in your ER/Repository installation.

SELECT ev.table_name, ev.NAME  Entity_Name, ev.Definition,

sv.name AS SubModel_Name,

           Mv.NAME AS Model_Name, dv.file_name AS DM1_File_Name

    FROM ERREPO.entity e, ERREPO.entity_ver ev, ERREPO.model m,

         ERREPO.model_ver mv, ERREPO.entity_display_ver edv,

         ERREPO.entity_display ed, ERREPO.submodel s,         

         ERREPO.submodel_ver sv,

         ERREPO.diagram_ver dv, ERREPO.diagram d

    WHERE ev.entity_id = e.entity_id

      and  e.latest_version_id = ev.entity_ver_id

      and  ev.is_deleted <> '1'

      and ev.table_name LIKE 'PR%' –Enter the desired search string here to replace the PR in the like clause.

      and e.entity_id = ed.entity_id

      and ed.latest_version_id = edv.entity_display_ver_id

      and edv.is_deleted <> '1' –Note that “is deleted” settings in Repo refer to the product’s ability to ‘keep’ deleted data in the Repo Database. The bit settings obscure this from ER/Studio and treat it like truly deleted data.

      and ed.submodel_id = s.submodel_id(+)

      and s.submodel_id = sv.submodel_id(+)

      and s.latest_version_id = sv.submodel_ver_id(+)

      and s.model_id = m.model_id(+)

      and m.latest_version_id = mv.model_ver_id(+)

      and m.model_id = mv.model_id(+)

      and nvl(mv.is_deleted,'0') <> '1'

      and m.diagram_id = d.diagram_id(+)

      and d.diagram_id = dv.diagram_id(+)

      and d.latest_version_id = dv.diagram_ver_id(+)

      and nvl(dv.is_deleted,'0') <> '1'

      and nvl(model_type,'2') = '2'

   ORDER BY 1, 4, 3

The results of the query would return values much like you see here:

Hide image
Click to see full-sized image

  • Step 4: What to do with the result sets.

With the initial knowledge of ‘mining’ ER/Repo data learned here, the opportunities are wide open with respect to communicating this valuable metadata. A myriad of ER/Studio Repository users have heavily leveraged this SQL query process to establish a veritable metadata ‘mart’ whose audiences are disparate in role and remotely located. Common uses of Repository querying are:

  • Writing HTML/Perl/CGI scripts to interface with ER/Repository to allow input of search strings to populate web-enabled Data Dictionary pages.
  • Querying the ER/Repository for re-purposing the design metadata for home-grown ETL transactions
  • Cross populating Repositories…e.g. querying/pulling metadata from the ER/Repository to populate “master” metadata Repositories.

Conclusion:

By virtue of ER/Studio populating its design metadata to a relational database it’s collaborative Repository is base don, users can literally mine this data for valuable “re-purposing” exercises to achieve cost-effective means to ensure a wide audience of people can benefit from seeing/leveraging the same design standards as set by data architects in their data models.


Server Response from: ETNASC03