Year 2000 Conversions

with

ADAPT/2000

 

A Methodology for Millennium Project Implementation

Prepared By:

George Luntz

President - Allegiant Legacy Solutions, Inc.

 

Edited December 1997 by:

Stanley Mackintosh

Information Systems Architect

 

Background
As the new millennium approaches, corporations large and small face one of the most significant challenges ever to confront the information processing industry - enhancing existing applications to handle the expanded date formats of the 21st century. ADAPT/2000 is designed to help companies address all the major phases of what has been billed as the largest code maintenance project in history - source code analysis, data conversion, and source code modification.

For a variety of reasons, many legacy application systems developed in the 1960’s, 1970’s, 1980’s and some even in the 1990’s used a 2-digit field in which to store the year of a date. The Gartner Group estimates that 90% of all applications will be affected by the Year 2000 problem. They further estimate that major corporations will each spend between $50 - $100 million trying to solve the problem. Indeed, according to some estimates, even with the use of tools such as ADAPT/2000, an additional 200,000 COBOL programmers will need to be added to the existing pool of 900,000 if the deadline is to be met worldwide.

Those companies who make an early decision to solve the problem could do so in an orderly, planned and timely manner. Implementing a millennium project, is after all, nothing more that a carefully planned code maintenance project, if the correct tools and approach are used. Time is now short, but even later-starting projects must be done in an orderly, planned and timely manner.

Allegiant Legacy Solutions, Inc., and its business partners, have invested heavily in producing a combinations of tools, services and methodologies to help corporations address the millennium issue. This document outlines an approach utilizing ADAPT/2000 as the core enabling technology, and Allegiant and its partners as a resource, to solve the problem for COBOL based systems.

Editor’s Note:- Some late-breaking developments are included as Editor’s notes.

Defining the Problem
In technical terms, a Year 2000 problem exists wherever date data is stored in fields that do not allow for a 4-digit year, or the calculation of one from a satisfactory internal storage definition (for example, a date stored as the number of days since a base date such as January 1, 1900). In such cases, whenever a date in the new century is required to be stored, it will be evaluated as being less than the year 19XX, resulting in sorting and computational errors of huge importance.

Such date problems can in exist in a number of key areas, not just in programming source code:

• Source Code (such as COBOL, PL/1, Assembler, BASIC, even ‘C’)

• Data files

• Job Control Scripts (JCL, DOS BAT Files, VMS DCL files, UNIX Shell Scripts, etc.)

• External sort definitions

• Operating systems and system clocks that do not support dates greater than 1999

Each of these areas need to be addressed as an integral part of the millennium project. The application related aspects of a millennium project can generally be broken down into 6 phases, according to Software Management Measures (Special Year 2000 Edition, April 1996):

1. Inventory Analysis
Scoping the problem by identifying all components and defining the boundaries of applications. Making strategic choices for each application. This will help to determine which systems will be replaced and when, and which will be retained. The general size of the problem will then be known.

2. Impact Analysis
The goal is to estimate the budgets for the complete solution. Many projects use scanning methods only for this phase (i.e. they do not use full source parsing and associated field identification). However, they can then only give a very rough estimate for the whole project, and cannot support strategic decision making per system.

The first step of Impact Analysis is a high-level analysis. The aim of this step is to identify all programs inferred, and to count the date fields per program that have to be changed. To get a more reliable estimate, a detailed level analysis is needed to identify the actual infections. Once these analyses have been done, a project estimation can be made, based on size and complexity metrics.

3. Detailed Analysis
The goal of this phase is to identify the exact locations of date fields and their infections (i.e. how they relate to other fields). To make a productive detailed analysis that is able to identify all of the date fields, you must trace the infections across program boundaries. Detailed analysis involves recursive analysis via interfaces such as common/linkage areas, data file storage, working data storage, and procedure division usage.

4. Partitioning/Planning
A detailed plan of technical solutions for each system or program is made. First, the selection of application clusters (sub-projects) is performed, then work lists per sub-project are drawn up, and finally the project can be planned.

5. Conversion
Automatic conversions should be used as much as possible during this phase of the project. Not only for source code, but for the data as well.

6. Testing
The test phase is often the most expensive and people-intensive phase. You must check all components for conformance to date standards (review), calling architecture standards, and the control structure.

Selecting a Date Format Standard
The International Standards Organization (ISO) standard for date storage is the ISO 8601 specification. In terms of ISO 8601, dates should be stored in the format YYYY-MM-DD, requiring 8 digits, with 4 digits for the year. This is also represented by CCYY-MM-DD, where CC is the century portion of the 4-digit year.

It will not always be practical or desirable to alter all non-conforming dates to this standard. For example, dates stored as Julian dates with a two digit year (YY-DDD) may be best altered to a millennium compliant Julian format (YYYY-DDD) due to the way the field is used in computations. Similarly, dates store as MM-DD-YY might best be changed to MM-DD-YYYY, again depending on usage and redefinition. However, wherever possible, the ISO standard should be used, even if this results in more complex code conversion. International standards for file design, user interface design and communications protocols, for instance, have increasingly adopted this format.

Conversion tools used should support this standard, as well as other date formats, and conversion from non-conforming formats to these.

Editor’s Note:- The Mackintosh Date Convention, published in December 1997, provides a method of achieving unequivocal century definition within existing 6-digit date field formats on existing files without making any file changes at all. This convention makes use of the fact that only about 3.7% of all possible 6-digit combinations are used for all possible dates in one century, ergo several thousands of years of dates can be stored unambiguously within those 6 digits. Even 5-digit Julian Date formats can accommodate another century of dates - or many more centuries if (as normally the case) they are 3 byte packed fields. The major significance of this new convention is that the sheer workload and risk of file conversions can be avoided by those who have no yet taken that route. This leaves a smaller (but still major) project of procedural code re-engineering, of course, for which ADAPT/2000 is specifically designed.

ADAPT/2000 Approach
ADAPT/2000 provides an integrated, repository based approach that addresses the core phases of the millennium project. The ADAPT/2000 integrated approach means that each activity feeds the next, and conversion can be automated across source code, copy member and actual data files.

ADAPT/2000 incorporates code scanning, parsing and generating technologies, and the concept of a central data dictionary and repository to provide an all-encompassing approach to millennium projects. These technologies include:

Code Scanning - using user defined rules for performing the high level impact analysis, identifying fields by name or storage characteristics as candidates for date fields.

Code Parsing - breaking down source statements to identify sentence structure, with VERBS and NOUNS, and establishing date infections based on associations.

Data Dictionary - building a comprehensive and accurate depiction of every data file, its records and fields, including multiple-record files, redefines, occurs arrays, keys and alternate keys, and identifying all date fields and their representation (internal format).

Data De-aliasing - understanding the implications of shared data areas and overlapping data areas (such as redefines), so that date-impacted data areas, even if not directly named or addressed, are taken into account

Data Flow Analysis - tracing the movement of date information from one field to the next, and how this is propagated across the system.

Central Repository - building and maintaining a complete cross-reference of data element usage, program element infections, and step by step project phase completion by element.

Date Field Expansion - ADAPT/2000 supports expansion of date fields to support century information, with significant automation of the source code changes needed to support expansion, as well as the generation of data conversion programs to convert existing data if the alternative route of the Mackintosh Date Convention is not adopted.

Date Windowing - since not all fields should be expanded to 4-digit years, depending on their usage and the ramifications of field expansion, ADAPT/2000 provides a called subroutine library that facilitates a fixed or sliding window for date conversion and offsetting, with user defined cut-off.

In-built Program Generation - to support the testing phase through report generation and data file manipulation. This includes generation of aged data from a test baseline; data bridging programs; and system date interception

No one tool could possibly address every single aspect of a millennium project. ADAPT/2000 is designed to help implement the core phases requiring necessary integration. It supports the use of 3rd party tools for automated regression testing, data analysis and source code control.

Choice of Platform
Choosing the processing platform upon which to run ADAPT/2000 will depend to some extent upon the source code origin. The following matrix is a guide to choosing the platform. Note that these are recommendations only, and that almost any platform can be used for any of the supported COBOL dialects.

ADAPT/2000 Project Steps
Implementing a millennium project involves the following main steps, with involvement from each party as indicated below. Again, this is a suggested approach to enhance the chances of success. Alternative project structures can be implemented, ranging from complete hands-off by Allegiant and its partners (expect for basic tools training), to a complete turnkey solution by Allegiant and its partners.

Editor’s Note:- With immutable deadlines approaching fast, the following necessary steps cannot be at a leisurely pace. Normal project cycle timescales now need to be accelerated and concentrated. Y2000 compliance should now be the first or preferably the only priority for systems development resources until completed. The Devil shall surely take the hindmost!

1. Inventory Analysis
While this is primarily a user responsibility, and requires intimate knowledge of the users systems and their usage and life cycle, ALS and its partners can provide significant experience and expertise in helping to assess existing systems, the alternatives available for millennium implementation, and a recommended approach. The output of this phase, if not already done, is jointly devised report including:

• A high level inventory of all systems at the organization

• Likelihood of year 2000 problems in each system or application

• A recommendation for replacement, rewrite, or analyze and convert, based on the application’s priority, uniqueness to the organization, life cycle, desirability for ongoing maintenance and propagation, and the availability of productivity tools, amongst other considerations

• A plan and budget for moving to the next 2 phases - impact and detailed analysis

2. Impact / Detailed Analysis
This is an important early step in defining the extent of the problem, and to indicate resource requirements. ADAPT/2000 includes both high level analysis and detailed code parsing to perform the following major steps:

• Configure ADAPT/2000 to establish overall defaults, and application-specific defaults. Items such as default directories for source, copy elements, JCL and data; element naming conventions (e.g. CPY for copy elements) etc.

• Define rules (naming conventions) for identifying date fields. ADAPT/2000 supports powerful and flexible wildcard and pattern matching to facilitate comprehensive name matching.

• Define rules for excluding fields that might have been identified as dates. For instance, while one might want to identify all field names that include the letters "DATE" in them as dates, the field "UPDATE-FLAG" is obviously not a date field.

• Define field storage characteristics to be used to identify date fields when name matching might fail. Depending on programming standards employed, or lack thereof, in developing systems, name matching might not identify all or even the majority of fields. ADAPT/2000 enables the definition of storage characteristics to identify potential matches. For example, all fields stored as PIC 9(6), or PIC 9(6) COMP-3 etc., might be flagged.

• ADAPT/2000 scans directories and builds a database (an inventory) of all application elements (programs, copy elements, data files etc.) to be used in tracking statistics and project status.

• Run the high level analysis tools to scan for possible date matches. ADAPT/2000 builds an interim dictionary of all fields, flagging those that are potential date fields. A report or inquiry can be run to identify these elements, along with the rule(s) that caused the match. Where a name match was made, ADAPT/2000 also inspects the field storage characteristics to suggest in what format the date is stored. ADAPT/2000 scans all copy elements and source programs in this phase, examining all code up to the Procedure Division.

• Use ADAPT/2000’s interactive utility to reject false positives, confirm correct matches, to identify false negatives as dates, and to confirm the internal storage format for each field.

• Run the second phase of high level impact analysis to build a cross-reference of date fields, data files and programs elements. This helps to narrow down the list of infected items for project sizing.

As already pointed out, high level analysis is not comprehensive enough to be used for final budgeting and planning. Many tools stop here, but ADAPT/2000 extends its scope to include complete code parsing in order to identify all date infections. The full power of ADAPT/2000 begins to show itself here, since ADAPT/2000’s macro language facilitates:

• Parsing each statement into VERBS and NOUNS

• User definition of which VERBS (MOVE, ADD, COMPUTE, CALL, EXEC CICS etc.) to examine, and which to ignore. This is an important feature, and one which enables ADAPT/2000 to address a wide range of COBOL dialects, with previously unseen VERBS

• Context interpretation to minimize false positives. For instance, in examining a MOVE involving a date field, one could surmise that all other fields in the MOVE statements are date fields too. However, a MOVE ZEROES TO a list of field names is simply an initializer for multiple numeric fields, and in no way indicates that all fields are associated with the date field. ADAPT/2000 supports the interpretation of VERBS to make such decisions.

Using the above features, ADAPT/2000 will recursively analyze all application elements to identify date infections by association. These fields are added to the interim dictionary, along with an indication of how they were found, and a cross reference of usage is built. Again using the interactive tools, false positives can be rejected, and then next in the recursive passes is run, until no further dates are found.

At this point, ADAPT/2000 will have identified all date fields, how they used, how they passed from field to field and program to program, and a wide variety of statistics by program, including:

All data files with dates
All program elements with dates
All date fields found, and how
Statistics by program element (size, complexity, date hits etc.)
Total lines of source code
Total data elements
Total procedure statements
Number of copy statements
Number of data files
Number of procedure date related statements (move, compute, add etc.)
Project segmentation reports, such as data files by source, source by data file etc.

By applying metrics based on these statistics, a much more accurate picture can be established, and much more appropriate project estimation and planning can be performed.

3. Partitioning and Planning
Based on the Detailed Analysis reports, the project can then be partitioned into sub-projects, and elements assigned to individual analyst/programmers. ADAPT/2000 provides utilities and reports to facilitate this, with final output providing a comprehensive report of project scope and itemized detail by programmer analyst.

ADAPT/2000 also tracks project status by element, keeping track of each phase completed for each element, with supporting reports and inquiries. At this stage, implementation of the conversion can begin.

4. Source Code Conversion
Source code conversion could never be fully automated, even with an outstanding tool like ADAPT/2000. Every line of code originally written bears the mark of the original developer - in style, intent and logic. No automated conversion tool would be able to modify the code flawlessly without some manual intervention, and some degree of human decision making.

ADAPT/2000 combines the power of code parsing with the experienced COBOL development backgrounds of its developers to produce a well-rounded conversion offering that will:

• Use a rules-based paradigm to make conversion decisions. Rules such as deciding which non-compliant fields NOT to convert, and how to modify the source code to convert to and from this non-compliant form (e.g. to minimize screen layout changes, you might elect to not convert a data filed to 8 digits, yet will want to convert to and from an 8-digit ISO date field)

• Place intelligent markers in the source code to identify where a change should be made, with suggested changes to be made, or where an automated change has been made

• Automatically implement the A2K date library calls for date conversion from one form to another in the application, if desired

• Output new copy elements for SELECTS, FD’s and WS record layouts, based on the conversion rules established.

ADAPT/2000 also incorporates a powerful integrated source editor, which enables users to advance sequentially through the code based on the markers inserted, and allowing on-line editing of the source code to make necessary changes. Through this combination of automated code conversion and source marking and editing, a complete and effective source conversion is implemented.

5. Data Conversion
This section applies when file conversion, rather than the new Mackintosh Date Convention, is to be used.

In order to achieve the maximum benefit of the work performed in doing the impact analysis, the year 2000 conversion tool should address not only the source code conversion, but data conversion as well. If the tool is repository and dictionary based, then the work done in earlier steps can be leveraged to automate the data conversion process.

ADAPT/2000 provides the most complete, integrated data conversion capability of the year 2000 conversion tools on the market. As a by-product of the impact analysis, a central data dictionary is built describing all date impacted files and records. This includes information such as:

• File and record relationships

• Multiple record definitions

• Redefines and Occurs clauses

• Every date field, with internal storage characteristics

ADAPT/2000’s rules library enables users to define which date formats to convert; the chosen new format for each one (you are not restricted to one date format only); which fields not to convert (e.g. screen or report fields may not be modified to reduce layout problems).

As a result, ADAPT/2000 will:

• Produce a new data dictionary definition for each file and record, paying full attention to redefinition’s and multiple record types

• Generate a data conversion program with full source code for each file

The data conversion programs generated will run in the following environments:

As a result, ADAPT/2000 will leverage the effort and knowledge base from the impact analysis and source conversion choices to generate a complete set of data conversion programs.

6. Testing
Final unit and integration testing is a resource-intensive task. A wide range of tools are available for helping to reduce the manpower required to complete the task, some of which are included in ADAPT/2000, others which are generally available and compatible with ADAPT/2000.

Tools that will prove useful in this phase include:

Data generating tools for test scenarios

Data editing and manipulation tools to support multiple runs through the data over multiple periods

Regression testing tools to automate processes such as data input, test runs and results comparison

Source code debuggers to step you through the code for problem resolution

ADAPT/2000 does not extend its built-in resources to regression testing tools. However, it does includes significant aids for the testing phase, including:

An SQL-like language for data generation, resetting and manipulation

Automated generation of data aging programs to move baseline test data forward to assist in Year 2000 compliance and boundary testing

Automated generation of data bridging programs to assist in testing and implementing segmented project, where not all modules are upgraded simultaneously

Replacement of standard COBOL syntax to "ACCEPT" the current system date by a called routine (supplied with ADAPT /2000), to enable future date testing to be conducted without interfering with current production applications

Integrated report writer and query tools for results analysis and comparisons, using the same data dictionary created during the analysis and implementation phases of the project

Integrated, forms-based data editing for one-time data fixes, or data entry of new test data

Summary
The Year 2000 problem is now seen as a very a real problem that may severely impact and even cripple the operations of organizations dependent on data processing and computing technology, and innocent parties who rely upon them. Because of the sheer size of the remedial task, and the time-based nature of many applications, it is not something that can be left until 1999 to be addressed.

The case for immediate action is made stronger by:

• A dire shortage of COBOL programmers to carry the manual load

• The likelihood of applications failing long before 2000

• The duration and manpower requirements of a complex year 2000 project, often requiring the hiring or contracting of new resources

• The likelihood that programming costs will rise significantly over the next few years as demand for COBOL programmers grows to hysterical proportions, and outstrips supply

Companies that do not achieve substantial Y2000 compliance in 1998 (and, of course, also achieve full 1999 compliance) may well face the prospect of a critical and possibly fatal stoppage of their data processing systems. Act now!

Next


FlexGen Home Screen I FlexGen Features I FlexGen Product Summary I The BIG Idea
How to Re-architecture Legacy COBOL I Rx Computer I Contact Us

Free Y2000 Dates Solution I NASA Y2000 Estimates Calculator I Y2000 White Paper I ADAPT/2000

© 1999 Rx Computer Ltd - Registered Trademarks and Names are the property of their respective owners.