Mapping and the GEMs
Translating Between the ICD-9 and ICD-10 Diagnosis Code Sets
Mappings between ICD-9 and ICD-10 attempt to find corresponding diagnosis codes between the two code sets, insofar as this is possible. In some areas of the classification the correlation between codes is fairly close, and since the two code sets share the conventions of organization and formatting common to both revisions of the International Classification of Diseases, translating between them is straightforward. Many infectious disease, neoplasm, eye, and ear codes are examples of fairly straightforward correspondence between the two code sets. In other areas—obstetrics, for example—whole chapters are organized along a different axis of classification. In such cases, translating between them the majority of the time can offer only a series of possible compromises rather than the mirror image of one code in the other code set.
ICD-10 Description | Correlation | ICD-9 Description | Unequal Axis of classification |
A02.21 Salmonella meningitis |
= | 003.21 Salmonella meningitis |
None |
C92.01 Acute myeloid leukemia, in remission |
= | 205.01 Myeloid leukemia, acute, in remission |
None |
ICD-10 Description | Correlation | ICD-9 Description | Unequal Axis of classification |
O26.851 Spotting complicating pregnancy, first trimester | ≠ | 649.50 Spotting complicating pregnancy, unspecified episode of care 649.51 Spotting complicating pregnancy, delivered 649.53 Spotting complicating pregnancy, antepartum | Stage of pregnancy (ICD-10) vs. Episode of care (ICD-9) |
O26.852 Spotting complicating pregnancy, second trimester | |||
O26.853 Spotting complicating pregnancy, third trimester | |||
O26.859 Spotting complicating pregnancy, unspecified trimester |
A sentence translated from English to Chinese may not be able to capture the full meaning of the original because of fundamental differences in the structure of the language. Likewise, a code set may not be able to seamlessly link the codes in one set to identical counterparts in the other code set. For these two diagnosis code sets, it is often difficult to find two corresponding descriptions that are identical in level of specificity and terminology used. This is understandable. Indeed, there would be little point in changing from the old system to the new system if the differences between the two, and the benefits available in the new system, were not significant.
There is no simple “crosswalk from ICD-9 to ICD-10” in the GEM files. A mapping that forces a simple correspondence—each ICD-9 code mapped only once—from the smaller, less detailed ICD-9 to the larger, more detailed ICD-10 defeats the purpose of upgrading to ICD-10. It obscures the differences between the two code sets and eliminates any possibility of benefiting from the improvement in data quality that ICD-10 offers. Instead of a simple crosswalk, the GEM files attempt to organize those differences in a meaningful way, by linking a code to all valid alternatives in the other code set from which choices can be made depending on the use to which the code is put.
It is important to understand the kinds of differences that need to be reconciled in linking coded data. The method used to reconcile those differences may vary, depending on whether the data is used for research, claims adjudication, or analyzing coding patterns between the two code sets; whether the desired outcome is to present an all-embracing look at the possibilities (one-to-many mapping) or to offer the one “best” compromise for the application (one-to-one mapping); whether the desired outcome is to translate existing coded data to their counterparts in the new code set (“forward mapping”) or to track newly coded data back to what they may have been in the previous code set (“backward mapping”), or any number of other factors. The scope of the differences varies, is complex, and cannot be overlooked if quality mapping and useful coded data are the desired outcomes. Several common types of differences between the code sets will be examined here in detail to give the reader a sense of the scope.
Diagnosis Codes and Differences in Classification
ICD-10-CM has been updated to reflect the current clinical understanding and technological advancements of medicine, and the code descriptions are designed to provide a more consistent level of detail. It contains a more extensive vocabulary of clinical concepts, body part specificity, patient encounter information, and other components from which codes are built.
For example, an ICD-9 code description containing the words “complicated open wound” does not have a simple one-to-one correspondent in ICD-10. The ICD-9 description identifies the clinical concept “complicated,” but according to the note at the beginning of the section, that one concept includes any of the following: delayed healing, delayed treatment, foreign body or infection. ICD-10 does not classify open wound codes based on the general concept “complicated”. It categorizes open wounds by wound type — laceration or puncture wound, for example — and then further classifies each type of open wound according to whether a foreign body is present. ICD-10 open wound codes do not mention delayed healing or delayed treatment, and instructional notes advise the coder to code any associated infection separately. Therefore, depending on the documentation in the record, the correct correspondence between and ICD-9 and ICD-10 code could be one of several.
Diagnosis Codes and Levels of Specificity
ICD-9 and ICD-10 Code Sets Compared:
Code Length and Set Size
Comparison | ICD-9-CM | ICD-10-CM |
# of Characters | 3-5 Numeric
(+V and E codes) |
3-7 Alphanumeric |
# of Codes | ~14,500 | ~70,000 |
As shown in the table above, ICD-10 codes may be longer, and there are about five times as many of them. Consequently, in an unabridged ICD-9 to ICD-10 mapping, each ICD-9 code is typically linked to more than one ICD-10 code, because each ICD-10 code is more specific.
ICD-10 is much more specific than ICD-9, and, just as important for purposes of mapping, the level of precision in an ICD-10 code is more consistent within clinically pertinent ranges of codes. In ICD-9, on the other hand, the level of detail among code categories varies greatly. For example, category 733, Other disorders of bone and cartilage, contains the codes:
- 733.93 Stress fracture of tibia or fibula
- 733.94 Stress fracture of the metatarsals
- 733.95 Stress fracture of other bone
- 733.96 Stress fracture of femoral neck
- 733.97 Stress fracture of shaft of femur
- 733.98 Stress fracture of pelvis
Five of the six codes specify the site of the fracture. The third code is an “umbrella” code for all other bones in the body. In practical terms this means that the general ICD-9 code 733.95 must represent a whole host of disparate fracture sites. Diagnoses that are identified by umbrella codes lose their uniqueness as coded data. When only the coded ICD-9 data is available, it is impossible to tell which bone was fractured. On the other hand, in many instances ICD-10 provides specific codes for all likely sites of a stress fracture, including more specificity for the bones of the extremities, the pelvis and the vertebra. Stress fracture data coded in ICD-10 possesses a consistent level of specificity.
One might expect an ICD-10 to ICD-9 mapping never to contain one-to-many mappings, since ICD-10 is so much larger and more specific. However, there are cases where ICD-9 contains more detail than ICD-10, especially where a clinical concept or axis of classification is no longer deemed essential information. Aspects of some individual ICD-9 code descriptions, such as information about how a diagnosis was confirmed, were intentionally not included in ICD-10. This means a single ICD-10 code could be linked to more than one ICD-9 code option, depending on the purpose of the mapping and the specific documentation in the medical record.
Below are two examples where a distinction made in ICD-9 is not made in ICD-10. The result is that the ICD-10 code could be linked to more than one ICD-9 code, because a particular area of the ICD-9 classification contains detail purposely left out of ICD-10.
Specificity in ICD-9 and not in ICD-10:
Method of Detection
Specificity in ICD-9 and not in ICD-10:
Method of Detection
ICD-9 contains | ICD-10 contains |
010.90 Primary tuberculous infection, unspecified examination 010.91 Primary tuberculous infection, bacteriological/histological exam not done 010.92 Primary tuberculous infection, bacteriological/histological exam unknown (at present) 010.93 Primary tuberculous infection, tubercle bacilli found by microscopy 010.94 Primary tuberculous infection, tubercle bacilli found by bacterial culture 010.95 Primary tuberculous infection, tubercle bacilli confirmed histologically 010.96 Primary tuberculous infection, tubercle bacilli confirmed by other methods |
3-5 Numeric
(+V and E codes) |
Specificity in ICD-9 and not in ICD-10:
Legal Status and completeness of procedure
ICD-9 contains | ICD-10 contains |
635.50 Legally induced abortion, complicated by shock, unspecified 635.51 Legally induced abortion, complicated by shock, incomplete 635.52 Legally induced abortion, complicated by shock, complete 636.50 Illegal abortion, complicated by shock, unspecified 636.51 Illegal abortion, complicated by shock, incomplete 636.52 Illegal abortion, complicated by shock, complete |
O04.81 Shock following (induced) termination of pregnancy |
Diagnosis Codes in Combination
One ICD-9 or ICD-10 code can contain more than one diagnosis. For purposes of mapping, these are called combination codes. A combination code consists of more than one diagnosis. For example, a combination code can consist of a chronic condition with a current acute manifestation, as in ICD-9 code 250.21 Diabetes with hyperosmolarity, type I (juvenile type), not stated as uncontrolled. Or a combination code can consist of two acute conditions found together, as in ICD-10 code R65.21 Severe sepsis with septic shock. Or a combination code can consist of an acute condition and its external cause, as in ICD-10 code T58.01 Toxic effect of carbon monoxide from motor vehicle exhaust, accidental (unintentional).
If a combination code in one code set has a corresponding combination code in the other code set, then the two entries are linked in the usual way. It is only when a combination code in one set is broken into discrete diagnosis codes in the other set that another method of mapping is needed.
Mapping in cases where a combination code in one set corresponds to two or more discrete diagnosis codes in the other set requires that the combination code be linked as a unit to two or more codes in the other code set. Each discrete diagnosis code is a partial expression of the information contained in the combination code and must be linked together as one GEM entry to fully describe the same conditions specified in the combination code. Entries of this type are linked using a special mapping flag that indicates the allowable A+B+C choices.
ICD-9 to ICD-10 mapping, combination entry:
Histoplasma duboisii meningitis
ICD-9-CM Source | to | ICD-10-CM Target |
115.11 Histoplasma duboisii meningitis |
≈ | B39.5 Histoplasmosis duboisii AND G02 Meningitis in other infectious and parasitic diseases classified elsewhere |
ICD-10 to ICD-9 mapping, combination entry:
Atherosclerosis of autologous vein coronary artery bypass graft(s) with unstable angina pectoris
ICD-9-CM Source | to | ICD-10-CM Target |
I25.710 Atherosclerosis of autologous vein coronary artery bypass graft(s) with unstable angina pectoris |
≈ | 414.02 Coronary atherosclerosis of autologous vein bypass graft AND 411.1 Intermediate coronary syndrome |
Introduction to the GEMs
The ICD-10 and ICD-9 GEMs are used to facilitate linking between the procedure codes in ICD-9 volume 3 and the new ICD-10 code set. The GEMs are the raw material from which providers, health information vendors and payers can derive specific applied mappings to meet their needs. This is covered in more detail in section 2.
The ICD-9 to ICD-10 GEM contains an entry for every ICD-9 code. Not all ICD-10 codes are contained in the ICD-9 to ICD-10 GEM; the ICD-9 to ICD-10 GEM contains only those ICD-10 codes which are plausible translations of the ICD-9 codes. As with a bi-directional translation dictionary, the translations given are based on the code looked up, called the source system code.
The ICD-9 to ICD-10 GEM can be used to migrate ICD-9 historical data to a ICD-10 based representation for comparable longitudinal analysis between ICD-9 coded data and ICD-10 coded data. It can be used to create ICD-10 based test records from a repository of ICD-9 based test records. The ICD-9 to PCS GEM can also be used for general reference.
The ICD-10 to ICD-9 GEM contains an entry for every ICD-10 code. Not all ICD-10 codes are contained in the ICD-10 to ICD-9 GEM; the ICD-10 to ICD-9 GEM contains only those ICD-9 codes which are plausible translations of the ICD-10 codes. The translations given are based on the ICD-10 code looked up, the source system code in the ICD-10 to ICD-9 GEM.
The ICD-10 to ICD-9 GEM can be used to convert ICD-9 based systems or applications to ICD-10 based applications, or create one-to-one backwards mappings (also known as a crosswalk) from incoming ICD-10 based records to ICD-9 based legacy systems. This is accomplished by using the ICD-10 to ICD-9 GEM, but looking up the target system code (ICD-9) to see all the source system possibilities (ICD-10). This is called reverse lookup. For more information on converting ICD-9 based systems and applications to ICD-10, see the MS-DRG conversion project report at: Link.
The word “crosswalk” is often used to refer to mappings between annual code updates of ICD-9. Crosswalk carries with it a comfortable image: clean white lines mark the boundary on either side; the way across the street is the same in either direction; a traffic signal, or perhaps even a crossing guard, aids you from one side to the other. Please be advised: GEMs are not crosswalks. They are reference mappings, to help the user navigate the complexity of translating meaning from one code set to the other. They are tools to help the user understand, analyze, and make distinctions that manage the complexity, and to derive their own applied mappings if that is the goal. The GEMs are more complex than a simple one-to-one crosswalk, but ultimately more useful. They reflect the relative complexity of the code sets clearly so that it can be managed effectively, rather than masking it in an oversimplified way.
One entry in a GEM identifies relationships between one code in the source system and its possible equivalents in the target system. If a mapping is described as having a direction, the source is the code one is mapping from, and the target is the code being mapped to.
Source | Target | a.k.a. |
From ICD-9-CM | To ICD-10-CM | “forward mapping” |
From ICD-10-CM | To ICD-9-CM | “backward mapping” |
The correspondence between codes in the source and target systems is approximate in most cases. As with translating between languages, translating between coding systems does not necessarily yield an exact match. Context is everything, and the specific purpose of an applied mapping must be identified before the most appropriate option can be selected.
The GEMs together provide a general (many to many) reference mapping that can be refined to fit the requirements of an applied mapping. For a particular code entry, a GEM may contain several possible translations, each on a separate row. The code in the source system is listed on a new row as many times as there are alternatives in the target system. Each correspondence is formatted as a code pair. The user must choose from among the alternatives a single code in the target system if a one-to-one mapping is desired.
The word “entry,” as used to describe the format of a GEM, refers to all rows in a GEM file having the same first listed code, the code in the source system. The word “row” refers to a single line in the file, containing a code pair—one code from the source system and one code from the target system—along with its associated attributes. An entry typically encompasses multiple rows.
There are two basic types of entries in the GEM. They are “single entry” and “combination entry.” In special cases, a code in the source system may be mapped using both types of entries.
- Single entry — an entry in a GEM for which a code in the source system linked to one code option in the target system is a valid entry
An entry of the single type is characterized by a single correspondence: code A in the source system corresponds to code A or code B or code C in the target system. Each row in the entry can be one of several valid correspondences, and each is an option for a “one to one” applied mapping. An entry may consist of one row, if there is a close correspondence between the two codes in the code pair.
An entry of the single type is not the same as a one-to-one mapping. A code in the source system may be used multiple times in a GEM, each time linked to a different code in the target system. This is because a GEM contains alternatives from which the appropriate applied mapping can be selected. Taken together, all rows containing the same source system code linked to single code alternatives are considered one entry of the single type.
Here is an entry of the single type, consisting of two rows. The rows can be thought of as rows A or B. Each row of the entry is considered a valid applied mapping option if a one-to-one mapping is desired.
ICD-9 to ICD-10 GEM:
Single type entry for ICD-9-CM code 599.72
ICD-9-CM Source | to | ICD-10-CM Target. |
599.72 Microscopic hematuria | ≈ | R31.1 Benign essential microscopic hematuria |
599.72 Microscopic hematuria | ≈ | R31.2 Other microscopic hematuria |
Because ICD-10 codes are for the most part more specific than ICD-9 codes, an entry of the single type in the ICD-9 to ICD-10 GEM is typically linked to multiple ICD-10 codes. The user must know, or must model, the level of detail contained in the original medical record to be able to choose one of the ICD-10 codes. The ICD-9 code itself cannot contain the answer; it cannot be made to describe detail it does not have. The same is occasionally true for the ICD-10 to ICD-9 GEM as well. An ICD-10 code may be linked to more than one ICD-9 code because detail in ICD-9 was purposely left out of ICD-10, as discussed earlier.
Both ICD-9 and ICD-10 contain what we refer to as “combination codes.” These are codes that contain more than one diagnosis in the code description. An example is ICD-10 code R65.21 Severe sepsis with septic shock. In this case, ICD-9 does not have an equivalent combination code, so in order to link the ICD-10 code to its ICD-9 equivalent, a combination entry must be used in the GEM.
- Combination entry—an entry in a GEM for which a code in the source system must be linked to more than one code option in the target system to be a valid entry
Stated another way, it takes more than one code in the target system to satisfy all of the meaning contained in one code in the source system. As discussed in this section, the situation occurs both when ICD-9 is the source system and when ICD-10 is the source system.
Here is an entry of the combination type, consisting of two rows in the format of a GEM file. The rows can be thought of as rows A and B. The rows of the entry combined are considered one complete translation.
ICD-10 to ICD-9 GEM:
Combination type entry for ICD-10-CM code R65.21
ICD-10-CM Source | to | ICD-9-CM Target. |
R65.21 Severe sepsis with septic shock | ≈ | 995.92 Severe sepsis AND 785.52 Septic shock |
Linking a code in the source system to a combination of codes in the target system is accomplished by using conventions in the GEMs called scenarios and choice lists.
- Scenario — in a combination entry, a collection of codes from the target system containing the necessary codes that combined as directed will satisfy the equivalent meaning of a code in the source system
- Choice list—in a combination entry, a list of one or more codes in the target system from which one code must be chosen to satisfy the equivalent meaning of a code in the source system
R6521 99592 101 1 1 R6521 78552 101 1 2 |
There are two rows in the ICD-10 to ICD-9 GEM for combination code R65.21. The entry is of the combination type, meaning that each row—code R65.21 linked to both of the two ICD-9 codes—is considered a valid entry. The combination flag is the third attribute in a GEM file. The scenario number is 1, because there is only one variation of the diagnoses specified in the combination code. There are two choice lists in this entry, and only one code in each choice list.
ICD-10 Code | ICD-10 Description | ICD-9 Code | ICD-9 Description | Approximate [FLAG] | No Map [FLAG] | Combination [FLAG] | Scenario | Choice list |
R65.21 | Severe sepsis with septic shock | 995.92 | Severe sepsis | 1 | 0 | 1 | 1 | 1 |
R65.21 | Severe sepsis with septic shock | 785.52 | Septic shock | 1 | 0 | 1 | 1 | 2 |
It is important to make the distinction between a single row in a combination entry and an entry of the single type. An entry of the single type is one code in the source system linked to multiple one-code alternatives in the target system. It presents the option of linking one code in the source system to code A or B or C in the target system. Each code correspondence is considered a viable option. Each row of the source system code entry linked with target code A or B or C is one valid entry in an applied map.
An entry of the combination type is one code in the source system linked to a multiple-code alternative in the target system. If the source system is ICD-10, for example, the user must include ICD-9 codes A and B and C in order to cover all the diagnoses identified in the ICD-10 code. Further, there may be more than one multiple-code alternative. If a GEM contains a range of ICD-9 code alternatives for each partial expression of the ICD-10 code, then the number of solutions increases. Each instance of the ICD-10 combination code paired with one code of the allowed range A and one code of the allowed range B and one code of the allowed range C is sometimes referred to as a “cluster,” and is considered a valid entry. The combination flag in a GEM will clearly signal an entry of the combination type.
The two entry types and their main features are summarized in the table below.
Entry Type | Summary Description | Approximate [FLAG] | No Map [FLAG] | Combination [FLAG] | Scenario | Choice list |
Single | Source system code has one or more single target code alternatives | On or Off | N/A | Off | 0 | 0 |
Combination | Source system code has one or more multiple target code alternatives | On | N/A | On | 1-9 | 1-9 |
Source: Link