Population for Validation

1.    Series needed in preprocessor. Population (POP). Age-sex, fertility and mortality distributions. The preprocessor fills holes in total fertility rate (TFR), life expectancy (LIFEXP), contraception use (CONTRUSE), literacy rate (LIT), calories per capita (CLPC), the percentage of malnourished children (MALNCHP), the labor force participation rate (LAPOPR), crude birth rate (CBR), and crude death rate (CDR). It also fills null values in migration (MIGRATE) with zeros, estimates annual rates from foreign born percentages, and normalizes immigration and emigration. (LABOR, ILLITERACY, URBAN, URBAN GR are present in the input file but are not used; dropped from historic data load.)

 

2.    Series needed in model initialization. Population (POP); population growth rate (POPR), crude birth rate (CBR), crude death rate (CDR); cohort-specific population (age-sex), mortality and fertility data. From these are computed the total fertility rate (TFR) and life expectancy (LIFEXP). Contraception use (CONTRUSE), literacy rate (LIT), income share of the lowest 20 percent (INCSHR)

 

3.    Country population (POP)

 

Used World Bank data from its 1999 Development Indicators CD because those data are very nearly complete. Plugged a few country series with CIA data (e.g. Taiwan). Only missing country then was French Guiana and put guess into 1960 value based on value from 1990s.

 

4. Population growth related data

 

Used TFR (1962) from historic series (World Bank CD). Very few missing countries. Should estimate GDPPCP relationship for 1960 to fill holes in rebuilding of base.

 

Used LifExp (1962) from historic series (World Bank CD). Very few missing countries. Should estimate GDPPCP relationship for 1960 to fill holes in rebuilding of base.

 

Used CBR, CDR series from World Bank CD. Again used 1962 as 1960 because the Bankdata for 1962 are much more complete than in 1960.

 

Historic CBR and CDR, in conjunction with year 2000 age-sex, fertility, and mortality distributions gave rise to a computed TFR in China for 1960 which was much lower than historic data (4.4 vs 7.5). This is because the age-sex pyramid used from 2000 has much heavier than representation in the child-bearing years than it should for historic use. So we introduced the actual 1960 age-sex distributions from the UN 1998 revision (the only holes were Greenland, French Guiana, and Taiwan, which we plugged with distributions from Denmark, Gabon, and China, respectively). The 1998 revision does not have fertility or mortality distributions prior to 1990, but this will not be quite as important.

 

5.    Population data added to historical data tables

 

Population, 1960-1997

Urban Population, 1960-98 (WDI)

Total fertility rate, 1960-1998 (WDI – irregular years, skipping those with little data)

Crude birth rate, 1960-1998 (many holes in early years)

Crude death rate, 1960-1998 (many holes in early years)

Life expectancy, 1962-1997 (many holes in early years)

Infant Mortality, 1960-1998 (WDI – irregular years, skipping those with little data)

Contraception use, 1975, 1980, 1985, 1990 (skimpy data, especially early; basically have taken very sparse data set based on surveys and clustered values around 5-year marks)

Labor force, 1960-98 (WDI)

Labor force portion female, 1970-95 (10-year intervals plus 1995)

Labor percentage in agriculture, 1970-90 (10-year intervals)

Labor percentage in industry, 1970-90 (10-year intervals)

Labor percentage in services, 1970-90 (10-year intervals)

Illiteracy percentage, 1963-95 (missing years)

Illiteracy percentage, female, 1985, 1990

Illiteracy percentage, female, 1985, 1990

Calories per capita, 1961-94

Malnourished children percentage, 1980, 1991, 1996 (WDI has low weight children with very sparse data set from 1970-98, but with some data even for U.S.; should clean this up; very similar data set for low height children)

Labor force, % foreign, 1988-97, few values (WDI)

Population, % foreign, 1988-97, few values (WDI)

Population, foreign, 1988-97, few values (WDI)

Population, foreign inflows, 1988-97, few values (WDI)

Population, asylum seekers, 1989-98, few values (WDI)

AIDS deaths in millions, 1999 – UN web site

HIV infection rates, 1997 and 1999 – 1997 from Population Reference Bureau annual population data sheet; 1999 from UN web site

 

6.    Population data added to historical data load for 1960-2000

 

Population (POP), 1960

Total fertility rate (TFR), 1960

Crude birth rate (CBR), 1960

Crude death rate (CDR), 1960

Life expectancy (LIFEXP), 1962

Infant Mortality (INFMOR), 1960

Contraception use (CONTRUSE), guesses based on data from 1975-1990

Literacy (LIT), 1963 calculated as 100% - illiteracy, but missing US, UK, etc.

Calories per capita, 1961 (WDI 2000 CD has illiteracy data only back to 1970 and does not have literacy rates)

Malnourished children percentage (MALNCHP), zeroed out to let function compute for all

Migration rate (MIGRATE) – still have data from 1995 load

HIV infection rates and AIDS deaths. Zeroed out for 1960 data load. No data and low levels.

 

Have put in 1960 data for age-sex distributions

 

7.    Specialized global information

 

Age-specific patterns of AIDS deaths obviously vary by country, but there is inadequate information to allow country-specific data entries or functional estimation across countries. Instead, we decided to enter a global “prototype” of age-specificity, using data from the U.S. Census BureauHIV/AIDS Surveillance Data Base, which draws on a vast number of studies of infections from around the world (www.census.gov/ipc/hivaidsw.html). We expected a priori that HIV infections and AIDS deaths would be concentrated in the sexually-active age cohorts and probably somewhat nearer the bottom of those than the top of them. UN data also indicate that only 4% of AIDS deaths are outside of the 15-49 year-old cohorts (although this seems rather low considering the age-specific data provided in the studies of the Surveillance Data Base; one study from Brazil indicated that the percentage of total infections across those tested was not much less in the entire 50-69 year-old category than in the much narrower 20-29 year-old cohort). Using data from studies done primarily in Brazil and Botswana (relying more on those in Brazil) we created a rough age distribution pattern and entered it into the PopCohortGlobal table. The studies suggest a more stable pattern of infection across the 15 to 49 year-olds that we anticipated. The data in the Surveillance Data Base suggested both limited gender-specific variation in age distributions of HIV infection for countries with high rates of infections and relatively balanced infection rates by gender in the most heavily afflicted countries.

 

[The UN database has enough data on gender-specific infections by country to allow us to do some estimation here. Have Nyema pull out the numbers and do it.]

 

It is similarly difficult to get age-specific patterns of migration by country. Again the US Census Bureau has a potentially useful and large data set, the International Data Base (www.census.gov/ipc/www/idbinst.html). Although other studies suggest that data from Sweden and Switzerland are particularly well-maintained relative to other countries, we found these data to be too limited to be of much use. More migration originates in less-developed countries, and data from Mexico and Pakistan were extensive and appeared relatively clean (with the concentration of migrants among economically-active age groups, as we expected a priori). We ultimately chose to use the data from Mexico in 1980 (latest year available in the data set) as the global prototype for the age-distribution of migrants. That data (as in Pakistan) suggest that the 20-29 year-old cohorts are especially likely to migrate, and migrants drop sharply above 34 years of age. Men outnumber women among Mexican migrants by 1.27 to 1 and among Pakistan migrants (in 1973) by 2.59 to 1. We decided to use 2 to 1 as a rough global gender pattern.

 

The U.S. Census Bureau generously provided their Ultimate Life (survival rates) and Ultimate Fertility tables (we used the latter). Unfortunately, the former only extended to 80+, because the U.S. Census Bureau is still not representing 5-year cohorts through 100+, as does the United Nations. The UN did not respond to a request for their Ultimate Life table. Examination of data from their 1998 Population Revision showed that their forecast of survival rates by cohort for Japan in 2050 (with Japan having the highest life expectancy early in the new century) was remarkably similar through the 1975-79 cohort to the survivor tables from the Census Bureau. Therefore we decided to use the UN forecast for Japan in 2050 (through cohort 100+) as the Ultimate Life table for the world.

 

8.    Next on Population

 

a.    WDI 2000 has labor force data from 1960-98 (complete). Pull out and use to compute labor force participation rate. Try to make sure that participation rate is computed on base of 15-65 year olds (can also get from WDI 2000, but should have available in model given cohort structure).

 

b.    Need immigration data/modeling improvement to improve U.S. historic performance. WDI 2000 does not have immigration/migration data, but has data on foreign labor force as % of total labor force 1988-98 (few countries, with only 1 value for U.S.; mostly OECD); also on foreign population 1988-98 (mostly OECD, several values for U.S.) and for foreign population as % of total population. Most useful from WDI 2000 might be “inflows of foreign population” (1988-98) and “inflows of asylum seekers” (1989-98), mostly OECD.

 

c.     Work on historic fit