Health and Demographic Surveillance System in the Western and Coastal Areas of Kenya: An Infrastructure for Epidemiologic Studies in Africa

Background The Health and Demographic Surveillance System (HDSS) is a longitudinal data collection process that systematically and continuously monitors population dynamics for a specified population in a geographically defined area that lacks an effective system for registering demographic information and vital events. Methods HDSS programs have been run in 2 regions in Kenya: in Mbita district in Nyanza province and Kwale district in Coast Province. The 2 areas have different disease burdens and cultures. Vital events were obtained by using personal digital assistants and global positioning system devices. Additional health-related surveys have been conducted bimonthly using various PDA-assisted survey software. Results The Mbita HDSS covers 55 929 individuals, and the Kwale HDSS covers 42 585 individuals. In the Mbita HDSS, the life expectancy was 61.0 years for females and 57.5 years for males. Under-5 mortality was 91.5 per 1000 live births, and infant mortality was 47.0 per 1000 live births. The total fertility rate was 3.7 per woman. Data from the Kwale HDSS were not available because it has been running for less than 1 year at the time of this report. Conclusions Our HDSS programs are based on a computer-assisted survey system that provides a rapid and flexible data collection platform in areas that lack an effective basic resident registration system. Although the HDSS areas are not representative of the entire country, they provide a base for several epidemiologic and social study programs, and for practical community support programs that seek to improve the health of the people in these areas.


INTRODUCTION
A Health and Demographic Surveillance System (HDSS) is a longitudinal data collection process that systematically and continuously monitors population dynamics in a specified population in a geographically defined area that lacks an effective system for registering demographic information and vital events. [1][2][3] The simplest HDSS consists of prospective data collection of vital events such as births, deaths, and migrations among the population, with periodic updates made via visits to all households in the defined area ( Figure 1). A more advanced HDSS adds various surveys during follow-up periods, to assess other variables (such as health-related factors and socioeconomic factors), to investigate risks of diseases or health conditions, or to identify high-risk groups among communities in the area. The data collected by the HDSS can be used not only for descriptive and analytic epidemiologic studies, but also for community-based interventional studies like a cluster-randomized trial, 4,5 which allocates treatment arms randomly to groups of individuals referred to as "cluster," eg, a community, village, or area in the HDSS area. The HDSS can be used as a base or infrastructure to test a new methodology of disease control in an area with no civil registration system. 6,7 The concept of the HDSS is not new. 8 Since the early 20th century, a study concept existed in which a community would be prospectively and systematically observed to collect public health, epidemiologic, and demographic information. 8,9 This type of study was called a population laboratory or population observatory at that time, because it observed a whole community or population for the purpose of research. 10 In 1997, Garenne et al renamed this type of study a prospective community study. Initially, the term demographic surveillance system (DSS) was used to refer to a system for managing demographic data in a prospective community study program. 8 In 1998, some prospective community study groups gathered to share information on DSSs and organized a new association. 3 This new alliance of DSSs was organized and named the INDEPTH Network (International Network for the Demographic Evaluation of Populations and Their Health in Developing Countries). At that time, the term DSS replaced prospective community study. In 2009, INDEPTH added the word "Health" before DSS, and the term became Health and Demographic Surveillance System (HDSS). However, the main purpose of the HDSS remained the same, that is, observation of population dynamics in a specific geographic area to support epidemiologic and interventional studies. 11 The value of the HDSS as a stable and reliable source of information has been increasing with regard to health and demographic data from areas and regions that lack data collection systems for vital statistics. 2,3,12,13 The Institute of Tropical Medicine at Nagasaki University is a research institute for tropical medicine and public health in low-and middle-income countries in Asia and Africa. It launched a new program in Kenya in 2005 that uses an HDSS as an infrastructure. The aim of this program was to conduct studies of infectious and parasitic diseases, other health conditions, and socioeconomic and environmental factors. 11,14 As of 2011, we have set up and implemented 2 HDSS programs in the Western and coastal areas of Kenya, which are areas that have different disease burdens and cultures. In this article, we describe the profile of our HDSS programs in Kenya and share some basic findings from the HDSS dataset.

Study area
Our HDSS programs are run in 2 regions of Kenya, ie, the Mbita district in Nyanza province and the Kwale district in Coast Province ( Figure 2). In Mbita, the HDSS program (hereafter "Mbita HDSS") follows 11 182 households and 55 929 inhabitants as of 1 July 2011 in an area of 163.28 km 2 between latitudes 0°21′S and 0°32′S and longitudes 34°04′ and 34°24′, which includes Rusinga West, Rusinga East, Gembe West, and Gembe East. In Kwale, the HDSS program (hereafter "Kwale HDSS") covers 7617 households and 42 585 inhabitants as of 1 July 2011 in the Kinango South and Mwaluphamba locations of the Kwale district in an area of 384.9 km 2 , which includes 3 locations between latitudes 4°17′S and 4°5′S and longitudes 39°15′ and 39°29′: Mwaluphamba, Kinango South, and Golini. The Golini location (58.3 km 2 ), which includes the eastern side of the Kwale-Kinango HDSS, was added in March 2011 to expand the area of coverage.
We selected these 2 HDSS areas with different cultural and environmental backgrounds to compare disease patterns and factors affecting health status. Additionally, we considered the following characteristics in the selection process: (1) suitability population density for an HDSS, (2) existence of currently targeted diseases by our researchers (malaria, schistosomiasis, filariasis, human immunodeficiency virus [HIV], tuberculosis, and diarrhea, (3) absence of other organizations operating in the study area, and (4) practicality of establishing a field station. After considering the above characteristics and activities to date by the researchers from Nagasaki University, 2 local areas in Mbita and Kwale were selected as HDSS sites. Malaria, schistosomiasis, filariasis, HIV, and tuberculosis are endemic in the region around the Mbita and Kwale HDSS areas. 15 Also, the areas had different cultural backgrounds, including lifestyle, environment, and religion. In the Mbita HDSS area, Christianity is the main religion, and people live on fishing in Lake Victoria and peasant farming. In contrast, in the Kwale HDSS area, the main religion is Islam, and the people are peasant farmers who rely on subsistence agriculture for food.

Census
Dynamic cohort (updated through regular cycles)

Exit
Enter Death Out-migration Birth In-migration The field station in Mbita is located in the International Centre for Insect Physiology and Ecology (ICIPE) research compound. In Kwale, the field station is set up in the community resource centre of Kwale district. Both stations have electricity with a backup generator and internet connection and are equipped with a data server, personal computers, scanner, printers, and motorcycles to manage the HDSS.
All the individuals recognized by our field interviewers during the baseline census and follow-up rounds (described below) are registered under the HDSS programs, except for visitors to the areas.

Baseline census
Using software for personal digital assistants (PDAs)-ie, a small device resembling a computer that uses a pencil-like stick (stylus) instead of a keyboard to enter data-and a global positioning system (GPS) that we developed for this project, all households and all members in the HDSS areas were registered by our local field interviewers. In the Mbita HDSS, we started the baseline census between August and December of 2006 without GPS information; however, we re-registered the population, including the residents who had temporarily migrated out of the area, using the GPS between October and December 2008. In the Kwale HDSS, the baseline census was conducted between July 2010 and December 2010. From April 2011, the Kwale HDSS was expanded to the Golini location, a location adjacent to the original area, and the baseline survey finished in July 2011.
Routine follow-up at fixed intervals During routine follow-up rounds, vital events such as inmigration, out-migration, pregnancy, death, and deliveries and numbers of newborns are recorded using the PDA. Data are updated at 1-to 2-month intervals. Because all records have already been stored on a Structured Query Language (SQL) database in the PDA, the field interviewers only register or update the events according to the PDA program instructions. The HDSS program updates the registered information on households at least once a year. When new families or residents migrate into the HDSS area or babies are born, they are registered in the system. Extra-survey program (pop-up program) During every follow-up round, we add different types of questions to the routine HDSS event update; the new questions appear (pop up) on the PDA as part of the routine updating program (thus we call it a "pop-up program"). For example, we added surveys on water sources, school attendance, bednet usage, health-seeking behavior, immunization history, breastfeeding and child care, and dental hygiene, among others. The pop-up program also provides a sampling function for random sampling surveys, such as sampling 10% of all residents in the HDSS area or using selected surveys only for children younger than 5 years.

Verbal autopsy
The verbal autopsy (VA) is a process that attempts to determine cause-of-death in areas with incomplete or no vital registration systems. In many less-developed areas, most deaths occur at home without a diagnosis, which makes it difficult for health planning, priority setting, monitoring, and evaluation. 16,17 Field interviewer managers, who are the first level of supervisors, visit houses where deaths were registered by field interviewers, and conduct a VA using VA forms. After the VA forms are completed, they are given to physicians or clinical officers who read the history of illness and provide a cause of death.
System for data collection System development for PDA devices and GPS As part of the HDSS registration system, we developed a software program for the PDA that enables us to conduct paperless registration and follow-up for households and individuals in the HDSS areas. By using a paperless system, we save paper normally used for registration and follow-up data, do not need space for saving the filled-in forms, reduce personnel costs to enter data from the registration and followup forms, and save time entering and analyzing data. Furthermore, the PDA program can assist surveys and data entry for field interviewers by giving instruction and on-site validation checking of entered data. It also generates house and individual identification numbers (IDs) for new registrants, so that no person needs to assign and re-register IDs, as would be required in a paper-based registration system. Our PDA system was developed using the Microsoft .Net framework and SQL server compact 3.5. The database consists of 15 tables: 10 tables for registration and follow-up and 5 tables for administration and management of the PDA program. Grid geographical address system In both HDSS areas, not all houses have addresses to show dwelling places, which means we need a system to identify house structures in our HDSS program. To provide a house identification that enables us to recognize geographic areas and locations easily, we developed a grid geographical address system that uses longitude and latitude given by the GPS mounted on the PDA. As shown in Figure 3, the HDSS areas were divided using uniquely numbered squares with 700-meter sides. Each 700-m 2 grid is further subdivided into 49-square grids (subgrids) with sides measuring 100 meters. Each subgrid has a number from 1 to 49. Furthermore, we provide a sequential number to a house structure within the same subgrid. As a result, each house structure has an ID number, eg, 357-32-12, that has its geographical position automatically recorded in the database at registration using the longitude and latitude obtained by the GPS receiver.
Quality control and quality assurance An essential issue in a data collection system like the HDSS, which deploys field workers in wide areas, is control of data quality. We have implemented several systems to prevent errors and misconduct by field interviewers such as intentionally or unintentionally skipping questions and entering or updating data without actually visiting houses. To avoid such behaviors, the PDA program uses a system that checks for missing and incompatible data on the spot, during the interview. In addition, it includes a program that calculates the distance between the place where the field interviewer is standing and the location where the house is registered when the field interviewer starts updating data. This program prevents the data updating program from starting if the field interviewer is standing more than 20 meters away from the target house. We are also able to monitor working hours of the field interviewer using times recorded on each record in the PDA database when a field interviewer updates any data. We also validate data collected by the field interviewer by random spot checking, which is done by second-level and third-level managers.

Organizing HDSS management
To run our HDSS programs, we assign a field manager for each HDSS site to manage the overall program in the area. Under the manager, 4 field-interviewer managers (FIMs) are deployed to manage routine data collection conducted by field interviewers (FIs  1  2  3  4  5  6  7   8  9  10  11  12  13  14   15  16  17  18  19  20  21   22  23  24  25  26  27  28   29  30  31  32  33  34  35   36  37  38  39  40  41  42   43  44  45  46  47  48  49 100 M 700 M 700 M Figure 3. Grid geographical address system (GGAS). Each grid is a square (700 m 2 ) that is divided into 49 subgrids (eg, left square). The subgrids are numbered from 1 to 49, from the left upper edge to the right lower edge. Each residential structure is numbered using the grid, subgrid, and sequence number within the grid-subgrid area, which is written on the door of the structure. The geographical position of each house ID is automatically recorded in the database at registration, using the longitude and latitude obtained by the GPS receiver.
Then, a 90-minute examination was given. For those with a higher exam result, an interview exam was given as part of further selection. After selection, training for the HDSS job was given to FIMs for only 3 weeks (1 week for lectures, 1 week for basic PDA training, and 1 week for field training). Then, trained FIMs gave the same training to FIs, under our supervision. We recruited local individuals for FI and FIM positions because they understand the local language and communities. Currently, 16 FIs in Mbita and 10 FIs in Kwale are working in their assigned areas to register and update information on vital events and to complete additional surveys provided by the PDA pop-up program. FIMs visit areas to replace PDA batteries and collect data from the PDA twice a week. Collected data are transferred to an SQL database (MySQL 5.0) in the local data server to monitor and evaluate the progress of data collection for the HDSS program in each round at local level.
Data transfer from the field station After transferring data from the PDA to the local server in the field station, PDA data are sent to a Nairobi server via the Internet, using a secure channel to manage both HDSS programs. Internet service in Kenya is not always stable. To address this instability in the internet networks, we accumulate daily datasets and send any remaining datasets when the internet becomes available. To avoid traffic congestion on the internet during the day, which might result in transfer errors, we send data at night, when internet traffic is relatively light. The transferred PDA data is automatically stored in the SQL database in the server; then auto-analysis programs monitor the progress of the HDSS follow-up rounds, working hours of field interviewers, and new deaths for VA, among other functions. For data management, STATA version 10 (Stata Corporation, TX, USA) is used after loading data directly from the SQL database by using the STATA ODBC command until the auto-analyses for data management.

Ethical considerations
The head of the household must give informed consent for HDSS registration and follow-up. To maintain a good relationship between our HDSS program and communities, we have routine meetings for community sensitization, which improve consent rates for participation in the registration. For households that reject registration at the first visit by an FIM, our field manager and village elders visit the household to explain our program and persuade them to join our HDSS program. Most such cases are households that have recently migrated into the HDSS area, and they usually agree to be registered and followed after they understand the program.

RESULTS
Only basic statistics are shown in this paper because much of the data are currently being analyzed or are still being organized for analysis. Figure 4 shows the population pyramids of the 2 HDSS areas. Populations in the HDSS programs are usually calculated using person-years, which are based on the persontime contributed by each registered resident. In calculating person-time in our program, we defined residents as those who stayed in the HDSS area 60 days (2 months) a year or longer; thus, registered individuals who stayed less than 60 days a year were removed for the calculation of total person-years.  Mortality rates by age group in 2010 are shown in Table 2. Infant mortality for girls (58.3/1000 person-years) was slightly higher than for boys (39.8/1000 person-years). Mortality rates for adults aged 20 to 29 years, which is child-bearing age in this area, were higher among women than among men. We counted 31 deaths in 4500 person-years among women aged 20 to 29 years as compared with 13 deaths in 3523.2 personyears among men of the same age. From age 30 years, mortality rates among men were higher than those among women in this area. Table 3 shows house structures and house properties that were registered in the Mbita HDSS and Kwale HDSS after updating data and basic registration for new households. These data can be used as variables of socioeconomic status in epidemiologic studies. In Mbita, as briefly summarized, most households (89%) used a lake as their main source of drinking water, and the proportion of households with toilets was lower than 40%. However, in Kwale, piped water was used as the main source of drinking water in 51.6% of households, although the proportion of households with toilets was almost the same as that for Mbita.

DISCUSSION
The presence of an HDSS in an area with no proper residential and vital registration system provides basic and statistical infrastructure for demographics and vital statistics without any additional investment by researchers, local officials, or communities. As a result, in addition to our HDSS program, we have several research programs and grassroots projects that support communities in the HDSS area, as shown in the Appendix. 19,20 Although HDSS programs have advantages, most HDSS programs in other areas require additional time for entering data from paper-based record forms into databases and for data cleansing. This process can take several months and can greatly prolong the period between data collection and analysis. Our paperless HDSS program, which uses a PDA, GPS, and the internet, enables us to analyze data on the day when data are collected. Moreover, this system provides on-the-spot logical checks of data entered by field interviewers, which also saves time and reduces human error and the need for additional resources. It also permits checking of when and where data were updated for each household, thereby ensuring data quality.  Among HDSS centers participating in the INDEPTH Network, one goal is to share information more widely to provide common data to analyze global health problems. 21,22 The 2 HDSS programs detailed here will be able to provide information to other centers and contribute information to the international community. Furthermore, our HDSS program has additional, educational value for young scientists. It provides good training and research opportunities for undergraduate and graduate students. Moreover, we have developed several technological improvements to better manage the HDSS system and data collection.
Another advantage is the ability to use the HDSS system in other areas. Our system, originally developed for the Mbita HDSS program, was easily transplanted to the Kwale HDSS. Regarding areas outside Kenya, in 2010 our system was transplanted to the Lahanam HDSS program in the Lao People's Democratic Republic, where it is operated collaboratively by the National Institute of Public Health, Laos and the Research Institute for Human and Nature, Kyoto, Japan after being translated from English to Laotian. The Laotian HDSS program covers about 7500 inhabitants, but plans are underway to expand to surrounding areas.
Although an HDSS can yield a variety of demographic and health-related data, there has been some criticism of HDSS data for not being representative of the population and the area to which the data are extrapolated. 23 This loss of representativeness or generalizability is unavoidable and intrinsic to HDSS programs because these programs change the communities and areas by providing benefits from the HDSS program itself or from programs secondarily introduced to the communities in the HDSS area. Such programs make the areas different from areas without HDSS programs. For example, using analysis and data provided by the HDSS program, a district government can plan a community health strategy in an area. 24,25 Also, research teams can design a more detailed plan for health research because the basic statistical data are provided by the HDSS program, which improves community health as an outcome of research. 23,[26][27][28] The potential loss of generalizability in HDSS programs has been discussed for some time. However, we believe that the numerous advantages of the current program outweigh the disadvantages because there is a lack of reliable information in the developing world. These advantages include the benefits of having precise, timely data and opportunities to systematically evaluate public health interventions, as well  as the usefulness of HDSS data for understanding areas of the developing world where no information is available. 21 Additionally, with regard to the Millennium Development Goals, the research agenda is shifting toward large-scale, multicenter trials and accelerated efforts at disease control. 22,29 In summary, our HDSS programs provide a rapid and flexible data collection platform in less-developed areas that lack an effective basic resident registration system. 22 This enables us to conduct several epidemiologic and social study programs and provide practical programs for community support to sustain healthy lives in such areas.