۱۳۸۹ آبان ۱۲, چهارشنبه

The Survey about the epidemy of rampant cancers in Tehran city by Geographical Information System

As I acclaimed to be prepared to do voluntarily the GIS (Geographical information System) services (information management system) for Society of support children suffering from cancer. The center of cancer research, under the supervision of Dr Moosavi Jarahi, asked me to participate. So the author has been voluntarily cooperating with this center since 5 years ago in the issue of the survey of the epidemy of rampant cancers in Tehran province by geographical information system.
Generally (the location information system) or Geographical Information System (GIS) as it's shown by the title, is initially a system which edits and analyzes the entire information of a specific location of geographic zone.
This is a computerized system for managing and analyzing the geographic information. It has the capability to gather, save, analyze and show the geographic information. The final aim of geographic system is to protect the decisions which are made according to the geographic data and the basic function of this kind of systems is to obtain the information that are gained by the syntax of the different layers of data with different procedures and various viewpoints.
About the word, epidemiology: it is the method of studying the conditions and the elements of diseases prevalence or any other elements which refer to the health. This phrase has been used as "the science for survey of the epidemy of infectious diseases". But today with the progress of all kinds of sciences and controlling most of epidemic diseases, the meaning of this word extended to "the science for survey of elements and conditions of diseases prevalence". "Epi" literally means occur, "Demos" means people and "Logus" means survey and recognition. So this phrase literally means "recognition of what occurs to people" and though the medical and health state of the society is the main point, it has close connection to economy, sociology, culture, religion....Although the clinical medical care most for the person and his disease, epidemiology most focuses on groups and communities and clinical epidemiology care for both.
The method which is mentioned in this research, according to Mr. Bastani, the prior supervisor of Iran codifying office, has been used for the first time and in spite of search in internet, no equivalent was found for it. The reason could be firstly, this method of codifying is used in developed countries and because it's near 200 years since those countries has been industrialized the geographic and descriptive plans of different parts of their country have been prepared and utilized since that time.
Secondly, development prevents those societies from prevalence of the epidemic diseases so there is no reason for them to use this method or any kind of other methods which will be mentioned soon after. We found only one instance, which was suggested by Oxford university to search about health GIS, According to the zip code. That article mentioned the same method and advised to use it for health management. But there were no details or special sub theory in it.
After one month I found another instance in "international journal of health geographic" that was about heart failure patients in Calgary, Canada in 2003 according to zip codes to find the transmittal of diseases in the city.

Methods and Materials
At the first step, the most important item for The Cancer Institute was the methodology of covering all existent information. Generally, the information assembling system of The Center of Cancer was according to the information of the patient's files (self declaration) in hospitals and medical centers which then has been registered on the specific forms of patients by the personnel and students in The Center of Cancer. Patient's address is one of the most important items in this forms which had some disadvantages. First, the patient's addresses were not precise, because the standard postal address has three parts, last pass way, penultimate pass way and sector. So as most of the people do not know about this, they do not follow the standards.
Second, most of the patients did not give any other addresses like where they work or spend most of their times in, especially for the patients of other cities. As you know, the signs of cancer can be detected in the places where the patients may work in. All of these disadvantages were announced to The Center of Cancer research but because The Center had no administrative authority to ask hospitals to register the patients' precise information, the current data was expected to be valid.
Thus, during the first two years we researched to prepare a plan of scattering level of disease and the population of residents in different zones according to these addresses. In this way, we had several sessions with the experts of The Center of Geographic Information of Tehran. We were supported by the president of that center, Mr. Moini and could use their data. Although the most important item for The Center of Cancer was to access to the population of residents of each zone, there was no proper data in The Center of Geographic Information. It is necessary to say, too many studies and researches in epidemiology has been done by The Center of Cancer and other centers but because of the same reason that has already been mentioned, none of them is not valid, nonetheless I will explain some of these efforts in the next part.
Characteristics of Zip Code
The custodian of the information of residents of each zone is the center of statistics .So we referred to that center but unfortunately they had no plan of gathered information of transmittal. Then we got in touch with central post office because recently they have codified all zip codes in the country. They sent us to the Postal Geographic Codifying Center of The Country, and then we asked to have a meeting with members of the board of the ministry of post and telegraph and telephone. The session was held and Mr. Hossain Abadi and Mr. Bastani were attended. The result of the session was a proper explanation about the qualities of zip code and all its documents. Briefly here they are:
The city of Tehran was divided to 8 postal centers. Digit 2 is not used as it can be mistaken by digit 3 (in Farsi) and zero as it s alike dot (.) is not used in codes. All of the digits of zip code have specific concept. For instance all codes in the city of Tehran start with 1 and codes in Khorasan province start with 5 and the codes in countryside of Tehran start with 3. As mentioned, the city of Tehran divided to 8 regions. Postal zones 11 to 19 except 12. When we go from main ways to sideways the codes go shredder:
1- The country has been codified by a Belgian contractor since 35 years ago.
2- Each one of the manmade receives a 10 digits code, even phone boxes.
3- Each code has progressive algorithm and each digit has one specific meaning.
4- The whole country divided to 11000 postal zones or as a technical postal phrase, "postal patrols".
5- About 22 millions 10 digits codes have been registered for places that 1millions of them are residential and about 2 millions of them are nonresidential.
6- The method of dividing these zones in the cities based on the aggregation of population and space but out of the cities it depends on the roads and natural sites.
7- The limits of zones in the cities are pathways and streets so that none of the houses is located between two zones. This method is used for all of the zones in the country.
8- 22 regions of Tehran are covered by 1900 zones.
9- The first 6 digits of zip codes include geographic information.
10- Zip codes are never changed under any condition.
11- Each one of these 10 digits zip codes includes 29 fields of information like usage sort in 29 separated guilds (residential, administrative, commercial and governmental) telephone numbers and the name of owners....
12- This information are always updating, every day 2000 agents throughout the country and 200 agents in Tehran pass their appointed zones on their feet to check the usage changes and reconstructions of buildings. These agents have no contact to the residents and observe, analyze and register the changes by themselves.

Considering utilizing of zip codes has segregation of information in residential usage, the experts of the ministry of health and medical support this method (utilizing zip codes) as the best way to reach to their aims. Because these 10 digits zip codes have information about the residents of each zone, they can calculate the population of zones by these codes and the coefficient of family which was appointed for most of the cities or even villages by The Center of Statistics of The Country.
For example this number is 4/1 for Tehran. We can also estimate the most valid approximation of residents and houses by utilizing this system, because in this system unlike The Center of Statistics Of Iran and the office of registrations of Iran, no document is required to prove the originality of existence of states and manmade constructions. Instead, the originality of existence is the proof or document of being existed.
I, with the aid of Mr. Bastani, was introduced to the chief of the department of Informatics in the Office of registrations of Iran, Mrs. Kazemi and she explained that zip codes are registered on national identification cards but registration is according to applicant's self declarations and is not verified to the information of office of codifying. There is no scheme to register the changing accommodations yet which actually changes the zip codes in national certification cards.
As most of the postal plans of zones are drafted by the first 5 digits of zip codes, these plans are supposed to be the basic plans. The plans can be drafted more attentively according to all 10 digits but considering, that level of accuracy is not needed, only the first 5 digits are applied.
One of the plans that were drawn by the Office of Codifying is enclosed (fig 1).

Digitizing Plans of the city of Tehran
(The phrase "digitizing" of the plans convey this concept that plans are not just painting but contain information.)
Finally the author began to digitizing 18 sheets of the plans of Tehran which had been drafted by The Office of Codifying, and then Mrs. Tavakoli voluntarily finished the task in 1386. As our chosen procedures, first, the sheets of plans were scanned as computer files, then the limits of zones were drawn by AutoCAD software, after that, 18 sheets of plans were attached together and with the aid of satellite images were changed to the actual scales with accuracy of 70 meters false in 11 kilometers diametrically and it was expectable (Fig 3).
Then all plans of each postal center were plated and sent back to the center to rectify the codes and unknown limits. Finally with confirmation of The Office we reached to ultimate plans. You can see the final plan of the city of Tehran in this image (fig 2).
In 1387 with the kindness of Mr. Nasiri, the president of The Codifying Office and Mr. Javadi, all of the plans and traffic collision report were sent to the Center of Cancer to apply this system for whole province, but most of the plans were like traffic collision report thus contained no geographic signs and also some of them were too big in size so that only 8 sheets of plans of Karaj and Damavand were utilizable. The plans of Karaj were finalized in 1388 and the plans of Damavand in 1389 (fig 4).
For other zones of the country as it was not in the work coverage of The Center of cancer, no plan was prepared.
But some of provinces independently prepared plans with different methods, like plans of Ghazvin by Micro Station and plans of Tabriz by AutoCAD. This is the location of postal zones plans of Tehran which was drawn for correction.
These are some of the individual advantages of this method:
1- The complete conformity of the method with all the plans and information in the country enable us to assemble all geographic plans according to the method which in GIS is called "overlay". For instance we can dismount ethnology plans, cell phone antennas plans, electricity transportation lines and gas station locations with this method.

2- Stability of codes and no changing.
3- Registration of descriptive changes in the data bank of The Office of Codifying.
4- Codifying all addresses, even incorrect or incomplete addresses. This option plays an important role through the goals of the institute, because no address even incorrect or incomplete is eliminated. As it has been mentioned, all codes in Tehran begin with 1, so all the information in The Institute of Cancer begin with 1 and whatsoever the address is more clear and complete we can add more digits to the code until getting to digit 5. For instance, whatsoever the names of streets in the addresses are more attuned with the algorithm and whatsoever its sequence is more precise we can add more digits to the codes. As an example, Roodaki Street is situated in postal zone No 17, so the record has initially 17 as the code. This street is on the junction of 7 postal zones which all of them begin with 175, so we add 175 to the code either. As the other item of the record, we have another Street named Esfand, which is on the rand of 3 postal zones: 17569, 17568, 17567 and all of them begin with 6, so we add 6 to the record as well and as a result our code becomes 1756. There is no other explanation in the record so our code is narrowed just to 4 digits and in this way we attribute one third to each one of 17569, 17568 and 17567 zones. If there is no address in the record, as much as all the records belong to the city of Tehran, we attribute 1/1900 (the number of all zones in Tehran) to each zone so actually no record is eliminated.
Clarification of Information Methods
Various methods of clarification of existent information including 700 thousand informational records (500 thousand records of people who died due to different kinds of cancer, buried in Tehran Behesht Zahra ,and 200 thousand records of patients under medical) were analyzed and tested by different softwares like ArcGIS, Arcview, Edvisi, Ilwis and AutoCAD. These methods contain, rechecking the information by operators and clarification of information on the postal and geographical plans by mentioned soft -wares. As AutoCAD has no capability to define descriptive information for spots, is not a good choice. ArcGIS and Arcview which are the most important soft wares of geographic information systems have availability of various kind of information analyzing options but it is hard for an ordinary operator to work with, so we have chosen Ilwis and Edrisi which both are hydrologic soft wares and easy to work with.
We received permission to use one of the patent software of The Office of Geography and Postal Codifying and in this case Mr. Javadi and Shahmohammadi were so cooperative.
This software has three parts of the address as the input (sector, penultimate pass way, last pass way) and 5 digits code as its output. We can also give just one word of the address to this software and get all the probable addresses from it, and then the operator can chose the closest one.
Mrs. Tebyanian voluntarily clarified 2000 records during 4 months. Although The Center of Cancer is a governmental center, no operator was available.
At the end, considering the plenty of information and inability to codify them by operators, I wrote a program to codify the addresses by FoxPro which had a good result for 23000 chosen records. 52 percent of addresses received 5 digits codes, 41 percent received 4 digits, 6.3 percent received 3 digits and only 0.7 percent received no codes.
Program Algorithm
This is the algorithm of the program:
As the field of address in records of data bank of The Center of Cancer were edited and finalized, inseparable detachment of different parts of addresses were not possible, so the informational files of the codifying program of The Codifying Office were assumed to be the base of the operation. This program initially reads a record of this file then searches all the addresses of patients in the data bank to find the address .As the addresses in the files of The Office of Codifying were edited in three parts (sector, penultimate pass way, last pass way ) The program firstly verifies all three fields with data bank of cancer to attribute 5 digits codes to patient's address, and then if two fields in one record are verified with data bank , 5 digits of zip code is attributed to the patient's address and finally all records which have received codes are eliminated thus these records are moved out of the comparing circle. At this stage, data bank of The Office of Codifying is the base of the operation but each one of records are compared by the data bank of the center of cancer, after that final decision is made.
When we were going through this method, at this stage, one medium data bank has been produced and edited. Firstly, according to the field of "last pass way" from the information of data bank of the office of Codifying, all the remained addresses are checked and first 4 digits of 5 digits zip codes are attributed to them and are registered in the medium data bank. Then the addresses according to the field of "penultimate pass way" and sectors are checked and compared and for each accordance, one record is registered in medium data bank with first 4 digits of 5 digits zip codes and finally the verified 4 digits codes are separately counted and the biggest number of accordance is registered as the final code in the data bank of the center of cancer, but if the number of found accordance is less than 10, only 3 digits are attributed.
There are other algorithms which have been utilized but it seems that, this one is more logical and reasonable. This kind of algorithm unlike the other data banks, especially financial data banks, has no static out crop (certain output) and it searches for the closest items which is named Dynamic Out Crop.
This program requires hours to compare and give addresses to information so that the addresses which have been declared by patients were not standard. As one of the goals of the cancer institute was online services (finding addresses), the author prepared a valid IP and enabled the program to have online services so every one who has username and password can contact to the system and upload the address files which have been stored in a folder with text format. The program is automatically run and put the results in a specific folder. But the spent time (96 hours) were not satisfactory so Mr. Mohammadi was appointed by the institute to provide the algorithm by other software .But he was not dominate at Dynamic Out Crop systems so the task had no result.
Also informatics expert of The Cancer Institute Mr. Golmahiwas informed to give the information files to the author of these pages or any other organization with no name and personal information, because it is against the security of information policy, thus the received files of that center only contain certification codes and registered addresses in forms.
As the zip codes have been publicized, geographic coordinates of every sector were actually informed to the residents. This incident has new and various usages which will be mentioned afterwards.
Besides, 10 digits zip codes produce different kinds of information which some of them are:
- Water – electricity – telephone and gas bills can be separately provided and announced to the consumers. As these bills are financial documents, should be very precise.
- Registration in Vehicle Registration Office in police department is assumed as a valuable data bank.
- Registration in passport office of police department.
- Registration in registry office.
- Bank accounts.
This collection of information which is frequently provided and updated creates a terrific capability which will be explained briefly in next part.
Finally, the result of this survey was a pack of figures and plans that you can see the initial plans based on given information in this part.
Figure 5 shows the transmittal items of the cancer for one part of data bank. This plan only clarifies the total numbers of infection in each zone.
In figure 6 the transmittal of cancer has been combined by Arcview software and the zones which have more aggregated are specified by arrow keys. This transmittal has been only edited by the number of disease but in figure 7, this transmittal has been combined according to the population percent and as you can see other spots have been appointed as the focus of cancer and the difference of spots is shown by black arrow keys and in the center of the city there is no focus point.
In the last part we explained why the other procedures have no epidemiologic value so that those procedures are not able to provide percent for each zone. If an epidemiologist decides to work with plan No 6 he is forced to search for cancer focuses in north and south parts of the city correspondingly, but if he works on plan No 7, these focuses are situated on other spots and the study does not go to disordering. Also, as it was mentioned before, it is possible to register the transmittal of automobiles on this plan and in this way we can use it as the index of possession of each family for comprehending the connection between level of income and cancer. We can add the consuming level of water, electricity – gas and telephone to this plan and obtain some new conclusions, meanwhile other information which has connection to zip codes can be directly registered and there is other information which has clear geographic coordinates and can be overlapped by the software.
The aforesaid information is generally concluded from all different type of cancers.
In figure 8 which had been provided by The Organization of Geology of Iran with the aid of The Center of Cancer before the author cooperated with them, you can see that the utmost accuracy of detachment of the disease is cities borders; nonetheless, some of the cities have no information. Although the population of each city can be obtained by The Center of Statistics, firstly the space of some of these cities (provinces) are more than thousands of square kilometers, in this case the accuracy of locations of disease is not proper, secondly, as the data bank does not focus on names of cities, the only data in this plan is population and no other information can be verified.
In figure 9, only one cancer infection has been digitized. This task took so much time and difficulty, but as you can see no reference of population and service has been referred to the spots and can only be analyzed individually and has no epidemiologic value.
The plans show that the most aggregated zone of cancer epidemy is Bazar and south west is the second most aggregated zone (red part).
As the plans show, the most aggregated zone of cancer epidemy is Bazar. We know there are not too many residential regions in Bazar so the big number of addresses in this zone refers to where the cancer patients work in. What can be the most important characteristic of Bazar in Iran? "Stress" which is the result of the quality of jobs there. The reasons of this aggregation needs more research by the cancer experts and epidemiologists and this survey only focuses on the centers with high rate of cancers.
After Bazar, south west of Tehran has the most aggregation. As most of the big industrial centers of Tehran are situated on west side of the city and workers have been settled close to where they work so Poverty and bad nutrition can be the reason of high rate of cancer in this zone.
Another conclusion which can be made of these plans but has no clear connection to cancer issue is, as the economic condition is not desirable and satisfactory, the industries are not profitable, so can not give proper services to workers. Sub plans shows high rate of cancer epidemy in industrial and commercial zones. We do not have technological industry so no technician or expert worker is needed, thus the rate of salary is low. This is another useful function of the plans of cancer transmittal which has been combined to the residential and non residential usage.
The explained algorithm in this article has been submitted to Iran University of Medical Sciences Research Section and they used it to analyze the epidemic disease of Fibroma. The experts of The Office of Codifying and Geography and Iran University of Medical Sciences Research Section had a session and decided the informatics unit of The Office of Codifying chose 2500 telephone numbers from each zone according to aggregation of residential usage and render these numbers to Iran University of Medical Sciences Research Section . For example, from some postal zones 5 numbers has been chosen, from the others 1 to 3 telephone numbers and even there were zones which no number has been chosen from, because of the majority of nonresidential usage of those zones. Finally a number of doctors who work in the center make contact to these telephone numbers and checked the information with the residents and registered new information which had contradictions with The Office of Codifying on specific forms and have received new telephon numbers from the residents. Now they are analyzing the statistics and waiting for results.
Zip codes are obtainable from many registry documents and urban services bills so we can use it in different fields. The author strongly advises to use this procedure because it has positive results in the field of transmittal of cancer information. These are some of the suggestions:
1- Considering the compatibility of severance of residential and nonresidential usage, utilizing the method in the field of crisis management EOC GIS, enable us to reach the precise estimation of probable victims of natural plagues in each zone. Also considering the exact number of existent buildings, we have the approximate amount of collapse and have the ability to manage the required machineries. Also considering the registration of zip codes in the bank, the national codes of people who need second aids due to the plagues are available.
2- In the fields of education, considering the registration of zip codes on national certification cards, we can get to a clear image of the number and sex of the people who are under coverage of education system in each zone and then we can manage the cases.
3- In the field of permanent development, considering the registration of these codes in water – electricity – telephone – gas bills, registry documents of automobiles and visas, we can interpret the development and as a result, deliberate the income distribution level in each zone. We can also use this information in sociology and economy. During last two years, some sessions were arranged and two experts of UNDP (United Nation development) Mr. Farzin and Mr. Heydar Nadim as the attendees were attracted to this method especially the issue of connection of this code and the registry number of automobiles which plays an important role in the case of family economy but as the decisions about new drafts are practically made in the UN center, this proposal had no final result.
4- This method has been submitted to UNAIDS (United Nations AIDS) and the experts of that center found it interesting and useful but for the same reason, it has no result. We rendered the method to UNICEF and WHO as well, but still the result was nothing.
5- In the field of traffic, considering the number of automobiles in each zone and aggregation of administrative – educational – commercial (nonresidential) centers in zones which indicate the path of urban trips, we can have a precise image of traffic in different hours of day and night which can be analyzed and utilized for traffic management.
6- In the field of security, registration of crime locations and victims and criminals habitats can be useful for violence probability occurrence, featuring crime prediction and police patrol management. For instance, we can insert the annual information to the plans and obtain a general image of different kind of crimes probability (murder, street fights ,banditry ) occurrence in each zone, and in this way we can send more preventive patrols to risky zones and make the number of patrols in non risky zones less. And as a result it is a good criterion in police management. A mathematical model which is a combination of parameters like number of cars and selling and buying of automobiles in each city or town, bank transaction ... can be submitted and this model enable us to obtain a connection between coefficient of correlation of crime occurrence and these parameters which are among the most important economic and social indexes, then we can update the model, according to this procedure.
7- Another advantage of this method is obtaining information from different social classes with the procedure of self clarification by short message service, because the connection network of the country with 35 million subscribers covers 90 percent of the population of the country has too many advantages toward the internet which has 15 million subscribers and covers 60 percent, so has a low penetration coefficient. Meanwhile the expense of each internet connection is 1 million riyals while we can connect to mobile network only with 350 thousands riyals.
As you can see, health function of is one of the secondary usage of this method and the most important function is in the field of top level politic management of the country. For instance, observing the amount of car and estate trading – health and medical services – bank interactions – going in to and out from the country and etc, enable us to have clear image of current events all over the country and with a mathematical model we can made too many different conclusions. We can get to various coefficients of correlation as well to chase the changing of each one of these elements (most important economic criteria) to estimate the rate of distribution of wealth and level of security, health and education with no need of direct analysis of each one of these elements separately.
1- Geology Organization Of The Country, Earth Science Management
2- Shahid Beheshti University, Hygiene Department
3- Cancer Research Of Iran Institute Of
4- Collection of articles Of Epidemiology Conference – Shahrood University – 1389
5- Society Of Epidemiologists Of Iran
6- World Hygiene Organization
7- Accessory of Geography and Codifying Of The Country
8- Universal Postal Union
9- Khaje Nasir'o'din Toosi University, Geomatic Department
10- Wikipedia
11- Environmental Systems Research Institute(Esri): the GIS software leader