Warning: mkdir(): Permission denied in /home/virtual/lib/view_data.php on line 81

Warning: fopen(upload/ip_log/ip_log_2024-11.txt): failed to open stream: No such file or directory in /home/virtual/lib/view_data.php on line 83

Warning: fwrite() expects parameter 1 to be resource, boolean given in /home/virtual/lib/view_data.php on line 84
Effectiveness of Repeated Examination to Diagnose Enterobiasis in Nursery School Groups

Effectiveness of Repeated Examination to Diagnose Enterobiasis in Nursery School Groups

Article information

Korean J Parasito. 2009;47(3):235-241
Publication date (electronic) : 2009 August 28
doi : https://doi.org/10.3347/kjp.2009.47.3.235
1Tartu Health Care College, Tartu, Estonia.
2Institute of Ecology and Earth Sciences, University of Tartu, Estonia.
Corresponding author (mareremm@nooruse.ee)
Received 2009 February 23; Revised 2009 April 04; Accepted 2009 April 24.

Abstract

The aim of this study was to estimate the benefit from repeated examinations in the diagnosis of enterobiasis in nursery school groups, and to test the effectiveness of individual-based risk predictions using different methods. A total of 604 children were examined using double, and 96 using triple, anal swab examinations. The questionnaires for parents, structured observations, and interviews with supervisors were used to identify factors of possible infection risk. In order to model the risk of enterobiasis at individual level, a similarity-based machine learning and prediction software Constud was compared with data mining methods in the Statistica 8 Data Miner software package. Prevalence according to a single examination was 22.5%; the increase as a result of double examinations was 8.2%. Single swabs resulted in an estimated prevalence of 20.1% among children examined 3 times; double swabs increased this by 10.1%, and triple swabs by 7.3%. Random forest classification, boosting classification trees, and Constud correctly predicted about 2/3 of the results of the second examination. Constud estimated a mean prevalence of 31.5% in groups. Constud was able to yield the highest overall fit of individual-based predictions while boosting classification tree and random forest models were more effective in recognizing Enterobius positive persons. As a rule, the actual prevalence of enterobiasis is higher than indicated by a single examination. We suggest using either the values of the mean increase in prevalence after double examinations compared to single examinations or group estimations deduced from individual-level modelled risk predictions.

INTRODUCTION

Medical laboratories in Estonia commonly employ the anal swab method to diagnose Enterobius vermicularis. The prevalence of E. vermicularis detected using a single anal swab among nursery school children is found to be greater than 20% [1,2]. However, the single anal swab method cannot accurately estimate the true prevalence of E. vermicularis infection in a community [3]. Repeated examinations on separate days result in higher and more reliable estimations of the prevalence of the E. vermicularis infection [4,5].

Although repeated examinations may be considered too laborious for routine diagnosis of E. vermicularis, which is a low-level pathogenic helminth most common in children, accurate diagnosis of enterobiasis is essential to identify and treat infected individuals. Moreover, correct diagnosis is an important preventative measure for children, especially in nurseries, as communication companions are a major risk factor of enterobiasis [2]. It is essential to develop potential simple and cheap methods for identifying nursery groups with a high risk of enterobiasis and potentially high prevalence, and to draw attention to the need for prevention and cure.

The ratio of prevalence from single swab examinations to actual prevalence in the population is unclear, due in part to the nature of the life cycle of E. vermicularis. Though the life cycle takes place within the lumen of the gastrointestinal tract, microscopic examination of fecal samples is not recommended for diagnosis. Fecal samples give a positive diagnosis in only a few cases: 5-15% [5], or 3% [6]. Anal swab examination is the recommended method [4], although a positive result from an anal swab should not always be interpreted as a diagnostic criterion of a present infection, since positive cases may not have intestinal E. vermicularis, although negative cases may [7]. The anal swab test, the routine test for enterobiasis in Estonia, is simple, quick and inexpensive, but it is quite a poor test along with the alternative cellophane tape method that is widely used all over the world. The both tests detect eggs after the death of worms. Therefore, the already terminated parasitism is detected. The autoinfection is frequent, causing common co-occurrence of different life stages of the worms in the same child. The anal swab detects only worms that already have laid eggs and not the younger stages.

The proportion of positive results from a single anal swab can be far less than the actual prevalence of E. vermicularis infections. Cho and Kang [4] found that 81.3% of negative cases of single anal swabs had various development stages of E. vermicularis in the intestine. The repeated anal swab technique offers the chance to obtain a sample, when the worms have just laid their eggs. Incorrect sampling or microscoping can also result in false positives and negatives, although, if the examination is done by an experienced specialist, a false positive result is unlikely because the eggs of E. vermicularis are easy to recognize.

The ratio between anal swab results and actual Enterobius infection is affected by several factors, including the size of the parasite's brood, time intervals between reinfection, and the distribution of the E. vermicularis burden in the surveyed community [7]. A single perianal tape test used for pinworm detection in owl monkeys yielded a 1-in-4 chance that no pinworm eggs will be detected in an infected animal [8]. Sadun and Melvin [3] detected 60% positives using a single examination in a heavily-infected group, but only 37% positives in groups with a lower prevalence. A small number of egg-positive children in consecutive examinations may suggest that the worm burden in the positive children is low [9].

The increase in prevalence from repeated examinations has varied. Fan and Chan [10] found that the prevalence among nursery and kindergarten children increased from 17.3% and 34.6% from a single swab to 44.4% and 70.2% in 8 consecutive swabs; the prevalence among primary school students increased from 59.9% to 77.3% from 4 consecutive swabs. Kim et al. [11] found the increase of prevalence from 50.0-59.2% for single anal swabs to 70.8% from 3 anal swabs repeated at 4-5 day intervals. Two consecutive examinations may increase the egg detection rate by 4.2-4.8% for low (around 10%) enterobiasis prevalence [9]. The triple anal swab examinations presumably detect nearly 90% of infected individuals [3,5].

The present study was carried out in order to estimate the benefit of repeated examinations for the detection of enterobiasis in nursery school groups in Estonia, and to test the effectiveness of individual-based risk predictions using different methods. The issues addressed were as follows. What is the actual prevalence of enterobiasis in nursery school groups in Estonia? How large is the relative gain in prevalence (efficiency) from repeated examinations compared to a single examination? Is it possible to predict the estimated prevalence for multiple examinations from the results of a single anal swab examination in nursery school groups? Which are the best methods for individual-based predictions of the prevalence of infection in groups?

MATERIALS AND METHODS

The investigation of enterobiasis was conducted among nursery school children from 3 counties in Estonia during 2005-2007. The double swab examination in Põlva County was conducted in spring 2005, and in Hiiu County in autumn 2006. The triple swab examination in Valga County was conducted in spring and autumn 2007. The double swab examination involved 604 children from 57 groups in 23 nursery schools. There were 336 children (29 groups) from Põlva County, 140 (16 groups) from Hiiu and 128 (12 groups) from Valga County. Ninety-six children from 6 nursery schools in Valga County were investigated 3 times (Table 1). The number of examined children in a group varied between 2 and 23. Although some data is connected to individual children, the main outcomes of the study are at the nursery group level.

The number of children, estimated prevalence of enterobiasis, and its absolute and relative increase

The data was obtained from 3 sources: 1) repeated anal swabs (at 1 or 2 day intervals) from children; 2) closed-ended questionnaires for children's parents; 3) observations of nursery schools and structured interviews with school staff. The aim of the interviews and observation of rooms was to identify possible infection risk factors for the groups. The observations concerned the number, purpose, sanitary conditions and state of repair of the rooms. The interviews concerned sanitation and children's hygiene, habits and cleanliness.

Infections of E. vermicularis were examined using the anal swab technique. Swabs were taken after breakfast just before the children went outside. One slide was prepared from each swab in the laboratory. The presence or absence of E. vermicularis eggs was determined, and the number of eggs was not determined.

The aim of the questionnaire was to identify possible factors for infection associated with children's homes, since enterobiasis spreads mainly in the indoor environment. The questions addressed 6 topics (the number of questions in brackets): personal data (5), household members and pet animals (3), previous occurrence of helminthiases in the family (2), socioeconomic status of the household (2), living conditions at home (8), and the child's habits and personal hygiene (3).

The second and third examinations, in addition to the first one, were considered separate single examinations with which to compare the effectiveness of repeated examinations and single examinations because the temporal order of the observations is not significant. The mean of single examinations served as the expected single-examination prevalence of enterobiasis in groups. Averaging the prevalence was used to reduce randomness in the results of single examinations and to diminish the number of zero-infected groups, and therefore, division by zero errors in the calculation of the relative gain. The average prevalence stands for the expected value of the estimated prevalence. The relative gain in the estimated prevalence (efficiency) for repeated examinations was calculated as:

where U% = the efficiency of the double examination in percentages, DSR = the prevalence from double or triple examination, SR1, SR2, and SR3 = the prevalence from single examinations.

In order to model the risk of enterobiasis at individual level, a machine learning and prediction software Constud (http://www.geo.ut.ee/CONSTUD) [1] was compared to the data mining methods that fit a binomial dependent variable and a large number of nominal and numerical explanatory variables, and are not sensitive to missing values in explanatory variables. The following data mining methods in the Statistica 8 Data Miner (Statsoft) software package were compared: κ-nearest neighbours, boosting classification trees (BCT), random forest classification (RF), support vector machine, Naive Bayes classifier, advanced classification trees, automated neural network search. RF and BCT models were trained both in ëqual prior probabilities and estimated prior probabilities mode. The other methods do not accommodate prior probabilities. The missing values for variables were not altered because the assumptive values derived would be deceptively effective predictors if the prevalence in a set of observations that had missing values of any feature was biased from the overall mean.

Only the results of the first examination and information on their homes and nursery groups were used as training data. The total number of explanatory variables (features) for all methods formalized from questionnaires, interviews, nursery room observation and the child's personal data (age, gender) was 78. Not all characteristics were known for all children because some questions remained unanswered by parents and by supervisors. All available characteristics were used as training data for all modelling methods not giving any preference to the main risk factors known from previous investigations [1,2] where risk factors were compared one-by-one. The indicator value of single characteristics can differ greatly if taken alone or if in combination with other explanatory variables since many characteristics are intercorrelated or extensively duplicate each other.

The ultimate aim of the individual-based models was not to predict the enterobius positive/negative status in a child, but to estimate the enterobiasis prevalence in groups. We assumed that random biases in risk estimations should at least partially smooth out according to the law of large numbers.

Estimations derived from the results of the first examination were compared with the results of the following examinations. The fit of predictions is expressed as the proportion of correctly classified cases and as the true skill statistic (TSS-the proportion of true positives plus the proportion of true negatives minus one). TSS was preferred to the commonly used positive predictive value (PPV) since TSS is not dependent on prevalence [12]. The Ethics Review Committee on Human Research, University of Tartu has approved this investigation (records: 136/2, 21.03.2005).

RESULTS

The overall mean prevalence of enterobiasis from single anal swabs for children examined twice was 22.5%; double anal swabs increased the estimated prevalence to 30.7%. The average prevalence from single swabs for triple-swabbed children was 20.1%. The prevalence for double swabs was 30.2%, and the prevalence for triple swabs was 37.5%. The average efficiency of double examination for the whole data set was 36.7% (31.2% in Põlva, 37.7% in Hiiu, and 49.6% in Valga County). The efficiency of triple examination was 86.2% with respect to single examinations and 24.1% with respect to double examinations (Table 1).

The average prevalence of enterobiasis among nursery groups from 2 single examinations varied from 0 to 59.1%; 9 groups of 57 were non-infected. The mean group prevalence was 20.4%. The second examination failed to increase the prevalence in 12 groups, of which 9 were already identified as non-infected by the single examination. The prevalence increased by 100% in 10 groups. The prevalence was quite low in all these groups, 13.6% in 1 group and < 10% in the others. The overall mean increase in estimated prevalence resulting from the double examination was 7.0% (range 0-25%) in groups. The increase in the estimated prevalence was a function of the prevalence by the mean single examination: less in groups with a lower prevalence and more in groups with a higher prevalence, except for those groups with the highest preliminary prevalence (Table 2). The efficiency of double examination varied from 0 to 100%, the mean value in groups was 38.6%.

The increase of prevalence after double examinations, and mean absolute and relative increase (efficiency) of double examination in nursery groups according to preliminary prevalence

Children from 12 nursery groups in Valga County were examined 3 times. Two groups revealed no infections after all 3 examinations. The efficiency of triple examination relative to double examination was 0-50% in the remaining 10 infected groups and 50-200% relative to the single examination (Table 3). The estimated prevalence increased by 0-20.8% after triple examination compared to double examination.

The increase in estimated prevalence in Valga County

The relative effectiveness at group level for repeat examinations was found to be statistically related to the prevalence in the group, when the 0-prevalence groups (according to both single examinations) were excluded to avoid division by zero (Fig. 1). The repeat-examination is more effective in groups in which the prevalence of enterobiasis is lower (linear regression, R2 = 0.41, n = 57, P < 0.0001). Detection of Enterobius positive children is less likely when the overall infection burden is low. The added proportion of estimated prevalence expressed as a difference in percentages is in general somewhat greater in groups in which the prevalence estimation is higher; except for those groups whose proportion of Enterobius positive children was the highest according to the first examination. No significant statistical relationships of efficiency or added prevalence with other characteristics of nursery school groups were found.

Fig. 1

Prevalence of enterobiasis in non-zero-prevalence groups according to a single examination, the efficiency of double examinations and the linear regression model.

y, efficiency; x, prevalence.

The best TSS results for the 3 methods for predicting individual risk were RF using equal prior probabilities (63.6% correctly classified, TSS = 0.332), BCT using estimated prior probabilities (64.9%, TSS = 0.338), and machine learning using Constud (73.8%, TSS = 0.388). Constud was superior in recognizing true negative cases, while BCT and RF succeeded better in predicting true positive cases (Table 4).

Number of children according to the results of double examination and model-based estimations of the best 3 individual-based methods

The explanatory features selected by these 3 methods as the most useful for distinguishing positive and negative cases were not the same (Table 5). The classification tree methods relied more on child and family characteristics, whereas Constud selected characteristics of nursery groups. Region, child age, and range of ages in nursery school groups were among the 13 most indicative attributes in all 3 models.

Explanatory features in descending rank of importance (maximum 100) or weights (w) in the best predictive set of features in Constud (characteristics of the group are in italics)

The estimated overall prevalence calculated using the 3 methods was 31.1% according to Constud, 43.7% according to BCT, and 45.7% according to RF. The mean TSS-fit of Constud was lower in groups with more than 10 children (TSS = 0.295, versus TSS = 0.587, n = 31 and 28, P = 0.002, Mann-Whitney U-test). Many small groups were not infected and the few children in these groups were commonly identified as not infected. The prevalence estimations in groups calculated from individual level predictions varied between 0.17 and 1.0, but these still correlated strongly with the results of single (R = 0.666), double (R = 0.613) and triple examination (R = 0.583). Constud estimated 31.5% detection (SD = 0.268) for the mean prevalence in groups, which is greater than the single swab examination of 20.4% (SD = 0.162), 27.4% (SD = 0.201) for double examination, and 29.1% (SD = 0.216) for triple examination.

DISCUSSION

Single examinations for enterobiasis for prophylactic purposes in nurseries is faster and cheaper, but previous research has recommended multiple examinations, since single examination may fail to detect infected children [3-5,7-9,11].

Yoon et al. [9] found a 4.2% increase in prevalence after double examinations. They concluded that when a single examination indicates around 10% of infections, the result of a double examination may be about 4% higher. We observed a higher prevalence from single anal swabs for the entire sample population (22.5%) and a greater increase in repeated examinations. The prevalence increased by 8.2% as a result of double examinations, an increase of more than one-third from the single examination. Cho et al. [7] noted a 13.2% higher prevalence from double examinations compared to single examinations (first exam 73.5%, second 83.7%, cumulative 91.8%).

The prevalence in groups obtained from a single examination (0-59.1%) and the increase of prevalence using double examination (0-25%) varied greatly in this study. We divided the nursery groups into classes according to the single-estimated prevalence (Table 2). Although the increase of prevalence estimated using repeated examinations varied within groups, in general, the increase was greater in groups with a higher estimated prevalence from the first examination (maximum at 30-45%). Among the lowest prevalence group (up to 15%), the increase of prevalence (4.1%) is similar to the results from Yoon et al. [9], 5.5%. The increase is 9.5% at the next class and reaches a maximum of 11.3% at the 30-45% class. This class had an increase similar to that found by Cho et al. [7].

Both the absolute increase in estimated prevalence and the gain in relative figures characterize the added value of a double examination. The results of a double examination depend on the proportion of infected children and other characteristics of the group. The absolute increase in estimated prevalence compared to a single examination, as was calculated by Yoon et al. [9], appears to be a more stable predictor. The application of effectiveness as a relative indicator is justified if the gain from double investigation needs to be set forth in the conditions of a relatively low prevalence.

Cho and Kang [4] calculated a 62.9% prevalence combining double anal swabs and Neyman's best asymptotically normal estimate, but this was much lower than their reported pinworm occurrence (89.3%) from the thorough study of 3-day feces. Kim et al. [11] found the prevalence of 70.8% with a triple examination, an increase of 11.6-20.8% over a single examination. The estimated prevalence of enterobiasis in our study increased by 37.5% as a result of triple examinations, the difference from the rate of single examinations was 17.4%. Assuming triple examinations discovered 90% of actual prevalence as concluded in [3,5], the corresponding expected actual prevalence in our data would be 41.7%.

The 3 best methods for estimating individual risk levels (RF, BCT, and Constud) were able to predict the actual results of the second examination, about two-thirds correctly classified, TSS ≈ 0.35. The predicted overall prevalence from individual-based risk estimations varied considerably among the 3 best methods. The prevalence predicted by Constud (31.1%) is lower than that of BCT (43.7%) and RF (45.7%), but is closer to the observed value. Group level predictions of the mean prevalence match relatively well into the sequence: 20.4% from single anal swab, 27.4% from double, 29.1% from triple, and 31.15% from Constud.

The significant merits of the 3 best methods are the tolerance of missing values of explanatory variables, the lack of restrictions due to the statistical distribution of explanatory variables and the lack of a predefined theoretical model. The test for individual-based predictions of the prevalence of enterobiasis in independent groups remains for future investigations, since all double examined individuals were included in the comparison of the prediction methods. Reducing the number of observations in the training sample by excluding them from an independent test sample would yield in over fitting the models. The individual-based estimations were presented here to demonstrate a possible novel approach to prevalence estimation.

We suggest using the values of the mean increase of prevalence after double examination compared to a single exam (Table 2). Individual-level modelled risk predictions can support or be an alternative for the estimation of the expected prevalence of enterobiasis. Similarity-based machine learning and prediction in Constud was able to yield the highest overall fit of individual-based predictions while BCT and RF models were more effective in recognizing Enterobius positive persons.

ACKNOWLEDGEMENTS

The investigation was supported by the Doctoral School of Ecology and Environmental Science and by the Estonian Ministry of Education (SF0180052s07). The authors express their gratitude to Helena Virt and Marina Kala from Tartu Health Care College for participating in data collection and laboratory analyses and to Robert Szava-Kovats and Michael Haagensen for linguistic corrections.

References

1. Remm M. Distribution of enterobiasis among nursery school children in SE Estonia and of other helminthiases in Estonia. Parasitol Res 2006;99:729–736. 16752158.
2. Remm M, Remm K. Case-based estimation of the risk of enterobiasis. Artif Intell Med 2008;43:167–177. 18502624.
3. Sadun E, Melvin D. The value of repeated examination in diagnosis of infection with Enterobius vermicularis. J Parasitol 1955;41(suppl):41.
4. Cho SY, Kang SY. Significance of Scotch-tape anal swab technique in diagnosis of Enterobius vermicularis infection. Korean J Parasitol 1975;13:102–114.
5. Cook G. Enterobius vermicularis infection. Gut 1994;35:1159–1162. 7959218.
6. Remm M. Helmintiaaside esinemine Tartu ja selle ümbruse lastepäevakodude lastel ning seda mõjutavad tegurid. Eesti Arst 2004;83:148–153.
7. Cho SY, Kang SY, Ryang YS, Seo BS. Relationships between the results of repeated anal swab examinations and worm burden of Enterobius vermicularis. Korean J Parasitol 1976;14:109–116.
8. Felt S, White C. Evaluation of timed and repeated perianal tape test for the detection of pinworms (Trypanoxyuris microon) in owl monkeys (Aotus nancymae). J Med Primatol 2005;34:209–214. 16053499.
9. Yoon HJ, Choi YJ, Lee SU, Park HY, Huh S, Yang YS. Enterobius vermicularis egg positive rate of pre-school children in Chunchon, Korea (1999). Korean J Parasitol 2000;38:279–281. 11138323.
10. Fan P, Chan C. Consecutive examination by scotch-tape perianal swab in diagnosis of enterobiasis. Gaoxiong Yi Xue Ke Xue Za Zhi 1990;6:647–652. 2266569.
11. Kim J, Lee H, Ahn Y. Prevalence of Enterobius vermicularis infection and preventive effects of mass treatment among children in rural and urban areas, and children in orphanages. Korean J Parasitol 1991;29:235–243.
12. Allouche O, Tsoar A, Kadmon R. Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). J Appl Ecol 2006;43:1223–1232.

Article information Continued

Fig. 1

Prevalence of enterobiasis in non-zero-prevalence groups according to a single examination, the efficiency of double examinations and the linear regression model.

y, efficiency; x, prevalence.

Table 1.

The number of children, estimated prevalence of enterobiasis, and its absolute and relative increase

Double investigation
Triple investigation
Põlva county Hiiu county Valga county Total in 3 counties Valga county
Number of children 336 140 128 604 96
Average prevalence from single examinations (%)a 23.3 21.8 21.1 22.5 20.1
Prevalence from double examination (%) 30.6 30.0 31.7 30.7 30.2
Increase in estimated prevalence compared to a single examination 7.3 8.2 10.6 8.2 10.1
Efficiency of double examination relative to a single examination (%) 31.2 37.7 49.9 36.7 50.0
Prevalence from triple examination (%) - - - - 37.5
Increase in estimated prevalence compared to double examination - - - - 7.3
Increase in estimated prevalence compared to a single examination - - - - 17.4
Efficiency of triple examination relative to a single examination (%) - - - - 86.2
Efficiency of triple examination relative to double examination (%) - - - - 24.1
a

Single examinations were components of double and triple examinations.

Table 2.

The increase of prevalence after double examinations, and mean absolute and relative increase (efficiency) of double examination in nursery groups according to preliminary prevalence

Mean prevalence from single exam (%)a No. of groups No. of children Mean increase of prevalence after double exam. compared to single (%) Mean efficiency of double exam. relative to single (%)
0-15 16 172 5.5 77.8
15-30 15 179 10.0 43.2
30-45 13 150 13.0 33.9
45-59.1 4 47 6.3 12.4
a

Single examinations were components of double and triple examinations.

Non-infected groups are excluded.

Table 3.

The increase in estimated prevalence in Valga County

Mean prevalence from single exams (%) Prevalence from double exam. (%) Prevalence from triple exam. (%) IP after triple exam. compared to single (%) Efficiency of triple exam. relative to single (%) IP after triple exam. compared to double (%) Efficiency of triple exam. relative to double (%) No. of children
0.0 0.0 0.0 0.0 0.0 0.0 0.0 2
0.0 0.0 0.0 0.0 0.0 0.0 0.0 6
5.6 11.1 16.7 11.1 200 5.6 50 6
6.7 10.0 10.0 3.3 50.0 0.0 0.0 10
6.7 13.3 20.0 13.3 200.0 6.7 50.0 10
13.3 26.7 40.0 26.7 200 13.3 50 5
23.3 40.0 50.0 26.7 114.3 10.0 25.0 10
24.2 30.3 36.4 12.2 50.0 6.1 20.0 11
29.2 41.7 50.0 21 71.4 8.3 20.0 8
30.3 42.4 45.5 15.2 50 3.1 7 11
33.3 54.2 75.0 41.7 125 20.8 39 8
40.7 55.6 66.7 26 64 11.1 20 9

IP, increase of prevalence.

Table 4.

Number of children according to the results of double examination and model-based estimations of the best 3 individual-based methods

Double examination Estimation No. of children
RF BCT Constud
Positive Positive 127 129 124
Positive Negative 54 52 57
Negative Negative 274 288 359
Negative Positive 149 135 64

RF, random forest; BCT, boosting classification trees.

Table 5

Explanatory features in descending rank of importance (maximum 100) or weights (w) in the best predictive set of features in Constud (characteristics of the group are in italics)

RF
BCT
Constud
Feature Rank Feature Rank Feature w
Child’s hygiene-related habits 100 Mean age of children 100 Region 1.680
Occupation of other children in family 69 Child’s age 78 Range of children age 1.527
Child’s age 68 Pets at home 74 Family income 1.421
Pets at home 49 Child’s hygiene-related habits 70 Child’s age 1.312
Hand washing before meals 45 Hand washing after toilet use 67 Number of rooms 0.893
Separate room for children 43 Number of children 66 No soft toys 0.547
Mean age of children 41 Occupation of other children in family 66 Finger sucking above 3-year-old children 0.403
Cleaning frequency 41 Cleaning method 65 Mother's education 0.217
Region 40 Region 56
Range of children’s age 37 Mother’s education 56
Hand washing after toilet use 37 Hand washing before meals 53
Number of children 34 Carpets at home 53
One-age group 34 Range of children’s age 52

RF, random forest; BCT, boosting classification trees.