Journal of Occupational Health
Online ISSN : 1348-9585
Print ISSN : 1341-9145
ISSN-L : 1341-9145
Originals
Reliability of smartphone-based gait measurements for quantification of physical activity/inactivity levels
Takeshi EbaraRyohei AzumaNaoto ShojiTsuyoshi MatsukawaYasuyuki YamadaTomohiro AkiyamaTakahiro KuriharaShota Yamada
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML

2017 Volume 59 Issue 6 Pages 506-512

Details
Abstract

Objectives: Objective measurements using built-in smartphone sensors that can measure physical activity/inactivity in daily working life have the potential to provide a new approach to assessing workers' health effects. The aim of this study was to elucidate the characteristics and reliability of built-in step counting sensors on smartphones for development of an easy-to-use objective measurement tool that can be applied in ergonomics or epidemiological research. Methods: To evaluate the reliability of step counting sensors embedded in seven major smartphone models, the 6-minute walk test was conducted and the following analyses of sensor precision and accuracy were performed: 1) relationship between actual step count and step count detected by sensors, 2) reliability between smartphones of the same model, and 3) false detection rates when sitting during office work, while riding the subway, and driving. Results: On five of the seven models, the inter-class correlations coefficient (ICC (3,1)) showed high reliability with a range of 0.956-0.993. The other two models, however, had ranges of 0.443-0.504 and the relative error ratios of the sensor-detected step count to the actual step count were ±48.7%-49.4%. The level of agreement between the same models was ICC (3,1): 0.992-0.998. The false detection rates differed between the sitting conditions. Conclusions: These results suggest the need for appropriate regulation of step counts measured by sensors, through means such as correction or calibration with a predictive model formula, in order to obtain the highly reliable measurement results that are sought in scientific investigation.

Introduction

In recent years, the health effects of physical inactivity, which exists as a variable independent of physical activity (exercise habit, intensity, and amount) after adjusting for potentially confounding factors, have attracted broad interest1,2). Physical inactivity is generally taken to be lack of routine physical activity or exercise, but physical inactivity as dealt with in occupational epidemiology research refers mainly to sedentary behaviors3-5). Recent systematic reviews, including meta-analyses, show that such physical inactivity raises the risks of cardiovascular disease, cancer, type 2 diabetes, and other diseases6,7).

Physical inactivity as an exposure factor, however, is in most cases determined by subjective responses about sitting time on self-administered questionnaires. A limited number of studies4,8) exist based on objective measurements using acceleration sensors or other devices with small subject samples, but methods that enable large-scale, objective, and simple measures on a population basis, such as frequency and interval for physical activity or cumulative sitting time, have not been established. Widely used smartphones that include standard internal 3-axis accelerometers, gyroscopes, or step counting sensors hold potential as one means of overcoming this problem. With regard to the step counting sensors used in determining levels of physical activity and inactivity, however, there is to our knowledge no information on the reliability of walking measurements that are needed and can be used in academic research.

The aim of this study was to elucidate the characteristics and reliability of built-in smartphone step counting sensors for development of an easy-to-use objective measurement tool that can be applied in ergonomics or epidemiological research.

Methods

The Institutional Review Board for medical research at the Nagoya City University, Japan approved this study design and procedures.

Apparatus and smartphone application development

Seven major smartphone models that use the Android OS and had a domestic Japanese share of more than 70% of Android smartphone shipments were used in this study (Table 1). A Lifelog application for academic research, Motion Logger ver. 1.6, was developed (available from http://www.med.nagoya-cu.ac.jp/hygiene.dir/MotionLogger/). This application extracts time-series log data from the various sensor information defined by Android OS in these smartphones. With Android applications, the embedded hardware sensors in smartphones cannot be directly accessed and controlled, but raw data can be obtained from the SensorManager that is provided as a function of the Android OS. Data acquired from the SensorManager and information from 10 main sensor information that are compatible with Android OS ver. 4.4 and later, such as a step counting sensor, 3-axis accelerometer, and magnetic field sensor, can be continuously measured for one week at a sampling rate of 500 ms (in the case of GPS, 0.5-60 sec). In this study we used the step counting sensor data from among this sensor information to distinguish levels of physical activity/physical inactivity, and tested the reliability of the information.

Table 1. List of smartphones tested in this study
Product name Model Manufacturer Date released Android OS version Major common sensor information extracted in the smartphonesa
a: Sensor information extracted via SensorManager provided by Android OS. Sensor information can be basically extracted from built-in hardware sensors, but step counting sensor defined by Android OS is calculated using specific algorithm applied to the accelerometers. The algorithm depends on each developer and is not disclosed generally.
AQUOS ZETA SH-01G
Sharp Corporation
2014
November
v.4.4 ST: Step counting sensor
LI: Illumination sensor
AC: Accelerometer sensor
GR: Gravity sensor
GY: Gyroscope sensor
MG: Magnetic field sensor
PX: Proximity sensor
RO: Rotation vector sensor
OR: Orientation sensor
GE: Geomagnetic rotation vector sensor
Galaxy S7 edge SCV33
Samsung
2016
February
v.6.0
Xperia Z3 SO-02G
Sony Corporation
2015
May
v.5.0.1
Xperia Z4 SOV31
Sony Corporation
2015
June
v.5.0.1
Xperia Z5 SO-01H
Sony Corporation
2015
September
v.5.1.1
X performance 502SO
Sony Corporation
2016
February
v.6.0.1
X Compact SO-02J
Sony Corporation
2016
September
v.6.0.1

Subjects and Procedures

The test subjects were five healthy male volunteers (age = 31.2 ± 8.5 years, height = 172.0 ± 5.5 cm, Body Mass Index = 26.0 ± 4.0). Prior to the reliability tests, it was necessary to determine whether there were differences in the step counting sensor detection level due to differences in the subjects' gait characteristics. To verify the accuracy of the step counting sensor, the 100-step walking test that was used in a previous study9) was implemented with repeated measures a total of five times for each subject.

The following three tests were performed to elucidate the characteristics and reliability of the step counting sensors of the various smartphone models.

1) Relationship between actual step count and step count detected by sensors: precision and accuracy

The 6-minute walk test (6MWT) was used. The 6MWT measures the distance an individual can walk in 6 minutes and his/her gait characteristics, and is used mainly in the field of rehabilitation to assess not only patients with cardiopulmonary diseases but also patients with locomotive/motor dysfunction10). We plan to implement additional function to the next MotionLogger version that can evaluate gait characteristics obtained from sensor information during 6MWT. For that reason, with a view to implementation of future test protocols in the MotionLogger smartphone application currently under development, the 6MWTs were applied and conducted to verify the reliability of each sensor in the 6MWT. Using a bust band, the 7 smartphone models were attached to the chest of each of the five subjects and we conducted the 6MWT a total of three times. The wearing positions of the seven smartphones were randomly assigned for eliminating the possibility of biases caused by the directions of the sensors' axes. The actual number of steps walked in 6 minutes was recorded by experimenters' direct observation. At the same time, the number of steps detected by the sensors was recorded.

2) Reliability between smartphones of the same model: Test of individual smartphone differences

To compare the reliability between smartphones of the same model, pairs of phones were obtained for 4 of the 7 models and the 6MWT was performed following the same protocol described above.

3) Test of the false detection rate during sitting

When sitting, posture is not always restrained; people change postures, stretch, reposition themselves, twist, cough, and move in various other ways. There are also effects from swaying or acceleration when they are seated while driving a car or riding on a subway. It is necessary to investigate the degree to which false detection occurs in step counting sensors in these situations. We therefore had two subjects carry different smartphone models in their chest pockets and sampled sensor data independently in various sedentary settings as they performed office work, rode the subway (sitting posture, Meijo Line of Nagoya Municipal Subway), and drove in their actual daily activities (traveling on local road and Tomei Expressway, the car models: Prius and Alphard, Toyota Corporation).

Statistical Analysis

The step data obtained in the 100-step walking test were tested for a main effect and interactions with three-way ANOVA for each of the subject factors, smartphone model factors, and measurement factors. The ratio of the number of steps detected by the sensors to the actual number of steps was obtained to assess the precision of the step counting sensor. As an indicator of accuracy, R2 values of a linear regression model were obtained. Furthermore, the inter-class correlation coefficients (ICC case 3) showing the relationship between the actual number of steps and the number of steps detected by the sensor were calculated. The ICC criteria, following the Landis-Koch scale11), were applied with ≥0.8 as almost perfect. In testing the reliability between smartphones of the same model, ICC case 3 was obtained as an indicator showing the level of agreement between the models. Inter-coefficient of variations (inter-CVs) was calculated to evaluate random error. Unstandardized beta coefficients were shown as indicators of proportional bias from Bland-Altman analysis, and the mean differences with the 95% confidence interval between two models were also estimated as an indicator of fixed bias. With respect to tests of the false detection rate during sitting, sedentary behavior has been defined as "Any waking behavior characterized by an energy expenditure ≤1.5 METs while sitting or reclining" 5) in studies related to exercise epidemiology. Since the exercise intensity in walking is about 2 METs, it is theoretically possible when discriminating between physical activity and physical inactivity to differentiate at the 0/1 step level using step count. We obtained the false detection rate (steps/hour) during the sampling time under the three conditions of office work, riding the subway, and driving, respectively, and calculated the mean false detection rate from the weighted arithmetic means according to the sampling time. This was taken as an indicator of the false positive rate. All statistical analyses were performed using the statistical software package SPSS 22.0 (SPSS, Chicago, IL, USA).

Results

In the results of the three-way ANOVA of step data obtained in the 100-step walking test, a significant difference was seen in the factor of smartphone model only (F [6, 16.7] = 434.2, p<0.0001). No main effect was seen for subject or number of measurement factors. An interaction was seen for model × subject only (F [24, 96] = 1.86, p = 0.02); no significant differences were seen for other combinations.

Fig. 1 shows the relationship between the actual number of steps and the number of steps detected by the sensors. R2 values were obtained from linear regression models for each smartphone model. For five smartphone models, AQUOS ZETA, Galaxy S7 edge, Xperia Z3, Xperia X Performance, and Xperia X Compact, the range of ICC (3,1) was 0.956-0.993, all meeting the criteria for almost perfect of ≥0.8. The relative error ratios of the sensor-detected step number to the actual step number were ±0.1%-0.7%, and the R2 value was also ≥0.91. Thus, the model goodness of fit was also high. With two smartphone models, however, the Xperia Z4 and Z5, the reliability between the actual number of steps and sensor-detected number of steps was low, with an ICC (3,1) of 0.504 for Xperia Z4 and 0.443 for Xperia Z5. The relative error ratios of the sensor-detected number of steps to the actual number of steps were ±48.7%-49.4%, showing a tendency for the step number to be approximately double-counted. The R2 values for these two models were 0.827 to 0.844, and the model goodness of fit to the regression line was high.

Fig. 1.

Relationship between actual step count and sensor step count.

The results for reliability between the same model of smartphone are shown in Table 2. The level of agreement between the same models was ICC (3,1): 0.992-0.998, all meeting the criteria for almost perfect of ≥0.8. All of the intra-CVs, which show the error rate for the mean value of three repeated measurements, were less than 10%. The proportional bias with the Xperia Z4 model showed a slight decreasing trend of beta = -0.05 in the regression line slope (p = 0.03), but, with the other models, no proportional bias was seen between the same models. For the fixed bias, there was no significant difference in the difference in mean values between the same models and no fixed bias was observed.

Table 2. Results of assessment of reliability between smartphones of the same model
Smartphone Models ICC (3,1)a (95%CI) Random error Systematic bias (Bland Altman analysis)
CVsb Proportional bias Fixed bias
Model a Model b Betac p Mean differencesd (95%CI) p Pooled meane
a: Inter-class correlations coefficient (ICC) between smartphones of the same model, (a) and (b)
b: Coefficient of variations (CVs), calculated as the ratio of the standard deviation to the mean showed as precision of the smartphone sensors
c: Unstandardized beta coefficients of regression lines by Bland Altman plots
d: Mean differences are an indicator of whether fixed bias exists between two measurements. If the mean difference is not significantly equal to 0, fixed bias exists. p: Welch’s t-test with adjustment of degrees of freedom
e: Pooled mean shows merged mean steps of models (a) and (b) measured by smartphone sensor in the 6-min test
AQUOS (a)/(b) 0.992 (0.983, 0.997) 7.45% 7.42% 0.01 (-0.05, 0.06) 0.80 1.88 (-25.76, 29.52) 0.89 653.7
Xperia Z3 (a)/Z3 (b) 0.998 (0.997, 0.999) 9.57% 9.69% -0.01 (-0.03, 0.01) 0.38 1.56 (-34.26, 37.38) 0.93 654.0
Xperia Z4 (a)/Z4 (b) 0.994 (0.986, 0.997) 6.24% 6.58% -0.05 (-0.09, -0.06) 0.03 4.68 (-43.18, 52.54) 0.84 1311.8
Xperia Z5 (a)/Z5 (b) 0.994 (0.986, 0.997) 5.02% 9.81% 0.03 (-0.02, 0.07) 0.22 7.36 (-67.85, 82.57) 0.84 1332.6

Table 3 shows the false detection rate of the step counter under the conditions of sitting during office work, on the subway, and while driving. During office work, it was 0.00-1.53 (steps/hour), showing stability with almost no false detection. In contrast, the false detection rate was 0.00-15.84 (steps/hour) on the subway and 0.00-201.63 (steps/hour) during driving. The false detection rate was thus found to differ considerably depending on the smartphone model.

Table 3. False detection rate of step counter sensors in various sitting situations
Sitting during office work Sitting on the subway Sitting during driving
Mean false detections (steps/hour)a SD Total Sampling time (min) Mean false detections (steps/hour)a SD Total Sampling time (min) Mean false detections (steps/hour)a SD Total Sampling time (min)
a: Weighted arithmetic means (which is accomplished by weighting the false detection rate by the sampling time in each trial test)
SD: standard deviation
AQUOS 0.00 0.00 662 0.00 0.00 125 5.52 1.29 1227
Galaxy S7 edge 0.00 0.00 662 0.00 0.00 125 0.00 0.00 1129
Xperia Z3 0.18 0.07 662 5.76 3.33 125 29.33 8.77 1262
Xperia Z4 0.63 0.26 662 11.52 6.65 125 201.63 87.91 1129
Xperia Z5 1.53 0.69 470 15.84 9.15 125 0.00 0.00 98
X performance 0.45 0.09 662 1.92 1.11 125 10.84 2.93 1129
X Compact 0.93 0.38 645 2.88 1.27 125 9.04 0.07 478

Discussion

The aim of this study was to explore the reliability of built-in step counting sensors in smartphones with a special focus on the detection of physical activity/inactivity. There are limited studies comparing the difference between gait characteristics obtained with a smartphone and those obtained with a conventional accelerometer12), or validating a smartphone-based measurement for quantification of level walking13). A few studies focusing on inter/intra smartphone reliability in actual walking tests can also be found. In this regard, it is worth noting that the present study provides valuable knowledge for development of smartphone-based gait measurement tools that can be applied in ergonomics or epidemiological research.

The results of the 100-step walking test in this study showed no significant differences within or between subjects, but significant differences were seen in factors between smartphone models. This shows the possibility that differences in phone model have a greater effect on measurement reliability than the individual differences in users' gait characteristics. However, the fact an interaction was seen in model × subject suggests the possibility of large variations in measurement accuracy depending on the sensor characteristics of each model and users' gait characteristics. Whereas the tests in this study were performed with only five healthy males and did not analyze gait characteristic parameters in detail, gait characteristics are known to differ with individual characteristics such as the presence or absence of chronic disease14). Further study will be needed with a larger number of subjects to reveal the association.

Next, it was demonstrated from the relationship between actual step count and sensor step count that the sensor value was double-counted with some smartphone models. This shows that when step counting sensor information from a smartphone is used in assessing physical activity/inactivity, the level of physical activity may be overestimated if sensor information is collected and analyzed uniformly. R2 values and precision are both high even with Xperia Z4 and Z5 models, which double-count sensor measurement values. Thus, if a prediction regression model is made, it should be possible to appropriately estimate the actual step count. As shown in Fig. 1, the step count measured with sensors may need to be appropriately adjusted, such as by preparing a prediction regression line model for each smartphone model as a profile of that model or employing a calibration setting during use of the application.

In examining the reliability between smartphones of same model, we assumed that the subjects are specified and that no interaction exists between subjects. Under these assumptions, the results of reliability for single measurements between smartphones of the same model showed high reliability. For the proportional bias, a slight regression line slope of -0.05 was seen with the Z4 model, but this is thought to be a level that has almost no effect on the actual step count estimate. Moreover, since no fixed bias was seen, it is possible that the measurement stability will remain high even with different individuals if the smartphone model is the same. In this study, however, we tested only four smartphone model pairs. Whether similar results will be obtained with other individual smartphones, that is, whether these results can be generalized, will need to be determined in tests with a greater number of subject smartphone models.

The false detection rate of step counting sensors had almost no effect when subjects were sitting during office work, but false detection becomes larger depending on the phone model for sitting while riding the subway or driving. This result is in line with the findings of a previous research15) that examined the reliability of an acceleration sensor under the condition of riding in a motorized vehicle on paved roads. This includes the possibility that physical activity may be overestimated and physical inactivity may be underestimated when attempting to make assessments with smartphone step counting sensors only. Smartphones have built-in GPS information and acceleration sensors, and so it may be that developing a hybrid determination algorithm for this information in order to assess physical activity/inactivity with a smartphone application can contribute to improving the reliability of measurements.

Practical Implications and Study Limitations

Reliability and validity have been tested in several studies for the development of applications for gait analysis using smartphones12,16,17), but few studies have examined the effects of inter/intra smartphone reliability in actual walking tests. The need has also been shown for smartphone model profiles or advance calibration settings, but very few studies are seen that have demonstrated error detection characteristics during movement. This study therefore provides valuable information on reliability associated with use in actual environments.

Big data analysis using lifelog data from internal smartphone sensors holds potential as a new research method in the fields of occupational epidemiology and occupational ergonomics. For example, in addition to subjective evaluations based on self-administered questionnaires, information on physical activity and physical environment information, such as illumination and temperature, can be measured simultaneously on smartphones, and improvements in analysis granularity are expected. Using general smartphones of the types that are owned by most people, it is possible to inexpensively measure time-series data. This method is promising for application to large-scale epidemiological studies. Moreover, to clarify the relationship between physical activity and various health outcomes, which is a research topic of much interest in recent years, the ability to acquire data on activity over time is a huge advantage. Further parameters such as the frequency or intervals of physical inactivity and the cumulative sitting time can be available besides. Such new technologies that can gather information comprehensively, continuously, and for long times in work and life settings appear promising for application to epidemiological research.

For smartphone makers and developers, this study can also provide beneficial knowledge on identifying where the cause of the error is and how to improve the reliability of step counting sensor. The step counting sensor defined by Android OS19) is calculated using specific algorithm applied to the accelerometer. The algorithm itself depends on each developer and is not disclosed generally. Particularly, regarding false detection of step counting sensors under mobile environment, this research could suggest the direction for optimizing algorithm.

The limitations to this study are that the number of subjects was only five and that a limited number of smartphone models were assessed. Also, iPhones, which have a top share in the domestic Japanese market, were not included in the assessments. Furthermore, in commercially available pedometers with an embedded 3-axis acceleration sensor, the certain interval mask time (i.e., certain inactivity time) is excluded so as not to count the number of steps when the detection of acceleration does not continue for a certain period of time in order to reduce false detection of steps18). Although this algorithm is reasonable for detecting the number of steps, on the other hand it can lead to overestimation of the amount of physical inactivity. Since the sampling time of the step counting sensor used in this study was 5 seconds, a similar problem may exist. Careful consideration is therefore needed in generalizing the findings of this study. Thirdly, socio-cultural aspects should be considered when smartphone-based survey is conducted. In many cases, women do not have their smartphones on the chest pocket, thereby limiting smartphone-based investigation for women's physical inactivity.

Conclusion

The step counting sensors built into smartphones have a high level of stability in measurements even with different phones of the same model. This suggests the possibility that differences between phone models have a larger effect on measurement reliability than the individual gait characteristics of users. There is also a possibility that in measurements of physical inactivity during travel, physical activity may be overestimated and physical inactivity underestimated depending on the smartphone model. These results suggest the need for appropriate regulation of step counts measured by sensors through means such as correction or calibration with a predictive model formula in order to obtain the highly reliable measurement results that are sought in scientific investigation.

Acknowledgments: This work was supported by JSPS KAKENHI Grant Numbers JP2656078 and JP16H03137, Japan.

Conflicts of interest: The authors declare that there are no conflicts of interest.

References
 
2017 by the Japan Society for Occupational Health
feedback
Top