Advance Publication by J-STAGE Journal of Biomechanical Science and Engineering

Model-based three-dimensional(3D)/two-dimensional(2D) image registration methods have been widely applied in measuring 3D kinematics of the knee during dynamic activities. However, the combined effects of bone model compositions (radiodensity vs. homogeneous-density) and the number of fluoroscopic views on the measurement accuracy remained unclear. The current study evaluated experimentally the accuracy of the four model-based 3D/2D image registration configurations on the accuracy of measured knee kinematics, namely homogeneous-density model/single-plane image (HS), radiodensity model/single-plane image (RS), homogeneous-density model/biplane images (HB), and radiodensity model/biplane images (RB). Computed tomography (CT) of the knee and asynchronous biplane fluoroscopic images of the simulated knee motions were collected from a cadaveric knee joint for the evaluation of the registration configurations. The results showed that the use of biplane fluoroscopic images ensured mean absolute errors (MAE) below 0.3 mm and 0.9° in each motion component regardless of the types of bone models. Application of radiodensity model could generate digitally reconstructed radiographs more similar to the fluoroscopic images, diminishing MAE in all motion components and measurement bias. As a result, the RS configuration was capable of reconstructing the 3D knee joint angles with MAE comparable to those obtained using the HB configuration. Among the four tested configurations, the RB configuration was most accurate and least affected by the fast skeleton motions.

The geometric model obtained from the CT preserves the radiodensity of the bone, which allowed further generation of a digitally reconstructed radiograph (DRR) through a ray-tracing technique (Siddon, 1985), a synthetic image showing similar contrast to the real fluoroscopic images. Given the merit of the DRR, the CT-based bone models are commonly used in the 3D/2D image registration tasks (You et al., 2001;Tsai et al., 2010;Postolka et al., 2020;Pitcairn et al., 2018). In order to minimize the radiation exposure on the subjects, on the other hand, MR images may be a potential alternative to the CT images. However, the appearance of the images obtained with standard MR imaging are notably different from those with CT, which inherently limits its use in DRR generation. In situations that the radiodensity of the bones are not available, the bone silhouette generated with flat-shading are used in the shape-matching process for the reconstruction of 3D bone poses (Giphart et al., 2012;Smoger et al., 2017;Moewis et al., 2012;Moro-Oka et al., 2007;Stentz-Olesen et al., 2017). Previous experimental evaluations have shown that while the flat shading is advantageous in computational efficiency, the lack of interior information and non-attenuated bone edges of flat-shaded images negatively affect the accuracy of the 3D/2D image registration (Fregly et al., 2005;Lin et al., 2014). To overcome the limitation, treating voxel intensities interior to the bony surfaces as homogeneous radiodensity and generating synthetic images with the ray-tracing may be a viable alternative to mimic CT-generated DRR.
Single-plane static or mobile fluoroscopy systems (List et al., 2017) provided successive real-time planar images of the joint of interest during the subjects' motions. While the single-plane imaging system allowed a less restricted field of view for motion data acquisition, the 3D/2D image registration with single-plane images were often reported to be less accurate especially in out-of-plane motion components (Postolka et al., 2020;Tsai et al., 2010;Flood and Banks, 2018). Biplane imaging with two x-ray units was demonstrated to topically address the issue with compromises on the effective field of view. Custom-built stereo radiography (Guan et al., 2016;Ivester et al., 2015;You et al., 2001), clinical asynchronous biplane x-ray imaging system (Akbari-Shandiz et al., 2018;Lin et al., 2020; and the configuration of two C-arm fluoroscopes (Barré and Aminian, 2018;Li, Van de Velde and Bingham, 2008) are commonly adopted. To the best knowledge of the authors, the combined effects of the number of image sources (single-plane vs. biplane) and the bone model compositions (homogeneous density vs. radiodensity) on the performance of the modelbased 3D/2D image registration remain unclear.
The current study aimed to evaluate the accuracy of four model-based 3D/2D image registration configurations with different combinations of bone model types (homogeneous-density vs. radiodensity) and fluoroscopic image sources (single-plane vs. biplane) for measuring the 3D kinematics of the knee joint. The same dataset acquired from a validation experiment conducting on a cadaveric knee joint was used for all the evaluations.

Specimen preparation and simulated knee motion
A left knee specimen consisting of the femur, tibia, fibula, patella and surrounding soft tissues was obtained from a male donor who had under-gone above-knee amputation procedures, and was stored at -70°C immediately after harvest for use in the validation experiment. The knee joint was free from any diagnosed abnormalities. Prior to the experiment, the knee specimen was defrosted for 24 hours at 4°C. The proximal end of the femur and distal end of the tibia were attached with plastic-made bases (each with a mass of 150g) via bone cement. The femoral base was attached to a fixed frame and the tibial base was free to move in space. This enabled the flexion/extension of the joint by applying a small contact force to the tibial base with minimal interference to the natural motion of the knee (Wilson et al., 2000) ( Fig. 1A). To record the real-time 3D movement of the femur and tibia, clusters composed of four 7-mm infrared retroreflective markers were rigidly implanted into respective bones. Overall, four clusters were affixed onto the femur and three clusters were attached to the tibia. At a controlled room temperature of 19°C, four trials of isolated knee flexion and extension motions were simulated by manually bending and extending the specimen at four different cadences (ranging from 0.4 to 0.8 cycles/s) guided by a metronome while allowing variations from the specified cadence. This enabled rotation speeds distributed over a speed range commonly found in daily activities. This study was approved by the Institutional Review Board of the China Medical University Hospital (approval number: DMR101-IRB1-139).

Image acquisition
During each trial of the simulated knee motions, asynchronous biplane fluoroscopic images were acquired using a clinical biplane x-ray imaging system (Allura Xper FD20/20, Philips Medical Systems, Amsterdam, The Netherlands) . The two flat-panel detectors were configured orthogonally to each other (Fig. 1A), which allowed the recording of the lateromedial and anteroposterior views of the knee motion (Fig. 1B). The x-ray source-to-detector distances were set at 1.0 m and 1.2 m, respectively. The specimen was placed around the isocenter of the two x-ray views with the x-ray source-to-specimen distances of approximately 930 mm and 740 mm, respectively. The interleaved biplane fluoroscopic images at a resolution of 512 × 512 pixels and a frame rate of 30 fps for each view were collected ( Fig. 2A). Meanwhile, the trajectories of the markers on the specimen were also recorded using a motion capture system consisting of fourteen infrared cameras (VICON 612, Oxford Metrics, Oxford, UK) for the definition of an independent standard reference. The intrinsic and extrinsic parameters describing the reconstructed biplane imaging environment were determined with reported calibration procedures prior to motion data acquisition . The knee joint with marker clusters underwent a CT scan at a peak voltage of 120 kVp and tube current of 180 mA (Optima CT660, GE Healthcare, Chicago, IL, USA) to acquire the image stack of radiodensities (voxel size: 0.7 mm × 0.7 mm × 0.625 mm).

CT-derived volumetric bone models
The CT slices of the cadaveric knee were semi-manually segmented to isolate the regions of the femur and tibia. A sub-volume that covered the complete region of the target bone was extracted from the original data set. Within the sub-volume, the radiodensity information of the bone was retained while the voxel intensities exterior to the bone were set to -1000 ( Fig. 2B). To mimic the situation that the radiodensity information of the bone is not available, a new subvolume was duplicated and the voxel intensities of the bone were set to a constant value obtained by averaging the intensities of the bone (Fig. 2C). As a result, the original image sub-volume was taken as the volumetric bone model containing radiodensity information while the processed sub-volume gave the bone model with homogeneous density. The 3D geometric models of the femur and tibia were also created from the segmented regions with the marching cubes algorithm (Lorensen and Cline, 1987). The geometric models were used to determine the anatomical coordinate system (ACS) of the bone using an automated determination method (Miranda et al., 2010).

Model-based 3D/2D image registration
The pose parameters of the femur and tibia with respect to the fluoroscopy coordinate system (FCS) were determined separately by best-matching the model-projections to the fluoroscopic images using the four model-based 3D/2D image registration configurations. Each of the configurations was established with different bone model types and numbers of fluoroscopic views, and was termed as homogeneous-density model/single-plane image (HS), radiodensity model/single-plane image (RS), homogeneous-density model/biplane images (HB), and radiodensity model/biplane images (RB).
For the radiodensity and homogeneous-density bone models, the original intensity values (i.e., Hounsfield scale) were converted to the linear attenuation coefficients. To generate a DRR, virtual rays were cast from the x-ray source through the volume of the bone model to each pixel of the image plane. The attenuation coefficient of voxels of the bone model went through by virtual rays were accumulated, representing the attenuation of the x-rays as they pass through the bone (Siddon, 1985). The intensity values of the DRR were thereafter determined using the attenuated information. To speed up the process of DRR generation, we implemented a ray-tracing with trilinear interpolation method (Otake et al., 2012). Note that since the radiodensity and homogeneous-density bone models are different in compositions, the appearances of the resulting DRRs were also visibly different (Figs. 2D and 2E).
The 3D/2D image registration was executed in a framework of an optimization procedure where the design variables were six pose parameters of the bone and the objective function was the score representing the similarity between the DRR and the fluoroscopic image. In the current study, if only the lateromedial view fluoroscopic images were used for the image registration, the genetic algorithm (population size = 50; generation size = 90) (Goldberg, 1989) was used to search for the optimal pose parameters of the bone model such that the generated DRR best-matched the fluoroscopic image. In cases when the biplane fluoroscopic images were used, an image registration procedure termed model-based interleaved biplane fluoroscopy image tracking (MIBFT) (Lin et al., 2020) was implemented to determine the pose parameters of the bone models.
In order to evaluate objectively the performance of the four tested approaches (i.e., HS, RS, HB and RB), a consistent similarity metrics Gradient Correlation (GC) (Penney et al., 1998) was used to quantify the image similarity in the current evaluation. For the computation of GC values, the horizontal Sobel operator was applied on the DRR and the fluoroscopic images to create gradient images. The normalized cross correlation (NCC) between the two gradient images was then computed (Penney et al., 1998). In the same fashion, the NCC was also computed from gradient images created using the vertical Sobel operator. The sum of the two NCCs then gave the final value of GC. A particle filter-based 2D/2D template registration (Lin et al., 2020) was employed to update the initial pose parameters prior to the 3D/2D image registration.

Data analysis
The cluster markers were used to determine the standard reference, from which the markers presenting on the CT were extracted and expressed in the corresponding ACS, and the markers obtained from the motion capture system were expressed in FCS using a spatial transformation obtained from a calibration procedure . Co-registration of the marker clusters in ACS ad FCS gave the reference poses of the bone (Söderkvist and Wedin, 1993). Reference knee kinematics were obtained from the reference poses of the femur and tibia, from which the angles were decomposed into flexion(+)/extension(-), abduction(+)/adduction(-) and external(+)/internal(-) rotations following the joint coordinate system proposed by Grood and Suntay (Grood and Suntay, 1983).
The errors of the bone pose parameters were quantified by comparing the standard references and the registered poses obtained from the four tested approaches. The errors in the knee kinematics were computed as the differences between the standard reference kinematics and those derived from the registered bone poses. For each trial (between 41 to 86 frames per trial), mean absolute errors (MAE) and mean errors (ME) were determined for each registration method, where the MAE represents the overall errors of each motion component and the ME represents the measurement bias. The means and standard deviations of the MAE and ME across all trials and bones were obtained. The effects of motion speeds in translation and rotation components on the deviations in the registered bone poses were evaluated.

Bone pose MAE and ME
The four model-based 3D/2D registration configurations were evaluated by quantifying the MAE and ME of the bone six-degree-of-freedom poses (Table 1 and 2). In general, utilization of biplane fluoroscopic images yielded accurate spatial bone positions with both MAE and ME below 0.3 mm (Table 1 and 2). The configuration RS with single-plane image and radiodensity bone model gave out-of-plane translation errors (Tz) up to 4.7 mm in MAE (Table  1) and -4.2 mm in ME ( Table 2). The configuration HS yielded maximum MAE of 8.2 mm and 1.5° in translation and rotation components, respectively (Table 1). When compared to the configurations using homogeneous-density model for image registration, utilization of radiodensity model diminished translation MAE ranging from 0.1 mm to 3.5 mm and rotation MAE ranging from 0.2° to 0.7° (Table 1). Similar tendencies were also found in the ME, and the bias of bone poses was lowered with radiodensity models (Table 2).

Knee joint kinematics
Among the four image registration configurations, the RB gave the most accurate reconstruction of the 3D knee kinematics with MAE below 0.3 mm or 0.6° in all translation and rotation components (Fig. 3). The utilization of biplane fluoroscopic images allowed more accurate determination of knee joint translations (MAE < 0.9 mm) regardless of the types of bone models used (Fig. 3). The utilization of homogeneous density models for the image registration gave an MAE of less than 1.2° in rotations (Fig. 3). The RS configuration was capable of reconstructing the 3D knee joint angles with MAE comparable to those obtained using the HB configuration.

Effects of movement speeds
The scatter plots and the moving averages of the absolute translation and rotation deviations against the movement speeds showed that the RB configuration gave the lowest pose deviations, the HS showed the highest deviations, and the performance of the RS and HB were in-between (Fig. 4). As indicated by curves of moving averages, the translation deviations obtained with four configurations were not associated with translation speeds, but the rotation deviations obtained with HS, RS and HB were increased with translation speeds (Fig. 4). Overall, the RB configuration was the one least affected by the fast skeleton motions (Fig. 4), the deviation values of moving averages being consistently below 0.4 mm and 1° over translation speeds from 20 mm/s to 200 mm/s and rotation speeds from 20°/s to 150°/s.

Discussion
The current study compared the accuracy of four model-based 3D/2D image registration configurations with different bone model types and number of fluoroscopic views in measuring 3D knee kinematics during dynamic activities. The results showed that using single-plane fluoroscopic images for model-based image registration had the greatest errors in out-of-plane translations whether the homogeneous-density or radiodensity model was used (Table 1), in agreement with previous findings (Fregly et al., 2005;Postolka et al., 2020;Tsai et al., 2010). It is difficult to compare directly the current findings with those reported in the literature considering the differences in the algorithms and experiment setup used. For example, in comparison to previous static evaluations using cadaveric specimen (Tsai et al., 2010), the current validation study conducting on fast dynamic motions of the knee definitely increased image artifacts associated with motions and tissue overlapping. The abundant soft tissues covering the current knee specimen have also compromised the contrast of the bone with respect to the neighboring regions on the fluoroscopic images ( Fig. 2A), and thus the effects were varying over different knee joint angles. Not well-distinguishable bone outlines at certain image frames affected the measurement accuracy of the bone poses, especially in the out-of-plane translation component while the HS and RS were used (Tables 1 and 2).
Configuration of stereo x-ray imaging can be achieved via biplane fluoroscopy, from which the pose parameters are determined by simultaneously best-matching the pair of DRRs to the pair of fluoroscopic images. In such a way, the out-of-plane motion components (e.g. Tz and θx) for one fluoroscopic view were practically the in-plane motion components of the other view (e.g. Tx and θz) if two views were configured orthogonally. Given the benefit of the stereo imaging, the bias (Table 2) and thus the MAE (Table 1) in Tz component (i.e., the original out-of-plane translation) were largely compensated regardless of the models used for registration. Both the HB and RB methods were shown to yield registration errors with similar magnitudes among six motion components of the bone (Table 1). These characteristics enabled a more precise description of the knee joint kinematics especially in joint translation components (Fig. 3).
The composition of the model content (homogeneous-density vs. radiodensity) was found to affect the accuracy of the 3D/2D image registration whether single-plane or biplane images were employed for analysis (Tables 1 and 2). Since the homogeneous-density model neglects the differences of attenuation coefficients between the cortical and cancellous bones, the DRRs created from the accumulation of such the attenuation information were thereafter visibly different in details from DRRs generated using the radiodensity model (Figs. 2D and 2E). The radiodensity modelderived DRRs were generally more similar to the fluoroscopic image as indicated by the lower negative gradient correlations (Fig. 5), which were intuitively associative to the reduced errors (Table 1) and bias (Table 2). Comparing the changes of the objective function values at various deviated bone poses, the use of the radiodensity model gave an objective function more sensitive to the bone pose deviations in the x-and y-axis translations and z-axis rotation (Fig.  5). This feature may be beneficial to lowering the number of local minima in optimization, leading to a more reliable image registration outcome. As a result, the use of the radiodensity model (RS) was shown to allow more accurate determination of knee joint angles than those using the homogeneous-density model (HS) and even comparable to those obtained with the HB (Fig. 3).
The homogeneous-density model implemented in the current study was considered an ideal bone model created from a hypothetical scenario that the outer surface of the bone geometry was identical to that of the CT-derived radiodensity model. In fact, in situations when the CT is not available, the models created using MR images (Moewis et al., 2012;Moro-Oka et al., 2007) or statistical shape models (Tsai et al., 2015) possibly presented morphological discrepancies to the CT-derived model. It should be noted that the small shape differences have also been demonstrated to affect notably the registration accuracy especially in the out-of-plane translation component (Moewis et al., 2012).
The effects of the translation speeds and rotation speeds on the registration errors were quantified with the setup of the current validation study as the fluoroscopic images were acquired at various rotation speeds of the knee joint motion. The effects of the motion speeds on the registration accuracy were represented in a twofold aspect. The most direct effect was on any of the image artifacts associated with the bone motions. Since the pulsed fluoroscopy system with a short pulse interval was employed in the current study, the motion artifacts were considered minimal which might merely have subtle effects on the registration errors. Another effect was on the inter-frame displacement of the bone, of which the higher motion speeds would directly enlarge the displacement of the bone between successive image frames. The more the initial pose parameters were away from the true poses, the lower probability of an optimization process might converge to the global minimum. The phenomenon has been shown in previous capture range analysis studies (van de Kraats et al., 2005;Lin et al., 2019;Lin et al., 2014). Although the template matching algorithm was used to diminish the inter-frame displacement, some of the components of inter-frame displacement were not fully eliminated (Lin et al., 2020). In general, increase of translation speeds of the bone did not consistently increase the overall translation errors regardless of the methods used (Fig. 4). However, the HS, RS and HB methods didnot produce consistently low rotation errors as the rotation speed increased (Fig. 4). It appeared that only the radiodensity model combined with the biplane images (RB) allowed diminishing the negative effects associated with the fast bone movements.

Conclusions
The accuracy of four model-based 3D/2D image registration configurations with different bone model types and numbers of fluoroscopic image sources on the determination of 3D kinematics of the knee joint was quantified. The use of biplane fluoroscopic images improves the measurement accuracy of the six motion components of the bone and thus the precise description of the six-degree-of-freedom kinematics of the knee. Application of radiodensity bone models further diminished the measurement errors. The RS configuration enabled the accurate reconstruction of 3D knee joint angles with errors comparable to those obtained using the HB. The current results suggest that of the four configurations a biplane fluoroscopy combined with radiodensity models will produce the most accurate 3D bone poses least influenced by the bone motion speeds, which will be preferred in clinical applications. Further studies with increased sample size may help provide inferential statistics to confirm further the current findings.