2022 Volume 98 Issue 3 Pages 93-111
The cerebral cortex performs its computations with many six-layered fundamental units, collectively spreading along the cortical sheet. What is the local network structure and the operating dynamics of such a fundamental unit? Previous investigations of primary sensory areas revealed a classic “canonical” circuit model, leading to an expectation of similar circuit organization and dynamics throughout the cortex. This review clarifies the different circuit dynamics at play in the higher association cortex of primates that implements computation for high-level cognition such as memory and attention. Instead of feedforward processing of response selectivity through Layers 4 to 2/3 that the classic canonical circuit stipulates, memory recall in primates occurs in Layer 5/6 with local backward projection to Layer 2/3, after which the retrieved information is sent back from Layer 6 to lower-level cortical areas for further retrieval of nested associations of target attributes. In this review, a novel “dynamic multimode module (D3M)” in the primate association cortex is proposed, as a new “canonical” circuit model performing this operation.
The cerebral cortex is organized into six stacked layers (Layers I–VI, L1–L6), an arrangement that increases its computational efficiency. Each cortical layer is endowed with unique combinations of cell types, inter-laminar connections, and long-range input/output with other brain areas.1)–9) Information arriving at the cortex is processed by a local circuit that traverses these six layers. Morphological investigations have revealed some characteristic features of local inter-laminar connections between local neurons in different layers.1),2),5),10)–12) On the morphological and some physiological observations most intensively accumulated in primary sensory areas, such as the primary visual area (V1), several local signal-flow models were proposed that depict a major feedforward path and a series of nested positive and negative feedback loops (i.e., recurrent loops).2),13) A diagram of a local circuit with presumptive signal flows is often called the “canonical” cortical circuit model,4),14) with a similar basic structural and functional organization of neuronal circuits expected throughout the cortex. However, direct support for signal flow in the aforementioned model has been meager in physiological in vivo situations, except in primary sensory areas, although there is abundant morphological and ultrastructural evidence for cell-type specific connectivity and quantitative synaptic input strength.2),11),15)
The idea that there is a fundamental cortical processing unit is partly connected with the concept of cortical columnar organization16),17) (but see below for the possible difference in the local circuit organization between columnar and non-columnar animal species). Dense intracortical connectivity in horizontal directions (parallel to the layers) is also generally restricted to an order of a few hundred microns.11) Historically, Hubel and Wiesel’s proposal16),18) for circuits that underlie simple and complex cells in V1 was the most influential model of physiological local circuit operation. This proposal highlighted a serial feedforward processing generating the orientation-selective receptive field of simple cells in Layer IV and the spatial invariant receptive field of complex cells in more superficial layers (Layers II and III). Thereafter, laminar differences in neural coding have been physiologically examined extensively in V13),6),10),19)–21) as well as in the primary somatosensory area (S1).22),23) From these studies, the classic pictures of “canonical” circuit models have emerged, which proposed two canonical operations within sensory cortical areas: (i) a feedforward computation of response selectivity (e.g., orientation selectivity) and (ii) an intracortical recurrent computation of response gain (e.g., “normalization” of strong/weak external inputs),14),21),24),25) as detailed in the next section.
It remains elusive, however, whether a similar functional organization of neuronal circuits is also at play in the higher association cortex, or, more specifically, whether and how different layers of the higher association cortex implement the computation for distinct steps of high-level cognition such as memory and attention.26)–35) This question is becoming a more urgent issue because recent deep-learning networks in artificial intelligence have been revealed to have similar network architectures to the primate temporal association cortices.36)–38) The first key biological question is to assess whether different types of cognition-related neurons are situated in distinct layers (laminar module model) or distributed throughout the cortical layers (non-laminar functional model).
Previously, the paucity of techniques for an accurate identification of cortical layers from which neuronal recordings were made in behaving animals was a major obstacle. This precluded carrying out any detailed investigation of neural activity in different layers of higher association cortices. High-level cognition has been predominantly investigated in monkeys, thus warranting a new technique that can be applied to behaving monkeys. Recently, magnetic resonance imaging (MRI)-based and current source density (CSD)-based approaches have been developed for investigating local circuit dynamics in the association cortex of monkeys, particularly memory recall dynamics in the temporal association areas.39),40)
This review briefly introduces the anatomical background of the cortical local circuit organization and provides an example of memory recall process to assess how different layers of the higher association cortex implement the computation for distinct steps of high-level cognition. It will thereby clarify the activation of distinct paths of the morphologically identified local connectivity in a distinct subprocess of memory recall across time sequences, revealing a different model from the feedforward information processing and recurrent control of response gain that were identified in the primary sensory areas. These findings will lay the foundation for the proposal of a new cortical circuit model, named the cortical “dynamic multimode module (D3M)” in the association cortex instead of the classic canonical circuit model.
The six layers of the cerebral cortex are arranged parallel to the cortical surface, numbered from the outer surface of the brain to the white matter (Fig. 1A). These layers are generally classified according to the most prominent cell type in a certain layer. Layer I (L1) is the most superficial layer and termed the molecular layer. It principally harbors axons, dendrites, and axon terminals of neurons, with their cell bodies located in deeper layers. Layers II and III (L2 and L3) predominantly comprise small pyramidal-shaped cells. Neurons located deeper in Layer III are typically larger than those located superficially. The axons of pyramidal neurons in Layers II and III project to other cortical areas as well as locally to other neurons within the same cortical area, thereby mediating intra-/inter-areal communication (Fig. 1B). Layer IV (L4) contains many small spherical neurons and is called the internal granular (G) cell layer. It is most prominent in primary sensory areas and is the major recipient of sensory input from the thalamus. Layer V (L5), the internal pyramidal cell layer, principally contains pyramidally shaped cells that are typically larger than those in Layer III. Pyramidal neurons in this layer give rise to the major output pathways of the cortex, projecting to other cortical areas and subcortical structures (Fig. 1B). The neurons in Layer VI (L6) are heterogeneous in shape; therefore, this layer is termed polymorphic or multiform, and it carries axons to other cortical areas.
Six-layered structure of the cerebral cortex and a “canonical” circuit model. A, Histological sections arranged perpendicular to the cortical surface. Golgi staining identifies neuronal and glial cells and their processes (left), Nissl staining identifies neurons (middle), and Weigert staining identifies fibers (right). I–VI, Layer I–Layer VI. Modified from Heimer (2012)113) with permission. B, Summary of inputs, outputs, and intrinsic excitatory connections of a generic, nonprimary visual area of the primate cerebral cortex. Modified from Shipp (2007).5) C, A “canonical” cortical circuit model proposed by Douglas & Martin (2004)4) on the basis of studies mostly in the primary visual cortex. Thalamic relay cells (Thal) mainly form synapses in Layer IV (L4P). In all layers, neurons form recurrent connections. Layer IV contains a specialist excitatory cell type, the spiny stellate cell, which projects to pyramidal cells in Layer III (L3P) and inhibitory cells in Layer IV and other layers. The superficial layer pyramidal cells connect locally and project to other areas of cortex. The deep layer pyramidal cells also connect recurrently locally and project to subcortical nuclei in the thalamus, midbrain, and spinal cord. The major feedforward information flows highlighted by Hubel & Wiesel (1977)16) and Felleman and Van Essen (1991)9) are shown in red arrows. Modified from Douglas & Martin (2004),4) but directly from the original correct figure kindly provided by Prof. Kevan Martin (personal communication, 2021).
The local circuit connectivity has been previously modelled in various ways. One of the most popular models by Douglas & Martin (2004) was based on studies of V1 and is often called the “canonical” cortical circuit (Fig. 1C).4) In an original form of this model, Douglas & Martin (2004) emphasized the roles of a series of nested positive and negative feedback loops (“recurrent circuits”) and the balance between the impacts of excitatory and inhibitory neurons.4) The major feedforward path [red arrow from the thalamus to Layer IV (L4), Fig. 1C] has been most intensively examined in the emergence of orientation preference in V1. It generates the core orientation preference of the simple cell in Layer IV by optimally orienting monosynaptic input from thalamic neurons that have non-oriented receptive fields.16),41) Furthermore, the feedforward path generates a spatial invariant complex cell receptive field in the supragranular (SG) layers (Layers II and III) (red arrow from L4 to L3, Fig. 1C).
Recurrent circuits with a balance between excitatory and inhibitory neurons are likely to selectively amplify the selectivity of the major feedforward path, i.e., to lower or raise response gain through inhibition or disinhibition, respectively.42)–44) Recent developments in the identification of cell-type-specific gene markers in mice enabled histochemically targeting of specific subsets of cortical inhibitory neurons, dividing cortical inhibitory neurons into three non-overlapping categories in mice: those that express parvalbumin (PV), somatostatin (SOM), or vasointestinal peptide (VIP).12),45)–47) In rats, PV, SOM, and calretinin (CR) cells form non-overlapping categories.48) These studies suggested many specific local circuits that may contribute to the recurrent gain control in V1.49)–51) However, a comprehensive review of these studies falls beyond the scope of this article given the possible difference in local circuit organization of V1 between columnar (e.g., primates) and non-columnar (e.g., mice) animal species.
As noted previously, the original canonical circuit model has assumed ‘columnar organization’ of response selectivity (e.g., orientation preference in V1 of cats and primates is invariant across the layers at a given cortical position). However, in rodents, only the spatial position of the receptive field is invariant across the layers (retinotopic map) and properties such as orientation selectivity are uncorrelated from cell to cell along the vertical direction across cortical layers (a “salt-and-pepper” organization).52)–54) Therefore, we remain unsure, at the present stage of conceptualization, whether cell-type-specific local circuits that may contribute to the recurrent gain control in mice V1 may also function similarly in the monkey association cortex. Discussion of cell-type-specific local circuits will be left open in this review.
The following sections will shed light on a different view of the association cortex from the classic canonical models, demonstrating that some cognitive processes, such as memory recall, are generated by different activation dynamics of the local circuit from those known in V1 during sensory processing. The new view highlights, rather than the major feedforward path (red arrows in Fig. 1C, magenta arrows in Fig. 7), the signal flows within and through deep cortical layers (red arrows in Fig. 7).
A major obstacle in addressing questions about local circuit dynamics in the association cortex is the absence of techniques to reliably localize recorded single neurons at a six-laminar resolution in each microelectrode penetration with task-trained monkeys. It is difficult to apply conventional techniques used for acute experiments on the primary sensory cortices of anesthetized animals, e.g., lesion marking on each electrode penetration, to chronic experiments in the association cortices of task-trained primates, because the lesion marks on individual penetrations do not persist for months to years. Recently, two novel approaches were developed: (i) an MRI-based microelectrode-tip localization procedure in task-trained monkeys,39) and (ii) an application of the CSD analysis with a multi-contact linear-array microelectrode.40)
MRI-based method.The position of the tip of a microelectrode inserted into the monkey cortex (Fig. 2A) is typically invisible on conventional MRI, because of the partial volume effect on a tiny electrode tip even with the smallest MRI voxels for primates. However, it can be detected on high-resolution structural MR images with enhanced detectability39) (Fig. 2B and E depict the electrode image becoming extremely thick through enhancement, with preserved spatial resolution along the electrode). Locations of the recorded neurons were reconstructed in the MRI volume by referring to the locations of the microelectrode tip on each recording track. This procedure is essentially identical to the conventional procedure used in acute experiments, in which the locations of the recorded neurons were reconstructed on histological sections by referring to the location of an electrolytic lesion mark on each recording track.16),18) The spatial resolution of this method was demonstrated by comparing the MR images of the electrode tip with subsequent post-mortem histological sections (see Fig. 2D and G for the electrolytic lesion that marks the tip position). This method identified the tip location with a single voxel accuracy of the MR image (<100 µm).39)
New MRI-based method for localization of the microelectrode tip in behaving monkeys. A, Schematic drawings of a lateral view (left) and a plane of the brain with the microelectrode inserted (right). The microelectrode (blue line) was inserted at a 90° angle to the superior temporal sulcus (STS), and the B0 angle of the MR scanner was maintained at more than 60° during MR scanning. The framed area denotes the position of the panels in B–D. B, MR image of the brain with the inserted microelectrode. C, D, Histological sections stained for myelin (C) and Nissl body (D) corresponding to the MR image in B. (E–G) Enlarged images from the framed areas in B–D. The cortical location of the microelectrode tip on the MR image matches that of the lesion mark (arrowheads) on the histological section. The histological sections were corrected for shrinkage. LS, lateral sulcus. Scale bars, 5 mm (B–D) and 2 mm (E–G). Modified from Matsui et al. (2007).39)
The MRI-based method requires the acquisition of high-resolution MR images (preferably with an MR scanner with a field >4 T). An alternative method using a CSD analysis55),56) and a linear-array multi-contact microelectrode57) has also been developed and refined.40) This method is more easily accessible to numerous laboratories that do not have access to a high-field MR scanner. The CSD reflects the gross transmembrane currents in a local neuronal ensemble and provides a physiological index of the location, direction, and density of synaptic transmembrane current flow at the corresponding depths of the cortex.55),56) The CSD, either at each trial or averaged across trials, is calculated from the depth profiles of the stimulus-evoked local field potentials (LFPs) recorded at different depths in the cortical tissue using a three-point-formula that approximates the second spatial derivative of the voltage recorded at each recording contact.55)–58) The CSD at the n-th contact, Dn, was calculated as follows:
\begin{equation*} \text{Dn} = -[\varphi (\text{n} + 1) + \varphi (\text{n} - 1) - 2 \varphi(\text{n})] / \Delta^{2}, \end{equation*} |
The CSD analysis was carried out to estimate the cortical layers that received afferent inputs56),58) and, in V1, the earliest current sink induced by an optimally oriented visual stimulus provided a good estimate of the location of the geniculo-cortical afferent synapses, the G layer (Layer IV).56) However, it remained unknown whether the location of cortical afferents may be estimated in the association cortex. We provided the first report documenting success in the monkey temporal association cortex,40) as detailed in the next section.
This procedure was only effective in calculating the earliest current sink using the responses evoked by the optimal stimuli for the recorded cortical patch/column. This requirement was reasonable and feasible for V1. However, optimal stimuli in the association cortex are not always known during the recording, except for some specific experimental designs, such as recording from a face-patch or from a memory-patch in the temporal association cortex. Failing to use optimal stimuli in the depth profile recording may result in the CSD providing an idiosyncratic and uninterpretable pattern, differing from the “canonical CSD pattern” (see below).
We applied the above-mentioned novel methods for analyzing memory recall in monkeys using a cued-recall task. The cued-recall task is used for testing memory on presenting a participant with cues, such as words or objects, to help with the recall of previously experienced stimuli. Cued-recall also frequently occurs in everyday life. We devised a variant of the cued-recall task, the paired associate task, for analyzing neural dynamics in the temporal association cortex in monkeys (Fig. 3). The original version of the paired associate test is included in the Wechsler Memory Scale-Revised (WMS-R), one of the most widely used neuropsychological batteries for assessing explicit/declarative memory performance in humans. In the verbal version of the test, a group of eight word-pairs is read to the patient (Fig. 3A). The patient is then required to recall the second word of the pair on reading the first word. Similarly, a visual paired associate task was designed for monkeys,59) which involved the preparation of 24 computer-generated pictures for each monkey. The geometrically distinct patterns were sorted into pairs (Fig. 3B). The combination of the paired associates was unpredictable, and the monkeys learned through trial and error. Monkeys obtained fruit juice as a reward for correctly touching the paired associate (Fig. 3C). An analysis of the shortest reaction time and the shortest neural response latency for the paired associate of the cue stimulus suggested that memory retrieval principally occurs during the delay period (for the time course of the neuronal recall signal, see Fig. 6E in which a cue stimulus was presented for 0.3 s),28),60),61) as also confirmed by a reaction time analysis of a variant of this paired associate task, the paired associate with color switch task.62)
Paired associate test for assessing explicit/declarative memory performance in humans or monkeys. A, Verbal version of the paired associate test included in the Wechsler Memory Scale-Revised (WMS-R). A group of eight word pairs (SET I of the WMS-R, left) is read to the patient. The patient is then required to recall the second word of the pair (Recall, right) on reading the first word. B, A set of 24 computer-generated pictures for the visual paired associate task designed for monkeys. The geometrically distinct patterns were created for each monkey and sorted into pairs. The combination of the paired associates was not predictable and the monkeys learned through trial and error. C, In each trial of the monkey task, a cue stimulus was presented on a video monitor for 0.3–1 s following a fixation period for 0.5–1 s. After a 2–4 s delay period, they were presented with a choice of two stimuli (the paired associate of the cue and one from a different pair). Monkeys obtained fruit juice as a reward for correctly touching the paired associate.
Monkeys were trained to perform a visual paired associate task, in which they had to retrieve the learned paired associate in response to the presented cue stimulus (Fig. 4A). We recorded unit activities and LFPs by inserting linear-array multi-contact microelectrodes (16 or 24 contacts with spacings of 150 µm or 100 µm, respectively) vertically into area 36 (A36) of the temporal association cortex (Fig. 4B).40) Subsequently, the CSD was calculated from depth profiles of stimulus-evoked LFPs to physiologically estimate the position of the G layer. A representative CSD profile exhibited the earliest current sink (Fig. 4D and E, asterisks) at the contact corresponding to the histologically verified G layer (Fig. 4C, red). The latency of the earliest current sink was approximately 90 ms following the cue onset. This earliest current sink was followed by sinks at more superficial contacts and by sources at deeper contacts (Fig. 4D). We termed this the “canonical CSD pattern”. Similar CSD profiles were consistently observed for all penetrations in A36,40) as also recently confirmed in recordings from area V4 of the monkey temporal cortex.63) Postmortem histological analyses showed that the earliest current sink evoked by cue stimuli consistently corresponded to the G layer [the distance between the contact with the earliest current sink and the center of the G layer was 79 µm (median), n = 6 penetrations]. Histological verifications, together with consistent “canonical” CSD profiles across penetrations, demonstrated that the CSD profiles could be reliably used to estimate the granular layer (G), the supragranular layer (SG), and the infragranular layer (IG).40)
CSD-based laminar localization procedure assessing the location of the Layer IV (granular layer, G) of the temporal cortex. A, Sequence of the paired associate task (see Fig. 3C). B, Lateral (left) and ventral (right) views of the monkey brain. Neuronal activities and local field potentials (LFPs) were recorded across cortical layers in area 36 (blue) using a linear-array multi-contact electrode. C, Electrolytic lesion marks made at two contacts of the electrode (yellow contacts) were identified in a Nissl-stained histological section. D and E, Stimulus-evoked CSDs. The earliest significant current sink appeared at 91 ms after cue onset (asterisks on the timeline data of the red contact). The red contact corresponded to the granular layer in histological section (C). Red and green bars in (E) indicate significant and nonsignificant current sinks, respectively. Scale bars, 10 mm (B) and 200 mm (C). Modified from Takeuchi et al. (2011).40)
This “canonical CSD pattern” is obtained only upon using the optimal stimuli for recording (in the above cued-recall experiment, one of the learned paired associate stimuli for the recording from the memory patch in A36). In the absence of an optimal stimulus, the latency of the earliest current sink would be much longer and the earliest current sink would not be reliably located within Layer IV.
Signal flow analysis by cross-correlation of spike trains of two neurons across different cortical layers.Recording with a multi-contact linear-array electrode, combined with CSD analysis, enabled simultaneous recording of spike trains from two neurons located in different cortical layers, such as G-SG pairs, G-IG pairs, and SG-IG pairs. Then, a cross-correlogram (CCG) of the two spike trains was calculated to estimate the functional interactions between neurons across cortical layers. The peak value of the CCG reflects the strength of the interaction, and the asymmetry (or peak lag) of CCG reflects the direction of functional connectivity between neurons.64)–69) The direction of functional connectivity during each task period was examined by calculating the asymmetry index (AI), which quantified the asymmetry of the CCG peak area against the zero time lag.69)–71) The AI value for each CCG was calculated as follows:
\begin{equation*} \text{AI} = (\text{R} - \text{L})/(\text{R} + \text{L}) \end{equation*} |
For G-SG pairs, the distribution of AIs of individual CCGs during the cue period was shifted to the feedforward direction, i.e., from G to SG (Fig. 5B, blue). This directional bias was insignificant during the delay period (Fig. 5B, red). The results were further substantiated by the population-averaged CCGs (Fig. 5A), which displayed a prominent peak on the right side during the cue period (G to SG) (Fig. 5A, blue arrow). In G-IG pairs, no bias was observed in the signal flow directions during any of the task periods.
Reversal of interlaminar signal flow between sensory and memory processing revealed by signal flow analysis using the cross-correlation of spike trains of two neurons in different cortical layers. A to C, Population results of the functional connectivity of G-SG pairs. A, Population-averaged CCGs in the fix (gray, left), cue (blue, middle), and delay (red, right) periods, respectively. Blue arrow shows peak shift to the right. B, Asymmetry index (AI) of individual CCGs (blue, cue period; red, delay period). Asterisk reflects significant bias to either side of the histogram. Filled histogram represents a task period for which significant bias in the directionality was observed. C, Schematic diagram of interlaminar signal flow between G and SG. D to F, Same as in A to C, but for SG-IG pairs. D, Blue arrow shows peak shift to the right during cue period, while red arrow shows peak shift to the opposite direction, i.e., to the left, during delay period. C and F, Signal flow from G to SG and from SG to IG during the cue period were just as predicted from the classic canonical feed-forward model. During the delay period when memory recall occurs, however, the direction of signal flow reversed, i.e., from IG to SG (F). Modified from Takeuchi et al. (2011).40)
For SG-IG pairs (Fig. 5, D–F), the distribution of AIs during the cue period was significantly shifted towards the SG to IG direction (Fig. 5E, blue). However, the AI distribution during the delay period exhibited a bias in the opposite direction, i.e., in the IG towards SG direction (Fig. 5E, red). Population-averaged CCGs confirmed the results (Fig. 5D): During the cue period, a significant peak was observed on the right side (SG to IG, Fig. 5D, blue arrow), whereas a significant peak appeared on the left side during the delay period (IG to SG, Fig. 5D, red arrow).
In summary, these results shed light on the signal flow from G to SG and from SG to IG during the cue period (Fig. 5C and F), as predicted from the canonical feed-forward model.4),10) During the delay period when memory recall occurs, however, the direction of signal flow reversed, thus suggesting the recruitment of a laminar feedback pathway (Fig. 5F). In addition, an outward signal flow (from superficial to deep parts) within the IG was observed during the delay period [for details, see Takeuchi et al. (2011)40)].
This study demonstrated that the feedforward signal flow across cortical layers during sensory coding reverses to the feedback direction during memory retrieval, thus pointing to the flexible recruitment of inter-laminar connectivity, depending on the cognitive demands in monkey association cortices (Fig. 5F) — a novel phenomenon beyond the classic canonical model.
Identification of response-types and laminar locations of the neurons that send/receive these signal flows at the resolution of all six layers (for example, to resolve the IG layer into Layers V and VI) requires an MRI-based method. Indeed, no definitive method has been developed yet to allow the use of the CSD-based method for the resolution of all six layers. A recent study conducted such an MRI-based experiment in monkeys performing a visual paired associate task,60) as summarized hereafter.
Laminar locations of the recorded neurons were determined according to a previously described procedure.39),60) Briefly, the spatial location of each neuron in the reconstructed MR volume was determined from two MR images captured at different points in the same electrode track (Fig. 6A, left panels). The reconstructed MRI volume of the monkey was registered onto the postmortem histological volume of the same monkey. The location of neurons in the cortical layers was identified based on the laminar architecture visible in the histological volume. Reconstruction errors estimated using lesion marks were small enough to identify the cortical layer to which a particular neuron belonged (46 µm, 14 µm, and 54 µm in the anteroposterior, lateromedial, and dorsoventral directions, respectively60)).
Response-types of single cells and their laminar distribution at the resolution of all six layers in memory recall. A, An example of a recording track reconstruction with MRI scan sessions at two different cortical depths (1st and 2nd). Represented are the coronal view of the brain (left column), MR images of the electrode tip at two different depths (column second from the left), the corresponding histological section (top, middle column), and line drawing of laminar positions of recorded neurons (bottom, middle column). Spike density functions (column second from the right) and stimulus coding properties for cue/target [Cue-Holding Index (CHI) and Pair-Recall Index (PRI)] (right column) are depicted for two representative neurons (B1577 and B1582) that are shown in the line drawing panel (light blue circle). The area framed by the dotted rectangle in the MR image and histological section reflects the position of the panel showing the laminar position of the neurons. B, Another example with MRI scan sessions at two different depths for two other representative neurons (O991 and O996). The conventions are the same as in (A). C–E, Distinct dynamics of cue- and target-stimulus coding across layers. C, A Nissl-stained section of A36 showing the six-layered structure. D and E, Time courses of the CHI (D) and PRI (E) for all stimulus-selective neurons in each layer. Each row connotes a single neuron, sorted according to its depth location. Only L5 and L6 unambiguously encoded the to-be-recalled target information during delay period. The PRI onset latency was significantly earlier in L5 than L6, suggesting the emergence of the sought target coding within L5. Modified from Koyano et al. (2016).60)
Figure 6A–B represents the response types and laminar locations of the representative neurons recorded in A36 of the perirhinal cortex (PRC). In addition to a tonic response to the optimal stimulus, the L5 (Layer V) neurons also displayed a tonic response upon presenting the paired associate of the optimal stimulus as a cue (Fig. 6B lower middle panel), thus indicating neuronal coding of learned object-object association memory.28),29),35),61),72),73) In contrast, responses of L6 neurons to the paired associate were increased at the middle of the delay period and were maintained until the presentation of choice stimuli (Fig. 6A upper middle and 6B upper middle panels). Unlike the IG neurons, responses of the L3 neurons were more strongly correlated with cue stimuli than with target stimuli throughout the cue and delay periods (Fig. 6A, lower middle panel). For each neuron, neuronal coding dynamics of memory representation were quantified with the indices that extracted the response components that coded the presented cue [Cue-Holding Index (CHI)] and the to-be-recalled target [Pair-Recall Index (PRI)] (Fig. 6A–B, right column). The CHI and PRI were defined as follows: the instantaneous firing rate of a neuron to the set of 24 stimuli in the paired associate task was denoted as a 24-dimensional vector, F(t): [f1(t), …, f24(t)], where fi(t) is the mean discharge rate at time t from cue onset when stimulus i was presented as a cue. The cue-period responses for the set of 24 stimuli were denoted as a 24-dimensional vector C: [c1, …, c24] and Cp: [cp(1), …, cp(24)], where the i-th and p(i)-th stimuli belong to a pair. CHI and PRI at time point t [CHI(t) and PRI(t), respectively] were defined as the Pearson’s correlation coefficients: CHI(t) = R⟨F(t)|C⟩; PRI(t) = R⟨F(t)|Cp⟩, where R⟨A|B⟩ denotes the correlation coefficient between A and B.60),61)
Memory retrieval signals appeared mostly in Layers V and VI.Laminar differences in memory coding were then assessed quantitatively at the neuronal population level. Differential dynamics for cue- and target-stimulus coding were observed in distinct layers (Fig. 6C–E). In addition to the stimulus-selective visual response in the cue period, neurons in all layers demonstrated significant CHI during the delay period (Fig. 6D), with L5 displaying the highest value. In contrast, the PRIs in L5 and L6 were statistically significant during the delay (Fig. 6E), and higher than those in other layers. This showed some overlapping mnemonic properties in L5 and L6 neurons. Only L5 and L6 unambiguously encoded the to-be-recalled target information during the delay period. Thus, these results supported the laminar module hypothesis, in which the target information in cued-recall is predominantly represented by the IG laminar module.
Differential memory recall dynamics in L5 and L6.Within the IG Layer, L5 and L6 have different cytoarchitectonic and cytochemical structures,74),75) pointing to their distinct functional roles. Despite the presence of target representation in both L5 and L6 during the delay period, the target-representing activity was already observed during the cue period in L5 (Fig. 6E), but much less in L6. A comparison of the PRI onset latency between L5 and L6 indicated that the recall process occurred significantly earlier in L5. In L5, PRI began to increase in the middle of the cue period, with 33.3% of L5 neurons revealing onset latencies <150 ms. This eventually suggested the emergence of the sought target coding within this cortical layer [for detail, see Koyano et al. (2016)60); Naya et al. (2003)61)].
A cluster analysis further segregated L6 neurons, but not L5 neurons, into two functionally distinct populations. One group of L6 neurons exhibited a significantly slower increase in the coding of the sought target, with a faster decrease in the coding of the presented cue (“late” group) than the other group (“early” group). Across the three neuronal groups in the IG layer (L6 early, L6 late, and L5), only the L6 late group demonstrated increased phase-locking to the LFP in the theta frequency range (5–8 Hz) during the delay period compared with the fixation period. Because low-frequency coordination can tolerate long conduction delays occurring in neuronal signals traveling a long distance between different cortical areas,76)–79) the low-frequency phase-locked L6 neurons (L6 late neurons) were able to subsequently broadcast the retrieved target to distant cortical areas, possibly to lower-level visual areas, as recently confirmed by simultaneous recording from two cortical areas.80),81) Thus, the late L6 group would contribute to the output of the task-relevant signal, which is consistent with recent anatomical findings that morphologically distinct L6 neurons selectively project to distant cortical areas.10),82),83)
A novel cortical circuit model in the association cortex: beyond the classic “canonical” circuit modelAs discussed in the previous sections, the transformation of representations from a cued visual object to a to-be-recalled object occurs at the stage of the IG layer, but not in the SG layer of the temporal cortex in a visual cued-recall task. These results laid the foundation for a detailed description of the recall dynamics implemented in a local circuit in the temporal cortex35),60): L5 neurons in the circuit implemented the coding of both cue and target information, thus representing the relevant “pair” of objects in the trial. In contrast, a subset of L6 neurons (“late” group) implemented the more exclusive coding of the to-be-recalled target. Therefore, they most unambiguously represented the behaviorally relevant sought target. L6 neurons in the “late” group exhibited cooperative firing with other neurons in the same group. Because the latency of the recall signal was substantially shorter in L5 than in L6, directional information flowed from L5 to L6, carrying the signal of the sought target. The directional flow was also confirmed using a cross-correlation analysis of single units.40)
Neurons coding the presented cue and those coding the to-be-recalled target were colocalized in the IG layer, particularly in L5. Thus, we had conjectured that this colocalization likely provides a local network environment for cell-to-cell interactions from cue-coding cells to target-coding cells, a crucial computational step towards the retrieval of associative memory. Subsequently, this conjecture was directly supported using a cross-correlation analysis and a Granger-causality-based signal flow analysis that identified the local cell-to-cell interactions,84) leading to the modeling of a microcircuit mechanism underlying the cue-to-target conversion35),72) (see Fig. 8 for a schematic representation of the cue-to-target conversion model).
The results of the cued-recall task supported the operating dynamics consistent with the classic canonical cortical circuit during sensory cue processing (magenta, Fig. 7). However, during memory recall, non-canonical signal flows should be added to the operating dynamics. Moreover, underestimated signal flows without a specific meaning in the classic canonical model have acquired a specific functional role (red, Fig. 7). One important component of this architecture added for the recall process is the backward output from its IG layer to the IG layer of the lower-level area (red, Fig. 7). This functional component is consistent with known anatomical cortico-cortical backward projections.5),82) The functional architecture for the recall process predominantly operating in the IG layer is consistent with the “strata” model of S1.85) In summary, I propose a new functional cortical circuit model, as depicted in Fig. 7 (framed in the green ellipse and labelled as the “Higher order cortex”) to be a canonical one operating in the association cortex, termed the cortical “dynamic multimode module (D3M)”.
Cortical “dynamic multimode module (D3M)” (shown in a green ellipse in the “Higher order cortex”) proposed based on the local circuit dynamics of the PRC (A36) in cued-recall. In the D3M model, the operational mode of the local circuit changes depending on cognitive demands (e.g., sensory cue processing or memory recall). During sensory cue processing mode, D3M in the PRC (“Higher order cortex”) receives a high-level representation of the cue stimulus from the area TE (“Lower order cortex”) and operates similar to the classic canonical cortical circuit, mostly with feedforward processing (magenta arrows). However, during memory recall mode, it predominantly operates with signal flows through Layers V and VI, and sends back the retrieved representation to the area TE (“Lower order cortex”) and other lower-level areas through the backward projection from L6 (red arrows). See text for details.
The D3M model, as a static anatomical circuit model, substantially overlaps with the classic canonical circuit model that was designed based on studies of primary sensory areas,4),13) but it functionally provides distinct dynamic views of local circuit operations. In the D3M model, the operational mode and signal flow of the local circuit dynamically change depending on cognitive demands. This flexible change of the operational mode is typically shown for the connectivity between L3 and L5: in the sensory processing mode, the signal flow from L3 to L5 plays a dominant role (magenta, Fig. 7), however, in the memory recall mode, the opposite signal flow from L5 to L3 plays a dominant role (red, Fig. 7).35),40) The forward signal from Layer II/III through cortico-cortical projections plays dominant roles in the sensory processing mode (magenta), while backward cortico-cortical signaling from Layer V/VI does so in the memory recall mode (red). Local signal processing in Layer II/III is highlighted in the sensory processing mode, while local signal processing in Layer V/VI performs the essential computations in the memory recall mode (for a neuronal network-level model of the recall computation, see the schema in thick magenta arrow in Fig. 8). The D3M model clarifies how the local cortical circuit module works in both feedforward and feedback modes depending on the cognitive demands.
D3M functioning in the whole-brain semantic memory network. A, The two-hub model of the semantic memory network. Many modality-specific cortical regions represent an aspect of conceptual knowledge (e.g., ‘apple’), an object feature (e.g., color of apple), and/or an object-associated feature (e.g., shape of apple tree). The links between each modality specific region and a supramodal hub region (PRC hub) are termed “spokes”. In cued-recall, the representation of object identity is activated in the PRC via the structural encoding process along the ventral visual pathway (green arrows) and then triggers retrieval of associated features or attributes of objects represented in the PRC itself, as well as those represented in lower-order cortical areas (red arrows). B, D3M provides essential circuit mechanisms required for driving the retrieval of nested associations of object-associated features in the distributed memory network by serial backward-propagating signaling through L5 and L6 (curved red arrow). In the PRC hub, D3M receives a high-level representation of the cue stimulus from the area TE and converts it to the to-be-recalled object representation [schema in thick magenta arrow; for details, see Hirabayashi et al. (2014)72)]. In brief, the conversion of the representations is performed by signal transfer between two functionally different classes of cells identified in the PRC: “cue-holding neurons” (CH, green), which hold information on the presented cue stimulus, and “pair-recall neurons” (PR, purple), whose delay activity encodes the to-be-recalled paired associate of the cue stimulus. Bars and arrows between neurons depict functional connectivity and directed interactions. This retrieval process starts when a cue stimulus is presented, and CH neurons exhibit strong responses. The CH neurons and PR neurons then interact, and the neuronal representations gradually shift to the sought target. Next, the firing rates of the PR neurons gradually increase, via mutual interactions among the PR neurons. Finally, the information of the to-be-recalled target is sent back through backward projections from L5/6 (curved red arrow). I propose that a similar mechanism for conversion of representations is at work in many areas of the semantic memory network.
A note on the spatial scale of the D3M model will help further clarification. The most plausible scale of the D3M model lies at the cortical columnar-scale level, similar to the classic canonical circuit model that was tightly connected with the concept of functional columnar organization by Hubel and Wiesel16) and Mountcastle17) established in the primary sensory cortices of cats and monkeys. It should be noted that the scale of functional units in the association cortex is likely larger than that in the primary sensory cortices. Indeed, the scale of the face patches in monkeys has been reported to range from 0.5 mm to 2 mm31),86) with some antero-posterior gradients within the temporal cortex.87) The scale of the memory patches in the area TE and PRC has also been reported as 1–2 mm.61),88) The neuronal circuit that generates a recall signal in Layer V of the PRC (Fig. 8B) was schematized as the cell-to-cell interaction, thus at the sale of a neuronal circuit consisting of single neurons. For finer resolution of the D3M model at the single neuron level, further studies with methodological innovations, such as optogenetics and two-photon imaging of Ca2+ signaling, will be required in the macaque association cortex,89)–93) while monkeys are performing complex cognitive tasks, as were recently attempted in the mouse and marmosets cortex.94)–97)
The local circuit module, D3M, functions as a key component in a brain-wide global network for high-level cognition. It modeled the local circuit dynamics of the perirhinal cortex (PRC) in cued-recall (Fig. 7, green ellipse in the “Higher order cortex”). Converging evidence from anatomical, physiological, and neuropsychological studies have identified the PRC as a crucial component of the declarative memory process.35),98)–100) Thus, this section will discuss how D3M works within the global network of the declarative memory, specifically, within the two-hub model of the whole-brain semantic memory system for the cued-recall (Fig. 8A).
The two-hub model of the semantic memory system has been described previously.35) Briefly, it is built on a general framework of the distributed hierarchical semantic network,101),102) and more specifically on the “hub-and spoke model” of semantic memory.103) The latter is a model of the structure and neural basis of semantic memory, in which several modality-specific cortical regions represent a distinct aspect of conceptual knowledge, an object feature, and/or an object-associated feature. The links between each modality specific region and a supramodal hub region are termed “spokes”.
The PRC hub, along with the temporopolar cortex (TPC) hub, is a supramodal hub located at the apex of a global semantic network. The PRC hub is strongly connected to many lower-level temporal/occipital cortical areas that represent object-associated features and plays a unique role in linking episodic and semantic memory. In contrast, the TPC hub preferentially supports the activation of well-consolidated semantic knowledge stored in all cortical areas via polymodal long-range connections.35) In cued-recall and in many other events of everyday life, the representation of object identity is activated in the PRC via structural encoding along the feedforward ventral visual pathway (Fig. 8A, green arrows) and then triggers the backward retrieval of object-associated features that are represented in the PRC and other lower-level cortical areas (Fig. 8A, red arrows; Fig. 8B, curved red arrow).
D3M provides essential circuit mechanisms required by the PRC hub. It receives a high-level representation of the cue stimulus from the neighboring unimodal visual association cortex, such as area TE (Fig. 7, magenta arrow from L3 of TE to L4 of PRC), and converts it to the to-be-recalled object representation (schema in Fig. 8B). Eventually, it initiates the retrieval of nested associations of object-associated features in the distributed memory networks by serial backward-propagating signaling through L5 and L6 (Fig. 8B, curved red arrow).
D3M was modeled from an analysis of the local circuit dynamics in the PRC at the apex of the semantic network with several lower-level cortical areas representing object-associated features. However, it is conjectured here that each lower-level cortical area may likely achieve its retrieval of object-associated features with a D3M-like circuit and send further serial backward-propagating cascades to its lower-level neighboring cortical areas. In the following, several lines of indirect evidence supporting the conjecture are briefly examined: at the computational level of brain theories (according to the terminology of David Marr) that characterize computations and algorithms, not the underlying circuit, some influential theories proposed a neural computation module with both feedforward and feedback output in cortical information processing.6),104),105) For example, Heeger (2017) proposed a neural computation module in a recurrent global network [Fig. 2D of Heeger (2017)105)], and stated that, ‘Although I focus on sensation and perception (specifically vision), I hypothesize that the same computational framework applies throughout neocortex’ and specifically noted that ‘This form of memory recall (called visual imagery or mental imagery) generates patterns of activity in visual cortex that are similar to sensory stimulation’ with a state parameter setting in which the high-level neurons dominate the network (λ = 0, Eq. 1, of Heeger (2017)105)). A similar idea that mental imagery is supported by backward (or top-down) cortical activation of V128),106),107) has been repeatedly formulated, tested, and confirmed with human functional magnetic resonance imaging (fMRI) experiments.108) Direct neuronal recordings in the primate association cortex with a laminar resolution have only recently begun. The first one was performed in the PRC,40),60) as described in the previous sections. Then, in the field of memory recall, Takeda et al. (2018) recorded from the area TE with a multi-contact linear-array electrode and showed with coherence measures that the recall signal first reached the IG layer of area TE from area 36 and was then sent back to the SG layer of area TE.80),81) This signal routing is the same as that found in area 36 of the PRC40) and in the D3M module. In the border ownership processing of a complex image scene, Franken and Reynolds (2021) recently confirmed the CSD results of Takeuchi et al. (2011)40) with a multi-contact linear-array electrode in the recording from area V4 of monkeys and found columnar processing of complex visual images.63) Further physiological experiments are required to clarify the laminar routing of signal flow, particularly in a ‘generative’ mode of visual processing (λ = 0, Heeger (2017)105)), to test whether the D3M model is applicable to most neocortical association cortices.
In this review, the dynamic signal flow in the monkey temporal cortex was examined, and a novel cortical circuit model in the association cortex is proposed (Fig. 7), termed the cortical “dynamic multimode module” or D3M. The D3M comprises the majority of morphologically identified “canonical” cortical circuits (Douglas & Martin, 2004),4) with some additional cortico-cortical output paths (e.g., output from its IG layer to the IG layer of the lower-level areas). In the D3M model, the operational mode of the local circuit changes depending on cognitive demands. During sensory cue processing mode (or more generally, in the feedforward mode), the D3M operates in a similar manner to the classic canonical cortical circuit (magenta, Fig. 7). However, during memory recall mode (or more generally, in the feedback mode), it operates differently (red, Fig. 7), predominantly with signal flows through the IG layers, namely Layers V and VI. The conversion of the object representation from the cue stimulus to the to-be-recalled target stimulus occurs within Layer V. Ultimately, one group of neurons in Layer VI becomes phase-locked with the LFP in the theta frequency range (5–8 Hz) and reliably outputs information about the to-be-recalled target to lower-level cortical areas. It is proposed here that the D3M may also function in each lower-level cortical area for the retrieval of nested associations in the distributed memory networks by serial backward-propagating signaling through Layers V and VI. Therefore, the D3M is a canonical model for cortical association areas that extensively engage in both feedforward and feedback signaling across cortical areas.
One reservation for this conjecture may be a report that increasingly strong connectivity between excitatory local neurons was seen with a posterior to anterior gradient from primary sensory areas to higher association areas.109) This report also prompted some physiological reports that increasingly slow changes in neuronal activity were seen from the posterior to anterior cortex,110)–112) although it remains unknown whether these morphological and physiological gradients along the cortical posterior-to-anterior axis may affect the D3M function at the level of a neuronal circuit. Future experiments should test the conjecture in monkeys using simultaneous recordings from two areas in the temporal cortex (e.g., V4 and TEO/V4 and TE), just as done in the PRC and area TE.80),81)
Research and writing were supported in part by MEXT and the Japan Society for the Promotion of Science KAKENHI grants 17H06161 and 24220008.
Contributed by Yasushi MIYASHITA, M.J.A.; Edited by Nobutaka HIROKAWA, M.J.A.
Correspondence should be addressed: Y. Miyashita, Department of Physiology, The University of Tokyo School of Medicine, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan; Juntendo University, Graduate School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo 113-8421, Japan (e-mail: miyashita.yasushi@gmail.com).
Yasushi Miyashita was born in Tokyo in 1949 and graduated from the University of Tokyo in 1972. He received his PhD degree in 1981 from the University of Tokyo and was a visiting Lecturer at Oxford University, U.K., in 1984–85 and Professor of Physiology at the University of Tokyo School of Medicine from 1989 to 2015. He served as Director of the Center for Brain Science in RIKEN, President of the Japan Neuroscience Society, and President of Union of Brain Science Association in Japan. He also acted as a member of the Board of Reviewing Editors for Science, Neuron, J. Cog. Neurosci., and other international journals. He has been studying memory and metamemory systems in primates; in particular, he discovered (1) the memory neurons that encode and retrieve associative long-term memory of objects in the temporal cortex, (2) the top-down signal from the prefrontal cortex to the memory neurons for memory retrieval, and (3) the metamemory centers in the frontopolar and anterior dorsolateral prefrontal cortex that interact with the temporal lobe memory centers. These discoveries clarified where and how mnemonic representations are organized in the primate brain, and what the mechanism underlying the reactivation of the representation on demand of voluntary recall is. Notably, his recent discoveries on the cortical meta-memory network enabled understanding of how our meta-cognition on cognition is implemented in the brain and eventually how our retrospection becomes possible. For his accomplishments, he received Asahi Prize, Keio Medical Science Prize, Fujihara Award, Japan Academy Prize and some other awards. He is currently a member of the Japan Academy and a fellow of International Union of Physiological Sciences.