IndexFiguresTables |
Seokju Han♦ , Inayat Ali* and Jeongseok Ha°A Joint Sensing and Decoding for Improving the Hard-Decision Lifetime of NAND Flash MemoriesAbstract: Soft-decision decoding schemes are utilized for high data reliability of NAND flash memories. However, it requires excessive latency and power consumption compared to hard-decision (HD) decoding schemes. This work proposes a novel joint sensing and decoding scheme to extend the HD decoding lifetime. When the HD decoding fails, the proposed scheme re-reads HD channel outputs and judiciously combines the two consecutive HD readings utilizing reliability information from the failed HD decoding. Since the random telegraph noises impairing HD readings are statistically independent, the combining provides a diversity gain. Numerical results show that the proposed scheme significantly improves the HD lifetime. Keywords: Channel coding , NAND flash memory , random telegraph noise , diversity combining Ⅰ. IntroductionDigital storage devices using NAND flash memory, such as solid-state drives have almost replaced magnetic disk-based drives in the consumer storage market. The NAND flash memory technology has recently received extensive attention for providingscalability in both storage capacity and data throughput. This was made possible by using multi-level-cell (MLC) technology, which uses a single memory cell for storing multiple bits[1]. The sequence of storing and retrieving data in and out of NAND flash memory can be modeled as data communication over a noisy channel which is impaired by various noise sources such as cell-to-cell interference, retention problem, and stress-induced leakage[2]. For instance, the channel deteriorates as the number of program/erase (P/E) cycles grows and eventually reaches a point called the end-of-life, at which an error-control scheme used in a storage device cannot correct errors induced by the channel. Although the NAND flash memory channel is time-varying and continuous in nature, the data are collected in discrete form by instantaneous sensing of the memory cells. The chip manufacturers of NAND flash memory are reducing the storage cost per bit by storing more bits in a memory cell. Recently, NAND flash memories with four bits per cell, i.e. quad-level cell (QLC), have become commercially available in the storage market. However, by increasing the number of bits per cell, the data reliability degrades and the device lifetime reduces significantly. To overcome these issues, there have been extensive studies[3-6] on developing powerful error-correcting codes (ECCs) for storage devices using NAND flash memories. In particular, soft-decision (SD) error-control schemes, such as low-density parity-check (LDPC) codes with belief propagation (BP) decoder, provide considerable coding gain by conducting multiple memory sensing and taking soft channel outputs from NAND flash memories[4-6]. While the SD error-control schemes provide high data reliability, the multiple memory sensing operations for the soft channel outputs result in excessively long latency and high power consumption, which is not appropriate for energy-constrained mobile storage applications. Sensing and decoding schemes that utilize the SD error-control scheme as a post-processor are proposed[6-9] to reduce the overall latency and complexity required for processing the data. A progressive data sensing and decoding scheme to minimize the use of high-precision SD sensing is proposed[6]. To reduce the latency, the data is first decoded by using the hard-decision (HD) channel outputs generated with a single memory sensing. When the HD decoder fails to retrieve the data, the SD decoding is performed with SD channel outputs obtained by performing additional memory sensing. The flash controller gradually increases the precision level of the SD channel outputs until the data are successfully decoded. To reduce the unnecessary read latency, a method for adaptively selecting the optimal read-level granularity is proposed[8-9]. The progressive data sensing and decoding scheme can achieve high data reliability. However, the existing works focus on designing SD error-control schemes which still require high latency and complexity to decode the data. In this paper, we propose a novel joint sensing and decoding scheme that utilizes the statistical independence of random telegraph noise (RTN) in time [10]. The threshold voltage of the flash memory cell is disturbed by the RTN, which incurs errors in the HD channel outputs from the flash memory. The RTN is a time-varying electronic noise caused by the capture and emission of electrons from the interfacial traps in the oxide layer of a cell. Thus, the RTN noises in two HD channel outputs are statistically independent. In this work, we assume that the proposed scheme employs an LDPC code with a low-complexity decoder (i.e., a gradient descent bit-flipping (GDBF) decoder[11]). The proposed scheme first performs HD decoding with a single memory sensing. When the decoder fails, it acquires another set of HD channel outputs by re-sensing the memory cells. The proposed scheme carefully combines the two sets of HD channel outputs and performs the GDBF decoding with the combined set. That is, for each coded bit, the proposed scheme takes the one with higher reliability from the two sets, which provides so-called the selection diversity gain. To judge which one is more reliable, we smartly exploit the results of the failed GDBF decoding. Note that the GDBF decoder builds up the reliability of each coded bit using a metric called the inversion function. By taking advantage of the values of the inversion function at the end of decoding, we can estimate the unreliable bit positions in the HD channel outputs from the first round of sensing. The HD decoding is repeated with new combined HD channel outputs obtained by replacing the unreliable channel outputs in the previous decoding round with the new channel outputs. Note that the outputs of the GDBF decoder are estimates of the coded bits based on the HD channel outputs and a metric called the inversion function. Using the reliability of the bits calculated at the decoder by the inversion function. It will be demonstrated that the proposed scheme greatly improves the error-rate performance, which in turn significantly extends the HD lifetime of the flash memory, i.e., the lifetime with hard-decision decoding. Note that since the proposed scheme is utilized before SD decoding, the proposed scheme can be collaboratively used with recent work that optimizes the SD granularity[8-9]. While the proposed scheme requires one additional sensing, it significantly reduces the chance of activating the SD decoding with at least three sensings. To verify the claims, we carry out error-rate performance evaluations, the number of SD activations, and the average number of sensings. Ⅱ. Preliminaries2.1 Flash Memory ChannelIn flash memories, data is stored in cells consisting of floating-gate transistors, with the threshold voltage required to turn on the transistors determined by the number of electrons stored in the floating gates. The flash controller reads the data by checking the status of the transistors after applying a read reference voltage to the cells. During the programming operation of a cell, the flash controller gradually shifts the threshold voltage of a cell to a target write voltage level. An MLC NAND flash memory has four write voltage levels per cell, i.e., an erased state ER(11), and programmed states P1(01), P2(00), and P3(10) as shown in Fig. 1. In a flash memory channel, the threshold voltages in MLC are affected by different noise components such as the program-erase cycle (PE cycle), and retention time[1]. The initial threshold voltages in the erased cells [TeX:] $$p_{\text {init }}^{\mathrm{ER}}$$ follow a Gaussian distribution, defined by [TeX:] $$\mathcal{N}\left(V_{\min }, \sigma_e^2\right)$$. The programmed states are generated with iterative incremental step pulse programming (ISPP). Furthermore, the initial threshold voltages in the programmed cells follow a uniform distribution after applying ISPP, defined as follows:
[TeX:] $$\begin{aligned} & p_{\text {init }}^{P R}(x) \\ & =\left\{\begin{array}{lc} & \frac{1}{\Delta V_{p p}}, \quad \text { for } V_p \leq x \leq V_p+\Delta V_{p p} \\ 0, & \text { otherwise } \end{array}\right. \end{aligned}$$ where [TeX:] $$V_p \in \{V_1, V_2, V_3\}$$ is the target programming voltage level and [TeX:] $$\deltaV_{pp}$$ is the ISPP step size. Fig. 1. Threshold voltage distribution of MLC NAND flash memory. Hard sensing [TeX:] $$V^H$$ requires single data sensing per bit, and soft sensing [TeX:] $$V^S$$ requires multiple (i.e., ≥ 2) data sensing per bit ![]() The threshold voltage variations are mainly due to programming noise, retention noise, random telegraph noise (RTN), and cell-to-cell interference (CCI). The programming noise component affects the programmed cells with an additive white Gaussian noise with a distribution defined by [TeX:] $$\mathcal{N}\left(0, \sigma_p^2\right)$$. Retention noise is caused by the charge leakage through a floating gate. It is related to the PE cycles of the memory and data retention time. The retention noise[1] can be modeled as a noise component with a Gaussian distribution, [TeX:] $$\mathcal{N}\left(\mu_r, \sigma_r^2\right)$$. The mean [TeX:] $$\mu_r$$ and variance [TeX:] $$\sigma_r^2$$ of retention noise is given by:
[TeX:] $$\begin{aligned} \mu_r & =\left(V_s-x_0\right)\left[A_t\left(N_P\right)^{\alpha_i}+B_t\left(N_P\right)^{\alpha_0}\right] \log (1+T), \\ \sigma_r & =0.4\left|\mu_r\right| \end{aligned}$$ where T is the data retention time, [TeX:] $$N_p$$ is the number of PE cycles, [TeX:] $$V_S \in \{V_{\min}, V_1, V_2, V_3\}$$ and [TeX:] $$x_0, A_t, B_t, \alpha_i, \alpha_0$$ are constants described in [9]. The RTN is caused by the capture and emission of electrons from the interfacial traps in the oxide layer, which causes fluctuations in the threshold voltage of the flash memory cell. Due to the RTN, the threshold voltage changes even between consequent read operations. The RTN becomes more severe when [TeX:] $$N_p$$ is increased, i.e., new traps are created in the oxide layer of the cell that makes RTN more susceptible. Similarly, the RTN is also modeled as a noise component with a Gaussian distribution defined by [TeX:] $$\mathcal{N}\left(0, \sigma_{\mathrm{RTN}}^2\right)$$[1] where [TeX:] $$\sigma_{\mathrm{RTN}}$$ vary with PE cycles and can be calculated as [TeX:] $$\sigma_{\mathrm{RTN}}=0.00025\left(N_p\right)^{0.62}$$. On the other hand, when a flash cell is programmed, the adjacent memory cells are affected by the parasitic capacitive coupling, which introduces CCI in the memory cells. However, the CCI can be removed from the programmed cells by using different pre-coding techniques[10]. For erased cells, the threshold voltage distribution with CCI is modeled as:
[TeX:] $$\begin{gathered} \widetilde{\mathrm{V}}_{\min , \mathrm{CCI}}^{\mathrm{even}}=V_{\min }+\Delta V_{\mathrm{avg}}\left(2 \mu_{\gamma_x}+\mu_{\gamma_y}+2 \mu_{\gamma_{x y}}\right) \\ \widetilde{\mathrm{V}}_{\min , \mathrm{CCI}}^{\text {odd }}=V_{\min }+\Delta V_{\mathrm{avg}}\left(\mu_{\gamma_y}+2 \mu_{\gamma_{x y}}\right) \end{gathered}$$ where [TeX:] $$\Delta V_{\text {avg }}=\left(V_{\min }+V_3\right) / 2-V_{\min }, \text { and } \mu_{\gamma x,} \mu_{\gamma y,} \mu_{\gamma x y}$$ are capacitive coupling ratios which depend on the physical architecture of the cells in the flash memory. 2.2 Flash Memory Reading/SensingThe flash memory controller reads data based on a predetermined reference read voltage. As explained in Sec. 1.1, the distribution of the threshold voltages varies from the target voltage levels depending on the retention time and the PE cycle. The degradation of threshold voltage leads to a higher raw bit error rate (BER). In [6], an SD error control scheme is used as a post-processor to enhance the error-correcting performance as shown in Fig. 2. Firstly, the memory controller generates HD channel outputs based on a single data sensing per bit with the reference read voltage [TeX:] $$$$ shown in Fig. 2. Then, HD decoding is performed with HD channel outputs. If the decoder fails, the controller progressively reads additional reference voltage levels per cell (e.g., [TeX:] $$$$) to generate soft reliability information in the form of log-likelihood ratios (LLRs) by combining the sensed channel outputs. These LLRs are used for decoding the data using any soft-input/soft-output (SISO) decoder (such as sum-product, or min-sum decoder). By using an SD decoder with soft reliability information, the decoding performance improves as compared to the single HD sensing and decoding. However, using an SD decoder has some drawbacks. Firstly, SD sensing requires generating additional SD information, which degrades the on-chip sensing and data transfer latency. Second, the SD decoder requires a large computational power due to its high decoding complexity. Unlike the HD decoder, the reliability messages used in the SD decoder are in the form of LLRs for multiple voltage levels in MLC. Thus, for practical applications, it is important to extend the HD decoding lifetime in the memory for minimizing the use of the SD decoder having higher latency and complexity. Ⅲ. Proposed Joint HD Sensing and DecodingIn this section, a novel joint decoding and memory sensing scheme is proposed, which utilizes the diversity provided by RTN for additional rounds of HD decoding. The random variations in the threshold voltage caused by the RTN provide diversity between the consecutively read data. The different reliability values between multiple signal inputs make it possible to improve the decoding performance through diversity combining, which is well-known in wireless communications[13]. There exist various diversity combining techniques, such as selection diversity, maximal-ratio diversity, equal gain diversity, etc. Selection diversity is a simple diversity combining technique that selects the input signal with the highest instantaneous reliability information. Since the HD sensed data is in discrete binary form, it is hard to select reliable and unreliable bits fromthe data. One possible way is to replace all previously sensed data by declaring them as unreliable when the decoder fails, hence new data is regenerated by re-sensing the memory for the next round of decoding. However, useful information is lost when discarding the previously sensed data that was not erroneous. In this paper, we propose an algorithm that utilizes the decoding result of the previous HD sensed data for improving the sensing and data transfer latency. 3.1 Measuring Data Reliability Using Inversion FunctionIn the SD decoding of the LDPC codes (such as the BP decoding), a hard decision over the reliability of the LLR values of each bit is used for the syndrome checking. Similarly, in the GDBF algorithm, a reliability measure called the inversion function is calculated for checking the amount of confidence over the bit decision value[11]. The inversion function [TeX:] $$\Delta_k$$ amounts to the reliability of the k-th variable node (VN) and is calculated as follows:
where [TeX:] $$y_k$$ and [TeX:] $$x_k \in \{-1, +1\}$$ is the channel output for the k-th bit, and bi-polar bit decision value after decoding of the k-th VN, respectively. [TeX:] $$\mathcal{M}$$(k) is a set of check nodes (CNs) connected to the k-th VN, and [TeX:] $$w_i$$ is a reliability indicator value given by:
[TeX:] $$\begin{aligned} & w_i \\ & =\left\{\begin{aligned} & \text { if } i-\text { th CN is satisfied } \\ -1, & \text { otherwise } \end{aligned}\right. \end{aligned}$$ The magnitude of Dk indicates the measure of confidence over the bit-decision value after decoding and is used for sorting the erroneous and reliable bits. For example, if [TeX:] $$Delta_k$$ is small, this means that the majority of the CNs connected to the k-th VN are unsatisfied, consequently, the measure of confidence over the bit-decision value of the k-th VN after decoding is low. To show the significance of [TeX:] $$Delta_k$$ in the sorting of the erroneous and reliable bits, the empirical behavior of the decoder is shown and compared with [TeX:] $$Delta_k$$ in Fig. 3. The raw BER of the channel outputs is compared with [TeX:] $$Delta_k$$ for (9216, 8192) quasi-cyclic LDPC (QC-LDPC) code at different [TeX:] $$N_p$$ and T = 100 hours in Fig. 3a. We set the maximum number of iterations [TeX:] $$I_{\max}$$ as 50, and the decoder is run for [TeX:] $$I_{max}$$ iterations to record these empirical results. We can observe that the raw BER is small for a large value of [TeX:] $$Delta_k$$ (i.e., [TeX:] $$Delta_k$$ = 5), whereas, for the smaller value of [TeX:] $$Delta_k$$, the raw BER is notably high. Using the same simulation setup, the empirical distribution of the VNs having [TeX:] $$Delta_k$$ value is depicted in Fig. 3b. We can observe that most of the bits have large [TeX:] $$Delta_k$$ (i.e., the empirical distribution is high for [TeX:] $$Delta_k$$ = 5 in Fig. 3b), which means that we only need to sort a small number of VNs having unreliable channel outputs by [TeX:] $$Delta_k$$. 3.2 Reliability-based Sensing and CombiningIn this subsection, we propose an efficient joint sensing and decoding scheme in which HD data is re-sensed using the prior reliability information of the bits calculated from [TeX:] $$\Delta_k$$. The proposed scheme is illustrated in Fig 4. Firstly, HD decoding is performed based on the HD sensed data, i.e., [TeX:] $$\mathbf{r}^h=\left\{r_0^h, r_1^h, \ldots, r_{N-1}^h\right\}$$, and if the decoder fails, it outputs the reliability values of the decoding result, i.e., [TeX:] $$\boldsymbol{\Delta}=\left\{\Delta_0, \Delta_1, \ldots, \Delta_{N-1}\right\}$$. Note that the bits with large [TeX:] $$Delta_k$$ can make a sufficiently reliable decision based on the previous sensed data as observed in Section 3.1. Due to the time-varying nature of the RTN, new HD data for the unreliable bits are generated by re-sensing the memory cells. Therefore, only the unreliable data is re-generated in the next round of memory sensing, and we retain the reliable information from the previous sensing round. We use [TeX:] $$\Delta_k$$ to define the set of unreliable bits called combining set (denoted by [TeX:] $$\mathcal{A}$$) that are updated with the new channel outputs. We define [TeX:] $$\mathcal{A}$$ as follows:
where [TeX:] $$\tau$$ is a pre-determined reliability threshold value. All bits with [TeX:] $$\Delta_k$$ smaller than [TeX:] $$\tau$$ are included in [TeX:] $$\mathcal{A}$$. After determining the combining set, the new data [TeX:] $$\mathbf{r}^h$$ is sensed from the memory. Using [TeX:] $$\mathbf{r}^h$$ and [TeX:] $$\mathcal{A}$$, the HD channel output [TeX:] $$\tilde{\mathbf{r}}^h$$ for the current round of HD decoding is computed as follows:
(2)[TeX:] $$\tilde{r}_i^h= \begin{cases}r_i^h, & \text { if } i \in \mathcal{A} \\ r_i^{h-1}, & \text { otherwise }\end{cases}$$The HD decoder operates with re-generated HD channel outputs. Although the proposed algorithm describes a generalized decoder with [TeX:] $$h_{max}$$ being any arbitrary number, we mainly focus on the case when [TeX:] $$H_{max}$$ = 2 in this paper. Since the latency and power consumption for data sensing is linearly proportional to the number of sensing, a smaller [TeX:] $$h_{max}$$ is more desirable for practical considerations. A single round of SD sensing requires two times larger latency and power compared to a single round of the proposed scheme (i.e., when [TeX:] $$h_{max}$$ = 2). For some applications which require a lower target BER, the flash controller can proceed to SD sensing and decoding when the HD decoder fails. Furthermore, it will be shown in Section 4 that the proposed scheme can improve not only the HD decoding performance but also the latency and power requirements compared to the existing schemes. Ⅳ. Numerical ResultsIn this section, the performances of the proposed and existing error-control schemes for NAND flash memory are compared in terms of word-error rate (WER) and the HD lifetime of the decoder. For evaluation, we use a (9216, 8192) QC-LDPC code having a code rate of 0.89. The code has VN and CN degrees of 4 and 36, respectively, and we use the GDBF algorithm[11] for decoding. To show the efficacy of the proposed decoder in terms of WER performance and complexity, we set [TeX:] $$h_{\max}$$ = 2. The threshold value [TeX:] $$\tau$$ for finding [TeX:] $$\mathcal{A}$$ is fixed to 3. To determine each noise parameter used in numerical simulations, the reference read voltage for HD sensing is set as in [14]. We set the channel parameters according to [1] as listed in Table 1. Table 1. NAND Flash Memory Channel Parameters
In Fig. 5, the WER performances of the proposed and existing algorithms are shown for different PE cycles, while the retention time is set to T = 100 hours. For comparison, we plot the performances of the single HD sensing and decoding, HD re-sensing with full replacement with new sensing data, and SD sensing and decoding. Using the diversity of the RTN, we can observe that the WER of the flash memory significantly improves for both fully replaced and joint sensing and decoding algorithms, as compared with the single hard sensing and decoding scheme. Especially, the proposed algorithm shows more than 2 orders of coding gain for both the least significant bit (LSB) and the most significant bit (MSB) compared to the single hard sensing and decoding scheme. To show the generality of the proposed algorithm, we also evaluate the performance for a TLCNANDflash memory model in [8]. In Fig. 6, the WER performances of the proposed and existing algorithms are depicted. For performance evaluation, the retention time is set to T = 40 hours. It is observed that the proposed joint sensing and decoding algorithm improves WER performance of single HD decoding and fully replaced and shows more than one order of coding gain for all the LSB, central significant bit (CSB) and MSB. Fig. 5. WER results of different sensing and decoding algorithms used in an MLC NAND flash memory when T = 100 hours. ![]() Fig. 6. WER results of different sensing and decoding algorithms used in a TLC NAND flash memory when T = 40 hours. ![]() In this paper, the efficacy of our proposed algorithm is shown in terms of the increased HD lifetime of the data in memory cells, hence in numerical simulations, we did not perform SD decoding after the algorithm fails. In practice, the memory devices perform SD sensing and decoding when the flash memory cell degrades over time. Figure 7 shows the number of SD decoding activations and the average number of sensing compared to retention time when [TeX:] $$N_p$$ = 5000. We define the average number of memory sensing for single HD decoding and the proposed algorithm as
[TeX:] $$\begin{aligned} & U_k \\ & =\left\{\begin{array}{r} 1+2 P_1, \quad \text { for single HD decoding, } \\ 1+P_1+2 P_2, \text { for joint sensing and decoding. } . \end{array}\right. \end{aligned}$$ where [TeX:] $$\mathcal{N}$$ is the number of samples from numerical simulation, and [TeX:] $$P_1$$ = 1 if the first HD decoding fails and it is 0 otherwise. Similarly, [TeX:] $$P_2$$ = 1 if the proposed algorithm fails to decode and it is 0 otherwise. The number of SD activations and the average number of sensing are evaluated for N = 104 samples of data read from the memory. The sensing and decoding complexity can be reduced and managed by reducing the number of times the algorithm activates SD sensing and decoding. By using the proposed scheme, we can observe that the number of SD decoding activations is significantly reduced compared to the single HD sensing and decoding scheme. Furthermore, the read latency of the flash memory is highly proportional to the number of times the memory is sensed. From Fig. 5, we can observe that the average number of memory sensing is greatly reduced by using the proposed scheme. Thus, the proposed scheme significantly improves the storage lifetime, with only a few add itional HDsensing andlow-complexity HD decoding. BiographyInayat AliAug. 2009:B.S. in electrical engineering, KAIST Oct. 2011: M.S. in telecomm- unication engineering, Ham- dard University Feb. 2019:Ph. D. in electrical and computer engineering, Sungkyunkwan University 2024~current :LG Electronics [Research Interest] coding theory, information theory, error correcting codes, digital communi- cation, blockchain technology [ORCID:0000-0002-0566-6405] BiographyJeongseok HaFeb. 1992: B.S. in electronics, Kyungpook National University Feb. 1994:M.S. in electronic and electrical engineering, Pohang University of Science and Technology 2004~current : Professor, School of electrical engineering, KAIST [Research Interest] wireless communication, coding theory, error correction codes [ORCID:0000-0003-1262-151X] References
|
StatisticsCite this articleIEEE StyleS. Han, I. Ali, J. Ha, "A Joint Sensing and Decoding for Improving the Hard-Decision Lifetime of NAND Flash Memories," The Journal of Korean Institute of Communications and Information Sciences, vol. 49, no. 6, pp. 816-824, 2024. DOI: 10.7840/kics.2024.49.6.816.
ACM Style Seokju Han, Inayat Ali, and Jeongseok Ha. 2024. A Joint Sensing and Decoding for Improving the Hard-Decision Lifetime of NAND Flash Memories. The Journal of Korean Institute of Communications and Information Sciences, 49, 6, (2024), 816-824. DOI: 10.7840/kics.2024.49.6.816.
KICS Style Seokju Han, Inayat Ali, Jeongseok Ha, "A Joint Sensing and Decoding for Improving the Hard-Decision Lifetime of NAND Flash Memories," The Journal of Korean Institute of Communications and Information Sciences, vol. 49, no. 6, pp. 816-824, 6. 2024. (https://doi.org/10.7840/kics.2024.49.6.816)
|