WO2012149904A1 - Modeling method and system based on context in transform domain of image/video - Google Patents

Modeling method and system based on context in transform domain of image/video Download PDF

Info

Publication number
WO2012149904A1
WO2012149904A1 PCT/CN2012/075052 CN2012075052W WO2012149904A1 WO 2012149904 A1 WO2012149904 A1 WO 2012149904A1 CN 2012075052 W CN2012075052 W CN 2012075052W WO 2012149904 A1 WO2012149904 A1 WO 2012149904A1
Authority
WO
WIPO (PCT)
Prior art keywords
coefficient
state
image
transform
context
Prior art date
Application number
PCT/CN2012/075052
Other languages
French (fr)
Chinese (zh)
Inventor
武筱林
牛毅
Original Assignee
Wu Xiaolin
Niu Yi
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wu Xiaolin, Niu Yi filed Critical Wu Xiaolin
Priority to CN201280027747.5A priority Critical patent/CN104094607B/en
Publication of WO2012149904A1 publication Critical patent/WO2012149904A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type

Definitions

  • the invention relates to a method and a system for image/video coding and decoding and other processing technologies, in particular to anisotropic correlation of a natural image in a certain transform domain (such as discrete cosine transform (DCT) domain).
  • DCT discrete cosine transform
  • Markov Random field Two-dimensional Markov random field (Markov Random field) modeling method and its system.
  • images with directionality such as boundaries and textures have characteristics of two-dimensional directional correlation by transform coefficients in a two-dimensional transform domain.
  • the directional correlation of the above transform coefficients is performed on image pixel blocks containing boundaries.
  • Dimensional discrete transformation is especially significant.
  • the signal energy is concentrated in the directional subband, as shown in Figure 1(a).
  • Figure 1(c) shows the result of discrete cosine transform (DCT) of image pixel blocks containing different orientations.
  • DCT discrete cosine transform
  • the signal energy is concentrated in the low frequency region, while the DCT transform coefficients are attenuated at a radial and nearly identical speed, as shown in Figure 1(b).
  • the visible natural image can be approximated to have radial (radial) in the discrete cosine transform (DCT) domain.
  • DCT discrete cosine transform
  • Zigzag scan is anti-radial Direction
  • Reciprocal scanning completely ignoring the statistical correlation of natural images in the radial direction.
  • the MPEG and H.264 standards respectively propose horizontal and vertical scanning modes as a substitute for the Zigzag scanning method.
  • the switching of the multi-scan mode is only a local and inflexible temporary scheme, and it is impossible to model the arbitrary direction correlation of the image/video signal in the transform domain, and the encoding of the scan mode also causes additional code rate overhead.
  • Cipheral Patent Document No. CN1741616 published on the date of 2006-03-01, describes a context-based adaptive entropy coding method, which includes the following steps: when encoding: scanning the current transform block The quantized DCT coefficients, thereby forming (level, run) pairs of sequences; then entropy encoding each pair of pairs in the inverse order of the scan, in the encoding, using the already coded block has been completed The values of the encoded pairs are dynamically adaptively constructed to construct the context statistics model.
  • a context model weighted fusion technique is proposed to further improve the compression performance of the model; the context statistics model obtained in the previous step is used to drive the entropy coding.
  • the context-based adaptive entropy decoding method is the inverse of the encoding method.
  • this technique has the following drawbacks and shortcomings: although the method uses a run-length method to combine and encode consecutive zero coefficients between two non-zero coefficients, the method does not combine other typical coefficients. Consolidation, and still failed to jump out of Zigzag's simple scanning method.
  • the scan order is generated as a zigzag scan order.
  • this method still fails to fundamentally solve the shortcomings of the Zigzag scanning mode, and its adaptive scanning of the transform coefficients is still performed for a single coefficient, and the coefficient blocks having the approximate properties are not merged. Therefore, the method The coding efficiency is still not satisfactory.
  • the present invention is directed to the above-mentioned deficiencies of the prior art, and provides a context-based modeling method and system thereof in a transform domain of an image/video (Method And System for Context Modeling of Images and Videos in Transform Domains), through adaptive block evolution (ABE: Adaptive Block) Evolution), unlike the existing Zigzag, horizontal or vertical scanning method, this method does not use a fixed one-dimensional scanning order, but adopts an adaptive two-dimensional scanning method, and the transform domain coefficients are simultaneously radial. Statistical modeling was performed with the inverse radial correlation.
  • the present invention can also be used to perform other image/video processing such as denoising, interpolation, classification, visual information retrieval and extraction, digital watermarking, information hiding, image retrieval, steganalysis, and the like.
  • image/video processing such as denoising, interpolation, classification, visual information retrieval and extraction, digital watermarking, information hiding, image retrieval, steganalysis, and the like.
  • the present invention relates to a contextual statistical modeling method in a transform domain, which adaptively constructs a two-dimensional Markov model to reflect the directional correlation of an image/video signal in a two-dimensional variation domain, and the method specifically includes:
  • the transform coefficients may or may not be quantized
  • the state in the two-dimensional Markov process refers to one or a set of adjacent and related coefficients in the transform domain, the state being defined by a coefficient block of a typical mode;
  • the initial state refers to a coefficient block formed by a coefficient group having the same or similar values and close frequencies with the lowest frequency coefficient and/or the highest frequency coefficient as reference points; the intercontinental of the coefficient group is Generalized radius.
  • S i -1 ) refers specifically to the transfer probability in the two-dimensional Markov process, which can be calculated by offline or online, wherein: S i is the next state, S i - 1 is the current state, i represents the sequence number of the coefficient block of the transform coefficient currently traversed, and takes a value from 1 to the total number of states in the 2D Markov process.
  • the adaptation of the traversal direction refers to: selecting a next state according to a state corresponding to an optimal value of a transfer probability between a current state and all of its next possible states; for two or more initial states, respectively Perform the above adaptive calculation.
  • the optimal value can be a maximum value, a minimum value, or other value that best meets the criteria requirements.
  • the invention relates to an image/video compression method based on transform domain context statistical modeling, comprising the following steps:
  • the first step is to perform coefficient transformation on the input image
  • the second step is to perform context-based modeling on the transform coefficients and determine an initial state S 0 ;
  • the two-dimensional Markov process is used to calculate and compare all possible transfer probabilities P ( S i
  • the fourth step the next state S i obtained in the third step, drives the entropy encoder with its corresponding transfer probability P ( S i
  • the initial state encodes at least one coefficient or coefficient block in the lowest frequency domain and/or at least one coefficient or coefficient block and/or frequency domain in the lowest frequency domain with a generalized radius.
  • the distinct features are: two or more coefficient pairs or coefficient blocks that are adjacent and have the same value or are typically distributed.
  • the invention relates to an image/video compression system based on transform domain context statistical modeling, comprising: a transform module, an adaptive block evolution (ABE) module and an entropy coding module, wherein:
  • ABE adaptive block evolution
  • the ABE module performs context-based modeling on the transform coefficients output by the transform module, and sequentially outputs the transfer probabilities obtained in each step of the traversal process to the entropy encoding module in an adaptive manner.
  • the entropy coding module entropy encodes and outputs the transform coefficients according to the transfer probability output by the ABE module.
  • the context-based modeling refers to: using one or a group of adjacent and related coefficients in the transform domain as the state of the two-dimensional Markov process, with the lowest frequency coefficient and/or the highest frequency coefficient as the reference point, The coefficient blocks formed by the clusters of coefficients having the same or similar values and close frequencies are gathered as the initial state of the model.
  • the adaptive manner refers to: calculating, from the initial state of the context-based model, a state corresponding to the highest value among the transfer probabilities among all next possible states as the next state.
  • the present invention improves the coding efficiency, especially when the code rate is increased to 0.45 or above, the code stream length of the ABE coding system can be reduced to nearly 90%.
  • FIG. 1 is a schematic diagram of corresponding DCT transform coefficients of 16 ⁇ 16-size pixel blocks of different images in Embodiment 1 (the coefficient amplitude is indicated by the degree of shading).
  • FIG. 3 is a block diagram of a plurality of DCT transform coefficient blocks of 8 ⁇ 8 size in a Markov state, specifically as shown in the gray area.
  • (a) is a radial state change
  • (b) is a horizontal state change.
  • FIG. 5 is a schematic diagram of a cross-sectional scan of a one-dimensional or two-dimensional direction of a transform domain
  • Figure 6 is a collection of images used in a test experiment to verify the coding performance of the present invention.
  • Figure 7 shows the comparison between the coding performance of the present invention and the existing best coding system H.264.
  • FIG. 1 A block diagram of a conventional DCT coding system and an ABE coding system of the present invention is shown in FIG. Similar to the traditional DCT coding system, the ABE system also first inputs the image for DCT transformation and quantization, but unlike the Zigzag used in the traditional system, the ABE system adopts a more flexible adaptive block evolution method, which has a higher The coding efficiency.
  • the ABE method uses an ordered two-dimensional Markov model and the position of the adjacent pixel block sequence to decompose 0 and non-zero (for example, generate a weight map).
  • the above-described two-dimensional Markov modeling process predicts the transfer probability P ( S i
  • the state refers to a set of adjacent and related coefficient blocks in the transform domain.
  • a set of all-zero coefficient blocks in the highest frequency domain constitutes one state (as shown in Fig. 4(a)); and another set of all non-zero coefficient blocks in the lowest frequency domain constitutes another state (as shown in Fig. 4(b). )).
  • a state is arbitrarily composed of a partial 0 coefficient and a non-zero coefficient (Fig. 3(c)).
  • the ABE method improves the transfer probability P ( S i
  • the initial state S 0 during the transformation may be a large coefficient block located in the high frequency domain and containing 0, or a large coefficient block located in the low frequency domain and containing non-zero, as shown in Fig. 4(b) and Fig. 4(a, respectively. ) shown.
  • These all-zero or all-one initial states can be easily encoded by a generalized radius of integer values, which is more efficient than the end-of-block (EOB) method and is widely used in existing compression standards.
  • EOB end-of-block
  • the intermediate state during the transverse scan of the two-dimensional transform domain consists of consecutive zero or non-zero coefficients located at the head of the current scan region.
  • the ABE method can achieve adaptive directional state transitions during traversal scanning or when compressing transform coefficients, as shown in FIG.
  • the propulsion direction for traversing the transform coefficients may be unidirectional: from high frequency to low frequency or from low frequency to high frequency, but may also be bidirectional, that is, the final convergence from the high frequency and the low frequency to the intermediate frequency respectively.
  • Figure 5 is a schematic illustration of these three traversal modes.
  • the ABE method can be combined with a context based matching entropy coder to encode the transform coefficients with a shorter code length.
  • Embodiment 2 Image Compression Application
  • the first step is to perform coefficient transformation on the input image
  • the coefficient transformation may be performed by DCT transform or KLT (Karhunen-Loeve) transform, or by using other existing transform methods, and the following traversal manners, and can be verified by experiments and shown in FIG. 7 . Similar effect.
  • the coefficient transformation may also add a quantization process
  • the second step is to perform context-based modeling on the transform coefficients and determine an initial state S 0 ;
  • the initial state in this embodiment is encoded by a generalized radius, and the initial state may be any of the following:
  • the third step online or offline, calculates and compares all possible transfer probabilities P ( S i
  • the fourth step for the next state S i obtained in the third step, drives the entropy encoder with the transfer probability P ( S i
  • the third step recalculates and compares according to the new state until all transform coefficients are traversed and a complete code stream of all output codes is obtained.
  • i is the total number of states in the 1 to 2D Markov process; if the number of the initial states is more than one, the simultaneous traversal in two or more directions may be implemented as shown in FIG. 5(c). Increase the coding speed.
  • an implementation system related to the above method includes: a transform module, an adaptive block evolution (ABE) module, and an entropy encoding module, where:
  • a transform module that performs coefficient conversion on the image such as DCT or KLT;
  • the ABE module performs context-based modeling on the transform coefficients output by the transform module, and selects the highest transition probability step by step in an adaptive manner.
  • the corresponding state performs a jump of the Markov process, and finally traverses all the coefficients, and at the same time outputs to the entropy coding module in order to drive the entropy encoder during the traversal process.
  • the entropy coding module entropy encodes and outputs the transform coefficients according to the transfer probability of the stepwise output of the ABE module.
  • the conversion module may be accompanied by a quantization processing function
  • the context-based modeling refers to: using one or a group of adjacent and related coefficients in the transform domain as the state of the two-dimensional Markov process, with the lowest frequency coefficient and/or the highest frequency coefficient as the reference point, The coefficient blocks formed by the clusters of coefficients having the same or similar values and close frequencies are gathered as the initial state of the model.
  • the adaptive manner refers to: calculating the transfer probability between all the next possible states from the initial state of the context-based model, and selecting the next state based on the highest value among them.
  • This embodiment uses 38 common test images as shown in FIG. 6 to verify the coding performance of the ABE system of the embodiment 1.
  • Each image is first subjected to an 8x8 DCT transform, and then a significant map of the DCT coefficients is used with different quantization steps. Map) to encode.
  • Map maps the H.264 encoder with the best effect to encode the same saliency map.
  • the entropy coding module uses the default adaptive binary arithmetic of H.264 encoder. Encoder CABAC.
  • the initial probability of all contexts is set to 0.5. It should be noted that in this example, the ABE encoding system uses three times the number of contexts of H.264. (378 vs 126), so the context dilution of the ABE system (context) Dilution) The punishment is more serious. Therefore, this initial setting method is actually more advantageous for the H.264 encoding system.

Landscapes

  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A modeling method and system based on the context in the transform domain of an image/video in the image/video coding/decoding and other processing technical fields. Compression or other processing can be performed on dynamic or static images by approximating a natural image to a two-dimensional Markov random field which has radial and reverse radial relevance simultaneously in the DCT transform domain or another transform domain and depicting the direction relevance of an image signal in a two-dimensional change domain using the model.

Description

图像/视频的变换域中基于上下文的建模方法及其系统  Context-based modeling method and system in image/video transform domain 技术领域  Technical field
本发明涉及的是一种图像/视频编码解码及其他处理技术领域的方法及系统,具体是一种刻画自然图像在某一变换域(如离散余弦变换(DCT)域)中各向异性相关性的二维马尔可夫随机场(Markov random field)的建模方法及其系统。 The invention relates to a method and a system for image/video coding and decoding and other processing technologies, in particular to anisotropic correlation of a natural image in a certain transform domain (such as discrete cosine transform (DCT) domain). Two-dimensional Markov random field (Markov Random field) modeling method and its system.
背景技术Background technique
目前的图像/视频处理技术,如压缩、重构、增强、分析等,大多是在变换域进行的。常用的变换域有离散余弦变换(DCT),离散傅立叶变换(DFT),哈达码变换等。变换的作用是将图像/视频信号中的能量集中于少数变换系数之中,从而显著地减小信号中的统计冗余。学术界对该过程有多种描述:如能量打包(energy packing)、去相关或图像/视频信号的稀疏表示等。但即使在变换域中,自然图像/视频信号也远非独立同分布(i.i.d),而应该被归类于马尔可夫随机过程(Markov random processes)。因此,对于图像/视频信号在变换域中的上下文统计建模成为被广泛应用的图像/视频处理系统中的关键组成部分。这里所谓的上下文统计建模是指马尔可夫或近似马尔可夫信号的条件概率的估计方法或过程。Current image/video processing techniques, such as compression, reconstruction, enhancement, analysis, etc., are mostly performed in the transform domain. Commonly used transform domains include discrete cosine transform (DCT), discrete Fourier transform (DFT), and Hada code transform. The effect of the transform is to concentrate the energy in the image/video signal among a small number of transform coefficients, thereby significantly reducing the statistical redundancy in the signal. The academic community has a variety of descriptions of the process: such as energy packaging (energy Packing), de-correlation or sparse representation of image/video signals, etc. But even in the transform domain, natural image/video signals are far from independent and identically distributed (i.i.d), but should be classified as Markov stochastic processes (Markov). Random Processes). Therefore, contextual statistical modeling of image/video signals in the transform domain becomes a key component in widely used image/video processing systems. The so-called context statistical modeling here refers to a method or process for estimating the conditional probability of a Markov or approximate Markov signal.
在基于变换的图像/视频压缩系统,如MPEG、JPEG、JPEG2000、H.264中,图像/视频信号的上下文统计建模无疑对系统的率失真(rate-distortion)性能至关重要,其作用是预测用于驱动熵编码器(如基于上下文的算术编码器等)的变换系数的条件概率。在熵编码过程中,任何对于条件概率的预测偏差都会直接导致编码性能的下降。准确地说,与理论最短码长比,预测概率偏差所造成的冗余码长等于其与真实概率分布之间的互信息熵(relative entropy),或者KL距离(Kullback-Leibler distance)。因此上下文统计建模的精度最终决定了系统的压缩性能。In transform-based image/video compression systems such as MPEG, JPEG, JPEG2000, H.264, contextual statistical modeling of image/video signals is undoubtedly critical to the rate-distortion performance of the system. A conditional probability of a transform coefficient used to drive an entropy coder (such as a context-based arithmetic coder, etc.) is predicted. In the entropy coding process, any prediction bias for the conditional probability will directly lead to a decrease in coding performance. To be precise, compared with the theoretical shortest code length ratio, the redundant code length caused by the prediction probability deviation is equal to the mutual information entropy between it and the true probability distribution (relative Entropy), or KL distance (Kullback-Leibler distance). Therefore, the accuracy of contextual statistical modeling ultimately determines the compression performance of the system.
现有的上下文建模方法往往利用自然图像具有快速下降的功率谱的特性(文献中假设为指数下降)。但具有快速下降的功率谱本身无法表征图像/视频信号在变换域中的统计特性,这是因为图像和视频的信号能量在频域上仅为一维分布,而在变换域中则为二维或者三维。Existing context modeling methods often take advantage of the fact that natural images have a rapidly declining power spectrum (assuming an exponential decrease in the literature). However, the power spectrum with fast drop itself cannot characterize the statistical properties of the image/video signal in the transform domain because the signal energy of the image and video is only one-dimensionally distributed in the frequency domain and two-dimensional in the transform domain. Or three-dimensional.
尤其如边界,纹理等含有方向性的图像具有在二维的变换域里通过其变换系数体现出二维方向相关性的特点,上述变换系数的方向相关性在对含有边界的图像像素块进行二维离散变换时尤为显著,对于含有边界或者规则纹理的图像块,信号能量被集中于方向子带中,如图1(a) 和 1(c)所示即为包含不同朝向边界的图像像素块进行离散余弦变换(DCT)的结果。In particular, images with directionality such as boundaries and textures have characteristics of two-dimensional directional correlation by transform coefficients in a two-dimensional transform domain. The directional correlation of the above transform coefficients is performed on image pixel blocks containing boundaries. Dimensional discrete transformation is especially significant. For image blocks with boundary or regular textures, the signal energy is concentrated in the directional subband, as shown in Figure 1(a). And 1(c) shows the result of discrete cosine transform (DCT) of image pixel blocks containing different orientations.
对于包含平滑阴影的像素块,其信号能量则集中于低频区域,而DCT变换系数以辐射状并且近乎相同的速度衰减,如图1(b)所示。可见自然图像在离散余弦变换(DCT)域中可以近似为同时具有径向(radial direction)和逆径向(anti-radial direction)相关性的二维马尔可夫随机场模型。For a block of pixels containing smooth shadows, the signal energy is concentrated in the low frequency region, while the DCT transform coefficients are attenuated at a radial and nearly identical speed, as shown in Figure 1(b). The visible natural image can be approximated to have radial (radial) in the discrete cosine transform (DCT) domain. Two-dimensional Markov random field model with direction and anti-radial direction correlation.
通过上述分析揭示了多种现行国际图像视频压缩标准,如JPEG、MPEG及H.264等所采用的常见的DCT系数之字形(Zigzag)扫描方法从本质上具有缺陷。Zigzag扫描为逆径向(anti-radial direction)往复式扫描,彻底忽略自然图像在径向上所具有的统计相关性。事实上,MPEG和H.264标准为了弥补该缺陷,分别提出水平和垂直方向的扫描模式作为Zigzag扫描方式的代替。但多扫描模式的切换仅为局部且不灵活的临时方案,无法对图像/视频信号在变换域中的任意方向相关性进行建模,并且对扫描模式的编码也会造成额外的码率开销。Through the above analysis, a variety of current international image video compression standards, such as the common DCT coefficient zigzag scanning method adopted by JPEG, MPEG and H.264, are inherently flawed. Zigzag scan is anti-radial Direction) Reciprocal scanning, completely ignoring the statistical correlation of natural images in the radial direction. In fact, in order to compensate for this defect, the MPEG and H.264 standards respectively propose horizontal and vertical scanning modes as a substitute for the Zigzag scanning method. However, the switching of the multi-scan mode is only a local and inflexible temporary scheme, and it is impossible to model the arbitrary direction correlation of the image/video signal in the transform domain, and the encoding of the scan mode also causes additional code rate overhead.
经过对现有技术的检索发现,中国专利文献号CN1741616,公开日2006-03-01,记载了一种基于上下文的自适应熵编码方法,该技术包括以下步骤:编码时:扫描当前变换块中已被量化的DCT系数,由此形成(level,run)数对序列;然后按扫描的逆向顺序对数对序列中每一数对进行熵编码,在编码中,利用已被编码块中已完成编码的数对的值动态自适应构造上下文统计模型,同时,也提出一个上下文模型加权融合技术来进一步提高模型的压缩性能;用上一步骤所获得的上下文统计模型来驱动熵编码。基于上下文的自适应熵解码方法是编码方法的逆。但该技术具有以下的缺陷和不足:虽然该方法采用游程(run-length)的方式对两个非0系数之间连续的0系数进行了组合编码,但该方法并未对其他的典型系数组合进行合并,并且依然未能跳出Zigzag的简单扫描方式。After searching for the prior art, Chinese Patent Document No. CN1741616, published on the date of 2006-03-01, describes a context-based adaptive entropy coding method, which includes the following steps: when encoding: scanning the current transform block The quantized DCT coefficients, thereby forming (level, run) pairs of sequences; then entropy encoding each pair of pairs in the inverse order of the scan, in the encoding, using the already coded block has been completed The values of the encoded pairs are dynamically adaptively constructed to construct the context statistics model. At the same time, a context model weighted fusion technique is proposed to further improve the compression performance of the model; the context statistics model obtained in the previous step is used to drive the entropy coding. The context-based adaptive entropy decoding method is the inverse of the encoding method. However, this technique has the following drawbacks and shortcomings: although the method uses a run-length method to combine and encode consecutive zero coefficients between two non-zero coefficients, the method does not combine other typical coefficients. Consolidation, and still failed to jump out of Zigzag's simple scanning method.
中国专利文献号CN1431828,公开日2003-07-23,记载了一种“用于编码/解码图像信号的最佳扫描方法”,该技术在一种通过离散余弦变换编码图像信号的方法中,在多个参考块中至少一个被选择。产生一个扫描顺序,其中进行扫描参考块的要编码的块,并且以所产生的扫描顺序扫描要编码的块。所述至少一个被选择的参考块与要编码的块在时间或空间上邻近。当扫描要编码的块的时候,从所述至少一个所选择的参考块获得发生非零系数的概率,并且从最高的概率开始以降序确定扫描顺序。在此,如果概率相同,则扫描顺序被产生为一个之字形的扫描顺序。但该方法依然未能从根本上解决Zigzag扫描模式的弊端,并且其对变换系数的自适应扫描依然是针对单个系数来进行的,并未对拥有近似性质的系数块进行合并,因此,该方法的编码效率依然不足以令人满意。Chinese Patent Publication No. CN1431828, published on 2003-07-23, describes a "optimal scanning method for encoding/decoding an image signal" in a method of encoding an image signal by discrete cosine transform, At least one of the plurality of reference blocks is selected. A scan sequence is generated in which the blocks of the reference block to be encoded are scanned, and the blocks to be encoded are scanned in the resulting scan order. The at least one selected reference block is temporally or spatially adjacent to the block to be encoded. When scanning a block to be encoded, the probability of occurrence of non-zero coefficients is obtained from the at least one selected reference block, and the scanning order is determined in descending order starting from the highest probability. Here, if the probabilities are the same, the scan order is generated as a zigzag scan order. However, this method still fails to fundamentally solve the shortcomings of the Zigzag scanning mode, and its adaptive scanning of the transform coefficients is still performed for a single coefficient, and the coefficient blocks having the approximate properties are not merged. Therefore, the method The coding efficiency is still not satisfactory.
发明内容Summary of the invention
本发明针对现有技术存在的上述不足,提供一种图像/视频的变换域中基于上下文的建模方法及其系统(Method and System for Context Modeling of Images and Videos in Transform Domains),通过自适应块进化法(ABE:Adaptive Block Evolution),与现有锯齿式(Zigzag)、水平或垂直的扫描方式不同,该方法并不采用固定的一维扫描顺序,而是采用自适应二维扫描方式,对变换域系数同时从径向和逆径向的相关性进行统计建模。The present invention is directed to the above-mentioned deficiencies of the prior art, and provides a context-based modeling method and system thereof in a transform domain of an image/video (Method And System for Context Modeling of Images and Videos in Transform Domains), through adaptive block evolution (ABE: Adaptive Block) Evolution), unlike the existing Zigzag, horizontal or vertical scanning method, this method does not use a fixed one-dimensional scanning order, but adopts an adaptive two-dimensional scanning method, and the transform domain coefficients are simultaneously radial. Statistical modeling was performed with the inverse radial correlation.
本发明除了能够作为图像视频压缩中的有效工具以外,还能够用于执行去噪、插值、分类、视觉信息检索及提取、数字水印、信息隐藏、图像检索、隐写分析等其他图像/视频处理应用中。In addition to being an effective tool in image video compression, the present invention can also be used to perform other image/video processing such as denoising, interpolation, classification, visual information retrieval and extraction, digital watermarking, information hiding, image retrieval, steganalysis, and the like. In the application.
本发明是通过以下技术方案实现的:The invention is achieved by the following technical solutions:
本发明涉及一种变换域里上下文统计建模方法,通过自适应地构建一个二维马尔可夫模型以反应图像/视频信号在二维变化域里的方向相关性,该方法具体包括:The present invention relates to a contextual statistical modeling method in a transform domain, which adaptively constructs a two-dimensional Markov model to reflect the directional correlation of an image/video signal in a two-dimensional variation domain, and the method specifically includes:
将多个或单个变换系数组成二维马尔可夫过程中的状态;Combining multiple or individual transform coefficients into states in a two-dimensional Markov process;
在线或离线地计算两个相邻马尔可夫状态之间的传递概率;Calculating the probability of transmission between two adjacent Markov states online or offline;
从初始状态起,通过传递概率分别实现单个或多个遍历方向的自适应。From the initial state, the adaptation of single or multiple traversal directions is achieved by the transfer probability, respectively.
所述的变换系数可以经过量化或未经过量化;The transform coefficients may or may not be quantized;
所述的二维马尔可夫过程中的状态是指变换域中的一个或一组相邻且相关的系数,该状态通过典型模式的系数块进行定义;The state in the two-dimensional Markov process refers to one or a set of adjacent and related coefficients in the transform domain, the state being defined by a coefficient block of a typical mode;
所述的初始状态是指:以最低频率系数和/或最高频率系数为参照点,聚集在一起的具有相同或相近数值且相近频率的系数群所构成的系数块;该系数群的洲际即为广义半径。The initial state refers to a coefficient block formed by a coefficient group having the same or similar values and close frequencies with the lowest frequency coefficient and/or the highest frequency coefficient as reference points; the intercontinental of the coefficient group is Generalized radius.
所述的传递概率P(S i |S i-1 )专指二维马尔可夫过程中的传递概率,可通过离线或者在线的方式计算得到,其中: S i 为下一状态, S i-1 为当前状态, i 表示当前遍历的变换系数的系数块的序号,其取值为1至二维马尔可夫过程中的状态总数。The transfer probability P ( S i | S i -1 ) refers specifically to the transfer probability in the two-dimensional Markov process, which can be calculated by offline or online, wherein: S i is the next state, S i - 1 is the current state, i represents the sequence number of the coefficient block of the transform coefficient currently traversed, and takes a value from 1 to the total number of states in the 2D Markov process.
所述的遍历方向的自适应是指:根据当前状态和其所有下一个可能状态之间的传递概率中的最优值所对应的状态来选择下一个状态;对于两个以上的初始状态则分别进行上述自适应计算。该最优值可以为最大值、最小值或其他最符合判据要求的取值。The adaptation of the traversal direction refers to: selecting a next state according to a state corresponding to an optimal value of a transfer probability between a current state and all of its next possible states; for two or more initial states, respectively Perform the above adaptive calculation. The optimal value can be a maximum value, a minimum value, or other value that best meets the criteria requirements.
本发明涉及一种基于变换域上下文统计建模的图像/视频压缩方法,包括以下步骤:The invention relates to an image/video compression method based on transform domain context statistical modeling, comprising the following steps:
第一步、对于输入图像进行系数变换;The first step is to perform coefficient transformation on the input image;
第二步、对变换系数进行基于上下文的建模并确定初始状态 S 0 The second step is to perform context-based modeling on the transform coefficients and determine an initial state S 0 ;
第三步、在线或离线地以二维马尔科夫过程对状态 S i-1 下的所有可能传递概率 P(S i |S i-1 ) 进行计算和比较,并基于其中的最优值作为下一个状态 S i In the third step, online or offline, the two-dimensional Markov process is used to calculate and compare all possible transfer probabilities P ( S i | S i -1 ) under state S i -1 , and based on the optimal value The next state S i ;
第四步、对第三步得到的下一个状态 S i ,用其对应的传递概率 P(S i |S i-1 ) 驱动熵编码器,输出状态 S i 所对应的变化系数的输出码,然后返回第三步根据新的状态重新进行计算和比较,直至遍历完所有变换系数并得到所有输出码组成的完整码流。The fourth step, the next state S i obtained in the third step, drives the entropy encoder with its corresponding transfer probability P ( S i | S i -1 ), and outputs the output code of the change coefficient corresponding to the state S i , Then return to the third step to recalculate and compare according to the new state until all the transform coefficients are traversed and a complete code stream composed of all output codes is obtained.
所述的初始状态采用广义半径的方式编码最低频域里至少一个值为1的系数或系数块和/或最高频域里的至少一个值为0的系数或系数块和/或频域任意部分中具有明显特征的系数块。The initial state encodes at least one coefficient or coefficient block in the lowest frequency domain and/or at least one coefficient or coefficient block and/or frequency domain in the lowest frequency domain with a generalized radius. A block of coefficients with distinct features in the section.
所述的明显特征是指:两个以上毗邻且具有相同值或呈典型分布的系数对或系数块。The distinct features are: two or more coefficient pairs or coefficient blocks that are adjacent and have the same value or are typically distributed.
本发明涉及一种基于变换域上下文统计建模的图像/视频压缩系统,包括:变换模块、自适应块进化(ABE)模块和熵编码模块,其中:The invention relates to an image/video compression system based on transform domain context statistical modeling, comprising: a transform module, an adaptive block evolution (ABE) module and an entropy coding module, wherein:
变换模块,对图像进行系数变换;Transforming the module to perform coefficient transformation on the image;
ABE模块,对变换模块输出的变换系数进行基于上下文的建模,并按照自适应方式逐步将遍历过程中每一步得到的传递概率依次输出至并驱动熵编码模块。The ABE module performs context-based modeling on the transform coefficients output by the transform module, and sequentially outputs the transfer probabilities obtained in each step of the traversal process to the entropy encoding module in an adaptive manner.
熵编码模块,根据ABE模块输出的传递概率对变换系数进行熵编码并输出。The entropy coding module entropy encodes and outputs the transform coefficients according to the transfer probability output by the ABE module.
所述的基于上下文的建模是指:将变换域中的一个或一组相邻且相关的系数作为二维马尔科夫过程的状态,以最低频率系数和/或最高频率系数为参照点,聚集在一起的具有相同或相近数值且相近频率的系数群所构成的系数块作为模型的初始状态。The context-based modeling refers to: using one or a group of adjacent and related coefficients in the transform domain as the state of the two-dimensional Markov process, with the lowest frequency coefficient and/or the highest frequency coefficient as the reference point, The coefficient blocks formed by the clusters of coefficients having the same or similar values and close frequencies are gathered as the initial state of the model.
所述的自适应方式是指:从基于上下文的模型的初始状态起,计算其所有下一个可能状态之间的传递概率中的最高值所对应的状态作为下一个状态。The adaptive manner refers to: calculating, from the initial state of the context-based model, a state corresponding to the highest value among the transfer probabilities among all next possible states as the next state.
本发明与现有H.264压缩方式相比提高了编码效率,尤其当码率升高至0.45或以上时,ABE编码系统的码流长度可减少至接近90%。Compared with the existing H.264 compression method, the present invention improves the coding efficiency, especially when the code rate is increased to 0.45 or above, the code stream length of the ABE coding system can be reduced to nearly 90%.
附图说明DRAWINGS
图1为实施例1中不同图像的16x16大小像素块对应DCT变换系数示意图(系数幅值通过明暗程度示意)。FIG. 1 is a schematic diagram of corresponding DCT transform coefficients of 16×16-size pixel blocks of different images in Embodiment 1 (the coefficient amplitude is indicated by the degree of shading).
图2为传统DCT编码系统和本发明的ABE编码系统的对比。2 is a comparison of a conventional DCT encoding system and an ABE encoding system of the present invention.
图3为马尔科夫状态下的8x8大小多个DCT变换系数块示意图,具体如图中灰色区域。FIG. 3 is a block diagram of a plurality of DCT transform coefficient blocks of 8×8 size in a Markov state, specifically as shown in the gray area.
图4为以高传递概率为方向进行状态变换的示意图,第i步至第iv步;4 is a schematic diagram of state transition in the direction of high transfer probability, steps i to iv;
图中:(a)为径向状态变化;(b)为水平状态变化。In the figure: (a) is a radial state change; (b) is a horizontal state change.
图5为变换域的一维或二维方向横断扫描示意图;5 is a schematic diagram of a cross-sectional scan of a one-dimensional or two-dimensional direction of a transform domain;
图中:(a)为从低变换域作为初始状态进行遍历;(b) 为从高变换域作为初始状态进行遍历;(c)为分别从低变换域和高变换域同时进行遍历。In the figure: (a) is traversed from the low transform domain as the initial state; (b) Traversing from the high transform domain as the initial state; (c) traversing simultaneously from the low transform domain and the high transform domain, respectively.
图6 为验证本发明编码性能的测试实验所用图像集。Figure 6 is a collection of images used in a test experiment to verify the coding performance of the present invention.
图7 为本发明编码性能与现存最好编码系统H.264的对比结果。Figure 7 shows the comparison between the coding performance of the present invention and the existing best coding system H.264.
具体实施方式detailed description
下面对本发明的实施例作详细说明,本实施例在以本发明技术方案为前提下进行实施,给出了详细的实施方式和具体的操作过程,但本发明的保护范围不限于下述的实施例。The embodiments of the present invention are described in detail below. The present embodiment is implemented on the premise of the technical solution of the present invention, and detailed implementation manners and specific operation procedures are given, but the scope of protection of the present invention is not limited to the following implementation. example.
实施例1 二维马尔科夫建模Example 1 Two-dimensional Markov modeling
为进一步体现出本发明ABE方式的创新性,下面以众多图像视频压缩标准和系统中最为核心的处理过程,即量化后的二维DCT系数的熵编码为例。传统的DCT编码系统与本发明ABE编码系统框图如图2所示。与传统DCT编码系统类似,ABE系统也首先输入图像进行DCT变换及量化,但与传统系统所采用的锯齿扫描(Zigzag)不同,ABE系统采用更为灵活的自适应块进化方法,从而拥有更高的编码效率。To further illustrate the innovation of the ABE method of the present invention, the following is an example of a plurality of image video compression standards and the most core processing in the system, that is, entropy coding of the quantized two-dimensional DCT coefficients. A block diagram of a conventional DCT coding system and an ABE coding system of the present invention is shown in FIG. Similar to the traditional DCT coding system, the ABE system also first inputs the image for DCT transformation and quantization, but unlike the Zigzag used in the traditional system, the ABE system adopts a more flexible adaptive block evolution method, which has a higher The coding efficiency.
经量化后,大部分非显著DCT系数被量化为0,而剩下的非0,即显著系数,往往会沿某方向集聚。相反大量的非显著系数(0),则会沿着从低频到高频的方向扇形扩散。为提高压缩效率,ABE方法采用有序二维马尔可夫模型以及相邻像素块序号分解0和非0的位置(比如:生成权重图)。在上述二维马尔可夫建模过程将对当前状态 S i-1 到下一个状态 S i 的传递概率 P(S i |S i-1 ) 进行预测。所述的状态是指变换域中的一组相邻且相关的系数块。比如一组位于最高频域的全0系数块构成一个状态(如图4(a));而另一组位于最低频域的全非0系数块则构成另一个状态(如图4(b))。更普遍地,一个状态则由部分0系数和非0系数随意地构成(如图3(c))。After quantization, most of the non-significant DCT coefficients are quantized to zero, while the remaining non-zero, significant coefficients, tend to accumulate in a certain direction. On the contrary, a large number of non-significant coefficients (0) will fan-shaped along the direction from low frequency to high frequency. In order to improve the compression efficiency, the ABE method uses an ordered two-dimensional Markov model and the position of the adjacent pixel block sequence to decompose 0 and non-zero (for example, generate a weight map). The above-described two-dimensional Markov modeling process predicts the transfer probability P ( S i | S i -1 ) from the current state S i -1 to the next state S i . The state refers to a set of adjacent and related coefficient blocks in the transform domain. For example, a set of all-zero coefficient blocks in the highest frequency domain constitutes one state (as shown in Fig. 4(a)); and another set of all non-zero coefficient blocks in the lowest frequency domain constitutes another state (as shown in Fig. 4(b). )). More generally, a state is arbitrarily composed of a partial 0 coefficient and a non-zero coefficient (Fig. 3(c)).
作为对量化后的变换域DCT系数进行熵编码,ABE方法通过 S i-1 -> S i 方式的变换以提高传递概率 P(S i |S i-1 ) ,实现横贯变换域的方式进行扫描。变换过程中的初始状态 S 0 可能是一个位于高频域且包含0的大系数块,或者是一个位于低频域且包含非0的大系数块,分别如图4(b)和图4(a)所示。这些全0或全1的初始状态能够方便地通过一整数数值的广义半径实现编码,该广义半径的方法较块尾(EOB)方法更为有效且广泛应用于现有压缩标准中。As the entropy coding of the quantized transform domain DCT coefficients, the ABE method improves the transfer probability P ( S i | S i -1 ) by transforming in the S i -1 -> S i manner, and realizes scanning by means of the transverse transform domain. . The initial state S 0 during the transformation may be a large coefficient block located in the high frequency domain and containing 0, or a large coefficient block located in the low frequency domain and containing non-zero, as shown in Fig. 4(b) and Fig. 4(a, respectively. ) shown. These all-zero or all-one initial states can be easily encoded by a generalized radius of integer values, which is more efficient than the end-of-block (EOB) method and is widely used in existing compression standards.
二维变换域的横贯式扫描过程中的中间状态由位于当前扫描区域首部的连续的0系数或非0系数组成。通过对相邻状态的变换进行上下文统计建模,ABE方法实现对变换域的特殊方向(大部分为径向)上相关性的利用。The intermediate state during the transverse scan of the two-dimensional transform domain consists of consecutive zero or non-zero coefficients located at the head of the current scan region. By performing contextual statistical modeling of the transformation of adjacent states, the ABE method achieves the use of correlations in the particular direction (mostly radial) of the transform domain.
由于变换系数的方向性与图像块中的方向特征紧密相关,当两个相邻的状态 S i S i-1 是同一类型并且方向相同时,其传递概率 P(S i |S i-1 ) 将表现得更高。通过利用该规律,ABE方法可以在横贯式扫描过程中或压缩变换系数时实现自适应的方向状态变换,如图4所示。Since the directivity of the transform coefficients is closely related to the directional features in the image block, when two adjacent states S i and S i -1 are of the same type and the directions are the same, the probability of transmission P ( S i | S i -1 ) ) will behave even higher. By utilizing this rule, the ABE method can achieve adaptive directional state transitions during traversal scanning or when compressing transform coefficients, as shown in FIG.
对变换系数进行遍历的推进方向可以是单向的:从高频到低频或者从低频到高频,但也可以是双向的,即从高频和低频分别向中频推进最终汇合。图5是这三种遍历方式的示意。The propulsion direction for traversing the transform coefficients may be unidirectional: from high frequency to low frequency or from low frequency to high frequency, but may also be bidirectional, that is, the final convergence from the high frequency and the low frequency to the intermediate frequency respectively. Figure 5 is a schematic illustration of these three traversal modes.
通过合理地选择马尔可夫状态并实施高概率变换,ABE方法可以与基于上下文的匹配熵编码器相结合,以更短的码长对变换系数进行编码。By reasonably selecting the Markov state and implementing a high probability transform, the ABE method can be combined with a context based matching entropy coder to encode the transform coefficients with a shorter code length.
实施例2 图像压缩应用 Embodiment 2 Image Compression Application
本实施例包括以下步骤:This embodiment includes the following steps:
第一步、对于输入图像进行系数变换;The first step is to perform coefficient transformation on the input image;
所述的系数变换可采用DCT变换或KLT(Karhunen-Loeve)变换,或采用现有已知其他变换方法与以下遍历方式进行组合,经实验核实均可以达到与本实施例及图7中所示相近似的效果。The coefficient transformation may be performed by DCT transform or KLT (Karhunen-Loeve) transform, or by using other existing transform methods, and the following traversal manners, and can be verified by experiments and shown in FIG. 7 . Similar effect.
所述的系数变换还可以附加量化处理;The coefficient transformation may also add a quantization process;
第二步、对变换系数进行基于上下文的建模并确定初始状态 S 0 The second step is to perform context-based modeling on the transform coefficients and determine an initial state S 0 ;
本实施例中的初始状态采用广义半径的方式编码,该初始状态可以是以下任意一种:The initial state in this embodiment is encoded by a generalized radius, and the initial state may be any of the following:
i)聚集于最低频域的若干数值为1的系数块,如图5(a)所示;i) a number of coefficient blocks of value 1 gathered in the lowest frequency domain, as shown in Figure 5(a);
ii)聚集于最高频域的若干数值为0的系数块,如图5(b)所示;Ii) a number of coefficient blocks of zero in the highest frequency domain, as shown in Figure 5(b);
iii)上述i)和ii)的组合,如图5(c)所示。Iii) A combination of the above i) and ii) as shown in Fig. 5(c).
第三步、在线或离线地以二维马尔科夫过程对当前状态 S i-1 的所有可能传递概率 P(S i |S i-1 ) 进行计算和比较,并基于其中的最大值作为下一个状态 S i The third step, online or offline, calculates and compares all possible transfer probabilities P ( S i | S i -1 ) of the current state S i -1 by a two-dimensional Markov process, and based on the maximum value thereof A state S i .
第四步、对第三步得到的下一个状态 S i ,用传递概率 P(S i |S i-1 ) 去驱动熵编码器,输出状态 S i 所对应的变化系数的输出码,然后返回第三步根据新的状态重新进行计算和比较,直至遍历完所有变换系数并得到所有输出码组成的完整码流。The fourth step, for the next state S i obtained in the third step, drives the entropy encoder with the transfer probability P ( S i | S i -1 ), outputs the output code of the change coefficient corresponding to the state S i , and then returns The third step recalculates and compares according to the new state until all transform coefficients are traversed and a complete code stream of all output codes is obtained.
上述步骤中i为1至二维马尔可夫过程中的状态总数;如初始状态的个数多于1个,则可以采用如图5(c)的方式实现两个以上方向上的同时遍历以提高编码速度。In the above steps, i is the total number of states in the 1 to 2D Markov process; if the number of the initial states is more than one, the simultaneous traversal in two or more directions may be implemented as shown in FIG. 5(c). Increase the coding speed.
在本应用中,涉及上述方法的实现系统,包括:变换模块、自适应块进化(ABE)模块和熵编码模块,其中:In this application, an implementation system related to the above method includes: a transform module, an adaptive block evolution (ABE) module, and an entropy encoding module, where:
变换模块,对图像进行如DCT或KLT的系数变换; a transform module that performs coefficient conversion on the image such as DCT or KLT;
ABE模块,对变换模块输出的变换系数进行基于上下文的建模,并按照自适应方式逐步选择最高转移概率 对应状态进行马尔科夫过程的跳转,最终遍历所有系数,同时在遍历过程中将 依次输出至熵编码模块以驱动熵编码器。The ABE module performs context-based modeling on the transform coefficients output by the transform module, and selects the highest transition probability step by step in an adaptive manner. The corresponding state performs a jump of the Markov process, and finally traverses all the coefficients, and at the same time outputs to the entropy coding module in order to drive the entropy encoder during the traversal process.
熵编码模块,根据ABE模块逐步输出的传递概率对变换系数进行熵编码并输出。The entropy coding module entropy encodes and outputs the transform coefficients according to the transfer probability of the stepwise output of the ABE module.
所述的变换模块可附带有量化处理功能;The conversion module may be accompanied by a quantization processing function;
所述的基于上下文的建模是指:将变换域中的一个或一组相邻且相关的系数作为二维马尔科夫过程的状态,以最低频率系数和/或最高频率系数为参照点,聚集在一起的具有相同或相近数值且相近频率的系数群所构成的系数块作为模型的初始状态。The context-based modeling refers to: using one or a group of adjacent and related coefficients in the transform domain as the state of the two-dimensional Markov process, with the lowest frequency coefficient and/or the highest frequency coefficient as the reference point, The coefficient blocks formed by the clusters of coefficients having the same or similar values and close frequencies are gathered as the initial state of the model.
所述的自适应方式是指:从基于上下文的模型的初始状态起,计算其所有下一个可能状态之间的传递概率,并基于其中的最高值选择下一个状态。The adaptive manner refers to: calculating the transfer probability between all the next possible states from the initial state of the context-based model, and selecting the next state based on the highest value among them.
本实施例采用如图6所示的38幅常用测试图像来验证实施实例1所述ABE系统的编码性能。每幅图像首先经过8x8的DCT变换,然后采用不同的量化步长分别对DCT系数的显著图(significant map)进行编码。作为比较,我们选用当前效果最好的H.264编码器,来对相同的显著图来进行编码,为了客观比较两种方法,熵编码模块均采用H.264编码器默认的自适应二元算术编码器CABAC。并且,所有上下文的初始概率都设置为0.5。需要说明的是,在本实例中,ABE编码系统采用的上下文(context)数目是H.264的三倍 (378 vs 126),所以ABE系统的上下文稀释(context dilution)惩罚更为严重。因此,该初始化设置方式其实更有利于H.264编码系统。This embodiment uses 38 common test images as shown in FIG. 6 to verify the coding performance of the ABE system of the embodiment 1. Each image is first subjected to an 8x8 DCT transform, and then a significant map of the DCT coefficients is used with different quantization steps. Map) to encode. For comparison, we use the H.264 encoder with the best effect to encode the same saliency map. In order to compare the two methods objectively, the entropy coding module uses the default adaptive binary arithmetic of H.264 encoder. Encoder CABAC. Also, the initial probability of all contexts is set to 0.5. It should be noted that in this example, the ABE encoding system uses three times the number of contexts of H.264. (378 vs 126), so the context dilution of the ABE system (context) Dilution) The punishment is more serious. Therefore, this initial setting method is actually more advantageous for the H.264 encoding system.
我们比较两种编码方式最终所生成码流长度的相对比值:γ=(L 264-L ABE)/L 264 ,其中, L ABE 表示ABE编码系统生成的码流长度, L 264 表示H.264系统生成的码流长度。不同图像的在不同码率下的γ如图7所示,从图中可以看出,与H.264相比,ABE编码系统明显更为高效,尤其当码率升高时,ABE编码系统的优势更为明显。此外,经测试表明,ABE系统对其他系数变换 ,如KLT变换等,以及采用其他现存熵编码器,如JPEG2000所采用的MQ编码器,以及霍夫曼编码器等,都可达到类似于图7的性能提升。We compare the relative ratios of the lengths of the code streams generated by the two encoding methods: γ = ( L 264 - L ABE ) / L 264 , where L ABE represents the code stream length generated by the ABE encoding system, and L 264 represents the H.264 system. The length of the generated stream. The γ of different images at different code rates is shown in Fig. 7. As can be seen from the figure, the ABE coding system is significantly more efficient than H.264, especially when the code rate is increased, the ABE coding system The advantage is even more obvious. In addition, tests have shown that the ABE system can be transformed into other coefficients, such as KLT transform, and other existing entropy encoders, such as the MQ encoder used in JPEG2000, and Huffman encoder, can be similar to Figure 7. Performance improvements.

Claims (13)

  1. 一种变换域里上下文统计建模方法,其特征在于,通过自适应地构建一个二维马尔可夫模型以反应图像/视频信号在二维变化域里的方向相关性;A context statistical modeling method in a transform domain, characterized in that a two-dimensional Markov model is adaptively constructed to reflect the directional correlation of an image/video signal in a two-dimensional variation domain;
    所述的构建包括:The described construction includes:
    将多个或单个变换系数组成二维马尔可夫过程中的状态;Combining multiple or individual transform coefficients into states in a two-dimensional Markov process;
    在线或离线地计算两个相邻马尔可夫状态之间的传递概率;Calculating the probability of transmission between two adjacent Markov states online or offline;
    从初始状态起,通过对被估计传递概率的比较,分别实现单个或多个遍历方向的自适应。From the initial state, adaptation of single or multiple traversal directions is achieved by comparison of the estimated transfer probabilities.
  2. 根据权利要求1所述的方法,其特征是,所述的二维马尔可夫过程中的状态是指变换域中的一个或一组相邻且相关的系数,该状态通过典型模式的系数块进行定义。The method of claim 1 wherein the state in said two-dimensional Markov process refers to one or a set of adjacent and related coefficients in the transform domain, the state passing through a coefficient block of a typical mode Make a definition.
  3. 根据权利要求1所述的方法,其特征是,所述的初始状态是指:以最低频率系数和/或最高频率系数为参照点,聚集在一起的具有相同或相近数值且相近频率的系数群所构成的系数块。The method according to claim 1, wherein said initial state is a coefficient group having the same or similar values and close frequencies with the lowest frequency coefficient and/or the highest frequency coefficient as reference points. The resulting coefficient block.
  4. 根据权利要求1所述的方法,其特征是,所述的传递概率 P(S i |S i-1 ) 专指二维马尔可夫过程中的传递概率,通过离线或者在线的方式计算得到,其中: S i 为下一状态, S i-1 为当前状态, i 表示当前遍历的变换系数的系数块的序号,其取值为1至二维马尔可夫过程中的状态总数。The method according to claim 1, wherein said transfer probability P ( S i | S i -1 ) is specifically a transfer probability in a two-dimensional Markov process, calculated by offline or online, Where: S i is the next state, S i -1 is the current state, and i represents the sequence number of the coefficient block of the transform coefficient currently traversed, which takes a value from 1 to the total number of states in the two-dimensional Markov process.
  5. 根据权利要求1所述的方法,其特征是,所述的遍历方向的自适应是指:根据当前状态和其所有下一个可能状态之间的传递概率中的最优值所对应的状态作为下一个状态;对于两个以上的初始状态则分别进行上述自适应计算。The method according to claim 1, wherein the adaptation of the traversal direction refers to: a state corresponding to an optimal value of a transfer probability between a current state and all of its next possible states as a lower One state; the above adaptive calculation is performed separately for two or more initial states.
  6. 一种基于变换域上下文统计建模的图像/视频压缩方法,其特征在于,包括以下步骤:An image/video compression method based on transform domain context statistical modeling, comprising the following steps:
    第一步、对于输入图像进行系数变换;The first step is to perform coefficient transformation on the input image;
    第二步、对变换系数进行基于上下文的建模并确定初始状态 S 0 The second step is to perform context-based modeling on the transform coefficients and determine an initial state S 0 ;
    第三步、在线或离线地以二维马尔科夫过程对状态 S i-1 下的所有可能传递概率 P(S i |S i-1 ) 进行计算和比较,并基于其中的最优值作为下一个状态 S i In the third step, online or offline, the two-dimensional Markov process is used to calculate and compare all possible transfer probabilities P ( S i | S i -1 ) under state S i -1 , and based on the optimal value The next state S i ;
    第四步、对第三步得到的下一个状态 S i ,用其对应的传递概率 P(S i |S i-1 ) 驱动熵编码器,输出状态 S i 所对应的变化系数的输出码,然后返回第三步根据新的状态重新进行计算和比较,直至遍历完所有变换系数并得到所有输出码组成的完整码流。The fourth step, the next state S i obtained in the third step, drives the entropy encoder with its corresponding transfer probability P ( S i | S i-1 ), and outputs the output code of the change coefficient corresponding to the state S i , Then return to the third step to recalculate and compare according to the new state until all the transform coefficients are traversed and a complete code stream composed of all output codes is obtained.
  7. 根据权利要求6所述的图像/视频压缩方法,其特征是,所述的初始状态采用广义半径的方式编码最低频域里至少一个值为1的系数或系数块和/或最高频域里的至少一个值为0的系数或系数块和/或频域任意部分中具有明显特征的系数块。The image/video compression method according to claim 6, wherein said initial state encodes at least one coefficient or coefficient block and/or highest frequency domain in the lowest frequency domain by a generalized radius. At least one coefficient or coefficient block having a value of 0 and/or a coefficient block having significant features in any portion of the frequency domain.
  8. 根据权利要求7所述的图像/视频压缩方法,其特征是,所述的明显特征是指:两个以上毗邻且具有相同值或呈典型分布的系数对或系数块。The image/video compression method according to claim 7, wherein said distinct feature refers to two or more coefficient pairs or coefficient blocks that are adjacent and have the same value or are typically distributed.
  9. 根据权利要求6或7所述的图像/视频压缩方法,其特征是,所述的初始状态是以下任意一种:The image/video compression method according to claim 6 or 7, wherein the initial state is any one of the following:
    i)聚集于最低频域的若干值为1的系数块;i) a number of coefficient blocks of value 1 gathered in the lowest frequency domain;
    ii)聚集于最高频域的若干数值为0的系数块;Ii) a number of coefficient blocks of zero in the highest frequency domain;
    iii)上述i)和ii)的组合。 Iii) A combination of the above i) and ii).
  10. 根据权利要求6所述的图像/视频压缩方法,其特征是,所述的最优值为最大值、最小值或其他最符合判据要求的取值。The image/video compression method according to claim 6, wherein said optimum value is a maximum value, a minimum value, or other values most in accordance with a criterion requirement.
  11. 一种基于变换域上下文统计建模的图像/视频压缩系统,其特征在于,包括:变换模块、自适应块进化马尔科夫建模模块和熵编码模块,其中:An image/video compression system based on transform domain context statistical modeling, comprising: a transform module, an adaptive block evolution Markov modeling module and an entropy encoding module, wherein:
    变换模块,对图像进行系数变换;Transforming the module to perform coefficient transformation on the image;
    自适应块进化马尔科夫建模模块,对变换模块输出的变换系数进行基于上下文的建模,并按照自适应方式逐步构建并遍历马尔科夫过程的状态;The adaptive block evolution Markov modeling module performs context-based modeling on the transform coefficients output by the transform module, and constructs and traverses the state of the Markov process step by step in an adaptive manner;
    熵编码模块,根据自适应块进化马尔科夫建模模块输出的传递概率对变换系数进行熵编码并输出。The entropy coding module entropy encodes and outputs the transform coefficients according to the transfer probability outputted by the adaptive block evolution Markov modeling module.
  12. 根据权利要求6或11所述的图像/视频压缩系统,其特征是,所述的基于上下文的建模是指:将变换域中的一个或一组相邻且相关的系数作为二维马尔科夫过程的状态,以最低频率系数和/或最高频率系数为参照点,聚集在一起的具有相同或相近数值且相近频率的系数群所构成的系数块作为模型的初始状态。The image/video compression system according to claim 6 or 11, wherein said context-based modeling means: using one or a group of adjacent and related coefficients in the transform domain as a two-dimensional Marco The state of the process, with the lowest frequency coefficient and/or the highest frequency coefficient as reference points, the coefficient blocks formed by the coefficient groups having the same or similar values and close frequencies are used as the initial state of the model.
  13. 根据权利要求11所述的图像/视频压缩系统,其特征是,所述的自适应方式是指:从基于上下文的模型的初始状态起,计算其所有下一个可能状态之间的传递概率,并基于其中的最高值所对应的状态选择下一个状态。The image/video compression system according to claim 11, wherein said adaptive manner means: calculating a transfer probability between all of its next possible states from an initial state of the context-based model, and The next state is selected based on the state corresponding to the highest value among them.
PCT/CN2012/075052 2011-05-04 2012-05-03 Modeling method and system based on context in transform domain of image/video WO2012149904A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201280027747.5A CN104094607B (en) 2011-05-04 2012-05-03 Modeling method and system based on context in transform domain of image/video

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201161518246P 2011-05-04 2011-05-04
US61/518,246 2011-05-04

Publications (1)

Publication Number Publication Date
WO2012149904A1 true WO2012149904A1 (en) 2012-11-08

Family

ID=47107767

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/075052 WO2012149904A1 (en) 2011-05-04 2012-05-03 Modeling method and system based on context in transform domain of image/video

Country Status (2)

Country Link
CN (1) CN104094607B (en)
WO (1) WO2012149904A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114822025A (en) * 2022-04-20 2022-07-29 合肥工业大学 Traffic flow combined prediction method

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113362252B (en) * 2021-06-30 2024-02-02 深圳万兴软件有限公司 Intelligent picture reconstruction method, device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1741616A (en) * 2005-09-23 2006-03-01 联合信源数字音视频技术(北京)有限公司 Adaptive entropy coding/decoding method based on context
US7123656B1 (en) * 2001-10-01 2006-10-17 Realnetworks, Inc. Systems and methods for video compression

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE10218541A1 (en) * 2001-09-14 2003-04-24 Siemens Ag Context-adaptive binary arithmetic video coding, e.g. for prediction error matrix spectral coefficients, uses specifically matched context sets based on previously encoded level values
CN1874509B (en) * 2001-09-14 2014-01-15 诺基亚有限公司 Method and system for context-based adaptive binary arithmetic coding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7123656B1 (en) * 2001-10-01 2006-10-17 Realnetworks, Inc. Systems and methods for video compression
CN1741616A (en) * 2005-09-23 2006-03-01 联合信源数字音视频技术(北京)有限公司 Adaptive entropy coding/decoding method based on context

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
AL-SHAYKH, O. ET AL.: "Video Sequence Compression.", CRC PRESS LLC., 2000, Retrieved from the Internet <URL:http://www.engnetbase.com> *
DUARTE, M.F. ET AL.: "Wavelet-domain Compressive Signal Reconstruction using a Hidden Markov Tree Model.", ACOUSTICS, SPEECH AND SIGNAL PROCESSING 2008. ICASSP 2008. IEEE INTERNATIONAL CONFERENCE ON, 31 March 2008 (2008-03-31), pages 5137 - 5140, XP031251757 *
GLUHOVSKY, I. ET AL.: "Markov Random Field Modeling in Median Pyramidal Transform Domain for Denoising Applications.", JOURNAL OF MATHEMATICAL IMAGING AND VISION., vol. 16, no. 3, 2002, pages 237 - 249 *
RAO, K.R. ET AL.: "The Transform and Data Compression Handbook.", CRC PRESS, INC., 2000, BOCA RATON FL USA *
WU, XIAOLIN ET AL.: "Context Modeling and Entropy Coding of Wavelet Coefficients for Image Compression.", ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 1997, ICASSP-97., 1997 IEEE INTERNATIONAL CONFERENCE ON, vol. 4, 21 April 1997 (1997-04-21), pages 3097 - 3100 *
ZHANG, LI ET AL.: "Context-based entropy coding in AVS video coding standard.", SIGNAL PROCESSING: IMAGE COMMUNICATION, vol. 24, no. ISS. 4, April 2009 (2009-04-01), pages 263 - 276, XP026091625, DOI: doi:10.1016/j.image.2008.12.001 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114822025A (en) * 2022-04-20 2022-07-29 合肥工业大学 Traffic flow combined prediction method

Also Published As

Publication number Publication date
CN104094607B (en) 2017-04-26
CN104094607A (en) 2014-10-08

Similar Documents

Publication Publication Date Title
US11070843B2 (en) Coding of last significant coefficient flags
KR101984826B1 (en) Method and apparatus for entropy Coding and decoding of transformation coefficient
US8126062B2 (en) Per multi-block partition breakpoint determining for hybrid variable length coding
AU2005234613B2 (en) Adaptive coefficient scan order
WO2009113791A2 (en) Image encoding device and image decoding device
JP2013214989A (en) Method and apparatus for encoding and decoding image by using large transformation unit
CN1419787A (en) Quality based image compression
US20240244207A1 (en) Adaptation of scan order for entropy coding
CN101502122B (en) Encoding device and encoding method
US8908985B2 (en) Image processing including encoding information concerning the maximum number of significant digits having largest absolute value of coefficient data in groups
WO2012149904A1 (en) Modeling method and system based on context in transform domain of image/video
KR100254402B1 (en) Line-length encoding method and line-length encoder
EP3989571A1 (en) Embedding information about eob positions
Yuan et al. Learned image compression with channel-wise grouped context modeling
CN1319382C (en) Method for designing architecture of scalable video coder decoder
CN103686176B (en) A kind of code rate estimation method for Video coding
WO2010047492A2 (en) Moving picture encoder, 2d alignment transformation device and method of image signal for the same, and recording medium therefor
KR20110071204A (en) Parallel Processing Method in Wave2000 Based on Wavelet Transform
KR100359813B1 (en) Alternative Double Scan For Video Coding And Decoding
JP5966346B2 (en) Image processing apparatus and method
US20160234529A1 (en) Method and apparatus for entropy encoding and decoding
WO1999005649A1 (en) Adaptive entropy encoding/decoding
Wei An Improved Image Encoding Algorithm Based on EZW and Huffman Joint Encoding
Cuixiang et al. A perfect pass-parallel EBCOT tier-1 coding scheme
Hua et al. A new parallel realization method of spiht for video compression

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12779291

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11/04/2014)

122 Ep: pct application non-entry in european phase

Ref document number: 12779291

Country of ref document: EP

Kind code of ref document: A1