Introduction to the TI C64x+ IMGLIB

The Texas Instruments C64x+ IMGLIB is an optimized Image/Video Processing Functions Library for C programmers using TMS320C64x+ devices. It includes many C-callable, assembly-optimized, general-purpose image/video processing routines. These routines are typically used in computationally intensive real-time applications where optimal execution speed is critical. Using these routines assures execution speeds considerably faster than equivalent code written in standard ANSI C language. In addition, by providing ready-to-use DSP functions, TI IMGLIB can significantly shorten image/video processing application development time.

IMGLIB适用于计算密集的场景，比单纯C写的代码要快，并且提供可直接操作DSP的接口。

Features and Benefits

The TI C64x+ IMGLIB contains commonly used image/video processing routines, as well as source code that allows you to modify functions to match your specific needs.
IMGLIB features include:

Optimized assembly code routines
C and linear assembly source code
C-callable routines fully compatible with the TI C6x compiler
Host library to enable PC based development and testing
CCS/VC++ projects to rebuild library
Benchmarks (cycles)
Tested against reference C model
Test bench with reference input and output vectors

IMGLIB用于处理图像和视频，允许用户修改源码以满足特殊需求。

Software Routines

categories as follows:

Compression and decompression
Image analysis
Picture filtering/format conversions

In addition, a set of 22 low-level kernels have been included in Appendix A. These functions perform simple image operations such as addition, substraction, multiplication, etc and are intended to be used as a starting point for developing more complex kernels

IMGLIB Image Analysis Functions

这部分主要为图像的预处理，如非0像素检测，像素分布，去噪等等。不过边缘检测的函数也归入了这部分。

函数名格式为IMG_{函数意义}_{模板大小}_i{输入数据位数}(s)_c{输出数据位数}(s)。以IMG开头，带s代表无符号类型。

IMG_boundary_8

这个函数用来扫描图像中非0像素的位置，结果存在out_coord数组中。
out_coord是一个二维int型数组，所以它的高16bit为Y坐标，低16bit为X坐标。
out_gray数组存的是out_coord数组对应像素的灰度信息

IMG_boundary_16s

这个函数和上面那个差不多，不过gray存的是short类型的。
看源码区别只有*out_coord++ = ((y & 0xFFFF) << 16) | (x & 0xFFFF);和*o_coord++ = ((y) << 16) | (x);*o_grey++ = p;
也就是说IMG_boundary_8会对数据预处理，去掉高16位。IMG_boundary_16s则没有这个操作。

IMG_clipping_16s

用来设置矩阵阈值，把数组中超过或小于阈值的位置都设置为最大或最小值。

IMG_dilate_bin

膨胀算法。mask矩阵中，当值为负数（DONT_CARE）时，代表不考虑这个位置的像素，否则考虑。

IMG_erode_bin

腐蚀算法。mask矩阵结构和上面一样。

IMG_errdif_bin_8

用来检测错误的像素，流程就是传个阈值进去，把大于阈值的都找出来。这个阈值可能是图像灰度的最大值，比如255。

IMG_errdif_bin_16

上面那个算法16位的版本。

IMG_histogram_8

IMG_histogram_16

统计像素分布的直方图。
分8位和16位两个版本，注意hist数组和t_hist数组的大小要求。

IMG_median_3x3_8

去噪音，把每个像素的值设置为周围9个像素的均值。有点类似于卷积。

IMG_perimeter_8

IMG_perimeter_16

图像分割，寻找图像的边界。

IMG_pix_expand

把16位的图像变成32位的图像。

IMG_pix_sat

把32位的图像变成16位的图像（超过255的就设置为255）。

IMG_sobel_3x3_8

IMG_sobel_3x3_16s

IMG_sobel_5x5_16s

IMG_sobel_7x7_16s

索贝尔边缘检测算法

IMG_thr_gt2max_8

IMG_thr_gt2max_16

溢出检测，把超过阈值的像素都设置成255或65535

IMG_thr_gt2thr_8

IMG_thr_gt2thr_16

溢出检测，把超过阈值的像素都设置成阈值

IMG_thr_le2min_8

IMG_thr_le2min_16

溢出检测，把低于或等于阈值的都设置成0

IMG_thr_le2thr_8

IMG_thr_le2thr_16

溢出检测，把低于或等于阈值的都设置成阈值

IMG_thr_le2thr

把低于或等于阈值的都设置成阈值（这个是无符号型的）

IMG_yc_demux_be16_8

把一个视频流分成三个小端模式下的视频流。（YCbCr模式视频源既可以为大端，也可以是小端）

IMG_ycbcr422p_rgb565

把YCbCr的视频转成RBG的视频，转的过程中还能调色。

IMGLIB2 Picture Filtering Functions

这部分主要包含卷积和模板匹配两大内容和中值、转码两个小内容。中值和转码上个部分也有相关函数，不清楚为什么要把拆开归类到这里。

函数名格式为IMG_{函数意义}_{模板大小}_i{输入数据位数}(s)_c{输出数据位数}(s)。以IMG开头，带s代表无符号类型。

IMG_conv_3x3_i8_c8s

IMG_conv_3x3_i16s_c16s

IMG_conv_3x3_i16_c16s

IMG_conv_5x5_i8_c8s

IMG_conv_5x5_i16s_c16s

IMG_conv_5x5_i8_c16s

IMG_conv_7x7_i8_c8s

IMG_conv_7x7_i16s_c16s

IMG_conv_7x7_i8_c16s

IMG_conv_11x11_i8_c8s

IMG_conv_11x11_i16s_c16s

卷积计算函数

卷积和模板匹配的流程均为每次匹配一行，假如需要匹配多行，则需要调用多次。
每次调用传入的width值可能需要减去模板长度以避免数组越界。
配合DMA如何做到扫描一行的同时搬下一行的数据。
3x3代表系数矩阵的大小

IMG_corr_3x3_i8_c16s

IMG_corr_3x3_i16s_c16s

IMG_corr_3x3_i8_c8

IMG_corr_3x3_i16_c16s

IMG_corr_5x5_i16s_c16s

IMG_corr_11x11_i16s_c16s

IMG_corr_11x11_i8_c16s

IMG_corr_gen_i16s_c16s

IMG_corr_gen_iq

翻译过来是相关性函数，但是仍然是匹配模板，和卷积函数的区别就是输出数据不做溢出检测，但输出数据的位宽是输入数据的一倍。其流程也与上面一致。

IMG_median_3x3_16s

IMG_median_3x3_16

求周围9个像素的均值（前面出现过，不过这个是16位的）。两个函数看介绍是无符号和有符号的区别，但是文档中两个函数的参数一样，所以具体还要参考源码确定作用。

IMG_yc_demux_be16_16

IMG_yc_demux_le16_16

把一个YCbCr的视频流转化成三个视频流，分别为Y、Cb、Cr。
两个函数分别对应源视频流为大端和小端模式。

Compression/Decompression IMGLIB2 Reference

IMG_fdct_8x8

离散余弦变换

IMG_idct_8x8_12q4

逆离散余弦变换

IMG_mad_8x8

8x8的划窗匹配（这里用的是减法，获取像素与模板的差值的绝对值），获得最匹配的位置。

IMG_mad_16x16

上面函数用的16x16的模板

IMG_mpeg2_vld_intra
IMG_mpeg2_vld_inter

MPEG-2的解码函数

IMG_quantize

矩阵量化操作

IMG_sad_8x8

计算两个8x8矩阵每个像素绝对值差的和（相当于mad一部分计算）

IMG_sad_16x16

上面那个16x16的版本

IMG_wave_horz
IMG_wave_vert

水平或垂直方向上的一位正小波分解