Opencv with cpp example

module

  • core: a compact module defining basic data structures, including the dense multi-dimensional array Mat, and basic functions used by all other modules
  • imgproc: an image processing module that includes linear and non-linear image filtering, geometrical image transformations (resize, affine and perspective warping, generic table-based remapping), color space conversion, histograms, and so on
  • video: a video analysis module that includes motion estimation, background subtraction, and object tracking algorithms
  • calib3d: basic multiple-view geometry algorithms, single and stereo camera calibration, object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction
  • features2d: salient feature detectors, descriptors, and descriptor matchers
  • objdetect: detection of objects and instances of the predefined classes (for example: faces, eyes, mugs, people, cars, and so on)
  • ml: various machine learning algorithms such as K-Means, Support Vector Machines, and Neural Networks
  • highgui: an easy-to-use interface for video capturing, image and video codecs, and simple UI capabilities (only a subset available on iOS)
  • cuda: GPU-accelerated algorithms from different OpenCV modules (unavailable on iOS)
  • ocl: common algorithms implemented using OpenCL (unavailable on iOS) a few more helper modules such as Python bindings and user-contributed algorithms

Load, Modify, and Save an Image

Load, Modify, and Save an Image

#include <opencv2/opencv.hpp>
using namespace cv;

/*

this example show two windows

one for origin color,
one for gray color

make
./readimg.out logo.png

*/

int main(int argc, char** argv) {
    char* imageName = argv[1];
    std::string img_path = "./imgs/";

    // concat char* to string
    std::string write_file = "gray_";
    write_file.append(imageName);

    Mat image;
    // cv::imread has BGR default channel order in case of color images
    image = imread(img_path + imageName, 1);
    if (argc != 2 || !image.data) {
        printf(" No image data \n ");
        return -1;
    }
    Mat gray_image;
    // convert our image from BGR to Grayscale format
    // detail
    // http://docs.opencv.org/trunk/d7/d1b/group__imgproc__misc.html#ga397ae87e1288a81d2363b61574eb8cab
    cvtColor(image, gray_image, COLOR_BGR2GRAY);
    imwrite(img_path + write_file, gray_image);
    namedWindow(imageName, WINDOW_AUTOSIZE);
    namedWindow("Gray image", WINDOW_AUTOSIZE);
    imshow(imageName, image);
    imshow("Gray image", gray_image);
    waitKey(0);  // hold wait to exit
    return 0;
}

How to scan images, lookup tables and time measurement with OpenCV

How to scan images, lookup tables and time measurement with OpenCV

to scan the image, there privide four different way to scan image.

because scan image is quite common action, opencv builtin function cv::LUT() , in this tutorial also show the performance, tha LUT show the best computation performance

#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/core/utility.hpp>
#include <opencv2/highgui.hpp>
#include <sstream>
#include "opencv2/imgcodecs.hpp"

using namespace std;
using namespace cv;

static void help() {
    cout << "\n----------------------------------------------------------------"
            "----------"
         << endl
         << "This program shows how to scan image objects in OpenCV (cv::Mat). "
            "As use case"
         << " we take an input image and divide the native color palette (255) "
            "with the "
         << endl
         << "input. Shows C operator[] method, iterators and at function for "
            "on-the-fly item address calculation."
         << endl
         << "Usage:" << endl
         << "./how_to_scan_images <imageNameToUse> <divideWith> [G]" << endl
         << "if you add a G parameter the image is processed in gray scale"
         << endl
         << "------------------------------------------------------------------"
            "--------"
         << endl
         << endl;
}

Mat& ScanImageAndReduceC(Mat& I, const uchar* table);
Mat& ScanImageAndReduceIterator(Mat& I, const uchar* table);
Mat& ScanImageAndReduceRandomAccess(Mat& I, const uchar* table);

int main(int argc, char* argv[]) {
    help();
    if (argc < 3) {
        cout << "Not enough parameters" << endl;
        return -1;
    }

    Mat I, J;
    if (argc == 4 && !strcmp(argv[3], "G"))
        I = imread(argv[1], IMREAD_GRAYSCALE);
    else
        I = imread(argv[1], IMREAD_COLOR);

    if (I.empty()) {
        cout << "The image" << argv[1] << " could not be loaded." << endl;
        return -1;
    }

    //! [dividewith]
    int divideWith = 0;  // convert our input string to number - C++ style
    stringstream s;
    s << argv[2];
    s >> divideWith;
    if (!s || !divideWith) {
        cout << "Invalid number entered for dividing. " << endl;
        return -1;
    }

    uchar table[256];
    for (int i = 0; i < 256; ++i)
        table[i] = (uchar)(divideWith * (i / divideWith));
    //! [dividewith]

    const int times = 100;
    double t;

    t = (double)getTickCount();

    for (int i = 0; i < times; ++i) {
        cv::Mat clone_i = I.clone();
        J = ScanImageAndReduceC(clone_i, table);
    }

    t = 1000 * ((double)getTickCount() - t) / getTickFrequency();
    t /= times;

    cout << "Time of reducing with the C operator [] (averaged for " << times
         << " runs): " << t << " milliseconds." << endl;

    t = (double)getTickCount();

    for (int i = 0; i < times; ++i) {
        cv::Mat clone_i = I.clone();
        J = ScanImageAndReduceIterator(clone_i, table);
    }

    t = 1000 * ((double)getTickCount() - t) / getTickFrequency();
    t /= times;

    cout << "Time of reducing with the iterator (averaged for " << times
         << " runs): " << t << " milliseconds." << endl;

    t = (double)getTickCount();

    for (int i = 0; i < times; ++i) {
        cv::Mat clone_i = I.clone();
        ScanImageAndReduceRandomAccess(clone_i, table);
    }

    t = 1000 * ((double)getTickCount() - t) / getTickFrequency();
    t /= times;

    cout << "Time of reducing with the on-the-fly address generation - at "
            "function (averaged for "
         << times << " runs): " << t << " milliseconds." << endl;

    //! [table-init]
    Mat lookUpTable(1, 256, CV_8U);
    uchar* p = lookUpTable.ptr();
    for (int i = 0; i < 256; ++i) p[i] = table[i];
    //! [table-init]

    t = (double)getTickCount();

    for (int i = 0; i < times; ++i)
        //! [table-use]
        LUT(I, lookUpTable, J);
    //! [table-use]

    t = 1000 * ((double)getTickCount() - t) / getTickFrequency();
    t /= times;

    cout << "Time of reducing with the LUT function (averaged for " << times
         << " runs): " << t << " milliseconds." << endl;
    return 0;
}

//! [scan-c]
Mat& ScanImageAndReduceC(Mat& I, const uchar* const table) {
    // accept only char type matrices
    CV_Assert(I.depth() == CV_8U);

    int channels = I.channels();

    int nRows = I.rows;
    int nCols = I.cols * channels;

    if (I.isContinuous()) {
        nCols *= nRows;
        nRows = 1;
    }

    int i, j;
    uchar* p;
    for (i = 0; i < nRows; ++i) {
        p = I.ptr<uchar>(i);
        for (j = 0; j < nCols; ++j) {
            p[j] = table[p[j]];
        }
    }
    return I;
}
//! [scan-c]

//! [scan-iterator]
Mat& ScanImageAndReduceIterator(Mat& I, const uchar* const table) {
    // accept only char type matrices
    CV_Assert(I.depth() == CV_8U);

    const int channels = I.channels();
    switch (channels) {
        case 1: {
            MatIterator_<uchar> it, end;
            for (it = I.begin<uchar>(), end = I.end<uchar>(); it != end; ++it)
                *it = table[*it];
            break;
        }
        case 3: {
            MatIterator_<Vec3b> it, end;
            for (it = I.begin<Vec3b>(), end = I.end<Vec3b>(); it != end; ++it) {
                (*it)[0] = table[(*it)[0]];
                (*it)[1] = table[(*it)[1]];
                (*it)[2] = table[(*it)[2]];
            }
        }
    }

    return I;
}
//! [scan-iterator]

//! [scan-random]
Mat& ScanImageAndReduceRandomAccess(Mat& I, const uchar* const table) {
    // accept only char type matrices
    CV_Assert(I.depth() == CV_8U);

    const int channels = I.channels();
    switch (channels) {
        case 1: {
            for (int i = 0; i < I.rows; ++i)
                for (int j = 0; j < I.cols; ++j)
                    I.at<uchar>(i, j) = table[I.at<uchar>(i, j)];
            break;
        }
        case 3: {
            Mat_<Vec3b> _I = I;

            for (int i = 0; i < I.rows; ++i)
                for (int j = 0; j < I.cols; ++j) {
                    _I(i, j)[0] = table[_I(i, j)[0]];
                    _I(i, j)[1] = table[_I(i, j)[1]];
                    _I(i, j)[2] = table[_I(i, j)[2]];
                }
            I = _I;
            break;
        }
    }

    return I;
}
//! [scan-random]

Mask operations on matrices

Mask operations on matrices

enhance photo

#include <iostream>
#include <opencv2/highgui.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/imgproc.hpp>

using namespace std;
using namespace cv;

static void help(char* progName) {
    cout << endl
         << "This program shows how to filter images with mask: the write it "
            "yourself and the"
         << "filter2d way. " << endl
         << "Usage:" << endl
         << progName
         << " [image_name -- default ../data/lena.jpg] [G -- grayscale] "
         << endl
         << endl;
}

void Sharpen(const Mat& myImage, Mat& Result);

int main(int argc, char* argv[]) {
    help(argv[0]);
    const char* filename = argc >= 2 ? argv[1] : "./imgs/lena.jpg";

    Mat src, dst0, dst1;

    if (argc >= 3 && !strcmp("G", argv[2]))
        src = imread(filename, IMREAD_GRAYSCALE);
    else
        src = imread(filename, IMREAD_COLOR);

    if (src.empty()) {
        cerr << "Can't open image [" << filename << "]" << endl;
        return -1;
    }

    namedWindow("Input", WINDOW_AUTOSIZE);
    namedWindow("Output", WINDOW_AUTOSIZE);

    imshow("Input", src);
    double t = (double)getTickCount();

    Sharpen(src, dst0);

    t = ((double)getTickCount() - t) / getTickFrequency();
    cout << "Hand written function times passed in seconds: " << t << endl;

    imshow("Output", dst0);

    Mat kernel = (Mat_<char>(3, 3) << 0, -1, 0, -1, 5, -1, 0, -1, 0);
    t = (double)getTickCount();
    filter2D(src, dst1, src.depth(), kernel);
    t = ((double)getTickCount() - t) / getTickFrequency();
    cout << "Built-in filter2D time passed in seconds:      " << t << endl;

    imshow("Output", dst1);

    waitKey();
    return 0;
}
void Sharpen(const Mat& myImage, Mat& Result) {
    CV_Assert(myImage.depth() == CV_8U);  // accept only uchar images

    const int nChannels = myImage.channels();
    Result.create(myImage.size(), myImage.type());

    for (int j = 1; j < myImage.rows - 1; ++j) {
        const uchar* previous = myImage.ptr<uchar>(j - 1);
        const uchar* current = myImage.ptr<uchar>(j);
        const uchar* next = myImage.ptr<uchar>(j + 1);

        uchar* output = Result.ptr<uchar>(j);

        for (int i = nChannels; i < nChannels * (myImage.cols - 1); ++i) {
            *output++ = saturate_cast<uchar>(
                5 * current[i] - current[i - nChannels] -
                current[i + nChannels] - previous[i] - next[i]);
        }
    }

    Result.row(0).setTo(Scalar(0));
    Result.row(Result.rows - 1).setTo(Scalar(0));
    Result.col(0).setTo(Scalar(0));
    Result.col(Result.cols - 1).setTo(Scalar(0));
}

capture camera frame

  • VideoCapture(int) read camera
  • VideoCapture(string) read filestream

more detail cv::VideoCapture Class Reference

#include "opencv2/opencv.hpp"
using namespace cv;

int main(int, char**) {
    // VideoCapture cap(0);  // open the default camera
    
    VideoCapture cap("./videos/sample.mp4");
    if (!cap.isOpened()) {  // check if we succeeded
        std::cout << "Could not open source" << std::endl;
        return -1;
    }

    namedWindow("Video", 1);
    while (1) {
        Mat frame;
        cap >> frame;  // get a new frame from camera
        imshow("Video", frame);

        // Press 'c' to escape
        if (waitKey(30) == 'c') break;
    }
    return 0;
}

compile args

compile with opencv

g++ source -o output `pkg-config --cflags --libs opencv`

compile with opencv and opengl

 g++ source -o output  -lGL -lGLU -lglut `pkg-config --cflags --libs opencv` 

compile with opencv, opengl and cuda

g++ source -o output  -L/usr/local/cuda/lib64/ -lcuda -lcudart  -lGL -lGLU -lglut `pkg-config --cflags --libs opencv` 

get property

referSet or Get Property from Camera using OpenCV

VideoCapture.set() or VideoCapture.get()

  • CV_CAP_PROP_POS_MSEC 影片播放毫秒
  • CV_CAP_PROP_POS_FRAMES 影片播放幀數
  • CV_CAP_PROP_POS_AVI_RATIO 影片播放百分比
  • CV_CAP_PROP_FRAME_WIDTH 影像解析度(寬)
  • CV_CAP_PROP_FRAME_HEIGHT 影像解析度(高)
  • CV_CAP_PROP_FPS 每秒輸出幀數
  • CV_CAP_PROP_FOURCC 壓縮4碼
  • CV_CAP_PROP_FRAME_COUNT 影片總幀數
  • CV_CAP_PROP_FORMAT Format of the Mat objects returned by retrieve() .
  • CV_CAP_PROP_MODE Backend-specific value indicating the current capture mode.
  • CV_CAP_PROP_BRIGHTNESS 影像亮度
  • CV_CAP_PROP_CONTRAST 影像對比度
  • CV_CAP_PROP_SATURATION 影像飽和度
  • CV_CAP_PROP_HUE 影像色度
  • CV_CAP_PROP_GAIN 影像增益
  • CV_CAP_PROP_EXPOSURE 影像曝光度
  • CV_CAP_PROP_CONVERT_RGB 影像是否被轉成RGB格式
  • CV_CAP_PROP_WHITE_BALANCE 目前不支援白平衡
  • CV_CAP_PROP_RECTIFICATION stereo cameras

color

[latex]RGB \leftarrow\rightarrow \ GRAY[/latex]

[latex]RGB [\ A\ ]\ to\ Gray:\ Y \leftarrow 0.299 \cdot R + 0.587 \cdot G + 0.114 \cdot B[/latex]

  • COLOR_BGR2RGB
  • COLOR_RGB2BGR
  • COLOR_BGR2GRAY (convert between RGB to grayscale)
  • COLOR_BGR2XYZ (convert BGR to CIE XYZ)
  • COLOR_XYZ2BGR
  • COLOR_BGR2YCrCb (convert BGR to luma-chroma)
  • COLOR_YCrCb2BGR
  • COLOR_BGR2HSV (convert BGR to HSV)
  • COLOR_HSV2BGR
  • COLOR_BGR2Lab (convert BGR to CIE Lab)
  • COLOR_Lab2BGR
  • COLOR_BGR2HLS (convert BGR to HLS)
  • COLOR_HLS2BGR

harr cascades

  • haarcascade_frontalface_default.xml
  • haarcascade_eye_tree_eyeglasses.xml
  • haarcascade_frontalface_alt_tree.xml
  • haarcascade_frontalface_alt.xml
  • haarcascade_frontalface_alt2.xml
  • haarcascade_frontalface_default.xml
  • haarcascade_fullbody.xml
  • haarcascade_lefteye_2splits.xml
  • haarcascade_lowerbody.xml
  • haarcascade_mcs_eyepair_big.xml
  • haarcascade_mcs_lefteye.xml
  • haarcascade_mcs_mouth.xml
  • haarcascade_mcs_nose.xml
  • haarcascade_mcs_righteye.xml
  • haarcascade_mcs_upperbody.xml
  • haarcascade_profileface.xml
  • haarcascade_righteye_2splits.xml
  • haarcascade_upperbody.xml
  • haarcascade_mcs_eyepair_small.xml

moudles

core. 基本資料結構的緊湊模組,包含操作影像像素及1.0與2.0版本格式交換
imgproc. 影像處理模組,包含線性與非線性的影像濾波器及幾何影像的轉換(影像縮放翻轉、色彩空間轉換和直方圖操作等)
imgcodecs. 影像格式編碼、解碼及影像存取
videoio. 攝影機操作
highgui. 提供簡易的視窗介面及鍵盤與滑鼠操作
video. 物件追蹤、背景相減及移動方向預估等
calib3d. 多視角幾何演算法,影像校正、3D物件重建及姿態估測等
features2d. 特徵擷取及特徵描述
objdetect. 物件偵測,如人臉偵測、眼睛偵測、嘴巴偵測、鼻子偵測及身形偵測等
ml. 機器學習演算法,如KNN、貝式機率分類器、SVM等
flann. 實現FLANN演算法
photo. 影像去雜訊、HDR
stitching. 影像合成相關技術
cudaarithm. 使用CUDA資料結構操作
cudabgsegm. 使用CUDA的背景相減MOG
cudacodec. 使用CUDA的影片存取讀取
cudafeatures2d. 使用CUDA實現部分特徵擷取技術 …

resize

void cvResize(const CvArr* src, CvArr* dst, int interpolation=CV_INTER_LINEAR )

    src 來源影像(input image)
    dst 目標影像(output image)
    interpolation修改、插補的方法
        INTER_NEAREST 臨近元素插值法
        INTER_LINEAR 雙線性差值法(默認)
        INTER_CUBIC 立方差值法
        INTER_AREA 像素關係重採樣法(可避免波紋)

FaceDetection

Landmark Detection and 3D Face Reconstruction using Modern C++

Face Detection using Haar Cascades

Smooth Face Tracking with OpenCV

Face Detection and Recognition (Theory and Practice)

ComputerVision

#include <opencv/cv.h>
#include <opencv/highgui.h>
#include <opencv/ml.h>

void doMosaic(IplImage* in, int x, int y,
              int width, int height, int size);

int main (int argc, char **argv)
{
  int i, c;
  IplImage *src_img = 0, *src_gray = 0;
  const char *cascade_name = "/opt/local/share/opencv/haarcascades/haarcascade_frontalface_default.xml";
  CvHaarClassifierCascade *cascade = 0;
  CvMemStorage *storage = 0;
  CvSeq *faces;
  
  cascade = (CvHaarClassifierCascade *) cvLoad (cascade_name, 0, 0, 0);
  cvNamedWindow ("Capture", CV_WINDOW_AUTOSIZE);
  CvCapture *capture = cvCreateCameraCapture(0);
  assert(capture != NULL);
  
  while (1) {
    src_img = cvQueryFrame (capture);
    src_gray = cvCreateImage (cvGetSize(src_img), IPL_DEPTH_8U, 1);
    
    storage = cvCreateMemStorage (0);
    cvClearMemStorage (storage);
    cvCvtColor (src_img, src_gray, CV_BGR2GRAY);
    cvEqualizeHist (src_gray, src_gray);
    
    faces = cvHaarDetectObjects (src_gray, cascade, storage,
                                 1.11, 4, 0, cvSize (40, 40));
    for (i = 0; i < (faces ? faces->total : 0); i++) {
      CvRect *r = (CvRect *) cvGetSeqElem (faces, i);
      doMosaic(src_img, r->x, r->y, r->width, r->height, 20);
    }
    
    cvShowImage("Capture", src_img);
    cvReleaseImage(&src_gray);
    
    c = cvWaitKey (2);
    if (c == '\x1b')
      break;
  }
  
  cvReleaseCapture (&capture);
  cvDestroyWindow ("Capture");
  
  return 0;
}

void doMosaic(IplImage* in, int x0, int y0,
              int width, int height, int size)
{
  int b, g, r, col, row;
  
  int xMin = size*(int)floor((double)x0/size);
  int yMin = size*(int)floor((double)y0/size);
  int xMax = size*(int)ceil((double)(x0+width)/size);
  int yMax = size*(int)ceil((double)(y0+height)/size);
  
  for(int y=yMin; y<yMax; y+=size){
    for(int x=xMin; x<xMax; x+=size){
      b = g = r = 0;
      for(int i=0; i<size; i++){
        if( y+i > in->height ){
          break;
        }
        row = i;
        for(int j=0; j<size; j++){
          if( x+j > in->width ){
            break;
          }
          b += (unsigned char)in->imageData[in->widthStep*(y+i)+(x+j)*3];
          g += (unsigned char)in->imageData[in->widthStep*(y+i)+(x+j)*3+1];
          r += (unsigned char)in->imageData[in->widthStep*(y+i)+(x+j)*3+2];
          col = j;
        }
      }
      row++;
      col++;
      for(int i=0;i<row;i++){
        for(int j=0;j<col;j++){
          in->imageData[in->widthStep*(y+i)+(x+j)*3]   = cvRound((double)b/(row*col));
          in->imageData[in->widthStep*(y+i)+(x+j)*3+1] = cvRound((double)g/(row*col));
          in->imageData[in->widthStep*(y+i)+(x+j)*3+2] = cvRound((double)r/(row*col));
        }
      }
    }
  }
}

face-recognizer-gui

threshold for recognize

Cascade Classification

Recommended values for OpenCV detectMultiScale() parameters

OpenCV detectMultiScale() minNeighbors parameter

OpenCV原理解读之HAAR+Adaboost

Face Recognition with OpenCV

How to do face recognition in OpenCV and how to improve it by aligning faces

CascadeClassifier::detectMultiScale()

Amongst these parameters, you need to pay more attention to four of them:

scaleFactor – Parameter specifying how much the image size is reduced at each image scale.

Basically the scale factor is used to create your scale pyramid. More explanation can be found here. In short, as described here, your model has a fixed size defined during training, which is visible in the xml. This means that this size of face is detected in the image if present. However, by rescaling the input image, you can resize a larger face to a smaller one, making it detectable by the algorithm.

1.05 is a good possible value for this, which means you use a small step for resizing, i.e. reduce size by 5%, you increase the chance of a matching size with the model for detection is found. This also means that the algorithm works slower since it is more thorough. You may increase it to as much as 1.4 for faster detection, with the risk of missing some faces altogether.

minNeighbors – Parameter specifying how many neighbors each candidate rectangle should have to retain it.

This parameter will affect the quality of the detected faces. Higher value results in less detections but with higher quality. 3~6 is a good value for it.

minSize – Minimum possible object size. Objects smaller than that are ignored.

This parameter determine how small size you want to detect. You decide it! Usually, [30, 30] is a good start for face detection.

maxSize – Maximum possible object size. Objects bigger than this are ignored.

This parameter determine how big size you want to detect. Again, you decide it! Usually, you don’t need to set it manually, the default value assumes you want to detect without an upper limit on the size of the face.

another example