2023-02-22

相机坐标_世界坐标转换

工程中有些图像处理任务需要真实物理尺寸，本文讲解如何根据图像坐标系与真实物理坐标系之间的对应关系从图像中获取真实物理尺寸。
参考：

针孔相机模型

先了解一下常见的几种3D几何变换

单目相机成像模型如下

需要经过如下步骤

几种变换求解

首先讲清楚需求：给定一个图像，要转换成俯视图视角。

由于3D空间点映射到2D相平面是存在不可逆的信息损失的（基深度方向的信息损失），所以不可能单纯根据一个图像就还原成俯视图视角下的图像（其他视角也不行）。但是如果这些点在同一个平面上，是可以还原成垂直这个平面视角的图像的。

一种思路就是在图像中选取四个点，要求这四个点在俯视图中要是一个矩形框，根据点的对应关系可以得到单应性矩阵，然后根据单应性矩阵得到转换后的图像。

还有一种思路是根据PnP方法求得相机位姿，然后选一个矩形框，根据相机位姿得到对应的像素坐标，然后再根据像素坐标和这个框坐标的对应关系获取单应性矩阵。当然这种思路的意义就是可以从相机获取相机内参及方位角（世界坐标系可以建在以相机为原点的位置，这样及省去了位移），从而即使相机有变动也不用再去现场找矩形框对应的点，但是具体没实践过。

关于怎样通过相机内参及外参，将该视角下的图像转换为另一视角下的图像。具体过程就是先求当前视角转换到另一视角的矩阵，得到当前视角相机坐标系上的点映射到另一视角相机坐标系上的点的矩阵。通过相机内存矩阵求逆得到当前视角像素坐标系映射到当前视角相机坐标系的矩阵，然后在将当前视角相机坐标系下的点乘以之前求到的视角转换矩阵就得到另一视角相机坐标系下的点，然后乘以相机内参矩阵映射到像素坐标。参考：https://blog.csdn.net/u013019296/article/details/119259585

# 相机内参矩阵
Mat cameraMatrix = (Mat_<double>(3, 3) << 700.0, 0.0, 320.0,
                       0.0, 700.0, 240.0,
                       0, 0, 1);
# 当前视角相机外参
Mat R1 = c1Mo(Range(0, 3), Range(0, 3));
# 另一视角相机外参
Mat R2 = c2Mo(Range(0, 3), Range(0, 3));
 //c1Mo * oMc2
#  当前视角相加坐标系下的点转换到另一视角相机坐标系下的转换矩阵，这里直接转置是求逆与求转置等价
Mat R_2to1 = R1*R2.t();//同一平面两个不同相机坐标系的单应矩阵
// [compute-homography]
# 最后将三者相乘，得到不同视角下的转换矩阵
Mat H = cameraMatrix * R_2to1 * cameraMatrix.inv();//同一平面计算出两个图像间的单应矩阵H
H /= H.at<double>(2, 2);//归一化
cout << "H:\n" << H << endl;

单应性不严谨的定义：用 [无镜头畸变] 的相机从不同位置拍摄 [同一平面物体] 的图像之间存在单应性，可以用 [透视变换] 表示。可以得到一个直接描述图像坐标变换的矩阵。
对极几何约束：根据两个视角中点的对应关系求解相机运动（旋转+平移）的一种方式
三角测量：在使用对极约束后估算点的深度信息的一种方法。
PnP（Perspective-n-Point）是求解 3D 到 2D 点对运动的方法。它描述了当我们知道 n 个 3D 空间点以及它们的投影位置时，如何估计相机所在的位姿。在双目或 RGB-D 的视觉里程计中，我们可以直接使用 PnP 估计相机运动。而在单目视觉里程计中，必须先进行初始化，然后才能使用 PnP。3D-2D 方法不需要使用对极约束，又可以在很少的匹配点中获得较好的运动估计，是最重要的一种姿态估计方法。
PnP 问题有很多种求解方法，例如用三对点估计位姿的 P3P，直接线性变换（DLT），EPn（Efficient PnP），UPnP 等等）。此外，还能用非线性优化的方式，构建最小二乘问题并迭代求解，也就是万金油式的 Bundle Adjustment。

相关代码

获取图像中点的坐标可以使用以下代码

from PIL import Image
from pylab import *
im = array(Image.open(r"water.jpg"))
imshow(im)
print('Please click 4 points')
x =ginput(4)
print('you clicked:',x)
show()

然后就可以获取单应性矩阵，并根据单应性矩阵得到俯视图了（特别注意：为了获取俯视图，要求进行映射的四个点在俯视视图中是一个矩形，因为就是将这四个点映射到俯视视图指定像素上，这样才方便具体实践）

import cv2
import numpy as np

img = cv2.imread(r"water.jpg")
w, h, _ = img.shape

width,height = 250,350  #所需图像大小

#所需图像部分四个顶点的像素点坐标
pts1 = np.float32([[658, 227],[943, 286],[516, 527],[837, 597]])  #所需图像部分四个顶点的像素点坐标
pts2 = np.float32([[0+100,0+100],[0.2*width+100,0+100],[0+100,0.2*height+100],[0.2*width+100,0.2*height+100]]) #定义对应的像素点坐标(特别注意这里是找的四个点映射到俯视图后的坐标)
matrix_K = cv2.getPerspectiveTransform(pts1,pts2)  #使用getPerspectiveTransform()得到转换矩阵
print(matrix_K)
img_K = cv2.warpPerspective(img,matrix_K,(width,height))  #使用warpPerspective()进行透视变换

cv2.imshow("Original Image",img)
cv2.imshow("img K",img_K)

cv2.waitKey(0)

以上代码只是一种理想情况，也就是需要找到水面上一个矩形框，知道矩形框实际大小，水位在变化，找一个理想的矩形框也很难。可不可以只找某一平面的四个点的坐标，然后水位变化也可以呢，以下代码可以解决此工程问题，参考：

import numpy as np
import cv2
from shapely.geometry import Polygon, LineString
from shapely.affinity import rotate
import rasterio
import argparse


def get_gcp_mean(dst):
    return np.array(dst).mean(axis=0)

# 从定义的四个感兴趣的点生成矩形 bbox
def set_bbox_from_corners(corners, resolution, dst):
    """
    Establish bbox based on a set of camera perspective corner points Assign corner coordinates to camera
    configuration

    Parameters
    ----------
    corners : list of lists (4)
        [columns, row] coordinates in camera perspective

    """
    assert (np.array(corners).shape == (4,
                                        2)), f"a list of lists of 4 coordinates must be given, resulting in (4, " \
                                             f"2) shape. Current shape is {corners.shape} "

    # get homography, this is the only place besides self.get_M where cv.get_M is used.
    M_gcp = get_M()
    # TODO: make derivation dependent on 3D or 2D point availability
    # if self.gcps["src"].shape == 3:
    #     # TODO: homography from solvepnp
    bbox = get_aoi(M_gcp, corners, resolution=0.05)
    # bbox is offset by self.gcp_mean. Regenerate bbox after adding offset
    bbox_xy = np.array(bbox.exterior.coords)
    # 在计算get_M()的时候减去了gcp_mean，所以现在要加gcp_mean
    gcp_mean = np.array(dst).mean(axis=0)
    bbox_xy += gcp_mean[0:2]
    bbox = Polygon(bbox_xy)
    return bbox, bbox_xy[0:4]

# 根据像素坐标系中感兴趣的点和单应性矩阵得到矩形框
def get_aoi(M, src_corners, resolution=None):
    """Get rectangular AOI from 4 user defined points within frames.

    Parameters
    ----------
    src : list of tuples
        (col, row) pairs of ground control points
    dst : list of tuples
        projected (x, y) coordinates of ground control points
    src_corners : dict with 4 (x,y) tuples
        names "up_left", "down_left", "up_right", "down_right", source corners

    Returns
    -------
    aoi : shapely.geometry.Polygon
        bounding box of aoi (rotated)
    """
    # retrieve the M transformation matrix for the conditions during GCP. These are used to define the AOI so that
    # dst AOI remains the same for any movie
    # prepare a simple temporary np.array of the src_corners
    _src_corners = np.array(src_corners)
    assert(_src_corners.shape==(4, 2)), f"a list of lists of 4 coordinates must be given, resulting in (4, 2) shape. " \
                                        f"Current shape is {src_corners.shape} "
    # reproject corner points to the actual space in coordinates
    _dst_corners = cv2.perspectiveTransform(np.float32([_src_corners]), M)[0]
    polygon = Polygon(_dst_corners)
    coords = np.array(polygon.exterior.coords)
    # estimate the angle of the bounding box
    # retrieve average line across AOI
    point1 = (coords[0] + coords[3]) / 2
    point2 = (coords[1] + coords[2]) / 2
    diff = point2 - point1
    angle = np.arctan2(diff[1], diff[0])
    # rotate the polygon over this angle to get a proper bounding box
    polygon_rotate = rotate(
        polygon, -angle, origin=tuple(_dst_corners[0]), use_radians=True
    )
    xmin, ymin, xmax, ymax = polygon_rotate.bounds
    if resolution is not None:
        xmin = round_to_multiple(xmin, resolution)
        xmax = round_to_multiple(xmax, resolution)
        ymin = round_to_multiple(ymin, resolution)
        ymax = round_to_multiple(ymax, resolution)
    # 这个地方返回的就是一个矩形框了
    bbox_coords = [(xmin, ymax), (xmax, ymax), (xmax, ymin), (xmin, ymin), (xmin, ymax)]
    bbox = Polygon(bbox_coords)
    # now rotate back
    bbox = rotate(bbox, angle, origin=tuple(_dst_corners[0]), use_radians=True)
    return bbox


def round_to_multiple(number, multiple):
    return multiple * round(number / multiple)



def get_bbox(bbox, dst, camera=False, h_a=None, redistort=False):
    """

    Parameters
    ----------
    camera : bool, optional
        If set, the bounding box will be returned as row and column coordinates in the camera perspective.
        In this case ``h_a`` may be set to provide the right water level, to estimate the bounding box for.
    h_a : float, optional
        If set with ``camera=True``, then the bbox coordinates will be transformed to the camera perspective,
        using h_a as a present water level. In case a video with higher (lower) water levels is used, this
        will result in a different perspective plane than the control video.
    redistort : bool, optional
        If set in combination with ``camera``, the bbox will be redistorted in the camera objective using the
        distortion coefficients and camera matrix. Not used in orthorectification because this occurs by default
        on already undistorted images.

    Returns
    -------
    A bounding box, that in the used CRS is perfectly rectangular, and aligned in the up/downstream direction.
    It can (and certainly will) be rotated with respect to a typical bbox with xlim, ylim coordinates.
    If user sets ``camera=True`` then the geographical bounding box will be converted into a camera perspective,
    using the homography belonging to the available ground control points and current water level.

    This can then be used to reconstruct the grid for velocimetry calculations.
    """
    bbox = bbox
    if camera:
        coords = np.array(bbox.exterior.coords)
        # convert to perspective rowcol coordinates
        M = get_M(reverse=True, h_a=h_a)
        # reduce coords by control point mean
        gcp_mean = np.array(dst).mean(axis=0)
        coords -= gcp_mean[0:2]
        # TODO: re-distort if needed
        corners = cv2.perspectiveTransform(np.float32([coords]), M)[0]
        # if redistort:
        #     # for visualization on still distorted frames this can be done. DO NOT do this if used for
        #     # orthorectification, as this typically occurs on undistorted images.
        #     corners = cv.undistort_points(corners, self.camera_matrix, self.dist_coeffs, reverse=True)

        bbox = Polygon(corners)
    return bbox


def get_M(h_a=None, to_bbox_grid=False, reverse=False):
    """Establish a transformation matrix for a certain actual water level `h_a`. This is done by mapping where the
    ground control points, measured at `h_ref` will end up with new water level `h_a`, given the lens position.

    Parameters
    ----------
    h_a : float, optional
        actual water level [m] (Default: None)
    to_bbox_grid : bool, optional
        if set, the M will be computed in row, column values of the target bbox, with set resolution
    reverse : bool, optional
        if True, the reverse matrix is prepared, which can be used to transform projected
        coordinates back to the original camera perspective. (Default: False)

    Returns
    -------
    M : np.ndarray
        2x3 transformation matrix

    """
    # src = cv.undistort_points(self.gcps["src"], self.camera_matrix, self.dist_coeffs)
    # print("origin src self.gcps: ", self.gcps["src"])
    # print("get_m src:",src)
    if to_bbox_grid:
        # lookup where the destination points are in row/column space
        # dst_a is the destination point locations position with the actual water level
        dst_a = transform_to_bbox(
            get_dst_a(h_a),
            bbox,
            resolution
        )
        dst_a = np.array(dst_a)
    else:
        # in case we are dealing with a 2D 4-point, then reproject points on water surface, else keep 3D points
        # 将参考平面四个点映射到水面
        dst_a = get_dst_a(h_a) if gcp_dims == 2 else dst

        # reduce dst_a with its mean to get much more accurate projection result in case x and y order of
        # magnitude is very large
        gcp_mean = np.array(dst).mean(axis=0)
        dst_a -= gcp_mean
    # src_a = self.get_src(**lens_pars)
    if dst_a.shape[-1] == 3:
        # compute the water level in the coordinate system reduced with the mean gcp coordinate
        z_a = get_z_a(h_a)
        gcp_mean = np.array(dst).mean(axis=0)
        z_a -= gcp_mean[-1]
        # treating 3D homography
        return get_M_3D(
            src=src,
            dst=dst_a,
            camera_matrix=camera_matrix,
            dist_coeffs=dist_coeffs,
            z=z_a,
            reverse=reverse
        )
    else:
        return get_M_2D(
            src=src,
            dst=dst_a,
            reverse=reverse
        )


# 根据四个对应的2D坐标点得到单应性矩阵
def get_M_2D(src, dst, reverse=False):
    """
    Retrieve homography matrix for between (4) src and (4) dst points with only x, y coordinates (no z)

    Parameters
    ----------
    src : list of lists
        [x, y] with source coordinates, typically cols and rows in image
    dst : list of lists
        [x, y] with target coordinates after reprojection, can e.g. be in crs [m]
    reverse : bool, optional
        If set, the reverse homography to back-project to camera objective will be retrieved

    Returns
    -------
    M : np.ndarray
        homography matrix (3x3), used in cv2.warpPerspective
    """
    # set points to float32
    _src = np.float32(src)
    _dst = np.float32(dst)
    # define transformation matrix based on GCPs
    if reverse:
        M = cv2.getPerspectiveTransform(_dst, _src)
    else:
        M = cv2.getPerspectiveTransform(_src, _dst)
    return M

# 先得到相机外参（需要内参已知），于是就得到了像素到世界坐标（要求的俯视平面为x-y平面，z=0）的映射矩阵，而我们要求的单应矩阵其实就是这样的一个映射
def get_M_3D(src, dst, camera_matrix, dist_coeffs=np.zeros((1, 4)), z=0., reverse=False):
    """
    Retrieve homography matrix for between (6+) 2D src and (6+) 3D dst (x, y, z) points

    Parameters
    ----------
    src : list of lists
        [x, y] with source coordinates, typically cols and rows in image
    dst : list of lists
        [x, y, z] with target coordinates after reprojection, can e.g. be in crs [m]
    camera_matrix : np.ndarray (3x3)
        Camera intrinsic matrix
    dist_coeffs : p.ndarray, optional
        1xN array with distortion coefficients (N = 4, 5 or 8)
    z : float, optional
        Elevation plane (real-world coordinate crs) of projected image
    reverse : bool, optional
        If set, the reverse homography to back-project to camera objective will be retrieved

    Returns
    -------
    M : np.ndarray
        homography matrix (3x3), used in cv2.warpPerspective

    Notes
    -----
    See rectification workflow OpenCV
    http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html
    Code based on:
    https://www.openearth.nl/flamingo/_modules/flamingo/rectification/rectification.html

    """
    # set points to float32
    _src = np.float32(src)
    _dst = np.float32(dst)
    # import pdb;pdb.set_trace()
    camera_matrix = np.float32(camera_matrix)
    dist_coeffs = np.float32(dist_coeffs)
    # define transformation matrix based on GCPs
    success, rvec, tvec = cv2.solvePnP(_dst, _src, camera_matrix, dist_coeffs)
    # convert rotation vector to rotation matrix
    R = cv2.Rodrigues(rvec)[0]
    # assume height of projection plane
    R[:, 2] = R[:, 2] * z
    # add translation vector
    R[:, 2] = R[:, 2] + tvec.flatten()
    # compute homography
    if reverse:
        # From perspective to objective
        M = np.dot(camera_matrix, R)
    else:
        # from objective to perspective
        M = np.linalg.inv(np.dot(camera_matrix, R))
    # normalize homography before returning
    return M / M[-1, -1]


def transform_to_bbox(coords, bbox, resolution):
    """transforms a set of coordinates defined in crs of bbox, into a set of coordinates in cv2 compatible pixels

    Parameters
    ----------
    coords : list of lists
        [x, y] with coordinates
    bbox : shapely Polygon
        Bounding box. The coordinate order is very important and has to be upstream-left, downstream-left,
        downstream-right, upstream-right, upstream-left
    resolution : float
        resolution of target pixels within bbox

    Returns
    -------
    colrows : list
        tuples of columns and rows

    """
    # first assemble x and y coordinates
    xs, ys = zip(*coords)
    transform = _get_transform(bbox, resolution)
    rows, cols = rasterio.transform.rowcol(transform, xs, ys, op=float)
    return list(zip(cols, rows))



def _get_transform(bbox, resolution=0.01):
    """Return a rotated Affine transformation that fits with the bounding box and resolution.

    Parameters
    ----------
    bbox : shapely.geometry.Polygon
        polygon of bounding box. The coordinate order is very important and has to be:
        (upstream-left, downstream-left, downstream-right, upstream-right, upstream-left)
    resolution : float, optional
        resolution of target grid in meters (default: 0.01)

    Returns
    -------
    affine : rasterio.transform.Affine
    """
    corners = np.array(bbox.exterior.coords)
    # estimate the angle of the bounding box
    top_left_x, top_left_y = corners[0]
    # retrieve average line across AOI
    point1 = corners[0]
    point2 = corners[1]
    diff = point2 - point1
    # compute the angle of the projected bbox area of interest
    angle = np.arctan2(diff[1], diff[0])
    # compute per col the x and y diff
    dx_col, dy_col = np.cos(angle) * resolution, np.sin(angle) * resolution
    # compute per row the x and y offsets
    dx_row, dy_row = (
        np.cos(angle + 1.5 * np.pi) * resolution,
        np.sin(angle + 1.5 * np.pi) * resolution,
    )
    return rasterio.transform.Affine(
        dx_col, dy_col, top_left_x, dx_row, dy_row, top_left_y
    )

# 将参考平面四个点映射到水面
def get_dst_a(h_a=0.0, h_ref=0.0, lens_position=None):
    """
    h_a : float, optional
        actual water level measured [m], if not set, assumption is that a single video
        is processed and thus changes in water level are not relevant. (default: None)

    Returns
    -------
    Actual locations of control points (in case these are only x, y) given the current set water level and
    the camera location
    """
    # map where the destination points are with the actual water level h_a.
    if h_a is None or h_a == h_ref:
        # fill in the same value for h_ref and h_a
        dst_a = dst
    else:
        h_ref = h_ref
        lens_position = lens_position
        dst_a = _get_gcps_a(
            lens_position,
            h_a,
            dst,
            z_0,
            h_ref,
        )
    return dst_a

# 根据三角形的相似原理计算真实水面四个点的位置
def _get_gcps_a(lensPosition, h_a, coords, z_0=0.0, h_ref=0.0):
    """Get the actual x, y locations of ground control points at the actual water level

    Parameters
    ----------
    lensPosition : list of floats
        x, y, z location of cam in local crs [m]
    h_a : float
        actual water level in local level measuring system [m]
    coords : list of lists
        gcp coordinates  [x, y] in original water level
    z_0 : float, optional
        reference zero plain level, i.e. the crs amount of meters of the zero level of staff gauge (default: 0.0)
    h_ref : float, optional
        reference water level during taking of gcp coords with ref to staff gauge zero level (default: 0.0)

    Returns
    -------
    coords : list
        rows/cols for use in getPerspectivetransform

    """
    # get modified gcps based on camera location and elevation values
    cam_x, cam_y, cam_z = lensPosition
    x, y = zip(*coords)
    # compute the z during gcp coordinates
    z_ref = h_ref + z_0
    # compute z during current frame
    z_a = z_0 + h_a
    # compute the water table to camera height difference during field referencing
    cam_height_ref = cam_z - z_ref
    # compute the actual water table to camera height difference
    cam_height_a = cam_z - z_a
    rel_diff = cam_height_a / cam_height_ref
    # apply the diff on all coordinate, both x, and y directions
    _dest_x = list(cam_x + (np.array(x) - cam_x) * rel_diff)
    _dest_y = list(cam_y + (np.array(y) - cam_y) * rel_diff)
    dest_out = list(zip(_dest_x, _dest_y))
    return dest_out


def get_z_a(h_a=None):
    """
    h_a : float, optional
        actual water level measured [m], if not set, assumption is that a single video
        is processed and thus changes in water level are not relevant. (default: None)

    Returns
    -------
    Actual locations of control points (in case these are only x, y) given the current set water level and
    the camera location
    """
    if h_a is None:
        return z_0
    else:
        return z_0 + (h_a - h_ref)


def shape():
    """
    Returns rows and columns in projected frames from ``Frames.project``

    Returns
    -------
    rows : int
        Amount of rows in projected frame
    cols : int
        Amount of columns in projected frame
    """
    cols, rows = _get_shape(
        bbox,
        resolution=resolution,
        round=1
    )
    return rows, cols


def _get_shape(bbox, resolution=0.01, round=1):
    """
    defines the number of rows and columns needed in a target raster, to fit a given bounding box.

    :param bbox: shapely Polygon, bounding box
    :param resolution: resolution of target raster
    :param round: number of pixels to round intended shape to
    :return: numbers of rows and columns for target raster
    """
    coords = bbox.exterior.coords
    box_length = LineString(coords[0:2]).length
    box_width = LineString(coords[1:3]).length
    cols = int(np.ceil((box_length / resolution) / round)) * round
    rows = int(np.ceil((box_width / resolution) / round)) * round
    return cols, rows

if __name__ == '__main__':

    # 需要4个对应的像素坐标点和真实物理坐标系点
    src=[
        [1421, 1001],
        [1251, 460],
        [421, 432],
        [470, 607]
    ]
    dst=[
        [642735.8076, 8304292.1190],  # lowest right coordinate
        [642737.5823, 8304295.593],  # highest right coordinate
        [642732.7864, 8304298.4250],  # highest left coordinate
        [642732.6705, 8304296.8580]  # highest right coordinate
    ]
    corners = [
        [292, 817],
        [50, 166],
        [1200, 236],
        [1600, 834]
    ]
    # 转换为俯视图每个像素对应的距离，单位为m
    resolution = 0.01
    # 参考点的高度
    h_ref = 0.0
    # 水面的高度
    h_a = 0.0
    # 摄像头在世界坐标系下的位置
    lens_position = None
    # 基准海拔
    z_0 = 1182.2
    gcp_dims = 2
    camera_matrix = None
    dist_coeffs = None

    flags = cv2.INTER_AREA
    # 根据corners在世界坐标系上求得一个适度的矩形框
    bbox, bbox_xy = set_bbox_from_corners(corners, resolution, dst)
    # 将这个矩形区域映射到像素坐标系下
    src_v2 = get_bbox(bbox, dst, camera=True, h_a=0.0).exterior.coords[0:4]
    # 获取要映射的dst地址大小，最简单的方式就是直接设置成 [[0,0],[width,0],[0,height],[width,height]]
    # 这里作者为了实现一个像素对应0.01m，特意设置了一个坐标
    # 注意，以上能够实现视角转换全得益于矩形框在世界坐标系下是矩形框
    dst_xy = bbox_xy
    dst_v2 = transform_to_bbox(dst_xy, bbox, resolution)
    # 得到转换矩阵
    M = get_M_2D(src_v2, dst_v2)

    # 应用转换矩阵得到俯视图
    # img = cv2.imread('ng_test.png')
    video_path = r'ngwerere_20191103.mp4'
    cap = cv2.VideoCapture(video_path)
    frame_list = []
    while True:
        ret, frame = cap.read()
        if ret:
            frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
            frame_list.append(frame)
        else:
            break
    shape = shape()
    img1 = frame_list[0]
    img2 = frame_list[1]
    final_frame1 = cv2.warpPerspective(img1, M, shape, flags=flags)
    final_frame2 = cv2.warpPerspective(img2, M, shape, flags=flags)
    cv2.imwrite("transfer_ng1v2_1.png", final_frame1)
    cv2.imwrite("transfer_ng2v2_2.png", final_frame2)

在得到俯视图后就可以进一步求解面积，速度等变量了。
根据水纹求解速度有两类方法，一类为 Space-time image velocimetry(stiv)，根据波纹得到纹理图，然后再根据纹理角得到速度。
另一类为Particle Image Velocimetry(piv)家族基于图像测流速方法，也是从水面中找特殊描述子然后进行追踪。
以下代码为stiv部分实现。

import datetime
import cv2
import numpy as np
import math
import time
import os
import m3u8
import requests
import eventlet


def get_alpha_confidence(img, window = 3):
    length = min(img.shape[0], img.shape[1])
    length = length if length % 2 == 0 else length -1
    img = img[0:length, 0:length]

    #计算水平方向边缘信息
    scharrx = cv2.Scharr(img,cv2.CV_64F,1,0)
    #计算垂直方向边缘信息
    scharry = cv2.Scharr(img,cv2.CV_64F,0,1)

    step = int(length / window)

    alpha_list = []
    c_list = []

    for i in range(window):
        for j in range(window):
            Gxx = np.sum(scharrx[i*step: (i+1)*step, j*step: (j+1)*step] * scharrx[i*step: (i+1)*step, j*step: (j+1)*step])
            Gxy = np.sum(scharrx[i*step: (i+1)*step, j*step: (j+1)*step] * scharry[i*step: (i+1)*step, j*step: (j+1)*step])
            Gyy = np.sum(scharry[i*step: (i+1)*step, j*step: (j+1)*step] * scharry[i*step: (i+1)*step, j*step: (j+1)*step])
            # print((2 * Gxy) / (Gyy - Gxx))
            alpha = (math.atan((2 * Gxy) / (Gyy - Gxx)) / 2) * 180 / math.pi
            if alpha < 0:
                alpha = 90 + alpha

            # 计算清晰置信度
            c = math.sqrt((Gxx - Gyy) ** 2 + 4 * (Gxy ** 2)) / (Gxx + Gyy)
            alpha_list.append(alpha)
            c_list.append(c)

    f_alpha = 0
    c_sum = 0
    for i in range(len(alpha_list)):
        f_alpha = f_alpha + alpha_list[i] * c_list[i]
        c_sum = c_sum + c_list[i]
    print(alpha_list)
    print(c_list)
    f_alpha = f_alpha / c_sum

    return f_alpha


def timeout(_time):
    with eventlet.Timeout(_time, False):  # 设置超时间
        time.sleep(2)
        return 'str'
    return '不好意思函数调用超时'


def get_xt_img(url, max_count=15, frames=200, width_start=267, width_end=350, height_start=334):
    size = 0
    time_cur = str(time.strftime('%Y-%m-%d-%H-%M-%S'))
    check = []

    img_path = os.path.join("path-to-save")
    if not os.path.exists(img_path):
        os.makedirs(img_path)

    # video_path = img_path + "\\" + 'stiv_post_ljx_online.mp4'
    video_path = "path-to-video"
    if os.path.exists(video_path):
        # 删除文件
        try:
            os.remove(video_path)
        except Exception as r:
            print("delete online.mp4 wrong {}".format(r))

    i = 0
    while i < max_count:
        try:
            print("check: {}".format(check))
            playlist = m3u8.load(uri=url, headers=headers, timeout=3)
            for seg in playlist.segments:
                if seg.absolute_uri[seg.absolute_uri.find("?") - 5:seg.absolute_uri.find("?")] not in check:
                    check.append(seg.absolute_uri[seg.absolute_uri.find("?") - 5:seg.absolute_uri.find("?")])
                else:
                    continue
                with open(video_path, "ab") as f:
                    r = requests.get(seg.absolute_uri, headers=headers, timeout=5)
                    data = r.content
                    size += len(data)
                    if data:
                        f.write(data)

            i = i + 1
        except:
            i = i + 1

        timeout(3)

    frame_img = None
    if os.path.exists(video_path):
        cap = cv2.VideoCapture(video_path)
        c = 0
        frame_list = []
        while True:
            ret, frame = cap.read()

            # 旋转一下
            alpha_rotate = math.atan(81 / 171) * 180 / math.pi
            if frame is not None:
                frame = rotate(frame, -alpha_rotate)
            if c == 0:
                cv2.imwrite("xuanzh.png", frame)
            if ret:
                frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
                frame_list.append(frame)
                c += 1
            else:
                print("所有帧都已经保存完成")
                break

        # frame_sample = frame_list[::5]
        # 生成 x-t 图像
        xt_img = np.zeros((frames, width_end-width_start), np.uint8)
        for k in range(frames):
            # print("len frame: ", len(frame_list))
            cur_frame = cv2.cvtColor(frame_list[k], cv2.COLOR_BGR2GRAY)
            xt_img[k, :] = np.flip(cur_frame[height_start, width_start:width_end])

    return xt_img, time_cur


def rotate(image, angle, center=None, scale=1.0):
    # grab the dimensions of the image
    (h, w) = image.shape[:2]

    # if the center is None, initialize it as the center of
    # the image
    if center is None:
        center = (w // 2, h // 2)

    # perform the rotation
    M = cv2.getRotationMatrix2D(center, angle, scale)
    rotated = cv2.warpAffine(image, M, (w, h))

    # return the rotated image
    return rotated


if __name__ == '__main__':
   
    url = r"url-to-get-video"
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36 Edg/86.0.622.56",
        "Connection": "close"
    }

    image, time_cur = get_xt_img(url)
    cv2.imwrite('lyx_test.png', image)
    alpha = get_alpha_confidence(image)
    print(alpha)
    cv2.imshow("xt image", image)

    cv2.waitKey(0)