相机坐标_世界坐标转换

工程中有些图像处理任务需要真实物理尺寸,本文讲解如何根据图像坐标系与真实物理坐标系之间的对应关系从图像中获取真实物理尺寸。
参考:

针孔相机模型

先了解一下常见的几种3D几何变换


单目相机成像模型如下


需要经过如下步骤

几种变换求解

首先讲清楚需求:给定一个图像,要转换成俯视图视角。

由于3D空间点映射到2D相平面是存在不可逆的信息损失的(基深度方向的信息损失),所以不可能单纯根据一个图像就还原成俯视图视角下的图像(其他视角也不行)。但是如果这些点在同一个平面上,是可以还原成垂直这个平面视角的图像的。

一种思路就是在图像中选取四个点,要求这四个点在俯视图中要是一个矩形框,根据点的对应关系可以得到单应性矩阵,然后根据单应性矩阵得到转换后的图像。

还有一种思路是根据PnP方法求得相机位姿,然后选一个矩形框,根据相机位姿得到对应的像素坐标,然后再根据像素坐标和这个框坐标的对应关系获取单应性矩阵。当然这种思路的意义就是可以从相机获取相机内参及方位角(世界坐标系可以建在以相机为原点的位置,这样及省去了位移),从而即使相机有变动也不用再去现场找矩形框对应的点,但是具体没实践过。

关于怎样通过相机内参及外参,将该视角下的图像转换为另一视角下的图像。具体过程就是先求当前视角转换到另一视角的矩阵,得到当前视角相机坐标系上的点映射到另一视角相机坐标系上的点的矩阵。通过相机内存矩阵求逆得到当前视角像素坐标系映射到当前视角相机坐标系的矩阵,然后在将当前视角相机坐标系下的点乘以之前求到的视角转换矩阵就得到另一视角相机坐标系下的点,然后乘以相机内参矩阵映射到像素坐标。参考:https://blog.csdn.net/u013019296/article/details/119259585

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 相机内参矩阵
Mat cameraMatrix = (Mat_<double>(3, 3) << 700.0, 0.0, 320.0,
0.0, 700.0, 240.0,
0, 0, 1);
# 当前视角相机外参
Mat R1 = c1Mo(Range(0, 3), Range(0, 3));
# 另一视角相机外参
Mat R2 = c2Mo(Range(0, 3), Range(0, 3));
//c1Mo * oMc2
# 当前视角相加坐标系下的点转换到另一视角相机坐标系下的转换矩阵,这里直接转置是求逆与求转置等价
Mat R_2to1 = R1*R2.t();//同一平面两个不同相机坐标系的单应矩阵
// [compute-homography]
# 最后将三者相乘,得到不同视角下的转换矩阵
Mat H = cameraMatrix * R_2to1 * cameraMatrix.inv();//同一平面计算出两个图像间的单应矩阵H
H /= H.at<double>(2, 2);//归一化
cout << "H:\n" << H << endl;
  • 单应性不严谨的定义:用 [无镜头畸变] 的相机从不同位置拍摄 [同一平面物体] 的图像之间存在单应性,可以用 [透视变换] 表示。可以得到一个直接描述图像坐标变换的矩阵。
  • 对极几何约束:根据两个视角中点的对应关系求解相机运动(旋转+平移)的一种方式
  • 三角测量:在使用对极约束后估算点的深度信息的一种方法。
  • PnP(Perspective-n-Point)是求解 3D 到 2D 点对运动的方法。它描述了当我们知道 n 个 3D 空间点以及它们的投影位置时,如何估计相机所在的位姿。在双目或 RGB-D 的视觉里程计中,我们可以直接使用 PnP 估计相机运动。而在单目视觉里程计中,必须先进行初始化,然后才能使用 PnP。3D-2D 方法不需要使用对极约束,又可以在很少的匹配点中获得较好的运动估计,是最重要的一种姿态估计方法。
  • PnP 问题有很多种求解方法,例如用三对点估计位姿的 P3P,直接线性变换(DLT),EPn(Efficient PnP),UPnP 等等)。此外,还能用非线性优化的方式,构建最小二乘问题并迭代求解,也就是万金油式的 Bundle Adjustment。

相关代码

获取图像中点的坐标可以使用以下代码

1
2
3
4
5
6
7
8
9
from PIL import Image
from pylab import *
im = array(Image.open(r"water.jpg"))
imshow(im)
print('Please click 4 points')
x =ginput(4)
print('you clicked:',x)
show()

然后就可以获取单应性矩阵,并根据单应性矩阵得到俯视图了(特别注意:为了获取俯视图,要求进行映射的四个点在俯视视图中是一个矩形,因为就是将这四个点映射到俯视视图指定像素上,这样才方便具体实践)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import cv2
import numpy as np

img = cv2.imread(r"water.jpg")
w, h, _ = img.shape

width,height = 250,350 #所需图像大小

#所需图像部分四个顶点的像素点坐标
pts1 = np.float32([[658, 227],[943, 286],[516, 527],[837, 597]]) #所需图像部分四个顶点的像素点坐标
pts2 = np.float32([[0+100,0+100],[0.2*width+100,0+100],[0+100,0.2*height+100],[0.2*width+100,0.2*height+100]]) #定义对应的像素点坐标(特别注意这里是找的四个点映射到俯视图后的坐标)
matrix_K = cv2.getPerspectiveTransform(pts1,pts2) #使用getPerspectiveTransform()得到转换矩阵
print(matrix_K)
img_K = cv2.warpPerspective(img,matrix_K,(width,height)) #使用warpPerspective()进行透视变换

cv2.imshow("Original Image",img)
cv2.imshow("img K",img_K)

cv2.waitKey(0)

以上代码只是一种理想情况,也就是需要找到水面上一个矩形框,知道矩形框实际大小,水位在变化,找一个理想的矩形框也很难。可不可以只找某一平面的四个点的坐标,然后水位变化也可以呢,以下代码可以解决此工程问题,参考:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
import numpy as np
import cv2
from shapely.geometry import Polygon, LineString
from shapely.affinity import rotate
import rasterio
import argparse


def get_gcp_mean(dst):
return np.array(dst).mean(axis=0)

# 从定义的四个感兴趣的点生成矩形 bbox
def set_bbox_from_corners(corners, resolution, dst):
"""
Establish bbox based on a set of camera perspective corner points Assign corner coordinates to camera
configuration

Parameters
----------
corners : list of lists (4)
[columns, row] coordinates in camera perspective

"""
assert (np.array(corners).shape == (4,
2)), f"a list of lists of 4 coordinates must be given, resulting in (4, " \
f"2) shape. Current shape is {corners.shape} "

# get homography, this is the only place besides self.get_M where cv.get_M is used.
M_gcp = get_M()
# TODO: make derivation dependent on 3D or 2D point availability
# if self.gcps["src"].shape == 3:
# # TODO: homography from solvepnp
bbox = get_aoi(M_gcp, corners, resolution=0.05)
# bbox is offset by self.gcp_mean. Regenerate bbox after adding offset
bbox_xy = np.array(bbox.exterior.coords)
# 在计算get_M()的时候减去了gcp_mean,所以现在要加gcp_mean
gcp_mean = np.array(dst).mean(axis=0)
bbox_xy += gcp_mean[0:2]
bbox = Polygon(bbox_xy)
return bbox, bbox_xy[0:4]

# 根据像素坐标系中感兴趣的点和单应性矩阵得到矩形框
def get_aoi(M, src_corners, resolution=None):
"""Get rectangular AOI from 4 user defined points within frames.

Parameters
----------
src : list of tuples
(col, row) pairs of ground control points
dst : list of tuples
projected (x, y) coordinates of ground control points
src_corners : dict with 4 (x,y) tuples
names "up_left", "down_left", "up_right", "down_right", source corners

Returns
-------
aoi : shapely.geometry.Polygon
bounding box of aoi (rotated)
"""
# retrieve the M transformation matrix for the conditions during GCP. These are used to define the AOI so that
# dst AOI remains the same for any movie
# prepare a simple temporary np.array of the src_corners
_src_corners = np.array(src_corners)
assert(_src_corners.shape==(4, 2)), f"a list of lists of 4 coordinates must be given, resulting in (4, 2) shape. " \
f"Current shape is {src_corners.shape} "
# reproject corner points to the actual space in coordinates
_dst_corners = cv2.perspectiveTransform(np.float32([_src_corners]), M)[0]
polygon = Polygon(_dst_corners)
coords = np.array(polygon.exterior.coords)
# estimate the angle of the bounding box
# retrieve average line across AOI
point1 = (coords[0] + coords[3]) / 2
point2 = (coords[1] + coords[2]) / 2
diff = point2 - point1
angle = np.arctan2(diff[1], diff[0])
# rotate the polygon over this angle to get a proper bounding box
polygon_rotate = rotate(
polygon, -angle, origin=tuple(_dst_corners[0]), use_radians=True
)
xmin, ymin, xmax, ymax = polygon_rotate.bounds
if resolution is not None:
xmin = round_to_multiple(xmin, resolution)
xmax = round_to_multiple(xmax, resolution)
ymin = round_to_multiple(ymin, resolution)
ymax = round_to_multiple(ymax, resolution)
# 这个地方返回的就是一个矩形框了
bbox_coords = [(xmin, ymax), (xmax, ymax), (xmax, ymin), (xmin, ymin), (xmin, ymax)]
bbox = Polygon(bbox_coords)
# now rotate back
bbox = rotate(bbox, angle, origin=tuple(_dst_corners[0]), use_radians=True)
return bbox


def round_to_multiple(number, multiple):
return multiple * round(number / multiple)



def get_bbox(bbox, dst, camera=False, h_a=None, redistort=False):
"""

Parameters
----------
camera : bool, optional
If set, the bounding box will be returned as row and column coordinates in the camera perspective.
In this case ``h_a`` may be set to provide the right water level, to estimate the bounding box for.
h_a : float, optional
If set with ``camera=True``, then the bbox coordinates will be transformed to the camera perspective,
using h_a as a present water level. In case a video with higher (lower) water levels is used, this
will result in a different perspective plane than the control video.
redistort : bool, optional
If set in combination with ``camera``, the bbox will be redistorted in the camera objective using the
distortion coefficients and camera matrix. Not used in orthorectification because this occurs by default
on already undistorted images.

Returns
-------
A bounding box, that in the used CRS is perfectly rectangular, and aligned in the up/downstream direction.
It can (and certainly will) be rotated with respect to a typical bbox with xlim, ylim coordinates.
If user sets ``camera=True`` then the geographical bounding box will be converted into a camera perspective,
using the homography belonging to the available ground control points and current water level.

This can then be used to reconstruct the grid for velocimetry calculations.
"""
bbox = bbox
if camera:
coords = np.array(bbox.exterior.coords)
# convert to perspective rowcol coordinates
M = get_M(reverse=True, h_a=h_a)
# reduce coords by control point mean
gcp_mean = np.array(dst).mean(axis=0)
coords -= gcp_mean[0:2]
# TODO: re-distort if needed
corners = cv2.perspectiveTransform(np.float32([coords]), M)[0]
# if redistort:
# # for visualization on still distorted frames this can be done. DO NOT do this if used for
# # orthorectification, as this typically occurs on undistorted images.
# corners = cv.undistort_points(corners, self.camera_matrix, self.dist_coeffs, reverse=True)

bbox = Polygon(corners)
return bbox


def get_M(h_a=None, to_bbox_grid=False, reverse=False):
"""Establish a transformation matrix for a certain actual water level `h_a`. This is done by mapping where the
ground control points, measured at `h_ref` will end up with new water level `h_a`, given the lens position.

Parameters
----------
h_a : float, optional
actual water level [m] (Default: None)
to_bbox_grid : bool, optional
if set, the M will be computed in row, column values of the target bbox, with set resolution
reverse : bool, optional
if True, the reverse matrix is prepared, which can be used to transform projected
coordinates back to the original camera perspective. (Default: False)

Returns
-------
M : np.ndarray
2x3 transformation matrix

"""
# src = cv.undistort_points(self.gcps["src"], self.camera_matrix, self.dist_coeffs)
# print("origin src self.gcps: ", self.gcps["src"])
# print("get_m src:",src)
if to_bbox_grid:
# lookup where the destination points are in row/column space
# dst_a is the destination point locations position with the actual water level
dst_a = transform_to_bbox(
get_dst_a(h_a),
bbox,
resolution
)
dst_a = np.array(dst_a)
else:
# in case we are dealing with a 2D 4-point, then reproject points on water surface, else keep 3D points
# 将参考平面四个点映射到水面
dst_a = get_dst_a(h_a) if gcp_dims == 2 else dst

# reduce dst_a with its mean to get much more accurate projection result in case x and y order of
# magnitude is very large
gcp_mean = np.array(dst).mean(axis=0)
dst_a -= gcp_mean
# src_a = self.get_src(**lens_pars)
if dst_a.shape[-1] == 3:
# compute the water level in the coordinate system reduced with the mean gcp coordinate
z_a = get_z_a(h_a)
gcp_mean = np.array(dst).mean(axis=0)
z_a -= gcp_mean[-1]
# treating 3D homography
return get_M_3D(
src=src,
dst=dst_a,
camera_matrix=camera_matrix,
dist_coeffs=dist_coeffs,
z=z_a,
reverse=reverse
)
else:
return get_M_2D(
src=src,
dst=dst_a,
reverse=reverse
)


# 根据四个对应的2D坐标点得到单应性矩阵
def get_M_2D(src, dst, reverse=False):
"""
Retrieve homography matrix for between (4) src and (4) dst points with only x, y coordinates (no z)

Parameters
----------
src : list of lists
[x, y] with source coordinates, typically cols and rows in image
dst : list of lists
[x, y] with target coordinates after reprojection, can e.g. be in crs [m]
reverse : bool, optional
If set, the reverse homography to back-project to camera objective will be retrieved

Returns
-------
M : np.ndarray
homography matrix (3x3), used in cv2.warpPerspective
"""
# set points to float32
_src = np.float32(src)
_dst = np.float32(dst)
# define transformation matrix based on GCPs
if reverse:
M = cv2.getPerspectiveTransform(_dst, _src)
else:
M = cv2.getPerspectiveTransform(_src, _dst)
return M

# 先得到相机外参(需要内参已知),于是就得到了像素到世界坐标(要求的俯视平面为x-y平面,z=0)的映射矩阵,而我们要求的单应矩阵其实就是这样的一个映射
def get_M_3D(src, dst, camera_matrix, dist_coeffs=np.zeros((1, 4)), z=0., reverse=False):
"""
Retrieve homography matrix for between (6+) 2D src and (6+) 3D dst (x, y, z) points

Parameters
----------
src : list of lists
[x, y] with source coordinates, typically cols and rows in image
dst : list of lists
[x, y, z] with target coordinates after reprojection, can e.g. be in crs [m]
camera_matrix : np.ndarray (3x3)
Camera intrinsic matrix
dist_coeffs : p.ndarray, optional
1xN array with distortion coefficients (N = 4, 5 or 8)
z : float, optional
Elevation plane (real-world coordinate crs) of projected image
reverse : bool, optional
If set, the reverse homography to back-project to camera objective will be retrieved

Returns
-------
M : np.ndarray
homography matrix (3x3), used in cv2.warpPerspective

Notes
-----
See rectification workflow OpenCV
http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html
Code based on:
https://www.openearth.nl/flamingo/_modules/flamingo/rectification/rectification.html

"""
# set points to float32
_src = np.float32(src)
_dst = np.float32(dst)
# import pdb;pdb.set_trace()
camera_matrix = np.float32(camera_matrix)
dist_coeffs = np.float32(dist_coeffs)
# define transformation matrix based on GCPs
success, rvec, tvec = cv2.solvePnP(_dst, _src, camera_matrix, dist_coeffs)
# convert rotation vector to rotation matrix
R = cv2.Rodrigues(rvec)[0]
# assume height of projection plane
R[:, 2] = R[:, 2] * z
# add translation vector
R[:, 2] = R[:, 2] + tvec.flatten()
# compute homography
if reverse:
# From perspective to objective
M = np.dot(camera_matrix, R)
else:
# from objective to perspective
M = np.linalg.inv(np.dot(camera_matrix, R))
# normalize homography before returning
return M / M[-1, -1]


def transform_to_bbox(coords, bbox, resolution):
"""transforms a set of coordinates defined in crs of bbox, into a set of coordinates in cv2 compatible pixels

Parameters
----------
coords : list of lists
[x, y] with coordinates
bbox : shapely Polygon
Bounding box. The coordinate order is very important and has to be upstream-left, downstream-left,
downstream-right, upstream-right, upstream-left
resolution : float
resolution of target pixels within bbox

Returns
-------
colrows : list
tuples of columns and rows

"""
# first assemble x and y coordinates
xs, ys = zip(*coords)
transform = _get_transform(bbox, resolution)
rows, cols = rasterio.transform.rowcol(transform, xs, ys, op=float)
return list(zip(cols, rows))



def _get_transform(bbox, resolution=0.01):
"""Return a rotated Affine transformation that fits with the bounding box and resolution.

Parameters
----------
bbox : shapely.geometry.Polygon
polygon of bounding box. The coordinate order is very important and has to be:
(upstream-left, downstream-left, downstream-right, upstream-right, upstream-left)
resolution : float, optional
resolution of target grid in meters (default: 0.01)

Returns
-------
affine : rasterio.transform.Affine
"""
corners = np.array(bbox.exterior.coords)
# estimate the angle of the bounding box
top_left_x, top_left_y = corners[0]
# retrieve average line across AOI
point1 = corners[0]
point2 = corners[1]
diff = point2 - point1
# compute the angle of the projected bbox area of interest
angle = np.arctan2(diff[1], diff[0])
# compute per col the x and y diff
dx_col, dy_col = np.cos(angle) * resolution, np.sin(angle) * resolution
# compute per row the x and y offsets
dx_row, dy_row = (
np.cos(angle + 1.5 * np.pi) * resolution,
np.sin(angle + 1.5 * np.pi) * resolution,
)
return rasterio.transform.Affine(
dx_col, dy_col, top_left_x, dx_row, dy_row, top_left_y
)

# 将参考平面四个点映射到水面
def get_dst_a(h_a=0.0, h_ref=0.0, lens_position=None):
"""
h_a : float, optional
actual water level measured [m], if not set, assumption is that a single video
is processed and thus changes in water level are not relevant. (default: None)

Returns
-------
Actual locations of control points (in case these are only x, y) given the current set water level and
the camera location
"""
# map where the destination points are with the actual water level h_a.
if h_a is None or h_a == h_ref:
# fill in the same value for h_ref and h_a
dst_a = dst
else:
h_ref = h_ref
lens_position = lens_position
dst_a = _get_gcps_a(
lens_position,
h_a,
dst,
z_0,
h_ref,
)
return dst_a

# 根据三角形的相似原理计算真实水面四个点的位置
def _get_gcps_a(lensPosition, h_a, coords, z_0=0.0, h_ref=0.0):
"""Get the actual x, y locations of ground control points at the actual water level

Parameters
----------
lensPosition : list of floats
x, y, z location of cam in local crs [m]
h_a : float
actual water level in local level measuring system [m]
coords : list of lists
gcp coordinates [x, y] in original water level
z_0 : float, optional
reference zero plain level, i.e. the crs amount of meters of the zero level of staff gauge (default: 0.0)
h_ref : float, optional
reference water level during taking of gcp coords with ref to staff gauge zero level (default: 0.0)

Returns
-------
coords : list
rows/cols for use in getPerspectivetransform

"""
# get modified gcps based on camera location and elevation values
cam_x, cam_y, cam_z = lensPosition
x, y = zip(*coords)
# compute the z during gcp coordinates
z_ref = h_ref + z_0
# compute z during current frame
z_a = z_0 + h_a
# compute the water table to camera height difference during field referencing
cam_height_ref = cam_z - z_ref
# compute the actual water table to camera height difference
cam_height_a = cam_z - z_a
rel_diff = cam_height_a / cam_height_ref
# apply the diff on all coordinate, both x, and y directions
_dest_x = list(cam_x + (np.array(x) - cam_x) * rel_diff)
_dest_y = list(cam_y + (np.array(y) - cam_y) * rel_diff)
dest_out = list(zip(_dest_x, _dest_y))
return dest_out


def get_z_a(h_a=None):
"""
h_a : float, optional
actual water level measured [m], if not set, assumption is that a single video
is processed and thus changes in water level are not relevant. (default: None)

Returns
-------
Actual locations of control points (in case these are only x, y) given the current set water level and
the camera location
"""
if h_a is None:
return z_0
else:
return z_0 + (h_a - h_ref)


def shape():
"""
Returns rows and columns in projected frames from ``Frames.project``

Returns
-------
rows : int
Amount of rows in projected frame
cols : int
Amount of columns in projected frame
"""
cols, rows = _get_shape(
bbox,
resolution=resolution,
round=1
)
return rows, cols


def _get_shape(bbox, resolution=0.01, round=1):
"""
defines the number of rows and columns needed in a target raster, to fit a given bounding box.

:param bbox: shapely Polygon, bounding box
:param resolution: resolution of target raster
:param round: number of pixels to round intended shape to
:return: numbers of rows and columns for target raster
"""
coords = bbox.exterior.coords
box_length = LineString(coords[0:2]).length
box_width = LineString(coords[1:3]).length
cols = int(np.ceil((box_length / resolution) / round)) * round
rows = int(np.ceil((box_width / resolution) / round)) * round
return cols, rows

if __name__ == '__main__':

# 需要4个对应的像素坐标点和真实物理坐标系点
src=[
[1421, 1001],
[1251, 460],
[421, 432],
[470, 607]
]
dst=[
[642735.8076, 8304292.1190], # lowest right coordinate
[642737.5823, 8304295.593], # highest right coordinate
[642732.7864, 8304298.4250], # highest left coordinate
[642732.6705, 8304296.8580] # highest right coordinate
]
corners = [
[292, 817],
[50, 166],
[1200, 236],
[1600, 834]
]
# 转换为俯视图每个像素对应的距离,单位为m
resolution = 0.01
# 参考点的高度
h_ref = 0.0
# 水面的高度
h_a = 0.0
# 摄像头在世界坐标系下的位置
lens_position = None
# 基准海拔
z_0 = 1182.2
gcp_dims = 2
camera_matrix = None
dist_coeffs = None

flags = cv2.INTER_AREA
# 根据corners在世界坐标系上求得一个适度的矩形框
bbox, bbox_xy = set_bbox_from_corners(corners, resolution, dst)
# 将这个矩形区域映射到像素坐标系下
src_v2 = get_bbox(bbox, dst, camera=True, h_a=0.0).exterior.coords[0:4]
# 获取要映射的dst地址大小,最简单的方式就是直接设置成 [[0,0],[width,0],[0,height],[width,height]]
# 这里作者为了实现一个像素对应0.01m,特意设置了一个坐标
# 注意,以上能够实现视角转换全得益于矩形框在世界坐标系下是矩形框
dst_xy = bbox_xy
dst_v2 = transform_to_bbox(dst_xy, bbox, resolution)
# 得到转换矩阵
M = get_M_2D(src_v2, dst_v2)

# 应用转换矩阵得到俯视图
# img = cv2.imread('ng_test.png')
video_path = r'ngwerere_20191103.mp4'
cap = cv2.VideoCapture(video_path)
frame_list = []
while True:
ret, frame = cap.read()
if ret:
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
frame_list.append(frame)
else:
break
shape = shape()
img1 = frame_list[0]
img2 = frame_list[1]
final_frame1 = cv2.warpPerspective(img1, M, shape, flags=flags)
final_frame2 = cv2.warpPerspective(img2, M, shape, flags=flags)
cv2.imwrite("transfer_ng1v2_1.png", final_frame1)
cv2.imwrite("transfer_ng2v2_2.png", final_frame2)

在得到俯视图后就可以进一步求解面积,速度等变量了。
根据水纹求解速度有两类方法,一类为 Space-time image velocimetry(stiv),根据波纹得到纹理图,然后再根据纹理角得到速度。
另一类为Particle Image Velocimetry(piv)家族基于图像测流速方法,也是从水面中找特殊描述子然后进行追踪。
以下代码为stiv部分实现。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
import datetime
import cv2
import numpy as np
import math
import time
import os
import m3u8
import requests
import eventlet


def get_alpha_confidence(img, window = 3):
length = min(img.shape[0], img.shape[1])
length = length if length % 2 == 0 else length -1
img = img[0:length, 0:length]

#计算水平方向边缘信息
scharrx = cv2.Scharr(img,cv2.CV_64F,1,0)
#计算垂直方向边缘信息
scharry = cv2.Scharr(img,cv2.CV_64F,0,1)

step = int(length / window)

alpha_list = []
c_list = []

for i in range(window):
for j in range(window):
Gxx = np.sum(scharrx[i*step: (i+1)*step, j*step: (j+1)*step] * scharrx[i*step: (i+1)*step, j*step: (j+1)*step])
Gxy = np.sum(scharrx[i*step: (i+1)*step, j*step: (j+1)*step] * scharry[i*step: (i+1)*step, j*step: (j+1)*step])
Gyy = np.sum(scharry[i*step: (i+1)*step, j*step: (j+1)*step] * scharry[i*step: (i+1)*step, j*step: (j+1)*step])
# print((2 * Gxy) / (Gyy - Gxx))
alpha = (math.atan((2 * Gxy) / (Gyy - Gxx)) / 2) * 180 / math.pi
if alpha < 0:
alpha = 90 + alpha

# 计算清晰置信度
c = math.sqrt((Gxx - Gyy) ** 2 + 4 * (Gxy ** 2)) / (Gxx + Gyy)
alpha_list.append(alpha)
c_list.append(c)

f_alpha = 0
c_sum = 0
for i in range(len(alpha_list)):
f_alpha = f_alpha + alpha_list[i] * c_list[i]
c_sum = c_sum + c_list[i]
print(alpha_list)
print(c_list)
f_alpha = f_alpha / c_sum

return f_alpha


def timeout(_time):
with eventlet.Timeout(_time, False): # 设置超时间
time.sleep(2)
return 'str'
return '不好意思函数调用超时'


def get_xt_img(url, max_count=15, frames=200, width_start=267, width_end=350, height_start=334):
size = 0
time_cur = str(time.strftime('%Y-%m-%d-%H-%M-%S'))
check = []

img_path = os.path.join("path-to-save")
if not os.path.exists(img_path):
os.makedirs(img_path)

# video_path = img_path + "\\" + 'stiv_post_ljx_online.mp4'
video_path = "path-to-video"
if os.path.exists(video_path):
# 删除文件
try:
os.remove(video_path)
except Exception as r:
print("delete online.mp4 wrong {}".format(r))

i = 0
while i < max_count:
try:
print("check: {}".format(check))
playlist = m3u8.load(uri=url, headers=headers, timeout=3)
for seg in playlist.segments:
if seg.absolute_uri[seg.absolute_uri.find("?") - 5:seg.absolute_uri.find("?")] not in check:
check.append(seg.absolute_uri[seg.absolute_uri.find("?") - 5:seg.absolute_uri.find("?")])
else:
continue
with open(video_path, "ab") as f:
r = requests.get(seg.absolute_uri, headers=headers, timeout=5)
data = r.content
size += len(data)
if data:
f.write(data)

i = i + 1
except:
i = i + 1

timeout(3)

frame_img = None
if os.path.exists(video_path):
cap = cv2.VideoCapture(video_path)
c = 0
frame_list = []
while True:
ret, frame = cap.read()

# 旋转一下
alpha_rotate = math.atan(81 / 171) * 180 / math.pi
if frame is not None:
frame = rotate(frame, -alpha_rotate)
if c == 0:
cv2.imwrite("xuanzh.png", frame)
if ret:
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame_list.append(frame)
c += 1
else:
print("所有帧都已经保存完成")
break

# frame_sample = frame_list[::5]
# 生成 x-t 图像
xt_img = np.zeros((frames, width_end-width_start), np.uint8)
for k in range(frames):
# print("len frame: ", len(frame_list))
cur_frame = cv2.cvtColor(frame_list[k], cv2.COLOR_BGR2GRAY)
xt_img[k, :] = np.flip(cur_frame[height_start, width_start:width_end])

return xt_img, time_cur


def rotate(image, angle, center=None, scale=1.0):
# grab the dimensions of the image
(h, w) = image.shape[:2]

# if the center is None, initialize it as the center of
# the image
if center is None:
center = (w // 2, h // 2)

# perform the rotation
M = cv2.getRotationMatrix2D(center, angle, scale)
rotated = cv2.warpAffine(image, M, (w, h))

# return the rotated image
return rotated


if __name__ == '__main__':

url = r"url-to-get-video"
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36 Edg/86.0.622.56",
"Connection": "close"
}

image, time_cur = get_xt_img(url)
cv2.imwrite('lyx_test.png', image)
alpha = get_alpha_confidence(image)
print(alpha)
cv2.imshow("xt image", image)

cv2.waitKey(0)