Fmars 10 1171625
Fmars 10 1171625
Fmars 10 1171625
fast clicks anywhere on that fish in those two images, a significant reduction in
image processing and analysis time is expected. By reducing analysis times, more
images can be processed; thereby, increasing the amount of data available for
environmental reporting and decision making.
KEYWORDS
minimal, with researchers also requiring ways to measure and track for processing and analysis; and a lack of understanding of costs and
fish (Bradley et al., 2019; Lopez-Marcano et al., 2021). advantages (Dowling et al., 2016; Cresswell et al., 2022). Deep
Assessing the health of fish populations depends on learning (DL) can address these challenges by replacing the
determining the average length of fish in sample population manual, labour-intensive task of precisely locating the heads and
subsets and inferring health in conjunction with other key tails of fish with computer-vision-based algorithms (e.g. Marrable
ecosystem markers. Methods applying the length-based et al., 2022). White et al. (2006) were the first to test this method
measurement of fish for assessing the health of fisheries have been with computer vision on a fishing vessel. Measurement using digital
around for decades (Pauly and Morgan, 1987) with few imagery is a growing field and has been successfully implemented
technological advancements until recently. Manual measurement with both single image (e.g. Lezama-Cervantes et al., 2017;
remains the principal tool in collecting essential management Monkman et al., 2019; Andrialovanirina et al., 2020; Wibisono
information on board fishing vessels. However, this method is et al., 2022), and stereo image (e.g. Johansson et al., 2008; Shafait
documented as highly time consuming and involves considerable, et al., 2017; Suo et al., 2020; Connolly et al., 2021; Lopez-Marcano
and potentially harmful, handling of fish to gain accurate et al., 2021; Marrable et al., 2022). Datasets now also exist to
measurements (Upton and Riley, 2013). Traditionally, evaluating explicitly support the development of DL algorithms; for instance,
stock levels has relied on manually measuring fish length, as it is segmentation, classification and size estimation (e.g. DeepFish,
frequently the only possibility where monitoring is limited and Garcia-d’Urso et al., 2022).
collecting length measurements is easier than quantifying a total Automated fish detection has been demonstrated using a range
catch (Rudd and Thorson, 2018). However, this method does not of computer vision methods of measurement targeting single
consider the fluctuations in fish recruitment and death rates over species for aquaculture (Atienza-Vanacloig et al., 2016; Shi et al.,
time, which is crucial for comprehending the indirect impacts of 2020; Yang et al., 2021). Some invasive methods of measurement
fishing on predator–prey dynamics and for identifying the factors involve channelling fish past stationary cameras (Miranda and
that influence the structure of fish communities on a larger scale Romero, 2017; Shafait et al., 2017), or methods which use active
(Jennings and Polunin, 1997). Average length is also considered an sources of light, such as sonar (Uranga et al., 2017), which are
operational indicator of fishing impact; whereas indices on the potentially stressful to the fish. Furthermore, removing fish from the
composition of species assemblage are difficult to interpret, average water (White et al., 2006) or measurement on board trawlers
length is well understood and reference points can be set (Rochet (Monkman et al., 2019) adds to fish mortality. These challenges
and Trenkel, 2003). As well as causing impacts on targeted species, highlight the importance of developing automated methods for
commercial fishing affects bycatch, including by-product and non-invasive means of measurement, such as BRUVS.
discarded/released species; and sometimes habitats, when fishing Although there have been advances in using DL for image
gear (e.g. demersal trawling) interacts with the sea floor or benthic analysis, video imagery presents additional complexities and
zones (Little and Hill, 2021). An increasing range of mechanisms requirements, particularly with regard to curated and structured
and technical tools is being used to reduce interactions with data (e.g. Marrable et al., 2022).
seabirds, marine mammals, reptiles and other vulnerable species. Recent reviews of machine learning in aquaculture found that
Such bycatch-reduction measures include tori lines, sprayers, and there is a need for DL and neural networks to optimise current
seal and turtle excluder devices (Cresswell et al., 2022). In Australia, approaches but have also identified certain pitfalls in the process,
as around the world, guidelines and rules on fish measurement including noise, occlusions and dynamic viewing spaces (Yang
methodology and length quotas are enacted and overseen et al., 2021; Zhao et al., 2021).
by governments1. Stereo baited remote underwater video systems (stereo-BRUVS)
are widely, and increasingly, used as a non-invasive, stressless
method for counting and measuring fish in aquaculture, fisheries
1.2 The move toward automation and conservation management (Harvey and Shortis, 1995; Harvey
et al., 2021). Recently, Marrable et al. (2022) demonstrated the
Monitoring devices and advances in data processing and application of DL to stereo-BRUVS imagery for the semi-
analysis techniques can, and should, form part of an effective automation of fish identification and early success with species
monitoring approach. However, data or capacity limitation is identification. Extending the application of DL to automate fish
widespread in global fisheries resulting in ineffective or non- length measurement would greatly enhance and advance marine
existent management as a result of this lack of data and/or an environment monitoring, speeding up data collation on localised
inability to generate statistical estimates of stock status. Significant fish populations and increasing the amount of data that can be
improvements in management outcomes, leading to conservation processed and used for environmental reporting and decision
and livelihood benefits, could be achieved through cost-effective making. The current limitation of BRUVS is that the data
analytical approaches; these exist, but are hampered by a range of processing is a highly time-consuming manual exercise, prone to
challenges, including data availability and requirements; resources human error and is costly, delaying the production of length data
and limiting how much BRUVS imagery can be processed
(Connolly et al., 2021; Marrable et al., 2022). However, as with
1 https://www.daf.qld.gov.au/business-priorities/fisheries/recreational/ species identification, mean length data is highly valuable for
recreational-fishing-rules/measuring. determining frequency distributions of fish populations and the
spatial and temporal changes required for environmental (RMS) value >20 mm. The RMS value is calculated by the
assessment and reporting. In addition to cost and processing photogrammetry library in EventMeasure and is an indicator of
time, BRUVS is limited by the MaxN ecological abundance how close two corresponding points in each image are to the
metric (Whitmarsh et al., 2017), creating an opportunity for a epipolar line calculated by the opposite point. An RMS value
much larger use of the data held within a video, such as including greater than 20 mm is considered by SeaGIS (outlined in the
fishery-independent assessments of fishing pressure. Recent use of EventMeasure software manual) as an imprecise measurement or
open-source image processing software to measure fisheries catch error in calibration and, therefore, was discarded in this study. This
has also been successful for a wide range of fish sizes reduced the number of images for training to 15558 stereo pairs of
(Andrialovanirina et al., 2020). cropped fish images.
FIGURE 1
Illustrates the workflow for data preparation, model training and model evaluation.
ability to handle various sizes, numbers of classes, and pixels around the head and tail points, respectively. Finally, the in-
computational requirements. The variant used in this study was sample training and validation fish crop images with head and tail
the ‘YOLOv5 small’ model. To adapt the model for head and tail labels were used to train the YOLOv5 small model. The early-
detection, transfer learning was employed, which built on stopping method was also implemented in this study to avoid
knowledge gained from the pre-trained model while reducing the overfitting the model.
amount of training data and time needed. A subset of the in-sample
dataset was used to retrain the model according to the standard
procedure outlined on the YOLOv5 website3. 2.4 Model prediction
The YOLOv5 model needs to be trained by defining the extent
of an object of interest (heads and tails in this case) by defining a The head and tail predictions from the object-detection model
bounding box. Therefore, the head and tail points in the training were converted to overall fish length by first taking the bounding
data were converted to bounding boxes by defining a box of 25 × 25 box predictions from the trained DL model and converting them to
points by using the centre location of the box in stereo image pairs.
On occasions when the DL model failed to find one or two of either
3 https://github.com/ultralytics/yolov5 Access Date (Nov 22, 2022). a head or a tail in both images, the location of the missing feature
FIGURE 2
Presents four out-of-sample examples of automated fish length measurements using the method described in this study. The example presents fish
of different sizes, habitat and distance from the camera.
was estimated by taking the reflection of one the classifier feature is the proportion of true positive (TP) predictions out of all positive
locations in the bounding box of the fish. On occasions when the predictions. False negative (FN) represent the number of
model returned more than one candidate for a head or tail, the one predictions the model missed and false positive (FP) predictions
with the highest confidence score was chosen. On the occasions are incorrectly predicted results. The F1 score is calculated by taking
when predicted head and tail points were inconsistent in both left the harmonic mean of recall and precision.
and right cropped fish images; for example, if head or tail points The recall, precision and F1 score for fish head and tail detection
were swapped, the predicted result was discarded as an incorrect are presented in Table 1.
measurement. Once the four required points were returned by the
TP
model, the camera calibration files were used along with Recall = (1)
TP + FN
EventMeasure’s photogrammetry library to calculate the length of
the fish.
TP
Precision = (2)
TP + FP
4 Discussion
The semi-automated method presented in this paper
demonstrates the potential to rapidly increase analysis time and
decrease reporting time for assessing fish biomass. Challenges
remain for a completely autonomous solution, some of which are
discussed below.
FIGURE 4
Histogram of the human versus DL comparison demonstrating the
4.1 Semi-automation of length
density of the number of length measurements. A higher density of measurement
points indicates the total number of measurements aggregate to
close agreement.
The challenge of applying this model in real-world scenarios is
that the model cannot currently match the fish in the corresponding
left and right images. This was not a problem when building and
testing the model, as the data were already analysed by experienced
2 Recall Precision ecologists who had matched the stereo image pairs. To address this
F1 = (3)
Recall + Precision challenge, the DL model was adapted to communicate with Event
Measure; wherein, the DL model requires an ecologist to click
anywhere on the body of the fish in both images. Inference on the
2.5.2 Human–machine comparison
length is conducted after the ecologist has solved the image
The Pearson correlation coefficient used for the human–
correspondence problem by identifying the same fish in each of the
machine comparison was calculated by:
left and right images. The fish is then precisely cropped from the
o (xi − x)(yi − y) ffi
r = qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (4)
stereo-BRUVS image using the DL method described in Marrable
et al. (2022), which places a bounding box over the fish, then parsed
o (xi − x) o (yi − y)
2 2
by the head and tail DL model. Without isolating the fish first, the
Where: model returns all of the heads and tails of all the fish it finds with no
correspondence data to match them. The head and tail locations are
x are the individual DL inference length results returned to EventMeasure which automatically calculates the length
x is the average DL length of the fish using its photogrammetry library. This reduces the number
yi are the individual human annotated length results of mouse clicks on the screen, from four precise clicks (i.e. left head,
y is the average human annotated length left tail, right head, right tail) to two. Additionally, placing clicks
anywhere on the body is significantly faster and requires much less
precision. This semi-automated method of length measurement has
the potential to significantly increase analysis speed.
3 Results Furthermore, by requiring ecologists to choose the corresponding
fish individuals, users can draw on their contextual knowledge to wait
The following section presents the results of the human– for a moment when a fish is the best pose for measurement and not
machine comparison by comparing the machine learning and occluded by other fish, seagrass, the BRUVS bait bag or other objects.
photogrammetry-derived length measurements with the This reduces false positive detection. Context is something that is not
ecologists’ manual measurements (Figure 3) and the density of currently possible by using computer vision alone.
TABLE 1 Deep learning precision (P), recall (R) and F1 for classification.
The DL model cannot correspond the head and tail of a given fish Previous published models capable of automating the length
and, therefore, the largest source of error is incorrect correspondence; measurement of fish have either used a single camera out of water
that is, when a head and tail pair are matched to two different fish. (Monkman et al., 2019); been limited to a single species (White et al.,
This is because the model searches within the bounding box for 2006); or used less accurate stereophotogrammetry calibration
features that look like heads and tails and returns the match with the methods (Tonachella et al., 2022). The model presented in this study
highest confidence. This works well when there is only one matching was trained and tested on 319 unique species of fish, making it much
pair; however, there are occasions when there are heads and tails more generalisable than any other previously published model. The
belonging to many fish. The model has no knowledge of data used to train this model was restricted to the species in the OzFish
correspondence and so matches them based on the highest dataset, which includes those mostly found along the coast of Western
confidence level, and sometimes pairs them incorrectly. An example Australian. However, the species richness and diversity shows evidence
of this is seen in Figure 5. This results in either the incorrect length that the model generalises across different species with varying colour,
being calculated from the photogrammetry, or the RMS value texture and morphometrics. An effort to separate in-sample and out-
returned from EventMeasure being >20, so no length is reported. of-sample data was made to give some indication of model
Figure 5A shows an example where two fish tails fall within the generalisability by training and testing to data collected at different
bounding box and the model identifies the wrong tail. This false dates and locations. How well the model works with species outside the
positive is seen most commonly where fish are schooling and OzFish data will be the focus of future work. For applications in marine
swimming between 30° and 45° to the plane of the camera. Angles environments with species not included in the OzFish data, the
within this span produce a large bounding box with more likelihood method described in this study should be repeated with a new
that tails from other fish will be captured. One way to reduce this training corpus that includes species in which users wish to measure.
effect is to automate a rotation of the bounding box, Figure 5B, or the
image in sympathy with the orientation of the fish to reduce the
empty space in the bounding box. Automating this process remains a 4.5 Challenges with data quality
challenge, as even establishing that a false positive detection has
occurred would require logic and processing beyond the capability of One reason for choosing the OzFish dataset for DL training was
the current model. There are published detection models that use because the data were annotated by expert fish ecologists. However,
rotated labels (Li et al., 2018) for ship identification in satellite when auditing the DL data there were still errors in the labelling.
images; but, as yet, YOLOv5 does not have the ability to train using Some errors included head and tail points that seemed to be
rotated bounding boxes. Addressing these false positive cases systematically shifted a few pixels away from the head and tail of
remains the subject of ongoing research. the fish, which may have been caused by incorrect synchronisation
of the stereo-BRUVS. There were also some instances where labels
were randomly out of place, such as labels placed on a rock.
4.3 Stereo calibration One issue that continues to be a challenge for computer-vision-
based DL is that it is so far incapable of using context in the way fish
Harvey and Shortis (1998) highlight the importance of precise ecologists do to help them label fish. For example, in the OzFish
measurement systems for accurate length. This was also the dataset, where a fish was partially occluded by an object, labels were
objective of this approach by using the OzFish dataset for model placed where heads or tails would logically be expected, estimated by
training and validation. The OzFish data were calibrated using the ecologists from experience and numerous previous observations of
calibration cube method (Shortis, 2015) which is more accurate and similar fish. When such an example is viewed by a computer-vision
precise than using 2D calibration patterns as reported by Boutros algorithm which, unlike an ecologist, cannot extrapolate from the
et al. (2015) in their comparison study. context, the algorithm may see a label on a rock and interpret that
A B
FIGURE 5
Example of a false positive detection of a tail leading to an incorrect length measurement; (A) two fish tails fall withinthe bounding box and the
model identifies the wrong tail. (B) the yellow box demonstrates that rotating theobject-detection bounding box, would eliminate the second tail
from the area and correct the false tail label.
rock as a fish head or tail. In such cases, those data must be removed on fish populations for stock assessment and ecosystem-based
as they would incorrectly train the DL model to detect some rocks as fisheries management. This non-invasive approach enables
fish heads and tails. Additionally, there were many instances of continuous monitoring of fish populations without harming the
seemingly very small fish labelled with heads and tails which were organisms or their habitats, offering a promising alternative for
very hard to distinguish between in static images. However, upon sustainable fishery management.
viewing the moving video, swimming behaviours clearly indicated the
direction fish were swimming in, which made head and tail
identification easy to the human eye. Although there are published 5 Conclusion
DL tracking algorithms (Bertinetto et al., 2016; Hu et al., 2022),
YOLO-based methods only consider static images for training or The semi-automated length measurement method presented
inference. Combining tracking with head and tail detection will be the here builds on and advances previously published DL-based fish
focus of future work so that numerous length measurements of the detection from stereo-BRUVS imagery (Marrable et al., 2022). This
same fish can be made to calculate the average size, a method that is new method combines that fish detection approach to isolate and
shown to be more statistically robust and less prone to measurement crop individual fish from a busy scene with a new DL model for
error (Harvey et al., 2001b). Validation experiments of measurements detecting the head and tail and applying photogrammetry to
from stereo-BRUVS (Harvey and Shortis, 1995; Harvey et al., 2003; determine fish length measurements.
Harvey et al., 2010) have been conducted using three or more repeat Although not completely autonomous, the machine-assisted, semi-
measurements of fish. However, this is seldom done when conducting automated labelling approach solves both the object correspondence
field surveys due to the extra labour required. challenge and allows for expert contextual knowledge to choose which
fish (and in which pose) are sent for analysis using DL. This is expected
to significantly reduce labour and analysis time by speeding up the
4.6 Combining optical and acoustic manual process of precisely locating the head and tail of the fish in both
sampling methods images by carefully placing four mouse clicks on the screen, to two fast
clicks anywhere on a fish while still using expert knowledge to truth
In recent years, size-spectrum models derived from acoustic and validate the result. By accelerating stereo-BRUVS analysis, more
surveys have emerged as essential tools for fish stock assessment imagery can be processed; thereby, increasing the amount of data
and ecosystem-based fisheries management. Acoustic surveys possess available for environmental reporting and decision making.
the advantage of rapidly and efficiently covering vast spatial scales.
However, stationary video platforms, such as stereo-BRUVS, are
constrained by a limited field of view and can only monitor a small
Data availability statement
area around the camera. Acoustic surveys also face challenges,
Publicly available datasets were analyzed in this study. This data
including difficulties in discriminating between fish species and
can be found here: https://github.com/open-AIMS/ozfish.
detecting fish close to the seabed or within dense schools.
Size and shape information of fish targets is extracted from echo
data by adjusting model parameters, such as growth rates, mortality Author contributions
rates, and species-specific traits, to match observed data (Edwards
et al., 2017; Froese et al., 2019). Calibration and validation of these DM, MW, ST, and SB contributed to the development of the
models often necessitate biological samples, which are invasive due to study design. DM, KB, ST, EH, MW, MS, and SLB contributed to
the physical capture and potential harm to fish during the process. the writing of the manuscript. All authors contributed to the article
Assessing fishery resources in reef ecosystems, where obtaining and approved the submitted version.
biological samples is sometimes prohibited, remains challenging. To
address these limitations, optic-acoustic methods combine video
footage and acoustic measurements (Ryan and Kloser, 2016; Demer Conflict of interest
et al., 2020). Underwater cameras or video systems, either mounted
on a research vessel, towed platform, or remotely operated vehicle The authors declare that the research was conducted in the
(ROV), capture images or footage of fish, providing high-resolution absence of any commercial or financial relationships that could be
information on size, shape, colour, and behaviour, which aids in construed as a potential conflict of interest.
species identification and refining size distribution estimates
without the need for biological samples.
The automated length measurement of fish in stereo-videos using Publisher’s note
the method described in this study could be integrated with the optic-
acoustic approach to capitalise on the strengths of both methods. All claims expressed in this article are solely those of the authors
Combining acoustic surveys with stereo-BRUVS, such as the and do not necessarily represent those of their affiliated organizations,
preliminary work by Landero-Figueroa et al. (2016), or other or those of the publisher, the editors and the reviewers. Any product
sampling techniques can help overcome the limitations of each that may be evaluated in this article, or claim that may be made by its
method and provide more accurate and comprehensive information manufacturer, is not guaranteed or endorsed by the publisher.
References
Andrialovanirina, N., Ponton, D., Behivoke, F., Mahafina, J., and Lé opold, M. (2020). Harvey, E., Fletcher, D., and Shortis, M. (2001a). A comparison of the precision and
A powerful method for measuring fish size of small-scale fishery catches using image. accuracy of estimates of reef-fish lengths determined visually by divers with estimates
J.Fish. Res. 223, 105425. doi: 10.1016/j.fishres.2019.105425 produced by a stereo-video system. Fish. Bull. 99, 63.
Atienza-Vanacloig, V., Andreu-Garcı́a, G., Ló pez-Garcı́a, F., Valiente-Gonzá lez, J. Harvey, E., Fletcher, D., and Shortis, M. (2001b). Improving the statistical power of
M., and Puig-Pons, V. (2016). Vision-based discrimination of tuna individuals in grow- length estimates of reef fish: a comparison of estimates determined visually by divers
out cages through a fish bending model. Comput. Electron. Agric. 130, 142–150. with estimates produced by a stereo-video system. Fishery bulletin-national oceanic
doi: 10.1016/j.compag.2016.10.009 atmospheric administration 99, 72–80.
Harvey, E., Fletcher, D., and Shortis, M. (2002). Estimation of reef fish length by
Australian Institute Of Marine Science (2020). Ozfish dataset - machine learning
divers and by stereo-video. a first comparison of the accuracy and precision in the field
dataset for baited remote underwater video stations. doi: 10.25845/5E28F062C5097
on living fish under operational conditions. Fish. Res. 57, 255–265. doi: 10.1016/S0165-
Bertinetto, L., Valmadre, J., Henriques, J. F., Vedaldi, A., and Torr, P. H. S. (2016). 7836(01)00356-3
“Fully-convolutional siamese networks for object tracking,” in Computer vision – ECCV Harvey, E., Goetze, J., McLaren, B., Langlois, T., and Shortis, M. (2010). Influence of
2016 workshops (Amsterdam, The Netherlands: Springer International Publishing), range, angle of view, image resolution and image compression on underwater stereo-
850–865. video measurements: high-definition and broadcast-resolution video cameras
Boutros, N., Shortis, M. R., and Harvey, E. S. (2015). A comparison of calibration compared. Mar. Technol. Soc J. 44, 75–85. doi: 10.4031/MTSJ.44.1.3
methods and system configurations of underwater stereo-video systems for applications in Harvey, E. S., McLean, D. L., Goetze, J. S., Saunders, B. J., Langlois, T. J., Monk, J.,
marine ecology. Limnol. Oceanogr. Methods 13, 224–236. doi: 10.1002/lom3.10020 et al. (2021). The BRUVs workshop – an australia-wide synthesis of baited remote
underwater video data to answer broad-scale ecological questions about fish, sharks and
Bradley, D., Merrifield, M., Miller, K. M., Lomonico, S., Wilson, J. R., and Gleason, rays. Mar. Policy 127, 104430. doi: 10.1016/j.marpol.2021.104430
M. G. (2019). Opportunities to improve fisheries management through innovative
technology and advanced data systems. Fish Fish 20, 564–583. doi: 10.1111/faf.12361 Harvey, E., and Shortis, M. (1995). A system for stereo-video measurement of sub-
tidal organisms. Mar. Technol. Soc J. 29, 10–22.
Brock, V. E. (1954). A preliminary report on a method of estimating reef fish
populations. J. Wildl. Manage. 18, 297–308. doi: 10.2307/3797016 Harvey, E. S., and Shortis, M. R. (1998). Calibration stability of an underwater
stereo video system: implications for measurement accuracy and precision. Mar.
Cappo, M. A., Harvey, E., Malcolm, H., and Speare, P. (2003). Potential of video Technol. Soc. J. 32 (2), 3–17.
techniques to monitor diversity, abundance and size of fish in studies of marine
Hellmrich, L. S., Saunders, B. J., Parker, J. R. C., Goetze, J. S., and Harvey, E. S.
protected areas. Aquat. Protected Areas-what works Best how do we know 1, 455–464.
(2023). Stereo-ROV surveys of tropical reef fishes are comparable to stereo-DOVs with
Cappo, M., Speare, P., Wassenberg, T., Harvey, E., Rees, M., Heyward, A., et al. reduced behavioural biases. Estuar. Coast. Shelf Sci. 281, 108210. doi: 10.1016/
(2001). Direct sensing of the size frequency and abundance of targt and non-target j.ecss.2022.108210
fauna in Australian fisheries-a national workshop. Pages 63–71. Hu, W., Wang, Q., Zhang, L., Bertinetto, L., and Torr, P. H. S. (2022). SiamMask: a?
Connolly, R. M., Fairclough, D. V., Jinks, E. L., Ditria, E. M., Jackson, G., Lopez- framework for fast online object tracking and segmentation. IEEE Transactions on
Marcano, S., et al. (2021). Improved accuracy for automated counting of a fish in baited Pattern Analysis and Machine Intelligence 45 (3), 3072–3089. doi: 10.1109/
underwater videos for stock assessment. Front. Mar. Sci. 8. doi: 10.3389/ TPAMI.2022.3172932
fmars.2021.658135 Jennings, S., and Kaiser, M. J. (1998). “The effects of fishing on marine ecosystems,”
Cresswell, I., Janke, T., and Johnston, E. (2022). Australia State of the environment in Advances in marine biology, vol. 34 . Eds. J. H. S. Blaxter, A. J. Southward and P. A.
2021: overview (Australia: Department of Agriculture, Water and the Environment). Tyler (Academic Press), 201–352.
Demer, D., Michaels, W., Cambronero Solano, S., Paramo, J., and Roa, C. (2020). Jennings, S., and Polunin, N. V. C. (1997). Impacts of predator depletion by fishing
Integrated optic-acoustic studies of reef fish: report of the 2018 GCFI field study and on the biomass and diversity of non-target reef fish communities. Coral Reefs 16, 71–82.
workshop. NOAA Technical Memorandum NMFS-F/SPO-209 (Washington, DC: doi: 10.1007/s003380050061
National Oceanic and Atmospheric Administration). Jessop, S. A., Saunders, B. J., Goetze, J. S., and Harvey, E. S. (2022). A comparison of
Dowling, N. A., Wilson, J. R., Rudd, M. B., Babcock, E. A., Caillaux, M., Cope, J., et al. underwater visual census, baited, diver operated and remotely operated stereo-video for
(2016). FishPath: a decision support system for assessing and managing data- and sampling shallow water reef fishes. Estuar. Coast. Shelf Sci. 276, 108017. doi: 10.1016/
capacity- limited fisheries Dowling2016-ww. doi: 10.4027/amdlfs.2016.03 j.ecss.2022.108017
Duarte, C. M., Agusti, S., Barbier, E., Britten, G. L., Castilla, J. C., Gattuso, J.-P., et al. Johansson, C., Stowar, M., and Cappo, M. (2008). The use of stereo BRUVS for
(2020). Rebuilding marine life. Nature 580, 39–51. doi: 10.1038/s41586-020-2146-7 measuring fish size. Marine and Tropical Sciences Research Facility Report Series; (Cape
Cleveland, Australia: Australian Institute of Marine Science).
Edwards, A. M., Robinson, J. P. W., Plank, M. J., Baum, J. K., and Blanchard, J. L.
(2017). Testing and recommending methods for fitting size spectra to data. Methods Landero-Figueroa, M. M., Parnum, I., Saunders, B. J., and Parsons, M. (2016).
Ecol. Evol. 8, 57–67. doi: 10.1111/2041-210X.12641 Integrating echo-sounder and underwater video data for demersal fish assessment
(Brisbane, Australia: Acoustics).
Ellis, D. M., and DeMartini, E. E. (1995). Evaluation of a video camera technique for
indexing abundances of juvenile pink snapper, pristipomoides filamentosus, and other Langlois, T., Goetze, J., Bond, T., Monk, J., Abesamis, R. A., Asher, J., et al. (2020). A
hawaiian insular shelf fishes. Oceanographic Literature Rev. 9, 786. field and video annotation guide for baited remote underwater stereo-video surveys of
demersal fish assemblages. Methods Ecol. Evol. 11, 1401–1409. doi: 10.1111/2041-
French, G., Mackiewicz, M., Fisher, M., Holah, H., Kilburn, R., Campbell, N., et al.
210X.13470
(2019). Deep neural networks for analysis of fisheries surveillance video and automated
monitoring of fish discards. ICES J. Mar. Sci. 77, 1340–1353. doi: 10.1093/icesjms/fsz149 Lezama-Cervantes, C., Godı́nez-Domı́nguez, E., Gó mez-Morales, H., Ornelas-Luna,
R., Morales-Blake, A. R., Patiño-Barragá n, M., et al. (2017). A suitable ichthyometer for
Friedlander, A. M., and DeMartini, E. E. (2002). Contrasts in density, size, and
systemic application. Lat. Am. J. Aquat. Res. 45, 870–878. doi: 10.3856/vol45-issue5-
biomass of reef fishes between the northwestern and the main hawaiian islands: the
fulltext-1
effects of fishing down apex predators. Mar. Ecol. Prog. Ser. 230, 253–264. doi: 10.3354/
meps230253 Li, S., Zhang, Z., Li, B., and Li, C. (2018). Multiscale rotated bounding box-based
deep learning method for detecting ship targets in remote sensing images. Sensors 18.
Froese, R., Winker, H., Coro, G., Demirel, N., Tsikliras, A. C., Dimarchopoulou, D.,
doi: 10.3390/s18082702
et al. (2019). On the pile-up effect and priors for linf and M/K: response to a comment
by hordyk et al. on “a new approach for estimating stock status from length frequency Little, R., and Hill, N. (2021). 2021 state of the environment report marine chapter –
data”. ICES J. Mar. Sci. 76, 461–465. doi: 10.1093/icesjms/fsy199 expert assessment – management effectiveness – commercial fishing. doi: 10.26198/
WWR3-4D52
Garcia-d’Urso, N., Galan-Cuenca, A., Pé rez-Sá nchez, P., Climent-Pé rez, P., Fuster-?
Guillo, A., Azorin-Lopez, J., et al. (2022). The DeepFish computer vision dataset for fish Lopez-Marcano, S., Jinks, E. L., Buelow, C. A., Brown, C. J., Wang, D., Kusy, B., et al.
instance segmentation, classification, and size estimation. Sci. Data 9, 1–7. doi: 10.1038/ (2021). Automatic detection of fish and tracking of movement for ecology. Ecol. Evol.
s41597-022-01416-0 11, 8254–8263. doi: 10.1002/ece3.7656
Goetze, J. S., Bond, T., McLean, D. L., Saunders, B. J., Langlois, T. J., Lindfield, S., MacNeil, M. A., Chapman, D. D., Heupel, M., Simpfendorfer, C. A., Heithaus, M.,
et al. (2019). A field and video analysis guide for diver operated stereo-video. Methods Meekan, M., et al. (2020). Global status and conservation potential of reef sharks.
Ecol. Evol. 10 (7), 1083–1090. doi: 10.1111/2041-210X.13189 Nature 583, 801–806. doi: 10.1038/s41586-020-2519-y
Harvey, E. S., Cappo, M., Butler, J. J., Hall, N., and Kendrick, G. A. (2007). Bait attraction Marrable, D., Barker, K., Tippaya, S., Wyatt, M., Bainbridge, S., Stowar, M., et al.
affects the performance of remote underwater video stations in assessment of demersal fish (2022). Accelerating species recognition and labelling of fish from underwater video
community structure. Mar. Ecol. Prog. Ser. 350, 245–254. doi: 10.3354/meps07192 with machine-assisted deep learning. Front. Mar. Sci. 9. doi: 10.3389/fmars.2022.
944582Marrable2022
Harvey, E., Cappo, M., Shortis, M., Robson, S., Buchanan, J., and Speare, P. (2003).
The accuracy and precision of underwater measurements of length and maximum body McClanahan, T. R., Muthiga, N. A., Kamukuru, A. T., Machano, H., and Kiambo, R.
depth of southern bluefin tuna (thunnus maccoyii) with a stereo–video camera system. W. (1999). The effects of marine parks and fishing on coral reefs of northern tanzania.
Fish. Res. 63 (3), 315–326 doi: 10.1016/S0165-7836(03)00080-8 Biol. Conserv. 89, 161–182. doi: 10.1016/S0006-3207(98)00123-2
Melnychuk, M. C., Kurota, H., Mace, P. M., Pons, M., Minto, C., Osio, G. C., et al. Shi, C., Wang, Q., He, X., Zhang, X., and Li, D. (2020). An automatic method of fish
(2021). Identifying management actions that promote sustainable fisheries. Nat. length estimation using underwater stereo system based on LabVIEW. Comput.
Sustainability 4, 440–449. doi: 10.1038/s41893-020-00668-1 Electron. Agric. 173, 105419. doi: 10.1016/j.compag.2020.105419
Miranda, J. M., and Romero, M. (2017). A prototype to measure rainbow trout’s length Shortis, M. (2015). Calibration techniques for accurate measurements by underwater
using image processing. Aquacult. Eng. 76, 41–49. doi: 10.1016/j.aquaeng.2017.01.003 camera systems. Sensors 15, 30810–30826. doi: 10.3390/s151229831
Monkman, G. G., Hyder, K., Kaiser, M. J., and Vidal, F. P. (2019). Using machine Steneck, R. S., and Pauly, D. (2019). Fishing through the anthropocene. Curr. Biol.
vision to estimate fish length from images using regional convolutional neural 29, R987–R992. doi: 10.1016/j.cub.2019.07.081
networks. Methods Ecol. Evol. 10, 2045–2056. doi: 10.1111/2041-210X.13282 Suo, F., Huang, K., Ling, G., Li, Y., and Xiang, J. (2020). “Fish keypoints detection for
Needle, C. L., Dinsdale, R., Buch, T. B., Catarino, R. M., Drewery, J., Butler, N., et al. ecology monitoring based on underwater visual intelligence,” in 2020 16th
(2015). Scottish Science applications of remote electronic monitoring. ICES J. Mar. Sci. International Conference on Control, Automation, Robotics and Vision (ICARCV).
72 (4), 1214–1229. doi: 10.1093/icesjms/fsu225 542–547. doi: 10.1109/ICARCV50220.2020.9305424
Pauly, D., Christensen, V., Gué nette, S., Pitcher, T. J., Sumaila, U. R., Walters, C. J., Tonachella, N., Martini, A., Martinoli, M., Pulcini, D., Romano, A., and
et al. (2002). Towards sustainability in world fisheries. Nature 418, 689–695. Capoccioni, F. (2022). An affordable and easy-to-use tool for automatic fish
doi: 10.1038/nature01017 length and weight estimation in mariculture. Sci. Rep. 12, 15642. doi: 10.1038/
Pauly, D., and Morgan, G. R. (1987). Length-based methods in fisheries research: s41598-022-19932-9
WorldFish, 299. Upton, K. R., and Riley, L. G. (2013). Acute stress inhibits food intake and alters
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016). “You only look once: ghrelin signaling in the brain of tilapia (oreochromis mossambicus). Domest. Anim.
unified, real-time object detection,” in Proceedings of the IEEE conference on computer Endocrinol. 44, 157–164. doi: 10.1016/j.domaniend.2012.10.001
vision and pattern recognition. 779–788. Uranga, J., Arrizabalaga, H., Boyra, G., Hernandez, M. C., Goñi, N., Arregui, I., et al.
Roberts, C. M. (1995). Rapid build-up of fish biomass in a caribbean marine reserve. (2017). Detecting the presence-absence of bluefin tuna by automated analysis of
Conserv. Biol. 9, 815–826. doi: 10.1046/j.1523-1739.1995.09040815.x medium-range sonars on fishing vessels. PloS One 12, e0171382. doi: 10.1371/
journal.pone.0171382
Rochet, M.-J., and Trenkel, V. M. (2003). Which community indicators can measure
the impact of fishing? a review and proposals. Can. J. Fish. Aquat. Sci. 60, 86–99. White, D. J., Svellingen, C., and Strachan, N. J. C. (2006). Automated measurement
doi: 10.1139/f02-164 of species and length of fish by computer vision. Fish. Res. 80, 203–210. doi: 10.1016/
j.fishres.2006.04.009
Rudd, M. B., and Thorson, J. T. (2018). Accounting for variable recruitment and
fishing mortality in length-based stock assessments for data-limited fisheries. Can. J. Whitmarsh, S. K., Fairweather, P. G., and Huveneers, C. (2017). What is big
Fish. Aquat. Sci. 75, 1019–1035. doi: 10.1139/cjfas-2017-0143 BRUVver up to? methods and uses of baited underwater video. Rev. Fish Biol. Fish.
27, 53–73. doi: 10.1007/s11160-016-9450-1
Ryan, T. E., and Kloser, R. J. (2016). Improved estimates of orange roughy biomass
using an acoustic-optical system in commercial trawlnets. ICES J. Mar. Sci. 73, 2112– Wibisono, E., Mous, P., Firmana, E., and Humphries, A. (2022). A crew-operated
2124. doi: 10.1093/icesjms/fsw009 data recording system for length-based stock assessment of indonesia’s deep demersal
fisheries. PloS One 17, e0263646. doi: 10.1371/journal.pone.0263646
Schramm, K. D., Marnane, M. J., Elsdon, T. S., Jones, C., Saunders, B. J., Goetze, J. S.,
et al. (2020). A comparison of stereo-BRUVs and stereo-ROV techniques for sampling Wilson, S. K., Graham, N. A. J., Holmes, T. H., MacNeil, M. A., and Ryan, N. M.
shallow water fish communities on and off pipelines. Mar. Environ. Res. 162, 105198. (2018). Visual versus video methods for estimating reef fish biomass. Ecol. Indic.
doi: 10.1016/j.marenvres.2020.105198 85, 146–152. doi: 10.1016/j.ecolind.2017.10.038
Seguin, R., Mouillot, D., Cinner, J. E., Stuart Smith, R. D., Maire, E., Graham, N. A. J., Yang, L., Liu, Y., Yu, H., Fang, X., Song, L., Li, D., et al. (2021). Computer vision
et al. (2022). Towards process-oriented management of tropical reefs in the models in intelligent aquaculture with emphasis on fish detection and behavior
anthropocene. Nat. Sustain. 6 (2), 148–157. doi: 10.1038/s41893-022-00981-x analysis: a review. Arch. Comput. Methods Eng. 28, 2785–2816. doi: 10.1007/s11831-
020-09486-2
Shafait, F., Harvey, E. S., Shortis, M. R., Mian, A., et al. (2017). Towards automating
underwater measurement of fish length: a comparison of semi-automatic and manual Zhao, S., Zhang, S., Liu, J., Wang, H., Zhu, J., Li, D., et al. (2021). Application of
stereo–video measurements. ICES J. Mar. Sci. 74 (6), 1690–1701. doi: 10.1093/icesjms/ machine learning in intelligent fish aquaculture: a review. Aquaculture 540, 736724.
fsx007 doi: 10.1016/j.aquaculture.2021.736724