dev(narugo): doc update for person (75a0a2fb) · Commits · git-mirror / Imgutils

imgutils/detect/person.py

+44 −33

Original line number	Diff line number	Diff line
		"""
		Overview:
		Detect human bodies (including the entire body) in anime images.

		Trained on dataset `AniDet3 <https://universe.roboflow.com/university-of-michigan-ann-arbor/anidet3-ai42v>`_ \
		with YOLOv8.
		This module provides functionality for detecting human bodies (including the entire body) in anime images.
		It uses YOLOv8 models trained on the `AniDet3 <https://universe.roboflow.com/university-of-michigan-ann-arbor/anidet3-ai42v>`_
		dataset from Roboflow.

		.. image:: person_detect_demo.plot.py.svg
		:align: center

		The module includes a main function :func:`detect_person` for performing the detection task,
		and utilizes the `yolo_predict` function from the generic module for the actual prediction.

		The module supports different model levels and versions, allowing users to choose
		between speed and accuracy based on their requirements.

		This is an overall benchmark of all the person detect models:

		.. image:: person_detect_benchmark.plot.py.svg
		@@ -25,46 +30,52 @@ _REPO_ID = 'deepghs/anime_person_detection'
		def detect_person(image: ImageTyping, level: str = 'm', version: str = 'v1.1', model_name: Optional[str] = None,
		conf_threshold: float = 0.3, iou_threshold: float = 0.5):
		"""
		Overview:
		Detect human bodies (including the entire body) in anime images.

		:param image: Image to detect.
		:param level: The model level being used can be either ``n``, ``s``, ``m`` or ``x``.
		The ``n`` model runs faster with smaller system overhead, while the ``m`` model achieves higher accuracy.
		The default value is ``m``.
		:param version: Version of model, default is ``v1.1``. Available versions are ``v0``, ``v1`` and ``v1.1``.
		:param max_infer_size: The maximum image size used for model inference, if the image size exceeds this limit,
		the image will be resized and used for inference. The default value is ``640`` pixels.
		:param conf_threshold: The confidence threshold, only detection results with confidence scores above
		this threshold will be returned. The default value is `0.3`.
		:param iou_threshold: The detection area coverage overlap threshold, areas with overlaps above this threshold
		will be discarded. The default value is `0.5`.
		:return: The detection results list, each item includes the detected area `(x0, y0, x1, y1)`,
		the target type (always `person`) and the target confidence score.

		Examples::
		This function uses YOLOv8 models to detect human bodies in anime-style images.
		It supports different model levels and versions, allowing users to balance between
		detection speed and accuracy.

		:param image: The input image for detection. Can be various types as defined by ImageTyping.
		:type image: ImageTyping

		:param level: The model level to use. Options are 'n', 's', 'm', or 'x'.
		'n' is fastest but less accurate, 'x' is most accurate but slower.
		:type level: str

		:param version: The version of the model to use. Available versions are 'v0', 'v1', and 'v1.1'.
		:type version: str

		:param model_name: Optional custom model name. If provided, overrides the auto-generated model name.
		:type model_name: Optional[str]

		:param conf_threshold: Confidence threshold for detections. Only detections with
		confidence above this value are returned.
		:type conf_threshold: float

		:param iou_threshold: Intersection over Union (IoU) threshold for non-maximum suppression.
		:type iou_threshold: float

		:return: A list of detection results. Each result is a tuple containing:
		((x0, y0, x1, y1), 'person', confidence_score)
		:rtype: List[Tuple[Tuple[int, int, int, int], str, float]]

		:raises ValueError: If an invalid level or version is provided.

		Example:
		>>> from imgutils.detect import detect_person, detection_visualize
		>>>
		>>> image = 'genshin_post.jpg'
		>>> result = detect_person(image)
		>>> result
		>>> print(result)
		[
		((371, 232, 564, 690), 'person', 0.7533698678016663),
		((30, 135, 451, 716), 'person', 0.6788613796234131),
		((614, 393, 830, 686), 'person', 0.5612757205963135),
		((614, 3, 1275, 654), 'person', 0.4047100841999054)
		]
		>>>
		>>> # visualize it
		>>> from matplotlib import pyplot as plt
		>>> plt.imshow(detection_visualize(image, result))
		>>> plt.show()

		.. note::
		Please note that certain combinations of versions and levels may not have corresponding models.
		When using them, please refer to the performance chart at the top of that page, which lists
		the versions and models included.

		For visualization of results, you can use the :func:`imgutils.detect.visual.detection_visualize` function.
		"""
		return yolo_predict(
		image=image,